5 Feb, 2012  |  Written by  |  under BTRFS, Linux

I have been waiting for the video presentation of a talk given by Chris Mason at this year’s Scale 10x to finally be posted online.  The original Scale 10x talks were streamed live, and the website claims that the videos will be posted online soon, however at this point no date has been provided.

In the meantime however, I found a link to another talk given by Chirs, this time hostsed at linuxfoundation.org. In order to view the full video you do need to provide your name and email address, but the process is painless and well worth the 30 seconds it takes to fill in the form.

It appears as though this was put together in December 2011, so it is relatively new and up to date, provides a nice introduction to btrfs, a look at the upcoming feature set, and a list of work that still needs to be done in order to make btrfs production ready.

Here is a link to the first few minutes of the talk:

5 Feb, 2012  |  Written by  |  under BTRFS, Linux, Xfs

There was another file system talk to come out of the recent Linux.conf.au conference, this one was given by Dave Chinner and was entitled ‘XFS: Recent and Future Adventures in Filesystem Scalability’.

Here Dave discusses some of the historical roadblocks which prevented XFS from scaling as well as it could have, provides some in depth details about how these issue were eventually overcome, shows off some benchmarks comparing throughput and overall scaling using XFS, EXT4 and BTRFS.

Dave finishes up the talk with some discussion about what you can expect next from XFS and then takes some questions from the audience.

30 Jan, 2012  |  Written by  |  under Gluster, Redhat

Now that RedHat has purchased Gluster, and they are in the process of releasing their storage software appliance, many people are wondering what all this means for the GlusterFS project and gluster.org as a whole.

John Mark Walker conducted a webinar last week entitled ‘The Future of GlusterFS and Gluser.org’. In the beginning of this presentation John talks about the history behind, and origins of the Gluster project, he then goes into a basic overview of the features provided by GlusterFS, and finally he talks about what to expect from version 3.3 of GlusterFS and the GlusterFS open source community going forward.

Here are some of the talking points that were discussed during the webinar:

  • Unstructured data is expected to grow 44X by 2020
  • Scale out storage will hold 63,000 PB by 2015
  • RedHat is aggressively hiring developers with file system knowledge
  • Moving back to an open-source model from and open-core model
  • Open source version will be testing ground for new features
  • RHSSA will be more hardened and thoroughly tested
  • Beta 3 for 3.3 due in Feb/Mar 2012
  • GlusterFS 3.3 expected in Q2/Q3 of 2012

Here is the link to the entire presentation in a downloadable .mp4 format.

Here is a link to all the slides that were presented during the talk.

30 Jan, 2012  |  Written by  |  under BTRFS, Linux

Here is a Youtube video of a presentation from this years Linux.conf.au conference given by Avi Miller.  The video talks about the current state of btrfs, some of the upcoming features, and Avi also provides a demonstration of one of the filesystem recovery tools in action.

Here are a a few of the highlights:

  • Lots of performance and stability fixes
  • Lots of code cleanup
  • New compression options (LZO and snappy)
  • Auto file defrag
  • Kernel 3.3 will allow larger block sizes (4k,8k,16k) for better meta-data throughput
  • A ZFS like send/receive is in the works
  • New filesystem checker (btrfsck) should be released by Feb 14th
  • Raid 5/6 code (from Intel) will go into mainline kernel after the release of btrfsck
  • Options exist/will exist to do mixed raid modes for data and meta-data
  • Btrfs will be production filesystem in next version of Oracle Unbreakable Linux

No doubt about it, if you are interested in the current state of btrfs you should check out this talk.

17 Jan, 2012  |  Written by  |  under Debian, Linux

I have spent some time over the last few weeks getting familiar with mdadm and software RAID on Linux, so I thought I would write down some of the commands and example syntax that I have used while getting started.

1)If we would like to create a new RAID array from scratch we can use the following example commands:
RAID1-with 2 Drives:
mdadm --create --verbose /dev/md0 --level=1 /dev/sda1 /dev/sdb1

RAID5-with 5 Drives:
mdadm --create --verbose /dev/md0 --level=5 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

RAID6-with 4 Drives with 1 spare:
mdadm --create --verbose /dev/md0 --level=6 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

2)If we would like to add a disk to an existing array:
mdadm --add /dev/md0 /dev/sdf1 (only added as a spare)

mdadm --grow /dev/md0 -n [new number of active disks - spares] (grow the size of the array)

3)If we would like to remove a disk from an existing array:
First we need to ‘fail’ the drive:
mdadm --fail /dev/md0 /dev/sdc1

Next it can be safely removed from the array:
mdadm --remove /dev/md0 /dev/sdc1

4)In order to make the array survive a reboot, you need to add the details to ‘/etc/mdadm/mdadm.conf’
mdadm --detail --scan >> /etc/mdadm/mdadm.conf (Debian)
mdadm --detail --scan >> /etc/mdadm.conf (Everyone else)

5)In order to delete and remove the entire array:
First we need to ‘stop’ the array:
mdadm --stop /dev/md0

Next it can be removed:
mdadm --remove /dev/md0

6)Examining the status of your RAID array:
There are two options here:
a)cat /proc/mdstat
b)mdadm --detail /dev/md0

Here is a quick tip for anyone who needs to access files that exists underneath an already mounted filesystem mount point.  For example suppose that you have some files located in a directory called ‘/tmp/docs’.

At some point someone might decide to accidentally  take that same directory, and create an NFS or CIFS mount,  if you need to access the original files that existed before the new mount point was put into place, you have two options.

  1. Unmount the NFS or CIFS filesystem and access your files and then remount.
  2. However, you may find yourself in a situation (such as I did), where it is extremely inconvenient or impossible for you have the downtime associated with the umount/remount process.  In that case you have another option…you can use a ‘bind’ mount.

All you need to do is something like the following:

mount --bind /tmp /tmp/new_location

Now you should be able to access the original files here:

‘/tmp/new_location/docs’

29 Nov, 2011  |  Written by  |  under Couchbase, Couchdb

Over the last few weeks I have started to familiarise myself with the current state of the CouchDB ecosystem.  I hope that this will be the first of several posts in which I will be able to detail some of the things I been able to learn about CouchDB so far, and also in the future when we are finally able to put this database into production.

With all the different companies, similar sounding projects names and technology buzzwords surrounding CouchDB, it can often seem very confusing, and some people have found it difficult to sift through all the jargon, and come away with a firm grasp on exactly who and what is behind CouchDB.

Let’s start off by defining some of the players and an overview of what they provide:

1) Apache CouchDB – is an open source, document-oriented database, it is part of the new breed of databases commonly referred to by some as NoSQL datastores.  CouchDB uses JSON to store documents, and you can interact with the database using a RESTful JSON API.

2) Couchbase – is a company that provides software and enterprise support for several projects that are based on the CouchDB source code.  Lets take a look at a couple of these projects in more detail:

  • Membase Server (Couchbase) - is an elastic, distributed, key-value database management system optimized for storing data for web applications.  For anyone already familiar with with memcached, Membase is basically memcached on steroids, it allows you to build a distributed cluster of memcached instances, and it provides options for both persistent and non persistant storage.  Another nice feature provided by Membase is a web gui that is displays all sorts of useful statistics that can help you understand exactly what is going on with your servers.
  • Couchbase Single Server – is the software package that you would download if you were looking for a replacement for (or an equivalent to) Couchdb.  This is basically the stock Apache Couchdb source code, with the geocouch extension enabled, as well as some additional patches provided by the developers that work for Couchbase.
  • Couchbase Server 2.0 – represents the future of both the Couchbase and Membase codebases.  Server 2.0 basically removes the SQLite backend that is currently being used by Membase, and replaces it with CouchDB.  At this point Server 2.0 has been released with a ‘Developer Preview’ status, and thus I do not believe it is quite ready for production use.  Currently Couchbase Server 2.0 only allows you to access the data stored in the backend CouchDB database using the memcache protocol (you have to go through memcache to access the data stored in CouchDB). Future versions (3.0, etc) promise to allow you to access the data via both the memcache protocols as well as the CoucbDB RESTful JSON API, but this is currently not the case, and will  most likely not be available for some time.

3) BigCouch – an open source version of CouchDB written in Erlang, that allows scaling beyond a master/slave architecture via database sharding, A BigCouch deployment will be seen as a single large CouchDB instance from the application perspective.

4) Couchdb-lounge – an open source project which uses Nginx and Python to provide a proxy based framework to achieve additional scaling beyond a master/slave architecture for Couchdb.

5) Cloudant – an enterprise software company which provides CouchDB hosting, enterprise support, as well as being the company behind BigCouch.

**UPDATE**

Dane (see comments) pointed out that ATI has in fact released the 11.10 version of their drivers, I went ahead and gave them a try and using them broke most things for me.

Once I booted back in to Gnome…I had some of the Gnome3 look and feel…but everything else (menus, icons, etc) were clearly from Gnome2.  I reinstalled version 11.9 and everything was back to normal.  This update might work for some other setups…but for now I’ll just stick with the version that is working 95% of the time.

——————————————————————————————————————————————————————

I was finally able to get a working desktop using Ubuntu 11.10, Gnome Shell, Gnome 3.2 along with my Radeon HD 2400 XT video card.  The adventure started a few weeks ago when I tried to setup my existing Ubuntu 11.04 desktop using some PPA repositories I found online.

I was able to successfully upgrade from Ubuntu 11.04 to 11.10 beta, and  since the 11.10 final release was right around the corner I figured it was safe to go ahead and give it a try.  The upgrade went well, but I spent the next day fighting to try and get gnome-shell to play nicely with my Radeon card using the existing ATI drivers.

I ended up starting from scratch a few days later, by backing up some important files in my home directory and doing a clean install of 11.10 once the final version was released.

After doing an update and installing some other packages such as  ubuntu-restricted-extras, vlc, pidgin, etc installing gnome-shell was painless:

apt-get install gnome-shell

After rebooting, I logged in to find some of the same problems as before with this desktop install (screen tearing, blurry icons, multicolored menus, etc). I found some posts around the net that alluded to the fact that I might be able to solve some of my problems if I used the latest drivers (version 11.9) off the ATI website.

On the other hand, I found other posts by people claiming that even using the latest drivers had not completely solved all their problems and that ATI would be releasing version 11.10 sometime within the next 2 to 3 weeks, and that this new version would be specifically tested against Gnome 3.x (and fix the remaining bugs).

Anyway, I decided that I had nothing to lose at this point and decided to grab the latest version from the web:

mkdir ati-11.9; cd ati-11.9
wget http://www2.ati.com/drivers/linux/ati-driver-installer-11-9-x86.x86_64.run
sh ati-driver-installer-11-9-x86.x86_64.run --buildpkg Ubuntu/oneiric
dpkg -i fglrx*.deb
aticonfig --initial -f

After rebooting my machine again, I was pleasantly surprised to see that everything was looking good, no more problems with screen tearing and all my icons and menus were seemingly in order.

The only thing I needed to do now was to setup my multiple monitors correctly, since at that point I was staring at two cloned spaces instead of one large desktop spread across both my two 24″ monitors.

First I launched the Catalyst control panel:

gksu amdcccle

Under the ‘Display Manager’ page I had to select ‘Multi-display desktop with display’

***FOR EACH OF MY TWO MONITORS****

After a reboot I went into the Gnome ‘System Settings’ and choose ‘Displays’….I was finally able to uncheck ‘Mirror displays’ and hit ‘Apply’ without error.

The final two steps required for me to getting everything working %100 correctly was to install the gnome-tweak-tool:

apt-get install gnome-tweak-tool

and disable the ‘Have file manager handle the desktop’ option in the ‘Desktop’ section (that did away with the extra menu I was seeing).

The final step in the process involved installing a new theme…I really liked the Elementary them found here. So that is the one I choose….now everything is working as it should be!

12 Oct, 2011  |  Written by  |  under Linux, OpenVZ, Proxmox

Ever since we upgraded from Proxmox 1.8 to version 1.9 we have had users who have periodically complained about receiving out of memory errors when attempting to start or restart their java apps.

The following two threads contain a little bit more information about the problems people are seeing:

1)Proxmox mailing list thread
2)Openvz mailing list thread

At least one of the threads suggest you allocate a minimum of 2 cpu’s per VM in order to remedy the issue.  We already have 2 cpu’s per VM, so that was not a possible workaround for us.

Another suggestion made by one of the posters was to  revert back to using a previous version of the kernel, or downgrade Proxmox 1.9 to Proxmox 1.8 altogether.

I decided I would try to figure out a work around that did not involving downgrading software versions.

At first I tried to allocate additional memory to the VM’s and that seemed to resolve the issue for a short period of time, however after several days I once again started to hear about out of memory errors with Java.

After checking ‘/proc/user_beancounters’ on several of the VM’s,  I noticed that the failcnt numbers on the  ‘privvmpages’ parameter was increasing steadily over time.

The solution so far for us has been to increase the ‘privvmpages’ parameter (in my case I simply doubled it) to such a level that these errors are no longer incrementing the ‘failcnt’ counter.

If you would like to learn more about the various UBC parameters that can be modified inside openvz you can check out this link.

4 Oct, 2011  |  Written by  |  under Debian, Linux

After spending the last two weeks upgrading various versions of Debian to Squeeze, I figured I would post the details of how to upgrade each version, starting from Debian 3.1 to Debian 6.0.

The safest way to upgrade to Debian Squeeze is to upgrade from the prior version until you reach version 6.x.  In order words, if you are upgrading from Debian 4.x, need to upgrade to Debian 5.x and THEN to Debian 6.x.  Direct upgrades are not at all recommended.

Here are the steps that I took when I upgrading between various versions.

Sarge to Etch:

I was able to upgrade all of our Debian 3.1 machines to Debian 4.0 using the following commands.  I did not encounter any real surprises when I upgraded any of our physical of virtual machines.

You can upgrade using apt and the following commands:

apt-get update
apt-get dist-upgrade

Etch to Lenny:

The only real issue to note when upgrading from Debian 4.0 to 5.0, is that Lenny does not provide the drivers by default for any of the Broadcom network adapter drivers used by a majority of our Dell servers.  This caused some stress for me since I was doing the upgrades without physical access to the servers, so after I completed the upgrade to 5.0 and rebooted the server, of course I was not able to access the server because the NIC cards were no longer recognised by Debian.

In order to resolve this issue you will need to install the ‘firmware-bnx2‘ package after you do the upgrade but BEFORE you reboot the server.

The reason that the Debian team does not include these drivers by default is due to license restrictions placed on the firmware.  If you want to read more about this issue you can view the very short bug report here.

The best tool for upgrading to Debian 5 is aptitude:

aptitude update
aptitude install apt dpkg aptitude
aptitude full-upgrade

Lenny to Squeeze:

Upgrading Debian 5.o to 6.0 was also relatively painless as well.  One issue that I did run into revolved around the new version of udev and kernel versions prior to 2.6.26.  We had a few servers that were using kernel versions in the 2.6.18 range and if don’t upgrade the kernel version before you reboot, you may have issues with certain devices not being recognized or named correctly and thus you may have issues that prevent a successful bootup.

You can use the following apt commands to complete the upgrade process:

apt-get update
apt-get dist-upgrade -u