Monthly Archives: January 2011

Updated Native Linux ZFS benchmarks

Phornix.com just released some updated numbers from benchmarks they took using the recently released GA version of the native ZFS kernel module for Linux. They conducted a total of 10 tests using the ZFS kernel module, Ext4, Btrfs and XFS.

The tests were performed using Ubuntu 10.10 and kernel version 2.6.35 for the ZFS tests,  kernel version 2.6.37 was used when testing the other three filesystems.

It appears that these tests were all run using single disk setups, I think it would be really great if Phornix would also look into providing benchmarks on multi-disk setups such as ZFS mirrored disks vs hardware or software RAID1 on Linux. I would also like to see benchmarks comparing RAID5 on Linux vs RAIDZ on ZFS.  I think these kinds of tests might provide a more realistic comparison of real world enterprise level storage configurations.

SUNWattr_ro error:Permission denied on OpenSolaris using Gluster 3.0.5

Last week I noticed an apparently obscure error message in my glusterfsd logfile. I was getting errors similar to this:

[2011-01-15 18:59:45] E [compat.c:206:solaris_setxattr] libglusterfs: Couldn’t set extended attribute for /datapool/glusterfs/other_files (13)
[2011-01-15 18:59:45] E [posix.c:3056:handle_pair] posix1: /datapool/glusterfs/other_files: key:SUNWattr_ro error:Permission denied

on several directories as well as on the files that resided underneath those directories. These errors only occurred when an attempt was made by Gluster to stat the file or directory (ls -l vs ls) in question.

After reviewing the entire logfile, I was unable to see any real pattern to the error messages, the errors were not very widespread given that I was only seeing these one maybe 75 or so files out of our total 3TB of data.

A google search yielded very few results on the topic, with or without Gluster as a search term. What I was able to find out was this:

SUNWattr_ro and SUNWattr_rw are Solaris ‘system extended attributes’, these attributes cannot be removed from a file or directory, you can however prevent users from being able to set them at all, by setting xattr=off, either during the creation of the zpool or changing the parameter after the fact.

This was not a viable solution for me due to the fact that several of Gluster’s translators require extended attributes be enabled on the underling filesystem.

I was able to list the extended attributes using the following command:

user@solaris1# touch test.file
user@solaris1# runat test.file ls -l
total 2
-r--r--r-- 1 root root 84 Jan 15 11:58 SUNWattr_ro
-rw-r--r-- 1 root root 408 Jan 15 11:58 SUNWattr_rw

I also learned that some people were having problems with these attributes on Solaris 10 systems, this is due to the fact that the kernels that are used by those versions of Solaris do not include, nor do they understand how to translate these ‘system extended attributes’, that were introduced in new versions of Solaris . This has caused a headache for some people who have been trying to share files between Solaris 10 and Solaris 11 based servers.

In the end, the solution was not overly complex, I had to recursively copy the directories to a temporary location, delete the original folder and rename the new one:

(cp -r folder folder.new;rm -rf folder;mv folder.new folder)

These commands must be done from a Gluster client mount point, so that Gluster can set or reset the necessary extended attributes.

Native Linux ZFS kernel module and stability.

UPDATE: If you are interested in ZFS on linux you have two options at this point:

I have been actively following the  zfsonlinux project because once stable and ready it should offer surperior performance due to the extra overhead that would be incurred by using fuse with the zfs-fuse project.

You can see another one of my posts concerning zfsonlinux here.

————————————————————————————————————————————————————-

There was a question posted in response to my previous blog post found here, about the stability of the native Linux ZFS kernel module release. I thought I would just make a post out of my response:

So far I have been able to perform some limited testing (given that the GA code was just released earlier this week), some time ago I had been given access to the beta builds,  so I had done some initial testing using those, I configured two mirrored vdevs consisting of two drives each. It seemed relatively stable as far as I was concerned, as I stated in my previous post…there is a known issue with the ‘zfs rollback’ command…which I tested using the GA release,  and I did in fact have problems with.

The work around at this point seems to be to perform a reboot after the rollback and then a ‘zfs scrub’ on the pool after the reboot. Personally I am hoping this gets fixed soon, because not everyone has the same level of flexibility, when it comes to rebooting their servers and storage nodes.

As far as I understand it, this module really consists of three pieces:

1)SPL --  a Linux kernel module which provides many of the Solaris kernel APIs. This layer makes it possible to run Solaris kernel code in the Linux kernel with relatively minimal modification.
2)ZFS -- a Linux kernel module which provides a fully functional and stable SPA, DMU, and ZVOL layer.
3)LZFS -- a Linux kernel module which provides the necessary POSIX layer.

Pieces #1 and #2 have been available for a while and are derived from code taken from the ZFS on Linux project found here. The folks at KQ Infotech are really building on that and providing piece #3, the missing POSIX layer.

Only time will tell how stable the code really is, my opinion at this point is that most software projects have some number of known bugs that exist (and even more have some unknown number of bugs as well), I know I am going to continue to test in a non production environment for the next few months.  At this point I have not experienced any instability (other then what was discussed above) or crashing, all the commands seem to work as advertised, there are a lot of features I have not been able to test yet, such as dedup, compression, etc, so there is lots more to look at in the upcoming weeks.

KQStor’s business model seems to be one where the source code is provided and support is charged for.  So far I have been able to have an open and productive dialog with their developers, and they have been very responsive to my inquiries, however it does not appear that they are going to be setting public tools such as mailing lists or forums, due to their current business model.  I am hoping that this will change in the near future, as I truly believe that everyone will be able to benefit from those kinds of public repositories, and there is no doubt in my mind that such tools will only lead to a more stable product in the long run.