Category Archives: Opensolaris

zfsonlinux and gluster so far….

Recently I started to revisit the idea of using zfs and linux (zfsonlinux) as the basis for a server that will eventually be the foundation of our gluster storage infrastructure.  At this point we are using the Opensolaris version of zfs and an older (but stable) version of gluster (3.0.5).

The problem with staying with Opensolaris (besides the fact that it is no longer being actively supported itself),  is that we would be unable to upgrade gluster….and thus we would be unable to take advantage of some of the new and upcoming features that exist in the later versions (such as geo-replication, snapshots, active-active geo-replication and various other bugfixes, performance enhancements, etc).

Hardware:

Here are the specs for the current hardware I am using to test:

  • 2 x Intel Xeon E5410 @ 2.33GHz:CPU
  • 32 GB DDR2 DIMMS:RAM
  • 48 X 2TB Western Digital SATA II:HARD DRIVES
  • 2 x 3WARE 9650SE-24M8 PCIE:RAID CONTROLLER
  • Ubuntu 11.10
  • Glusterfs version 3.2.5
  • 1 Gbps interconnects (LAN)

ZFS installation:

I decided to use Ubuntu 11.10 for this round of testing, currently the daliy ppa has a lot of bugfixes and performance improvements that do not exist in the latest stable release ( 0.6.0-rc6) so the daily ppa is the version that should be used until either v0.6.0-rc7 or v0.6.0 final are released.

Here is what you will need to get zfs installed and running:

# apt-add-repository ppa:zfs-native/daily
# apt-get update
# apt-get install debootstrap ubuntu-zfs

At this point we can create our first zpool. Here is the syntax used to create a 6 disk raidz2 vdev:

# zpool create -f tank raidz2 sdc sdd sde sdf sdg sdh

Now let’s check the status of the zpool:

# zpool status tank
pool: tank
state: ONLINE
scan: none requested
config:NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0errors: No known data errors

ZFS Benchmarks:

I ran a few tests to see what kind of performance I could expect out of zfs first, before I added gluster on top, that way I would have better idea about where the bottleneck (if any) existed.

linux 3.3-rc5 kernel untar:

single ext4 disk: 3.277s
zfs 2 disk mirror: 19.338s
zfs 6 disk raidz2: 8.256s

dd using block size of 4096:

single ext4 disk: 204 MB/s
zfs 2 disk mirror: 7.5 MB/s
zfs 6 disk raidz2: 174 MB/s

dd using block size of 1M:

single ext4 disk: 153.0 MB/s
zfs 2 disk mirror: 99.7 MB/s
zfs 6 disk raidz2: 381.2 MB/s

Gluster + ZFS Benchmarks

Next I added gluster (version 3.2.5) to the mix to see how they performed together:

linux 3.3-rc5 kernel untar:

zfs 6 disk raidz2 + gluster (replication): 4m10.093s
zfs 6 disk raidz2 + gluster (geo replication): 1m12.054s

dd using block size of 4096:

zfs 6 disk raidz2 + gluster (replication): 53.6 MB/s
zfs 6 disk raidz2 + gluster (geo replication): 53.7 MB/s

dd using block size of 1M:

zfs 6 disk raidz2 + gluster (replication): 45.7 MB/s
zfs 6 disk raidz2 + gluster (geo replication): 155 MB/s

Conclusion

Well so far so good, I have been running the zfsonlinux port for two weeks now without any real issues. From what I understand there is still a decent amount of work left to do around dedup and compression (neither of which I necessarily require for this particular setup).

The good news is that the zfsonlinux developers have not even really started looking into improving performance at this point, since their main focus thus far has been overall stability.

A good deal of development is also taking place in order to allow linux to boot using a zfs ‘/boot’ partition.  This is currently an option on several disto’s including Ubuntu and Gentoo, however the setup requires a fair amount of effort to get going, so it will be nice when this style setup is supported out of the box.

In terms of Gluster specifically, it performs quite well using geo-replication with larger file sizes. I am really looking forward to the active-active geo-replication feature currently planned for v3.4 to become fully implemented and available. Our current production setup (currently using two node replication) has a T3 (WAN) interconnect, so having the option to use geo-replication in the future should really speed up our write throughput, which is currently hampered by the throughput of the T3 itself.

ZFS crash during high I/O

After successfully completing a ‘zfs replace’ I was not so pleased to get the following error message back from ‘zfs detach’:

cannot detach c5t17d0: no valid replicas

I decided that I would upgrade this OpenSolaris 2008.11 instance to OpenSolaris 2009.06 in order to see if the obvious bug that I was encountering was resolved in the newest version. Since upgrading in OpenSolaris supports automatic boot environment creation, there really is not much danger at all in updating because you can always boot back into the other environment at any time.

The upgrade was a success, and after I booted into 2009.06 I was able to simply detach the failed drive from the pool and thus remove it from the system.

I recompiled gluster and I ran 2009.06 for a couple of days, until I started noticing that the server was rebooting during times of high I/O. A peek inside ‘/var/adm/messages’ revealed the following errors:

Aug 15 22:33:04 cybertron unix: [ID 836849 kern.notice]
Aug 15 22:33:04 cybertron ^Mpanic[cpu0]/thread=ffffff091060c900:
Aug 15 22:33:04 cybertron genunix: [ID 783603 kern.notice] Deadlock: cycle in blocking chain
Aug 15 22:33:04 cybertron unix: [ID 100000 kern.notice]
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d9651f0 genunix:turnstile_block+795 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965250 unix:mutex_vector_enter+261 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d9652f0 zfs:zfs_zget+be ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965380 zfs:zfs_zaccess+7c ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965400 zfs:zfs_lookup+333 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d9654a0 genunix:fop_lookup+ed ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965550 genunix:xattr_dir_realdir+8b ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d9655a0 genunix:xattr_dir_realvp+5e ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d9655f0 genunix:fop_realvp+32 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965640 genunix:vn_compare+31 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965860 genunix:lookuppnvp+94c ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965900 genunix:lookuppnatcred+11b ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965990 genunix:lookuppnat+69 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965b30 genunix:vn_createat+13a ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965cf0 genunix:vn_openat+1fb ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965e50 genunix:copen+435 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965e80 genunix:openat64+25 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965ec0 genunix:fsat32+f5 ()
Aug 15 22:33:04 cybertron genunix: [ID 655072 kern.notice] ffffff003d965f10 unix:brand_sys_sysenter+1e0 ()
Aug 15 22:33:04 cybertron unix: [ID 100000 kern.notice]
Aug 15 22:33:04 cybertron genunix: [ID 672855 kern.notice] syncing file systems…
Aug 15 22:33:04 cybertron genunix: [ID 904073 kern.notice] done

My efforts to find any further detailes about this bug are ongoing, so at this point I have booted back into 2008.11 and I will be running that until a fix or a workaround is found.

SUNWattr_ro error:Permission denied on OpenSolaris using Gluster 3.0.5–PartII

Recently one of our 3ware 9650SE raid cards started spitting out errors indicating that the unit was repeatedly issuing a bunch of soft resets. The lines in the log look similar to this:

WARNING: tw1: tw_aen_task AEN 0x0039 Buffer ECC error corrected address=0xDF420
WARNING: tw1: tw_aen_task AEN 0x005f Cache synchronization failed; some data lost unit=22
WARNING: tw1: tw_aen_task AEN 0x0001 Controller reset occurred resets=13

I downloaded and installed the latest firmware for the card (version 4.10.00.021), which the release notes claimed had several fixes for cards experiencing soft resets.  Much to my disappointment the resets continued to occur despite the new revised firmware.

The card was under warranty, so I contacted 3ware support and had a new one sent overnight.  The new card seemed to resolve the issues associated with random soft resets, however the resets and the downtime had left this node little out of sync with the other Gluster server.

After doing a ‘zfs replace’ on two bad disks (at this point I am still unsure whether the bad drives where a symptom or the cause of the issues with the raid card, however what I do know is that the Cavier Geen Western Digital drives that are populating this card have a very high error rate, and we are currently in the process of replacing all 24 drives with hitachi ones), I set about trying to initiate a ‘self-heal’ on the known up to date node using the following command:

server2:/zpool/glusterfs# ls -laR *

After some time I decided to tail the log file to see if there were any errors that might indicate a problem with the self heal. Once again the Gluster error log begun to fill up with errors associated with setting extended attributes on SUNWattr_ro.

At that point I began to worry whether or not the AFR (Automatic File Replication) portion of the Replicate/AFR translator was actually working correctly or not.  I started running some tests to determine what exactly was going on.  I began by copying over a few files to test replication.  All the files showed up on both nodes, so far so good.

Next it was time to test AFR so I began deleting a few files off one node and then attempting to self heal those same deleted files.  After a couple of minutes, I re-listed the files and the deleted files had in fact been restored. Despite the successful copy, the errors continued to show up every single time the file/directory was accessed (via stat).  It seemed that even though AFR was able to copy all the files to the new node correctly, gluster for some reason continued to want to self heal the files over and over again.

After finding the function that sets the extended attributes on Solaris, the following patch was created:

— compat.c Tue Aug 23 13:24:33 2011
+++ compat_new.c Tue Aug 23 13:24:49 2011
@@ -193,7 +193,7 @@
{
int attrfd = -1;
int ret = 0;

+
attrfd = attropen (path, key, flags|O_CREAT|O_WRONLY, 0777);
if (attrfd >= 0) {
ftruncate (attrfd, 0);
@@ -200,13 +200,16 @@
ret = write (attrfd, value, size);
close (attrfd);
} else {
– if (errno != ENOENT)
– gf_log (“libglusterfs”, GF_LOG_ERROR,
+ if(!strcmp(key,”SUNWattr_ro”)&&!strcmp(key,”SUNWattr_rw”)) {
+
+ if (errno != ENOENT)
+ gf_log (“libglusterfs”, GF_LOG_ERROR,
“Couldn’t set extended attribute for %s (%d)”,
path, errno);
– return -1;
+ return -1;
+ }
+ return 0;
}

return 0;
}

The patch simply ignores the two Solaris specific extended attributes (SUNWattr_ro and SUNWattr_rw), and returns a ‘0’ to the posix layer instead of a ‘-1’ if either of these is encountered.

We’ve been running this code change on both Solaris nodes for several days and so far so good, the errors are gone and replicate and AFR both seem to be working very well.

Slow ZFS resilvering on OpenSolaris

Two weeks ago I started receiving automated messages from one of our 3ware 9650SE raid cards concerning an increase in the number of SMART errors on one of the 2TB hard drives attached to the card. Within a few days of the raid card starting to generate these messages, ZFS was nice enough to take the drive in question out of service, and replace it with one of the drives we had set aside as an ‘online spare’ for that specific pool.

So far so good.

Two terabytes of data is a decent amount, so I assumed that the resilvering might take some time, and i was able to confirm that after logging in and looking at the output from the ‘zpool status’ command. The output indicated that it was going to take several more hours before the resilvering process would be totally complete.

So far so good.

The next day I logged into to server to check on the progress, not only did I find that the job had not yet been completed, but I also discovered that now the ‘zpool status’ command had almost doubled the amount of time that it estimated would be required to fully resilver the drive.

It was at this point that I started to suspect that maybe our automated snapshotting policy (which runs hourly, daily, weekly and monthly via cron) may be hampering the resilvering progress. A quick google search indicated that at some point in the past, bug number ‘6343667‘ had in fact been associated with degraded scrub and resilvering performance during periods in which snapshots were being taken. It appears that some older versions of ZFS used to require a restart of the entire resilvering process after a snapshot was initiated.

According to bug number ‘6343667‘, this issue was resolved with the release of ZFS pool version 11. I double checked the version we are running on the server in question and discovered that we were running version 13.

At this point I am unsure if the problem that I experienced had anything to do with that specific bug number, however what I do know is that after commenting out the automated snapshot entries from the crontab on that server, the drive resilvering finished quickly and without error, and I have not had any problems since.

Just remember to re-enable to snapshots after the resilver and you should be all set.

DYI NAS: Part1

A few weeks I decided to do some research into what it would take to build a NAS unit that would act as a storage server for all of my digital assets (audio, video, images, etc). I had purchased a 500GB external HD from Bestbuy about a year ago, however recently I was having problem reading the drive from my Linux laptop (since then I have not had any problems reading that same drive from my new MacBook Pro).

That was enough to scare me into investing a little bit more time and money into something that provided a higher level of  fault tolerance,  and might also lead to some more restful nights as well.

HARDWARE:

My initial requirements were not super hefty, I knew I wanted the following:

1)relatively small form factor case
2)a unit that consumed relativity little power
3)would allow scaling up to at least 4 drives
4)a 64 bit CPU and a motherboard that would handle at least 4GB of RAM
5)a setup that would allow me to use ZFS as the backend filesystem

After doing some initial research, I came across the case that I thought would be perfect for this build, the Chenbro ES34169. This case fit several of the requirements, it was small, I could use a mini-itx motherboard and it provided a backplane for at least four hot swappable 3.5 inch hard drives.

After finding a case that I was fairly sure I was going to use, I set out to find a motherboard that would allow for at least 4 SATA devices and work well with with OpenSolaris, EON, Nexenta or FreeNAS.

Initially I really liked the GA-D525TUD from GIGABYTE.  It  has 4 SATA ports, can handle up to 4 GB of RAM, has a built in Intel Atom D525, and was priced very reasonably priced at right about $100.  The one huge downside of this motherboard was the onboard Realtek NIC.  I came across several posts (here and here for example) that indicated that reliability issues existed with these types of NIC’s, and that I was better of using an Intel chipset instead.  Since this server was mainly going to be used as a file server, network performance and  reliability were imperative, so I was going to try and avoid a motherboard with a Realtek NIC if possible.

I also found the JNC98-525E-LF from JetWay, this motherboard had a lot of the same appeal of the GIGABYTE motherboard, however it also had an HDMI port, a DVI port and analog audio outputs as well.  I think this motherboard would be a good pick if I was building a media server instead of strictly a storage server . This motherboard also used the Realtek chipset for networking as well, so I decided to continue my search.

When everything was all said and done, I went with the MBD-X7SPA-H-O from Supermicro.  This board has 6 SATA ports, allowed 4GB of RAM, came with a 64-bit processor, an on board USB port (which would allow me to hide my boot device inside the case itself) and most importantly had two Intel based network cards.  This motherboard was a bit pricey at right around $200, however I guess you are paying extra for the extra SATA ports and the extra NIC.

I found 4 GB of cheap Kingston RAM, so the only thing left hardware wise was to decide what type of hard drives I would purchase and how many.  I decided that I would use two WD20EADS 2TB hard drives from Western Digital.

The only real issue that I ran into with this setup was the fact that there are very few mini-itx motherboards that support ECC RAM, which is a must have for enterprise level storage setups, however I was not willing to spend the extra money for an enterprise level mini-itx board which sells for about $1000.  My other option was to scrap my plans for a really small form factor mini-itx based setup and go with something bigger like an micro-atx or regular atx motherboard, where I am sure I would have an easier time finding ECC support.

I decided that I would give up the ECC RAM option in order to gain the benefits that come with a much smaller machine.

SOFTWARE:

I went back and fourth between using EON, OpenSolaris and FreeNAS.  I liked EON because it has a very small footprint, it could be installed on a USB flash drive, it was based on OpenSolaris and had a stable ZFS implementation.  The downside of using EON for this project is that it would require a bit more expertise to configure and administer.

I have a good amount of OpenSolaris experience, and they have obviously the most stable ZFS implementation that exists, but it think that it is overkill for this machine, and I am also not a big fan of what Oracle is doing right now in terms of the open source community, so I decided to look a little closer at what FreeNAS had to offer.

FreeNAS is a FreeBSD based NAS distro, that has a very nice web interface that allows you to configure almost all aspects of the server from any web browser.  Research indicated that they had a stable and reliable ZFS port in place, I would be able to install and boot the OS on my usb flash drive, and if I got hit by a bus, one of my family members would have a better chance of being able to figure out how to retrieve the data…so FreeNAS it was.

In Part 2 of my post I plan to provide some more details about overall power use and network performance.

SUNWattr_ro error:Permission denied on OpenSolaris using Gluster 3.0.5

Last week I noticed an apparently obscure error message in my glusterfsd logfile. I was getting errors similar to this:

[2011-01-15 18:59:45] E [compat.c:206:solaris_setxattr] libglusterfs: Couldn’t set extended attribute for /datapool/glusterfs/other_files (13)
[2011-01-15 18:59:45] E [posix.c:3056:handle_pair] posix1: /datapool/glusterfs/other_files: key:SUNWattr_ro error:Permission denied

on several directories as well as on the files that resided underneath those directories. These errors only occurred when an attempt was made by Gluster to stat the file or directory (ls -l vs ls) in question.

After reviewing the entire logfile, I was unable to see any real pattern to the error messages, the errors were not very widespread given that I was only seeing these one maybe 75 or so files out of our total 3TB of data.

A google search yielded very few results on the topic, with or without Gluster as a search term. What I was able to find out was this:

SUNWattr_ro and SUNWattr_rw are Solaris ‘system extended attributes’, these attributes cannot be removed from a file or directory, you can however prevent users from being able to set them at all, by setting xattr=off, either during the creation of the zpool or changing the parameter after the fact.

This was not a viable solution for me due to the fact that several of Gluster’s translators require extended attributes be enabled on the underling filesystem.

I was able to list the extended attributes using the following command:

user@solaris1# touch test.file
user@solaris1# runat test.file ls -l
total 2
-r–r–r– 1 root root 84 Jan 15 11:58 SUNWattr_ro
-rw-r–r– 1 root root 408 Jan 15 11:58 SUNWattr_rw

I also learned that some people were having problems with these attributes on Solaris 10 systems, this is due to the fact that the kernels that are used by those versions of Solaris do not include, nor do they understand how to translate these ‘system extended attributes’, that were introduced in new versions of Solaris . This has caused a headache for some people who have been trying to share files between Solaris 10 and Solaris 11 based servers.

In the end, the solution was not overly complex, I had to recursively copy the directories to a temporary location, delete the original folder and rename the new one:

(cp -r folder folder.new;rm -rf folder;mv folder.new folder)

These commands must be done from a Gluster client mount point, so that Gluster can set or reset the necessary extended attributes.

Gluster on OpenSolaris so far…part 1.

We have been running Gluster in our production environment for about 1 month now, so I figured I would post some details about our setup and our experiences with Gluster and OpenSolaris so far.

Overview:

Currently we have a 2 node Gluster cluster, we are using the replicate translator in order to provide Raid-1 type mirroring of the filesystem.  The initial requirements involved providing  a solution that would house our digital media archive (audio, video, etc), would scale up to around 150TB, support exports such as CIFS and NFS, and be extremely stable.

It was decided that we would use ZFS as our underlying filesystem, due to it’s data integerity features as well as it’s support for taking filesystem snapshots, both considered very high on the requirement list for this project as well.

Although FreeBSD has had ZFS support for quite some time, there were some known issues (with 32 vs 64 bit inode numbers) at the time of my research that prevented us from going that route.

Just this week  KQstor released their native ZFS kernel module for Linux, which as of this latest release is supposed to fully support extended filesystem attributes, these are requirement in order for Gluster to function properly.  This software was Beta at the time,  and did not support extended attributes, so we were unable to consider and/or test this configuration either.

The choice was then made to go with ZFS on OpenSolaris (2008.11 specifically due to the 3ware drivers available at the time).  Currently there is no FUSE support under Solaris, so although you can use it without a problem on the server side,  if you choose to use a Solaris variant for your storage nodes,  you will be required to use a head node with an OS that does support it on the client side.

The latest version of Gluster to be fully supported on the Solaris platform is version 3.0.5. 3.1.x introduced some nice new features, however we will have to either port our storage nodes to Linux, or wait until the folks at Gluster decide to release 3.1.x for Solaris (which I am not sure will happen anytime soon).

Here is the current hardware/software configuration:

  • 2 x Intel Xeon E5410 @ 2.33GHz:CPU
  • 32 GB DDR2 DIMMS:RAM
  • 48 X 2TB Western Digital SATA II:HARD DRIVES
  • 2 x 3WARE 9650SE-24M8 PCIE:RAID CONTROLLER
  • Opensolaris version 2008.11
  • Glusterfs version 3.0.5
  • Samba version 3.2.5 (Gluster1)

ZFS Setup:

Setup for the two OS drives was pretty straight forward, we created a two disk mirrored rpool.  This will allow us to have a disk failure in the root pool and still be able to boot the system.

Since we have 48 disks to work with for our data pool, we created a total of 6 Raid-z2 vdevs, each consisting of 7 physical disks.  This setup gives up 75TB of space (53TB usable) per node, while leaving 6 disks available to use as spares.

user@server1:/# zpool list
NAME       SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
rpool     1.81T  19.6G  1.79T     1%  ONLINE  -
datapool  75.8T  9.01T  66.7T    11%  ONLINE  -

Gluster setup:

Creating the Gluster .vol configuration files is easily done via the glusterfs-volgen command:

user1@host1:/#glusterfs-volgen --name cluster01 --raid 1 server1.hostname.com:/data/path server2.hostname.com:/data/path

That command will produce 2 volume files, one is called ‘glusterfsd.vol’ used on the server side and one called ‘glusterfs.vol’ used on the client.

Starting glusterd on the serverside is straightforward:

user1@host1:/# /usr/glusterfs/sbin/glusterfsd

Starting gluster on the client side is straightforward as well:

user1@host2:/#/usr/glusterfs/sbin/glusterfs --volfile=/usr/glusterfs/etc/glusterfs/glusterfs.vol /mnt/glusterfs/

In a later blog post I plan to talk more about issues that we have encountered running this specific setup in a production environment.

Gluster 3.1 released

Today the team over at Gluster.com announced the availability of version of Gluster 3.1 of their software.   There are currently two different offerings available from Gluster.  There is the Gluster Storage Platform, known as ‘GlusterSP’ which provides a Linux based bare metal installer, web based front end, etc.

They also offer ‘Glusterfs’ which they release as open source and provides the same functionality of GlusterSP,  but does not require a fresh install like GlusterSP,  but instead,  you can use it on an existing Linux or Solaris based system.

The 3.1 release brings the following new features:

Elastic Volume Management: logical storage volumes are decoupled from physical hardware, allowing administrators to grow, shrink and migrate storage volumes without any application downtime. As storage is added, storage volumes are automatically rebalanced across the cluster making it always available online regardless of changes to the underlying hardware.

New Gluster Console Manager: the Command Line Interface (CLI), Application Programming Interface (API) and shell are merged into a single powerful interface, enabling automation by giving the CLI higher level API’s and scripting capabilities. Languages such as Python, Ruby or PHP can be used to script a series of commands that are invoked through the command line. This new tool requires no new APIs and is able to script out and rapidly automate any information inserted in the CLI allowing cloud administrators the ability to simply automate large scale operations.

Native Network File System (NFS): including a native NFS v3 module which allows storage servers to communicate natively with NFS clients directly to any storage server in the cluster and simultaneously communicates NFS and the Gluster protocol. NFS requires no specialized training, making it simple and easy to deploy.

To find out more about Gluster you can visit Gluster.com, you can also visit Gluster.org if you want to get more familiar with the open source side of the Gluster house.

ZFS Resources

Recently I was given the task of putting together  a storage solution that would be used to  house a large amount of our digital assets.  I was also asked to make sure there would be enough space to meet our needs over the next few years.  The project called for a solution that could scale up to around 120TB of usable space.  Depending on the price, this solution might also be used to store a majority of our digital archive (audio and video).

I will go into the specific hardware and software details of the project in another post, however after about a month of research, we decided to go with a solution that was able to take advantage of the ZFS filesystem.

Here are a few documents that I found invaluable during my setup and overall planning:

ZFS Best Practices Guide

ZFS Configuration Guide

ZFS Troubleshooting Guide

ZFS Troubleshooting and Cheatsheet Guide

These links can be a starting point for anyone who wants to gain a better overall understanding of how to best administer a server running ZFS.  The ‘best practices guide’ is also a great resource to consult during the initial project planning stages.

Nexenta

I came across an interesting project last week while doing some research on OpenSolaris and zfs.  The distribution is called Nexenta.  The kernel of Nexenta is based on opensolaris, however the userspace tools are based on Debian/Ubuntu.

There is also a commercial offshoot called the Nexenta storage appliance which is a the Nexentra distribution packaged as a zfs based storage server.  Pricing is dependant on the maxium size of the storage pool.

I have downloaded the free version and am currently planning testing this distro with Gluster as well.  The FUSE project (which is required by a Gluster client to mount the filesystem) is currently not stable on opensolaris.  However I plan on using Nexenta as the server bricks of the Gluster cluster and using Linux as the client, since FUSE has no issues running on Linux.