Ted Neykov, currently a hacker at Rackspace, pointed me in the direction of the new Open Stack Operations Guide. I have only had a chance to browse the .pdf at this point, however I believe this will end up being a very informative and useful book for me going forward.
Taken from the guide’s summery:
‘This book offers hard-earned experience from OpenStack operators who have run OpenStack in production for six months or longer. Â They’ve gathered their notes, shared their stories, and learned from each other in the room. We invite you to join in the quest for best practices in OpenStack cloud operations’
Here is a quick video that was released along with the guide, that briefly describes the process they used during it’s creation:
Today I had to track down the cause of an issue we were having with a server where shortly after restarting the server, requests would start to hang, and the number of Apache processes seemed to be growing rather large, rather quickly.
I started out using Apache’s mod_status to get some details about the state of each process.
I noticed that many of the processes ended up Â in a ‘”W” Â or “Sending Reply” state. Â I choose a random Apache process and fired up ‘strace’ to try to get some more information:
server7:/root# strace -p 11574
Process 11574 attached – interrupt to quit
flock(26, LOCK_EX <unfinished …>
This process was stuck waiting for anÂ exclusive lock on some file. Â I used ‘readlink’ to find out the name of the file in question:
Ok I just found this video of Chris Mason giving a talk on Btrfs at Linuxcon 2010.Â It appears to be very similar to the webcast I linked toÂ a few days ago, hosted on Oracle.com.Â This video however is hosted on linuxfoundation.org and there is no registration required which is nice.
I have been actively following the Â zfsonlinux project because once stable and ready it should offerÂ surperior performance due to the extra overhead that would be incurred by using fuse with the zfs-fuse project.
You can see another one of my posts concerning zfsonlinuxÂ here.
KQ Infotech has released (currently in closed beta) code that brings ZFS to Linux via a loadable kernel module.
Here is a link to the current and future feature set.Â The reason that this is exciting is that although other ZFS implementations for Linux have traditionally existed, each of the available options have significant drawbacks.Â For exampleÂ ZFS-FUSE isÂ implemented in userspace using FUSE, which has additional overhead due to the context switching that is required while switching back and forth between kernel-space and user -space. .
Another option is ZFS on Linux which provides a stable SPA, DMU and ZVOL layer, but does not however provide a Posix layer (ZPL) that would enable you to actually mount a ZFS filesystem from inside Linux.Â From what I understand, KQ Infotech has basically taken some of the ZFS on Linux code that was developed by the Lawrence Livermore National Laboratory (LLNL), and actually implementedÂ the missing ZPL layer.
NPR was recently accepted into the closed beta program,Â and I took some time last week to get this module installed on a Dell Poweredge 2950 running a 64 bit version of Ubuntu 10.04.Â We are currently testing ZFS underÂ kernel versionÂ 2.6.32-24.Â I have not had a ton of time to test things out, but I would say so far so good.Â I plan on posting some ZFS and Btrfs benchmarks in the next few weeks after I get some time to better test performance, throughput, etc.
While doing research into poor write performance with Oracle I discovered that the server was using the LSI SAS1068E. We had a RAID1 setup with 300GB 10K RPM SAS drives. Google provided some possible insight into why we the write performance was so bad(1 2). The main problem with this card is that there is noÂ battery backed write cache. This means that the write-cache is disabled by default. I was able to turn on the write cache using the LSI utility.
This change however did not seem to any difference on performance.Â At this point I came to the conclusion that the card itself is the blame.Â I believeÂ that this is an inexpensive RAID card that is good for general use of RAID0 and Raid1, however for anything were write throughput is important, it might be better the spring for a something a little bit more expensive.
When it was all said and done we ended up replacing all the these LSI cards with Dell Perc 6i cards.Â These cards did come battery backed…which allowed us to then enable the write cache, needless to say the performance improved significantly.
Welcome to shainmiley.com. I plan to use this blog to discuss some of the technological issues that I encounter on a day to day basis.Â Topics will include Linux, scaling infrastructure, cloud computing, Mysql, open source,Â storage etc.