17 Jan, 2012  |  Written by  |  under Debian, Linux

I have spent some time over the last few weeks getting familiar with mdadm and software RAID on Linux, so I thought I would write down some of the commands and example syntax that I have used while getting started.

1)If we would like to create a new RAID array from scratch we can use the following example commands:
RAID1-with 2 Drives:
mdadm --create --verbose /dev/md0 --level=1 /dev/sda1 /dev/sdb1

RAID5-with 5 Drives:
mdadm --create --verbose /dev/md0 --level=5 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

RAID6-with 4 Drives with 1 spare:
mdadm --create --verbose /dev/md0 --level=6 --raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

2)If we would like to add a disk to an existing array:
mdadm --add /dev/md0 /dev/sdf1 (only added as a spare)

mdadm --grow /dev/md0 -n [new number of active disks - spares] (grow the size of the array)

3)If we would like to remove a disk from an existing array:
First we need to ‘fail’ the drive:
mdadm --fail /dev/md0 /dev/sdc1

Next it can be safely removed from the array:
mdadm --remove /dev/md0 /dev/sdc1

4)In order to make the array survive a reboot, you need to add the details to ‘/etc/mdadm/mdadm.conf’
mdadm --detail --scan >> /etc/mdadm/mdadm.conf (Debian)
mdadm --detail --scan >> /etc/mdadm.conf (Everyone else)

5)In order to delete and remove the entire array:
First we need to ‘stop’ the array:
mdadm --stop /dev/md0

Next it can be removed:
mdadm --remove /dev/md0

6)Examining the status of your RAID array:
There are two options here:
a)cat /proc/mdstat
b)mdadm --detail /dev/md0

Here is a quick tip for anyone who needs to access files that exists underneath an already mounted filesystem mount point.  For example suppose that you have some files located in a directory called ‘/tmp/docs’.

At some point someone might decide to accidentally  take that same directory, and create an NFS or CIFS mount,  if you need to access the original files that existed before the new mount point was put into place, you have two options.

  1. Unmount the NFS or CIFS filesystem and access your files and then remount.
  2. However, you may find yourself in a situation (such as I did), where it is extremely inconvenient or impossible for you have the downtime associated with the umount/remount process.  In that case you have another option…you can use a ‘bind’ mount.

All you need to do is something like the following:

mount --bind /tmp /tmp/new_location

Now you should be able to access the original files here:

‘/tmp/new_location/docs’

4 Oct, 2011  |  Written by  |  under Debian, Linux

After spending the last two weeks upgrading various versions of Debian to Squeeze, I figured I would post the details of how to upgrade each version, starting from Debian 3.1 to Debian 6.0.

The safest way to upgrade to Debian Squeeze is to upgrade from the prior version until you reach version 6.x.  In order words, if you are upgrading from Debian 4.x, need to upgrade to Debian 5.x and THEN to Debian 6.x.  Direct upgrades are not at all recommended.

Here are the steps that I took when I upgrading between various versions.

Sarge to Etch:

I was able to upgrade all of our Debian 3.1 machines to Debian 4.0 using the following commands.  I did not encounter any real surprises when I upgraded any of our physical of virtual machines.

You can upgrade using apt and the following commands:

apt-get update
apt-get dist-upgrade

Etch to Lenny:

The only real issue to note when upgrading from Debian 4.0 to 5.0, is that Lenny does not provide the drivers by default for any of the Broadcom network adapter drivers used by a majority of our Dell servers.  This caused some stress for me since I was doing the upgrades without physical access to the servers, so after I completed the upgrade to 5.0 and rebooted the server, of course I was not able to access the server because the NIC cards were no longer recognised by Debian.

In order to resolve this issue you will need to install the ‘firmware-bnx2‘ package after you do the upgrade but BEFORE you reboot the server.

The reason that the Debian team does not include these drivers by default is due to license restrictions placed on the firmware.  If you want to read more about this issue you can view the very short bug report here.

The best tool for upgrading to Debian 5 is aptitude:

aptitude update
aptitude install apt dpkg aptitude
aptitude full-upgrade

Lenny to Squeeze:

Upgrading Debian 5.o to 6.0 was also relatively painless as well.  One issue that I did run into revolved around the new version of udev and kernel versions prior to 2.6.26.  We had a few servers that were using kernel versions in the 2.6.18 range and if don’t upgrade the kernel version before you reboot, you may have issues with certain devices not being recognized or named correctly and thus you may have issues that prevent a successful bootup.

You can use the following apt commands to complete the upgrade process:

apt-get update
apt-get dist-upgrade -u

30 Sep, 2011  |  Written by  |  under Debian, Proxmox

Martin Maurer sent an email to the Proxmox-users mailing list this morning announcing that a version 2.0 beta ISO had been made available for download.

Here are some links that will provide further information on this latest release:

Roadmap and feature overview:
http://pve.proxmox.com/wiki/Roadmap#Roadmap_for_2.x

Preliminary 2.0 documentation:
http://pve.proxmox.com/wiki/Category:Proxmox_VE_2.0

Community tools (Bugzilla, Git, etc):
http://www.proxmox.com/products/proxmox-ve/get-involved

Proxmox VE 2.0 beta forum:
http://forum.proxmox.com/forums/16-Proxmox-VE-2.0-beta

Downloads:
http://www.proxmox.com/downloads/proxmox-ve/17-iso-images

I have not had a chance to install a test node using this latest 2.0 beta codebase, however I expect to have a two node cluster up and running in the next week or so, and after I do I will will follow up with another blog post detailing my thoughts.

Thanks again to Martin and Dietmar for all their hard work so far on this great open source project!

Recently one of our 3ware 9650SE raid cards started spitting out errors indicating that the unit was repeatedly issuing a bunch of soft resets. The lines in the log look similar to this:

WARNING: tw1: tw_aen_task AEN 0×0039 Buffer ECC error corrected address=0xDF420
WARNING: tw1: tw_aen_task AEN 0x005f Cache synchronization failed; some data lost unit=22
WARNING: tw1: tw_aen_task AEN 0×0001 Controller reset occurred resets=13

I downloaded and installed the latest firmware for the card (version 4.10.00.021), which the release notes claimed had several fixes for cards experiencing soft resets.  Much to my disappointment the resets continued to occur despite the new revised firmware.

The card was under warranty, so I contacted 3ware support and had a new one sent overnight.  The new card seemed to resolve the issues associated with random soft resets, however the resets and the downtime had left this node little out of sync with the other Gluster server.

After doing a ‘zfs replace’ on two bad disks (at this point I am still unsure whether the bad drives where a symptom or the cause of the issues with the raid card, however what I do know is that the Cavier Geen Western Digital drives that are populating this card have a very high error rate, and we are currently in the process of replacing all 24 drives with hitachi ones), I set about trying to initiate a ‘self-heal’ on the known up to date node using the following command:

server2:/zpool/glusterfs# ls -laR *

After some time I decided to tail the log file to see if there were any errors that might indicate a problem with the self heal. Once again the Gluster error log begun to fill up with errors associated with setting extended attributes on SUNWattr_ro.

At that point I began to worry whether or not the AFR (Automatic File Replication) portion of the Replicate/AFR translator was actually working correctly or not.  I started running some tests to determine what exactly was going on.  I began by copying over a few files to test replication.  All the files showed up on both nodes, so far so good.

Next it was time to test AFR so I began deleting a few files off one node and then attempting to self heal those same deleted files.  After a couple of minutes, I re-listed the files and the deleted files had in fact been restored. Despite the successful copy, the errors continued to show up every single time the file/directory was accessed (via stat).  It seemed that even though AFR was able to copy all the files to the new node correctly, gluster for some reason continued to want to self heal the files over and over again.

After finding the function that sets the extended attributes on Solaris, the following patch was created:

--- compat.c    Tue Aug 23 13:24:33 2011
+++ compat_new.c        Tue Aug 23 13:24:49 2011
@@ -193,7 +193,7 @@
 {
        int attrfd = -1;
        int ret = 0;
-
+
        attrfd = attropen (path, key, flags|O_CREAT|O_WRONLY, 0777);
        if (attrfd >= 0) {
                ftruncate (attrfd, 0);
@@ -200,13 +200,16 @@
                ret = write (attrfd, value, size);
                close (attrfd);
        } else {
-               if (errno != ENOENT)
-                       gf_log ("libglusterfs", GF_LOG_ERROR,
+               if(!strcmp(key,"SUNWattr_ro")&&!strcmp(key,"SUNWattr_rw")) {
+
+                       if (errno != ENOENT)
+                               gf_log ("libglusterfs", GF_LOG_ERROR,
                                "Couldn't set extended attribute for %s (%d)",
                                path, errno);
-               return -1;
+                       return -1;
+               }
+               return 0;
        }
-
        return 0;
 }

 

The patch simply ignores the two Solaris specific extended attributes (SUNWattr_ro and SUNWattr_rw), and returns a ’0′ to the posix layer instead of a ‘-1′ if either of these is encountered.

We’ve been running this code change on both Solaris nodes for several days and so far so good, the errors are gone and replicate and AFR both seem to be working very well.

Once you have your OpenStack cluster up and running you will need to either find some pre-created image templates or you may decide that you want to roll your own.  I’ll leave the details of creating images from scratch for a different post, this post will focus on providing links to both image files and instructions for installing pre-created Linux templates on OpenStack infrastructure.

First, if you are looking to install any version of Ubuntu, you should visit

http://uec-images.ubuntu.com/releases/

and download the file that corresponds to your desired version and architecture.

Once you have that file, you can follow the instructions here.

If you are looking to install a version of Debian, CentOS or Fedora, you should visit

http://open.eucalyptus.com/wiki/EucalyptusUserImageCreatorGuide_v1.6,

and download one of pre-created images that the folks over at Eucalyptus have provided.

Once you have are ready to install one of those files, you can follow the instructions here.

1 Jun, 2011  |  Written by  |  under Debian, Linux, OpenVZ, Proxmox

I was recently attempting to backup one of our Proxmox VE’s using OpenVZ’s backup tool ‘vzdump’. In the past when using vzdump, a complete backup of a 100GB VE, for example could be obtained in under and hour or so. This time however, after leaving the process running and returning several hours later, the .tar file was a mere 2.3GB in size.

At first I thought that there might be an issue with one or more nodes in the shared storage cluster, so I decided I would direct vzdump to store the .tar file on one of the server’s local partitions instead. Once again I started the backup, returned several hours later, only to find a file similar in size to the previous one.

Next I decided I would attempt to ‘tar up’ the contents of the VE up manually, that combined with the ‘nohup’ command would allow me to find out at what point this whole process was stalling.

As it turns out, I had thousands of files in my ‘/var/spool/postfix/incoming/’ directory on that VE, and although almost every single file in that directory was small, and the overall directly size was not large at all, the result was that file operations inside that folder had come to a screeching halt.

Luckily for me, I knew for a fact that we did not need any of these particular email messages, so I was simply able to delete the ‘incoming’ folder and then recreate it once all the files had been removed, after that, vzdump was once again functioning as expected.

I recently had the pleasure(!) of trying to get PHP on Debian working correctly with a Microsoft SQL server so that the data could be migrated from a Mssql instance into a Mysql one.

Previous to this attempt, the developers were using a Windows machine as a ‘broker’ between the two database. This setup was much too slow for importing and exporting large amounts of data,  so we decided to cut out the middle man (the Windows machine) and do all the processing on a single server.

First I needed to install a few prerequisite packages:

user@computer:$ apt-get install unixodbc-dev
user@computer:$ apt-get install libmysqlclient15-dev

Next we need to download and uncompress the FreeTDS source code:

user@computer:$ wget ftp://ftp.linuxforum.hu/mirrors/frugalware/pub/frugalware/frugalware-testing/source/lib-extra/freetds/freetds-0.82.tar.gz

Next we use configure and install FreeTDS with the following options:

user@computer:$ ./configure --enable-msdblib --prefix=/usr/local/freetds --with-tdsver=7.0 --with-unixodbc=/usr
user@computer:$ make
user@computer:$ make install

Next we need to download and uncompress the PHP source code:

user@computer:$ wget http://us.php.net/get/php-5.3.6.tar.bz2/from/www.php.net/mirror

Next we use configure and install PHP with the following options:

user@computer:$ ./configure --with-mssql=/usr/local/freetds --with-mysql --with-mysqli
user@computer:$ make
user@computer:$ make install

Lastly we will need to create and install the mssql module for PHP:

user@computer:$ cd ext/mssql
user@computer:$ phpize
user@computer:$ ./configure --with-mssql=/usr/local/freetds
user@computer:$ make
user@computer:$ make install

Now you should be able to connect to any Microsoft SQL (and Mysql) server from PHP using the functions found here.

23 Feb, 2011  |  Written by  |  under Debian, KVM, Linux, OpenVZ, Proxmox

Martin Maurer sent an email to the Proxmox users mailing list detailing some of the features that we can expect from the next iteration of Proxmox VE. Martin expects that the first public beta release of the 2.x branch will be ready for use sometime around the second quarter of this year.

Here are some of the highlights currently slated for this release:

  • Complete new GUI
    • based on Ext JS 4 JavaScript framework
    • fast search-driven interface, capable of handling hundreds and probably thousands of VM´s
    • secure VNC console, supporting external VNC viewer with SSL support
    • role based permission management for all objects (VM´s, storages, nodes, etc.)
    • Support for multiple authenication sources (e.g. local, MS ADS, LDAP, …)
  • Based on Debian 6.0 Squeeze
    • longterm 2.6.32 Kernel with KVM and OpenVZ as default
    • second kernel branch with 2.6.x, KVM only
  • New cluster communication based on corosync, including:
    • Proxmox Cluster file system (pmcfs): Database-driven file system for storing configuration files, replicated in realtime on all nodes using corosync
    • creates multi-master clusters (no single master anymore!)
    • cluster-wide logging
    • basis for HA setup´s with KVM guests
  • RESTful web API
    • Ressource Oriented Architecture (ROA)
    • declarative API definition using JSON Schema
    • enable easy integration for third party management tools
  • Planned technology previews (CLI only)
    • spice protocol (remote display system for virtualized desktops)
    • sheepdog (distributed storage system)
  • Commitment to Free Software (FOSS): public code repository and bug tracker for the 2.x code base
    • Topics for future releases

      • Better resource monitoring
      • IO limits for VM´s
      • Extend pre-built Virtual Appliances downloads, including KVM appliances
    17 Jan, 2011  |  Written by  |  under Debian, Zfs

    UPDATE: If you are interested in ZFS on linux you have two options at this point:

    I have been actively following the  zfsonlinux project because once stable and ready it should offer superior performance due to the extra overhead that would be incurred by using fuse with the zfs-fuse project.

    ————————————————————————————————————————————————————-
    Earlier this week  KQInfotech released the latest latest build of their ZFS kernel modules for Linux. This version has been labeled GA and ready for wider testing (and maybe ready for production).

    KQStor has been setup as a place where you can go to sign-up for an account, download the software and get additional support.

    The source code for the module can be found here:

    https://github.com/zfs-linux

    Currently mounting of the root filesystem is not supported, however a post here, describes a procedure that can be used to do it.

    The users guide also hints at possible problems using ‘zfs rollback’ under certain circumstances.  I have asked for more specific information on this issue, and I will pass along any other information I can uncover.

    After looking around the various mailing lists, this looks like it might be an issue that exists with zfs-fuse, and thus the current version of the kernel module as well, since they share a lot of the same code.

    Installation and usage:

    Installation of the module is fairly simple, I downloaded the pre-packaged .deb packages for Ubuntu 10.10 server.

    user@computer:$ root@server1:/root/Deb_Package_Ubuntu10.10_2.6.35-22-server# dpkg -i *.deb

    If all goes well you should be able to list the loaded modules:

    user@computer:$ root@server1:/root/Deb_Package_Ubuntu10.10_2.6.35-22-server# lsmod |grep zfs
    lzfs                   36377  3
    zfs                   968234  1 lzfs
    zcommon                42172  1 zfs
    znvpair                47541  2 zfs,zcommon
    zavl                    6915  1 zfs
    zlib_deflate           21866  1 zfs
    zunicode              323430  1 zfs
    spl                   116684  6 lzfs,zfs,zcommon,znvpair,zavl,zunicode

    Now I can create a test pool:

    user@computer:$ root@server1:/root#zpool create test-mirror mirror sdc sdd

    Now check the status of the zpool:

    user@computer:$ root@server1:/root# zpool status
    pool: test-mirror
    state: ONLINE
    scan: none requested
    config:

    NAME        STATE     READ WRITE CKSUM
    test-mirror  ONLINE    0     0     0
    mirror-0  ONLINE       0     0     0
    sdc1   ONLINE       0     0     0
    sdd1   ONLINE       0     0     0