Monthly Archives: November 2013

Ceph braindump part1

After spending about 4 months testing, benchmarking, setting up and breaking down various Ceph clusters, I though I would spend time documenting some of the things I have learned while setting up cephfs, rbd and radosgw along the way.

First let me talk a little bit about the details of the cluster that we will be putting into production over the next several weeks.

Cluster specs:

  • 6 x Dell R-720xd;64 GB of RAM; for OSD nodes
  • 72 x 4TB SAS drives as OSD’s
  • 3 x Dell R-420;32 GB of RAM; for MON/RADOSGW/MDS nodes
  • 2 x Force10 S4810 switches
  • 4 x 10 GigE LCAP bonded Intel cards
  • Ubuntu 12.04 (AMD64)
  • Ceph 0.72.1 (emperor)
  • 2400 placement groups
  • 261TB of usable space

The process I used to set- up and tear down our cluster during testing was quite simple, after installing ‘ceph-deploy’ on the admin node:

  1. ceph-deploy new mon1 mon2 mon3
  2. ceph-deploy install  mon1 mon2 mon3 osd1 osd2 osd3 osd4 osd5 osd6
  3. ceph-deploy mon create mon1 mon2 mon3
  4. ceph-deploy gatherkeys mon1
  5. ceph-deploy osd create osd1:sdb
  6. ceph-deploy osd create osd1:sdc

The uninstall process went something like this:

  1. ceph-deploy disk zap osd1:sdb
  2. ceph-deploy purge mon1 mon2 mon3 osd1 osd2 osd3 osd4 osd5 osd6
  3. ceph-deploy purgedata mon1 mon2 mon3 osd1 osd2 osd3 osd4 osd5 osd6

Additions to ceph.conf:

Since we wanted to configure an appropriate journal size for our 10GigE network, mount xfs with appropriate options and configure radosgw, we added the following to our ceph.conf (after ‘ceph-deploy new but before ‘ceph-deploy install’:

osd_journal_size = 10240
osd_mount_options_xfs = “rw,noatime,nodiratime,logbsize=256k,logbufs=8,inode64”
osd_mkfs_options_xfs = “-f -i size=2048”

host = mon1
keyring = /etc/ceph/keyring.radosgw.gateway
rgw_socket_path = /tmp/radosgw.sock
log_file = /var/log/ceph/radosgw.log
admin_socket = /var/run/ceph/radosgw.asok
rgw_dns_name =
debug rgw = 20
rgw print continue = true
rgw should log = true
rgw enable usage log = true


I used the following commands to benchmark rados, rbd, cephfs, etc

  1. rados -p rbd  bench 20 write --no-cleanup
  2. rados -p rbd  bench 20 seq
  3. dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
  4. dd bs=4M count=512 if=/dev/zero of=test conv=fdatasync

 Ceph blogs worth reading:

megacli cheat sheet

The information below is based heavily off of a post that can be found here:

I am providing the information on my blog in the event that the original blog post becomes unavailable at some point in the future, as we use this information quite regularly.

1-Gather Info: 

Controller information

megacli -AdpAllInfo -aALL
megacli -CfgDsply -aALL
megacli -adpeventlog -getevents -f controller-events.log -a0 -nolog

Enclosure information

megacli -EncInfo -aALL

Virtual drive information

megacli -LDInfo -Lall -aALL

Physical drive information

megacli -PDList -aALL
megacli -PDInfo -PhysDrv [E:S] -aALL

Battery backup information

megacli -AdpBbuCmd -aALL

Check Battery backup warning on boot

megacli -AdpGetProp BatWarnDsbl -a0

Controller management:

Silence active alarm

megacli -AdpSetProp AlarmSilence -aALL

Disable alarm

megacli -AdpSetProp AlarmDsbl -aALL

Enable alarm

megacli -AdpSetProp AlarmEnbl -aALL

Disable battery backup warning on system boot

megacli -AdpSetProp BatWarnDsbl -a0

Change the adapter rebuild rate to 60%:

megacli -AdpSetProp {RebuildRate -60} -aALL

2-Virtual drive management:

Create RAID 0, 1, 5 drive

megacli -CfgLdAdd -r(0|1|5) [E:S, E:S, …] -aN

Create RAID 10 drive

megacli -CfgSpanAdd -r10 -Array0[E:S,E:S] -Array1[E:S,E:S] -aN

Remove drive

megacli -CfgLdDel -Lx -aN

Physical drive management

Set state to offline

megacli -PDOffline -PhysDrv [E:S] -aN

Set state to online

megacli -PDOnline -PhysDrv [E:S] -aN

Mark as missing

megacli -PDMarkMissing -PhysDrv [E:S] -aN

Prepare for removal

megacli -PdPrpRmv -PhysDrv [E:S] -aN

Replace missing drive

megacli -PdReplaceMissing -PhysDrv [E:S] -ArrayN -rowN -aN

The number N of the array parameter is the Span Reference you get using megacli -CfgDsply -aALL and the number N of the row parameter is the Physical Disk in that span or array starting with zero (it’s not the physical disk’s slot!).

Rebuild drive -- Drive status should be “Firmware state: Rebuild”

megacli -PDRbld -Start -PhysDrv [E:S] -aN
megacli -PDRbld -Stop -PhysDrv [E:S] -aN
megacli -PDRbld -ShowProg -PhysDrv [E:S] -aN
megacli -PDRbld -ProgDsply -physdrv [E:S] -aN

Clear drive

megacli -PDClear -Start -PhysDrv [E:S] -aN
megacli -PDClear -Stop -PhysDrv [E:S] -aN
megacli -PDClear -ShowProg -PhysDrv [E:S] -aN

Bad to good

megacli -PDMakeGood -PhysDrv[E:S] -aN

Changes drive in state Unconfigured-Bad to Unconfigured-Good.

Hot spare management

Set global hot spare

megacli -PDHSP -Set -PhysDrv [E:S] -aN

Remove hot spare

megacli -PDHSP -Rmv -PhysDrv [E:S] -aN

Set dedicated hot spare

megacli -PDHSP -Set -Dedicated -ArrayN,M,… -PhysDrv [E:S] -aN

Walkthrough: Rebuild a Drive that is marked ‘Foreign’ when Inserted:

Bad to good

megacli -PDMakeGood -PhysDrv [E:S] -aALL

Clear the foreign setting

megacli -CfgForeign -Clear -aALL

Set global hot spare

megacli -PDHSP -Set -PhysDrv [E:S] -aN

Walkthrough: Change/replace a drive

a. Set the drive offline, if it is not already offline due to an error

megacli -PDOffline -PhysDrv [E:S] -aN

b. Mark the drive as missing

megacli -PDMarkMissing -PhysDrv [E:S] -aN

c. Prepare drive for removal

megacli -PDPrpRmv -PhysDrv [E:S] -aN

d. Change/replace the drive

e. If you’re using hot spares then the replaced drive should become your new hot spare drive

megacli -PDHSP -Set -PhysDrv [E:S] -aN

f. In case you’re not working with hot spares, you must re-add the new drive to your RAID virtual drive and start the rebuilding

megacli -PdReplaceMissing -PhysDrv [E:S] -ArrayN -rowN -aN
megacli -PDRbld -Start -PhysDrv [E:S] -aN

3-Gathering Standard logs

#rm –f MegaSAS.log
#megacli -adpallinfo -a0
#megacli -encinfo -a0
#megacli -ldinfo -lall -a0
#megacli -pdlist -a0
#megacli -adpeventlog -getevents -f controller-events.log -a0 -nolog
#megacli -fwtermlog -dsply -a0 -nolog > controller-fwterm.log