Category Archives: Dell

Replace failed Ceph disk on Dell hardware

We are using Dell 720 and 730xd servers for our Ceph OSD servers. Here is the process that we use in order to replace a disk and/or remove the faulty OSD from service.

In this example we will attempt to replace OSD #45 (slot #9 of this particular server):

Stop the OSD and unmount the directory:
stop ceph-osd id=45
umount /var/lib/ceph/osd/ceph-45
ceph osd crush reweight osd.num 0.0 (wait for the cluster to rebalance):
ceph osd out osd.num
service ceph stop osd.num
ceph osd crush remove osd.num
ceph auth del osd.num
ceph osd rm osd.num

megacli -PDList -a0

If not already offline…offline the drive:
megacli -pdoffline -physdrv[32:9] -a0
Mark disk as missing:
megacli -pdmarkmissing -physdrv[32:9] -a0
Permanently remove drive from array:
megacli -pdprprmv -physdrv[32:9] -a0

NOW PHYSICALLY REPLACE THE BAD THE DRIVE WITH A NEW ONE.

Set drive state to online if not already:
megacli -PDOnline -PhysDrv [32:9] -a0
Create Raid-0 array on new drive:
megacli -CfgLdAdd -r0[32:9] -a0

You may need to discard the cache before doing the last step:
First get cache lsit:
megacli -GetPreservedCacheList -a0
Clear whichover one you need to:
megacli -DiscardPreservedCache -L2 -a0

Recreate OSD using Bluestore as the new default
ceph-deploy disk zap hqosdNUM /dev/sdx
ceph-deploy osd create --data /dev/sdm hqosdNUM

Force10 troubleshooting commands

Here are a few of the more helpful commands that I have been using recently to troubleshoot some performance issues we have been having with our our stacked S4820’s.

1)Show interfaces statistics:
‘show interface tengigabitethernet 0/40’

2)Monitor interface statistics in real-time:
‘monitor interface tengigabitethernet 0/40’

3)Show port channel statistics:
‘show interface port-channel7’

4)Show overview of port channel groupings:
‘show interface port-channel brief’

5)Monitor port channel statistics in real-time:
‘monitor interface port-channel 12’

6)Display vlan interfaces statistics:
‘show interface vlan 20’

7)Gather data for tech support:
‘show tech-support’

8)Display serial numbers, service tags, etc:
‘show inventory’