Category Archives: Dell

Replace failed Ceph disk on Dell hardware

We are using Dell 720 and 730xd servers for our Ceph OSD servers. Here is the process that we use in order to replace a disk and/or remove the faulty OSD from service.

In this example we will attempt to replace OSD #45 (slot #9 of this particular server):

Stop the OSD and unmount the directory:
stop ceph-osd id=45
umount /var/lib/ceph/osd/ceph-45
megacli -PDList -a0

If not already offline…offline the drive:
megacli -pdoffline -physdrv[32:9] -a0
Mark disk as missing:
megacli -pdmarkmissing -physdrv[32:9] -a0
Permanently remove drive from array:
megacli -pdprprmv -physdrv[32:9] -a0

NOW PHYSICALLY REPLACE THE BAD THE DRIVE WITH A NEW ONE.

Set drive state to online:
megacli -PDOnline -PhysDrv [32:9] -a0
Create Raid-0 array on new drive:
megacli -CfgLdAdd -r0[32:9] -a0

You may need to discard the cache before doing the last step:
First get cache lsit:
megacli -GetPreservedCacheList -a0
Clear whichover one you need to:
megacli -DiscardPreservedCache -L2 -a0

Force10 troubleshooting commands

Here are a few of the more helpful commands that I have been using recently to troubleshoot some performance issues we have been having with our our stacked S4820’s.

1)Show interfaces statistics:
‘show interface tengigabitethernet 0/40’

2)Monitor interface statistics in real-time:
‘monitor interface tengigabitethernet 0/40’

3)Show port channel statistics:
‘show interface port-channel7’

4)Show overview of port channel groupings:
‘show interface port-channel brief’

5)Monitor port channel statistics in real-time:
‘monitor interface port-channel 12’

6)Display vlan interfaces statistics:
‘show interface vlan 20’

7)Gather data for tech support:
‘show tech-support’

8)Display serial numbers, service tags, etc:
‘show inventory’