[ceph-users] Another OSD broken today. How can I recover it?

Ronny Aasen ronny+ceph-users at aasen.cx
Tue Dec 5 01:14:41 PST 2017


On 05. des. 2017 09:18, Gonzalo Aguilar Delgado wrote:
> Hi,
> 
> I created this. http://paste.debian.net/999172/ But the expiration date 
> is too short. So I did this too https://pastebin.com/QfrE71Dg.
> 
> What I want to mention is that there's no known cause for what's 
> happening. It's true that time desynch happens on reboot because few 
> millis skew. But ntp corrects it fast. There are no network issues and 
> the log of the osd is in the output.
> 
> I only see in other osd the errors that are becoming more and more usual:
> 
> 2017-12-05 08:58:56.637773 7f0feff7f700 -1 log_channel(cluster) log 
> [ERR] : 10.7a shard 2: soid 
> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head data_digest 
> 0xfae07534 != data_digest 0xe2de2a76 from auth oi 
> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head(3873'5250781 
> client.5697316.0:51282235 dirty|data_digest|omap_digest s 4194304 uv 
> 5250781 dd e2de2a76 od ffffffff alloc_hint [0 0])
> 2017-12-05 08:58:56.637775 7f0feff7f700 -1 log_channel(cluster) log 
> [ERR] : 10.7a shard 6: soid 
> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head data_digest 
> 0xfae07534 != data_digest 0xe2de2a76 from auth oi 
> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head(3873'5250781 
> client.5697316.0:51282235 dirty|data_digest|omap_digest s 4194304 uv 
> 5250781 dd e2de2a76 od ffffffff alloc_hint [0 0])
> 2017-12-05 08:58:56.637777 7f0feff7f700 -1 log_channel(cluster) log 
> [ERR] : 10.7a soid 
> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head: failed to 
> pick suitable auth object
> 
> Digests not matching basically. Someone told me that this can be caused 
> by a faulty disk. So I replaced the offending drive, and now I found the 
> new disk is happening the same. Ok. But this thread is not for checking 
> the source of the problem. This will be done later.
> 
> This thread is to try recover an OSD that seems ok to the object store 
> tool. This is:
> 
> 
> Why it breaks here?


if i get errors on a disk that i suspect are from reasons other then the 
disk beeing faulty. i remove the disk from the cluster. run it thru 
smart disk tests + long test. then run it thru the vendors diagnostic 
tools (i have a separate 1u machine for this)
if the disk clears as OK i wipe it and reinsert it as a new OSD

the reason you are getting corrupt digests are probably the very common 
way most people get corruptions.. you have size=2 , min_size=1


when you have 2 different objects, both with correct digests, in your 
cluster, the cluster can not know witch of the 2 objects are the correct 
one.  just search this list for all the users that end up in your 
situation for the same reason, also read this : 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-March/016663.html


simple rule of thumb
size=2, min_size=1 :: i do not care about my data, the data is volatile 
but i want the cluster to accept writes _all the time_

size=2, min_size=2 :: i can not afford real redundancy, but i do care a 
little about my data, i accept that the cluster will block writes in 
error situations until the problem is fixed.

size=3, min_size=2 :: i want safe and available data, and i understand 
that the ceph defaults are there for a reason.



basically: size=3, min_size=2 if you want to avoid corruptions.

remove-wipe-reinstall disks that have developed 
corruptions/inconsistencies with the cluster

kind regards
Ronny Aasen






More information about the ceph-users mailing list