[ceph-users] Another OSD broken today. How can I recover it?

Denes Dolhay denke at denkesys.com
Tue Dec 5 01:26:07 PST 2017


Hi,

This question popped up a few times already under filestore and 
bluestore too, but please help me understand, why this is?

"when you have 2 different objects, both with correct digests, in your 
cluster, the cluster can not know witch of the 2 objects are the correct 
one."

Doesn't it use an epoch, or an omap epoch when storing new data? If so 
why can it not use the recent one?


Thanks,

Denes.


On 12/05/2017 10:14 AM, Ronny Aasen wrote:
> On 05. des. 2017 09:18, Gonzalo Aguilar Delgado wrote:
>> Hi,
>>
>> I created this. http://paste.debian.net/999172/ But the expiration 
>> date is too short. So I did this too https://pastebin.com/QfrE71Dg.
>>
>> What I want to mention is that there's no known cause for what's 
>> happening. It's true that time desynch happens on reboot because few 
>> millis skew. But ntp corrects it fast. There are no network issues 
>> and the log of the osd is in the output.
>>
>> I only see in other osd the errors that are becoming more and more 
>> usual:
>>
>> 2017-12-05 08:58:56.637773 7f0feff7f700 -1 log_channel(cluster) log 
>> [ERR] : 10.7a shard 2: soid 
>> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head 
>> data_digest 0xfae07534 != data_digest 0xe2de2a76 from auth oi 
>> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head(3873'5250781 
>> client.5697316.0:51282235 dirty|data_digest|omap_digest s 4194304 uv 
>> 5250781 dd e2de2a76 od ffffffff alloc_hint [0 0])
>> 2017-12-05 08:58:56.637775 7f0feff7f700 -1 log_channel(cluster) log 
>> [ERR] : 10.7a shard 6: soid 
>> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head 
>> data_digest 0xfae07534 != data_digest 0xe2de2a76 from auth oi 
>> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head(3873'5250781 
>> client.5697316.0:51282235 dirty|data_digest|omap_digest s 4194304 uv 
>> 5250781 dd e2de2a76 od ffffffff alloc_hint [0 0])
>> 2017-12-05 08:58:56.637777 7f0feff7f700 -1 log_channel(cluster) log 
>> [ERR] : 10.7a soid 
>> 10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head: failed 
>> to pick suitable auth object
>>
>> Digests not matching basically. Someone told me that this can be 
>> caused by a faulty disk. So I replaced the offending drive, and now I 
>> found the new disk is happening the same. Ok. But this thread is not 
>> for checking the source of the problem. This will be done later.
>>
>> This thread is to try recover an OSD that seems ok to the object 
>> store tool. This is:
>>
>>
>> Why it breaks here?
>
>
> if i get errors on a disk that i suspect are from reasons other then 
> the disk beeing faulty. i remove the disk from the cluster. run it 
> thru smart disk tests + long test. then run it thru the vendors 
> diagnostic tools (i have a separate 1u machine for this)
> if the disk clears as OK i wipe it and reinsert it as a new OSD
>
> the reason you are getting corrupt digests are probably the very 
> common way most people get corruptions.. you have size=2 , min_size=1
>
>
> when you have 2 different objects, both with correct digests, in your 
> cluster, the cluster can not know witch of the 2 objects are the 
> correct one.  just search this list for all the users that end up in 
> your situation for the same reason, also read this : 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-March/016663.html
>
>
> simple rule of thumb
> size=2, min_size=1 :: i do not care about my data, the data is 
> volatile but i want the cluster to accept writes _all the time_
>
> size=2, min_size=2 :: i can not afford real redundancy, but i do care 
> a little about my data, i accept that the cluster will block writes in 
> error situations until the problem is fixed.
>
> size=3, min_size=2 :: i want safe and available data, and i understand 
> that the ceph defaults are there for a reason.
>
>
>
> basically: size=3, min_size=2 if you want to avoid corruptions.
>
> remove-wipe-reinstall disks that have developed 
> corruptions/inconsistencies with the cluster
>
> kind regards
> Ronny Aasen
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



More information about the ceph-users mailing list