[ceph-users] How to recover from corrupted RocksDb

Paul Emmerich paul.emmerich at croit.io
Thu Nov 29 05:54:59 PST 2018


does objectstore-tool still work? If yes:

export all the PGs on the OSD with objectstore-tool and important them
into a new OSD.

Paul


-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Am Do., 29. Nov. 2018 um 13:06 Uhr schrieb Igor Fedotov <ifedotov at suse.de>:
>
> 'ceph-bluestore-tool repair' checks and repairs BlueStore metadata consistency not RocksDB one.
>
> It looks like you're observing CRC mismatch during DB compaction which is probably not triggered during the repair.
>
> Good point is that it looks like Bluestore's metadata are consistent and hence data recovery is still possible  - potentially, can't build up a working procedure using existing tools though..
>
> Let me check if one can disable DB compaction using rocksdb settings.
>
>
> On 11/29/2018 1:42 PM, Mario Giammarco wrote:
>
> The only strange thing is that ceph-bluestore-tool says that repair was done, no errors are found and all is ok.
> I ask myself what really does that tool.
> Mario
>
> Il giorno gio 29 nov 2018 alle ore 11:03 Wido den Hollander <wido at 42on.com> ha scritto:
>>
>>
>>
>> On 11/29/18 10:45 AM, Mario Giammarco wrote:
>> > I have only that copy, it is a showroom system but someone put a
>> > production vm on it.
>> >
>>
>> I have a feeling this won't be easy to fix or actually fixable:
>>
>> - Compaction error: Corruption: block checksum mismatch
>> - submit_transaction error: Corruption: block checksum mismatch
>>
>> RocksDB got corrupted on that OSD and won't be able to start now.
>>
>> I wouldn't know where to start with this OSD.
>>
>> Wido
>>
>> > Il giorno gio 29 nov 2018 alle ore 10:43 Wido den Hollander
>> > <wido at 42on.com <mailto:wido at 42on.com>> ha scritto:
>> >
>> >
>> >
>> >     On 11/29/18 10:28 AM, Mario Giammarco wrote:
>> >     > Hello,
>> >     > I have a ceph installation in a proxmox cluster.
>> >     > Due to a temporary hardware glitch now I get this error on osd startup
>> >     >
>> >     >     -6> 2018-11-26 18:02:33.179327 7fa1d784be00  0 osd.0 1033
>> >     crush map
>> >     >     has features 1009089991638532096, adjusting msgr requires for
>> >     osds
>> >     >        -5> 2018-11-26 18:02:34.143084 7fa1c33f9700  3 rocksdb:
>> >     >
>> >      [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
>> >     >     Compaction error: Corruption: block checksum mismatch
>> >     >     -4> 2018-11-26 18:02:34.143123 7fa1c33f9700 4 rocksdb:
>> >     (Original Log
>> >     >     Time 2018/11/26-18:02:34.143021)
>> >     >     [/build/ceph-12.2.9/src/rocksdb/db/compaction_job.cc:621]
>> >     [default]
>> >     >     compacted to: base level 1 max bytes base268435456 files[17$
>> >     >
>> >     >     -3> 2018-11-26 18:02:34.143126 7fa1c33f9700 4 rocksdb:
>> >     (Original Log
>> >     >     Time 2018/11/26-18:02:34.143068) EVENT_LOG_v1 {"time_micros":
>> >     >     1543251754143044, "job": 3, "event": "compaction_finished",
>> >     >     "compaction_time_micros": 1997048, "out$
>> >     >        -2> 2018-11-26 18:02:34.143152 7fa1c33f9700  2 rocksdb:
>> >     >
>> >      [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
>> >     >     Waiting after background compaction error: Corruption: block
>> >     >     checksum mismatch, Accumulated background err$
>> >     >        -1> 2018-11-26 18:02:34.674171 7fa1c4bfc700 -1 rocksdb:
>> >     >     submit_transaction error: Corruption: block checksum mismatch
>> >     code =
>> >     >     2 Rocksdb transaction:
>> >     >     Delete( Prefix = O key =
>> >     >
>> >      0x7f7ffffffffffffffb64000000217363'rub_3.26!='0xfffffffffffffffeffffffffffffffff'o')
>> >     >     Put( Prefix = S key = 'nid_max' Value size = 8)
>> >     >     Put( Prefix = S key = 'blobid_max' Value size = 8)
>> >     >         0> 2018-11-26 18:02:34.675641 7fa1c4bfc700 -1
>> >     >     /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function
>> >     'void
>> >     >     BlueStore::_kv_sync_thread()' thread 7fa1c4bfc700 time 2018-11-26
>> >     >     18:02:34.674193
>> >     >     /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 8717: FAILED
>> >     >     assert(r == 0)
>> >     >
>> >     >     ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217)
>> >     >     luminous (stable)
>> >     >     1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> >     >     const*)+0x102) [0x55ec83876092]
>> >     >     2: (BlueStore::_kv_sync_thread()+0x24b5) [0x55ec836ffb55]
>> >     >     3: (BlueStore::KVSyncThread::entry()+0xd) [0x55ec8374040d]
>> >     >     4: (()+0x7494) [0x7fa1d5027494]
>> >     >     5: (clone()+0x3f) [0x7fa1d4098acf]
>> >     >
>> >     >
>> >     > I have tried to recover it using ceph-bluestore-tool fsck and repair
>> >     > DEEP but it says it is ALL ok.
>> >     > I see that rocksd ldb tool needs .db files to recover and not a
>> >     > partition so I cannot use it.
>> >     > I do not understand why I cannot start osd if ceph-bluestore-tools
>> >     says
>> >     > me I have lost no data.
>> >     > Can you help me?
>> >
>> >     Why would you try to recover a individual OSD? If all your Placement
>> >     Groups are active(+clean) just wipe the OSD and re-deploy it.
>> >
>> >     What's the status of your PGs?
>> >
>> >     It says there is a checksum error (probably due to the hardware glitch)
>> >     so it refuses to start.
>> >
>> >     Don't try to outsmart Ceph, let backfill/recovery handle this. Trying to
>> >     manually fix this will only make things worse.
>> >
>> >     Wido
>> >
>> >     > Thanks,
>> >     > Mario
>> >     >
>> >     > _______________________________________________
>> >     > ceph-users mailing list
>> >     > ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>> >     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >     >
>> >
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


More information about the ceph-users mailing list