[ceph-users] How to recover from corrupted RocksDb

Igor Fedotov ifedotov at suse.de
Thu Nov 29 04:06:31 PST 2018


'ceph-bluestore-tool repair' checks and repairs BlueStore metadata 
consistency not RocksDB one.

It looks like you're observing CRC mismatch during DB compaction which 
is probably not triggered during the repair.

Good point is that it looks like Bluestore's metadata are consistent and 
hence data recovery is still possible  - potentially, can't build up a 
working procedure using existing tools though..

Let me check if one can disable DB compaction using rocksdb settings.


On 11/29/2018 1:42 PM, Mario Giammarco wrote:
> The only strange thing is that ceph-bluestore-tool says that repair 
> was done, no errors are found and all is ok.
> I ask myself what really does that tool.
> Mario
>
> Il giorno gio 29 nov 2018 alle ore 11:03 Wido den Hollander 
> <wido at 42on.com <mailto:wido at 42on.com>> ha scritto:
>
>
>
>     On 11/29/18 10:45 AM, Mario Giammarco wrote:
>     > I have only that copy, it is a showroom system but someone put a
>     > production vm on it.
>     >
>
>     I have a feeling this won't be easy to fix or actually fixable:
>
>     - Compaction error: Corruption: block checksum mismatch
>     - submit_transaction error: Corruption: block checksum mismatch
>
>     RocksDB got corrupted on that OSD and won't be able to start now.
>
>     I wouldn't know where to start with this OSD.
>
>     Wido
>
>     > Il giorno gio 29 nov 2018 alle ore 10:43 Wido den Hollander
>     > <wido at 42on.com <mailto:wido at 42on.com> <mailto:wido at 42on.com
>     <mailto:wido at 42on.com>>> ha scritto:
>     >
>     >
>     >
>     >     On 11/29/18 10:28 AM, Mario Giammarco wrote:
>     >     > Hello,
>     >     > I have a ceph installation in a proxmox cluster.
>     >     > Due to a temporary hardware glitch now I get this error on
>     osd startup
>     >     >
>     >     >     -6> 2018-11-26 18:02:33.179327 7fa1d784be00  0 osd.0 1033
>     >     crush map
>     >     >     has features 1009089991638532096, adjusting msgr
>     requires for
>     >     osds
>     >     >        -5> 2018-11-26 18:02:34.143084 7fa1c33f9700  3 rocksdb:
>     >     >
>     >
>       [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
>     >     >     Compaction error: Corruption: block checksum mismatch
>     >     >     -4> 2018-11-26 18:02:34.143123 7fa1c33f9700 4 rocksdb:
>     >     (Original Log
>     >     >     Time 2018/11/26-18:02:34.143021)
>     >     >  [/build/ceph-12.2.9/src/rocksdb/db/compaction_job.cc:621]
>     >     [default]
>     >     >     compacted to: base level 1 max bytes
>     base268435456 files[17$
>     >     >
>     >     >     -3> 2018-11-26 18:02:34.143126 7fa1c33f9700 4 rocksdb:
>     >     (Original Log
>     >     >     Time 2018/11/26-18:02:34.143068) EVENT_LOG_v1
>     {"time_micros":
>     >     >     1543251754143044, "job": 3, "event":
>     "compaction_finished",
>     >     >     "compaction_time_micros": 1997048, "out$
>     >     >        -2> 2018-11-26 18:02:34.143152 7fa1c33f9700  2 rocksdb:
>     >     >
>     >
>       [/build/ceph-12.2.9/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
>     >     >     Waiting after background compaction error: Corruption:
>     block
>     >     >     checksum mismatch, Accumulated background err$
>     >     >        -1> 2018-11-26 18:02:34.674171 7fa1c4bfc700 -1 rocksdb:
>     >     >     submit_transaction error: Corruption: block checksum
>     mismatch
>     >     code =
>     >     >     2 Rocksdb transaction:
>     >     >     Delete( Prefix = O key =
>     >     >
>     >
>       0x7f7ffffffffffffffb64000000217363'rub_3.26!='0xfffffffffffffffeffffffffffffffff'o')
>
>     >     >     Put( Prefix = S key = 'nid_max' Value size = 8)
>     >     >     Put( Prefix = S key = 'blobid_max' Value size = 8)
>     >     >         0> 2018-11-26 18:02:34.675641 7fa1c4bfc700 -1
>     >     >  /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function
>     >     'void
>     >     >     BlueStore::_kv_sync_thread()' thread 7fa1c4bfc700 time
>     2018-11-26
>     >     >     18:02:34.674193
>     >     >  /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 8717:
>     FAILED
>     >     >     assert(r == 0)
>     >     >
>     >     >     ceph version 12.2.9
>     (9e300932ef8a8916fb3fda78c58691a6ab0f4217)
>     >     >     luminous (stable)
>     >     >     1: (ceph::__ceph_assert_fail(char const*, char const*,
>     int, char
>     >     >     const*)+0x102) [0x55ec83876092]
>     >     >     2: (BlueStore::_kv_sync_thread()+0x24b5) [0x55ec836ffb55]
>     >     >     3: (BlueStore::KVSyncThread::entry()+0xd)
>     [0x55ec8374040d]
>     >     >     4: (()+0x7494) [0x7fa1d5027494]
>     >     >     5: (clone()+0x3f) [0x7fa1d4098acf]
>     >     >
>     >     >
>     >     > I have tried to recover it using ceph-bluestore-tool fsck
>     and repair
>     >     > DEEP but it says it is ALL ok.
>     >     > I see that rocksd ldb tool needs .db files to recover and
>     not a
>     >     > partition so I cannot use it.
>     >     > I do not understand why I cannot start osd if
>     ceph-bluestore-tools
>     >     says
>     >     > me I have lost no data.
>     >     > Can you help me?
>     >
>     >     Why would you try to recover a individual OSD? If all your
>     Placement
>     >     Groups are active(+clean) just wipe the OSD and re-deploy it.
>     >
>     >     What's the status of your PGs?
>     >
>     >     It says there is a checksum error (probably due to the
>     hardware glitch)
>     >     so it refuses to start.
>     >
>     >     Don't try to outsmart Ceph, let backfill/recovery handle
>     this. Trying to
>     >     manually fix this will only make things worse.
>     >
>     >     Wido
>     >
>     >     > Thanks,
>     >     > Mario
>     >     >
>     >     > _______________________________________________
>     >     > ceph-users mailing list
>     >     > ceph-users at lists.ceph.com
>     <mailto:ceph-users at lists.ceph.com>
>     <mailto:ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>>
>     >     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>     >     >
>     >
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181129/269af99d/attachment.html>


More information about the ceph-users mailing list