[ceph-users] rocksdb: Corruption: missing start of fragmented record

Michael mehe.schmid at gmx.ch
Wed Nov 1 01:30:06 PDT 2017


Hello everyone,

I've conducted some crash tests (unplugging drives, the machine, 
terminating and restarting ceph systemd services) with Ceph 12.2.0 on 
Ubuntu and quite easily managed to corrupt what appears to be rocksdb's 
log replay on a bluestore OSD:

# ceph-bluestore-tool fsck  --path /var/lib/ceph/osd/ceph-2/
[...]
4 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2859] 
Recovered from manifest file:db/MANIFEST-000975 
succeeded,manifest_file_number is 975, next_file_number is 1008, 
last_sequence is 51965907, log_number is 0,prev_log_number is 
0,max_column_family is 0
4 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2867] 
Column family [default] (ID 0), log number is 1005
4 rocksdb: EVENT_LOG_v1 {"time_micros": 1509298585082794, "job": 1, 
"event": "recovery_started", "log_files": [1003, 1005]}
4 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] 
Recovering log #1003 mode 0
4 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] 
Recovering log #1005 mode 0
3 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:424] 
db/001005.log: dropping 3225 bytes; Corruption: missing start of 
fragmented record(2)
4 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:217] Shutdown: 
canceling all background work
4 rocksdb: 
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:343] Shutdown 
complete
-1 rocksdb: Corruption: missing start of fragmented record(2)
-1 bluestore(/var/lib/ceph/osd/ceph-2/) _open_db erroring opening db:
1 bluefs umount
1 bdev(0x557f5b6a4240 /var/lib/ceph/osd/ceph-2//block) close

If I understand this right, rocksdb is  just trying to replay WAL type 
logs, of which presumably "001005.log" is corrupted. It then throws an 
error that stops everything.

I did try to mount the bluestore, as I was assuming that would probably 
where I'd find the rocksdb's files somewhere, but that also doesn't seem 
possible:

#ceph-objectstore-tool --op fsck --data-path /var/lib/ceph/osd/ceph-2/ 
--mountpoint /mnt/bluestore-repair/
fsck failed: (5) Input/output error
# ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-2 
--mountpoint /mnt/bluestore-repair/
Mount failed with '(5) Input/output error'
# ceph-objectstore-tool --op fuse --force --skip-journal-replay 
--data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
Mount failed with '(5) Input/output error'

Adding --debug shows the ultimate culprit is just the above rocksdb 
error again.

Q: Is there some way in which I can tell rockdb to truncate or delete / 
skip the respective log entries? Or can I get access to rocksdb('s 
files) in some other way to just manipulate it or delete corrupted WAL 
files manually?

-Michael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171101/fabce7d8/attachment.html>


More information about the ceph-users mailing list