[ceph-users] RBD corruption when removing tier cache

Jan Pekař - Imatic jan.pekar at imatic.cz
Sat Dec 2 17:54:21 PST 2017

Hi all,

today I continued with my investigation. and maybe somebody will be 
interested with my research, so I'm sending it here.

I compared object in hot pool with object in cold pool and they were the 
same so I removed cache tier from cold pool.

Then I tried to fsck my rbd image using libvirt virtual with booted 
rescue cd.

I was successful only with read-only mount and with not replaying 
journal (mount -o ro,noload)

I noticed, that I'm getting IO errors on the disk.

sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK 
sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
sd 2:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 00 00 08 08 00 00 10 00
blk_update_request: 4 callbacks suppressed
blk_update_request: I/O error, dev sda, sector 2056
buffer_io_error: 61 callbacks suppressed
Buffer I/O error on dev sda1, logical block 1, lost async page write
Buffer I/O error on dev sda1, logical block 2, lost async page write
VFS: Dirty inode writeback failed for block device sda1 (err=-5).

I wanted to write to that block manually. To be sure I created rbd 
snapshot of that filesystem and after I created it, problems disappeared.

After creating snapshot I was able to fsck that filesystem, replay ext4 

It looks, that objects in cold pool were locked somehow so they cannot 
be modified? After snapshot they changed name and modification was 
possible? Can I debug it somehow?

I continued with cleaning hot pool, i tried to delete objects. Delete 
operation succeeded with rados rm, but some objects stayed there and I 
couldn't delete or get them anymore.

rados -p hot ls


rados -p hot rm
error removing hot>rbd_data.9c000238e1f29.0000000000000000: (2) No such 
file or directory

How to cleanup that pool? What could happen to that pool?

After some additional tests I think, that my initial problem caused 
switching cache mode to forward, so I recommend not only warn, like it 
is now when using that mode, but also to change official webpage


and find some other ways to flush all objects (like turn off VMs, set 
short time to evict or target size) and remove overlay after that.

With regards
Jan Pekar

On 1.12.2017 03:43, Jan Pekař - Imatic wrote:
> Hi all,
> today I tested adding SSD cache tier to pool.
> Everything worked, but when I tried to remove it and run
> rados -p hot-pool cache-flush-evict-all
> I got
>          rbd_data.9c000238e1f29.0000000000000000
> failed to flush /rbd_data.9c000238e1f29.0000000000000000: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000621
> failed to flush /rbd_data.9c000238e1f29.0000000000000621: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000001
> failed to flush /rbd_data.9c000238e1f29.0000000000000001: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000a2c
> failed to flush /rbd_data.9c000238e1f29.0000000000000a2c: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000200
> failed to flush /rbd_data.9c000238e1f29.0000000000000200: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000622
> failed to flush /rbd_data.9c000238e1f29.0000000000000622: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000009
> failed to flush /rbd_data.9c000238e1f29.0000000000000009: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000208
> failed to flush /rbd_data.9c000238e1f29.0000000000000208: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.00000000000000c1
> failed to flush /rbd_data.9c000238e1f29.00000000000000c1: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000625
> failed to flush /rbd_data.9c000238e1f29.0000000000000625: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.00000000000000d8
> failed to flush /rbd_data.9c000238e1f29.00000000000000d8: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000623
> failed to flush /rbd_data.9c000238e1f29.0000000000000623: (2) No such 
> file or directory
>          rbd_data.9c000238e1f29.0000000000000624
> failed to flush /rbd_data.9c000238e1f29.0000000000000624: (2) No such 
> file or directory
> error from cache-flush-evict-all: (1) Operation not permitted
> I also notice, that switching cache tier to "forward" is not safe?
> Error EPERM: 'forward' is not a well-supported cache mode and may 
> corrupt your data.  pass --yes-i-really-mean-it to force.
> In the moment of flushing (or switching to forward mode) RBD got 
> corrupted and even fsck was unable to repair it (unable to set 
> superblock flags). I don't know if it is due to cache still active and 
> corrupted or ext4 got messed, that it cannot work anymore.
> Even if VM that was using that pool is stopped I cannot flush it.
> So what I did wrong? Can I get my data back? Is it safe to remove tier 
> cache and how?
> Using rados get I can dump objects to disk, but why I cannot flush it 
> (evict)?
> It looks like the same issue as on
> http://tracker.ceph.com/issues/12659
> but it is unresolved.
> I also have some snapshot of RBD image in the cold pool, but that should 
> not cause problems in production.
> I'm using 12.2.1 version on all 4 nodes.
> With regards
> Jan Pekar
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Ing. Jan Pekař
jan.pekar at imatic.cz | +420603811737
Imatic | Jagellonská 14 | Praha 3 | 130 00

More information about the ceph-users mailing list