[ceph-users] RBD image has no active watchers while OpenStack KVM VM is running
Wido den Hollander
wido at 42on.com
Wed Nov 29 05:48:41 PST 2017
On a OpenStack environment I encountered a VM which went into R/O mode after a RBD snapshot was created.
Digging into this I found 10s (out of thousands) RBD images which DO have a running VM, but do NOT have a watcher on the RBD image.
$ rbd status volumes/volume-79773f2e-1f40-4eca-b9f0-953fa8d83086
The VM is however running since September 5th 2017 with Jewel 10.2.7 on the client.
In the meantime the cluster was already upgraded to 10.2.10
Looking further I also found a Compute node with 10.2.10 installed which also has RBD images without watchers.
Restarting or live migrating the VM to a different host resolves this issue.
The internet is full of posts where RBD images still have Watchers when people don't expect them, but in this case I'm expecting a watcher which isn't there.
The main problem right now is that creating a snapshot potentially puts a VM in Read-Only state because of the lack of notification.
Has anybody seen this as well?
More information about the ceph-users