[ceph-users] RBD image has no active watchers while OpenStack KVM VM is running

Logan Kuhn logank at wolfram.com
Wed Nov 29 05:53:10 PST 2017


We've seen this.  Our environment isn't identical though, we use oVirt and connect to ceph (11.2.1) via cinder (9.2.1), but it's so very rare that we've never had any luck in pin pointing it and have a lot less VMs, <300.

Regards,
Logan

----- On Nov 29, 2017, at 7:48 AM, Wido den Hollander wido at 42on.com wrote:

| Hi,
| 
| On a OpenStack environment I encountered a VM which went into R/O mode after a
| RBD snapshot was created.
| 
| Digging into this I found 10s (out of thousands) RBD images which DO have a
| running VM, but do NOT have a watcher on the RBD image.
| 
| For example:
| 
| $ rbd status volumes/volume-79773f2e-1f40-4eca-b9f0-953fa8d83086
| 
| 'Watchers: none'
| 
| The VM is however running since September 5th 2017 with Jewel 10.2.7 on the
| client.
| 
| In the meantime the cluster was already upgraded to 10.2.10
| 
| Looking further I also found a Compute node with 10.2.10 installed which also
| has RBD images without watchers.
| 
| Restarting or live migrating the VM to a different host resolves this issue.
| 
| The internet is full of posts where RBD images still have Watchers when people
| don't expect them, but in this case I'm expecting a watcher which isn't there.
| 
| The main problem right now is that creating a snapshot potentially puts a VM in
| Read-Only state because of the lack of notification.
| 
| Has anybody seen this as well?
| 
| Thanks,
| 
| Wido
| _______________________________________________
| ceph-users mailing list
| ceph-users at lists.ceph.com
| http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


More information about the ceph-users mailing list