[ceph-users] Hangs with qemu/libvirt/rbd when one host disappears
marcus at priesch.co.at
Thu Dec 7 01:24:13 PST 2017
Hello Alwin, Dear All,
yesterday we finished cluster migration to proxmox and i had the same
A couple of osd's down and out and a stuck request on a completely
different osd which blocked the vm's.
i tried to put this specific osd out (ceph osd out xx) and voila, the
problem was gone. later on i put the osd back in and anything works as
in the meantime i read the post here:
where network problems with switches are also mentioned ...
as the 1Gb network is completely busy in such a scenario i would assume
maybe the problem is that some network communication got stuck somewhere
however all in all the transition from ubuntu / jewel to ubuntu
/luminous to proxmox / luminous went rather flawless - despite of the
problem stated above - but i am aware that i am using ceph outside its
requirements - so definitely *thumbs up* for ceph in general !!!!
to your comments :
>> i am running ceph luminous (have upgraded two weeks ago)
> I guess, you are running on ceph 12.2.1 (12.2.2 is out)? What does ceph versions say?
>> ceph communication is carried out on a seperate 1Gbit Network where we
>> plan to upgrade to bonded 2x10Gbit during the next couple of weeks.
> With 6 hosts you will need 10GbE, alone for lower latency. Also a ceph
> recovery/rebalance might max out the bandwidth of your link.
yes, i think this is the problem ...
> Mixing of spinners with SSDs is not recommended, as spinners will slow
> down the pools residing on that root.
why should this happen ? i would assume that osd's are seperate parts
running on hosts - not influencing each other ?
otherwise i would need a different set of hosts for the ssd's and the
>> when i turn off one of the hosts (lets say node7) that do only ceph,
>> after some time the vm's stall and hang until the host comes up again.
> A stall of I/O shouldn't happen, what is your min_size of the pools? How
> is your 'ceph osd tree' looking?
you find it on the owncloud link ... at least ceph osd df tree
>> but neither osd's 9, 10 or 5 are located on host7 - so can anyone of you
>> tell me why the requests to this nodes got stuck ?
> Those OSDs are waiting on other OSDs on host7, you can see that in the
> ceph logs and you see with 'ceph pg dump' which pgs are located on which
ok, you mean that they are waiting for operations to finish with the
osd's that just went offline ?
this should be a normal scenario when hardware fails - so this shouldnt
lead to a stuck vm ... i assume ?
>> i have one pg in state "stuck unclean" which has its replicas on osd's
>> 2, 3 and 15. 3 is on node7, but the first in the active set is 2 - i
>> thought the "write op" should have gone there ... so why unclean ? the
>> manual states "For stuck unclean placement groups, there is usually
>> something preventing recovery from completing, like unfound objects" but
>> there arent ...
> unclean - The placement group has not been clean for too long (i.e., it
> hasn’t been able to completely recover from a previous failure).
i know this ... ther was no previous failure ... when i turn off some
osd's i always get this after some time ...
> How is your 1GbE utilized? I guess, with 6 nodes (3-4 OSDs) your link
> might be maxed out. But you should get something in the ceph
yes, it is maxed out ... i suspect that maybe its a problem of the
network hardware that some packets get lost/stuck somewhere ...
>> do i have a configuration issue here (amount of replicas?) or is this
>> behavior simply just because my cluster network is too slow ?
>> you can find detailed outputs here :
>> i hope any of you can help me shed any light on this ...
>> at least the point of all is that a single host should be allowed to
>> fail and the vm's continue running ... ;)
> To get a better look at your setup, a crush map, ceph osd dump, ceph -s
> and some log output would be nice.
you should find all in ceph_report.txt in the link above ...
> Also you are moving to Proxmox, you might want to have look at the docs
> & the forum.
> Docs: https://pve.proxmox.com/pve-docs/
> Forum: https://forum.proxmox.com
thanks, been there ...
> Some more useful information on PVE + Ceph: https://forum.proxmox.com/threads/ceph-raw-usage-grows-by-itself.38395/#post-189842
havent read this ...
thanks a lot !
More information about the ceph-users