[ceph-users] Hangs with qemu/libvirt/rbd when one host disappears

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Dec 7 02:02:23 PST 2017

On Thu, Dec 07, 2017 at 09:59:43AM +0100, Marcus Priesch wrote:
> Hello Brad,
> thanks for your answer !
> >> at least the point of all is that a single host should be allowed to
> >> fail and the vm's continue running ... ;)
> > 
> > You don't really have six MONs do you (although I know the answer to
> > this question)? I think you need to take another look at some of the
> > docs about monitors.
> however i dont get the point here ...
> because its an even number ?
> i read docs ... but dont get any hints on the number of mons ... i would
> assume, the more the better ... is this wrong ?

an even number is always bad for quorum based systems (6 is no better
than 5, as you can only tolerate a loss of 2 before losing quorum).

in Ceph, additional monitors require additional resources AND generate
additional overhead (more mons -> more communication). the rule of thumb
is 3 for small to mid-sized cluster. the next step up performance wise
would be to move the 3 mons to their own stand-alone nodes, and only
once that starts to bottleneck, you increase the number to 5 and/or
upgrade the HW to become faster. for really big clusters, you can then
start splitting out the mgr instances to reduce the load further.

More information about the ceph-users mailing list