[ceph-users] Cluster Down from reweight-by-utilization

Kevin Hrpcek kevin.hrpcek at ssec.wisc.edu
Sat Nov 4 21:50:08 PDT 2017


Hey Sage,

Thanks for getting back to me this late on a weekend.
> Do you now why the OSDs were going down?  Are there any crash dumps in the
> osd logs, or is the OOM killer getting them?
That's a part I can't nail down yet. OSDs didn't crash, after the 
reweight-by-utilization OSDs on some of our earlier gen servers started 
spinning 100% cpu and were overwhelmed. Admittedly these early gen osd 
servers are undersized on cpu which is probably why they got 
overwhelmed, but it hasn't escalated like this before. Heartbeats among 
the cluster's OSDs started failing on those OSDs first and then the osd 
100% cpu problem seemed to snowball to all hosts. I'm still trying to 
figure out why the relatively small reweighting caused this problem.
> The usual strategy here is to set 'noup' and get all of the OSDs to catch
> up on osdmaps (you can check progress via the above status command).  Once
> they are all caught up, unset noup and let them all peer at once.
I tried having noup set for a few hours earlier to see if stopping the 
moving osdmap target would help but I eventually unset it while doing 
more troubleshooting. I'll set it again and let it go overnight. 
Patience is probably needed with a cluster this size. I saw this similar 
situation and was trying your previous solution 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-May/040030.html
>
> The problem that has come up here in the past is when the cluster has been
> unhealthy for a long time and the past intervals use too much memory.  I
> don't see anything in your description about memory usage, though.  If
> that does rear its head there's a patch we can apply to kraken to work
> around it (this is fixed in luminous).
Memory usage doesn't seem too bad, a little tight on some of those early 
gen servers, but I haven't seen OOM killing things off yet. I think I 
saw mention of that patch and luminous handling this type of situation 
better while googling the issue...larger osdmap increments or something 
similar if i recall correctly. My cluster is a few weeks away from a 
luminous upgrade.

Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171104/bf4c6c2e/attachment.html>


More information about the ceph-users mailing list