[ceph-users] _committed_osd_maps shutdown OSD via async signal, bug or feature?

Stefan Kooman stefan at bit.nl
Thu Oct 5 06:48:20 PDT 2017


During testing (mimicking BGP / port flaps) on our cluster we are able
to trigger a "_committed_osd_maps shutdown OSD via async signal" on the
the affected OSD servers in that datacenter (OSDs in that DC become
intermittent isolated from their peers). Result is that all OSD
processes stop. Is this a bug or a feature? I.e. is there a "flap"
detection mechanism in Ceph OSD? 

If it's a bug it might be related to
http://tracker.ceph.com/issues/20174. We get similiar error message on
"12.2.0". Version "12.2.1" does not log 

"-1 Fail to open
'/proc/0/cmdline' error = (2) No such file or directory
-1 received  signal: Interrupt from  PID: 0 task name: <unknown> UID: 0
-1 osd.21 1846 *** Got signal Interrupt ***
0 osd.21 1846 prepare_to_stop starting shutdown
-1 osd.21 1846 shutdown"

Gr. Stefan

| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info at bit.nl

More information about the ceph-users mailing list