Stefan Kooman stefan at bit.nl
Sun Mar 3 06:20:26 PST 2019

Quoting Patrick Donnelly (pdonnell at redhat.com):
> On Thu, Feb 28, 2019 at 12:49 PM Stefan Kooman <stefan at bit.nl> wrote:
> >
> > Dear list,
> >
> > After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs
> > (MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for
> > more that 5 seconds. We have one active, and one active standby MDS. All
> > storage on SSD (Samsung PM863a / Intel DC4500). No other (OSD) slow ops
> > reported. The MDSes are underutilized, only a handful of active clients
> > and almost no load (fast hexacore CPU, 256 GB RAM, 20 Gb/s network). The
> > cluster is also far from busy.
> >
> > I've dumped ops in flight on the MDSes but all ops that are printed are
> > finished in a split second (duration: 0.000152), flag_point": "acquired
> > locks".
> I believe you're looking at the wrong "ops" dump. You want to check
> "objector_requests".

I've done that, but not much timing info in there, i.e.:

    "ops": [
            "tid": 416231,
            "pg": "6.791ce4c5",
            "osd": 31,
            "object_id": "200.00024d0f",
            "object_locator": "@6",
            "target_object_id": "200.00024d0f",
            "target_object_locator": "@6",
            "paused": 0,
            "used_replica": 0,
            "precalc_pgid": 0,
            "last_sent": "79790s",
            "attempts": 1,
            "snapid": "head",
            "snap_context": "0=[]",
            "mtime": "2019-03-01 16:39:29.0.41847s",
            "osd_ops": [
                "write 798204~6249 [fadvise_dontneed]"

The MDS_SLOW_METADATA_IO warning has been added in commit

Tracker issue: http://tracker.ceph.com/issues/24879
PR: 23022

It uses the "osd_op_complaint_time" as the "complaint_time". We have set
the osd_op_complaint_time to 5 seconds. 

The tracker issue hints that slow metadata would be caused by a slow OSD
(joural writes of MDS that take too long). However, we do not see any
slow mds ops (data), nor slow ops. We've changed the parameter
"mon_osd_warn_op_age" to 5 seconds as well so slow OSD ops would be
reported as well.

I do not think our OSDs are that slow, that often (~ at least every 15
minutes we would see that warning) ... so I really wonder what's going

For now we've set the osd_op_complaint_time to "30" on the MDSes and do
not see any SLOW METADATA warnings anymore.

Anyone willing to set the osd_op_complaint_time to 5 seconds to see if
those messages will start popping up? ceph tell mds.* config set
osd_op_complaint_time 5 should do the trick ... you would need to be
running 12.2.11 and /or mimic 13.2.2.



P.s. kudos to Wido den Hollander for helping us debug this.

| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info at bit.nl

More information about the ceph-users mailing list