[ceph-users] slow ops after cephfs snapshot removal

Gregory Farnum gfarnum at redhat.com
Fri Nov 9 13:38:12 PST 2018


On Fri, Nov 9, 2018 at 2:24 AM Kenneth Waegeman <kenneth.waegeman at ugent.be>
wrote:

> Hi all,
>
> On Mimic 13.2.1, we are seeing blocked ops on cephfs after removing some
> snapshots:
>
> [root at osd001 ~]# ceph -s
>    cluster:
>      id:     92bfcf0a-1d39-43b3-b60f-44f01b630e47
>      health: HEALTH_WARN
>              5 slow ops, oldest one blocked for 1162 sec, mon.mds03 has
> slow ops
>
>    services:
>      mon: 3 daemons, quorum mds01,mds02,mds03
>      mgr: mds02(active), standbys: mds03, mds01
>      mds: ceph_fs-2/2/2 up  {0=mds03=up:active,1=mds01=up:active}, 1
> up:standby
>      osd: 544 osds: 544 up, 544 in
>
>    io:
>      client:   5.4 KiB/s wr, 0 op/s rd, 0 op/s wr
>
> [root at osd001 ~]# ceph health detail
> HEALTH_WARN 5 slow ops, oldest one blocked for 1327 sec, mon.mds03 has
> slow ops
> SLOW_OPS 5 slow ops, oldest one blocked for 1327 sec, mon.mds03 has slow
> ops
>
> [root at osd001 ~]# ceph -v
> ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic
> (stable)
>
> Is this a known issue?
>

It's not exactly a known issue, but from the output and story you've got
here it looks like the OSDs are deleting the snapshot data too fast and the
MDS isn't getting quick enough replies? Or maybe you have an overlarge
CephFS directory which is taking a long time to clean up somehow; you
should get the MDS ops and the MDS' objecter ops in flight and see what
specifically is taking so long.
-Greg


>
> Cheers,
>
> Kenneth
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181109/7ad5a0dc/attachment.html>


More information about the ceph-users mailing list