[ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

Igor Fedotov ifedotov at suse.de
Mon Oct 15 06:00:57 PDT 2018


Perhaps this is the same issue as indicated here:

https://tracker.ceph.com/issues/36364


Can you check OSD iostat reports for similarities to this ticket, please?

Thanks,
Igor

On 10/15/2018 2:26 PM, Andrei Mikhailovsky wrote:
> Hello,
>
> I am currently running Luminous 12.2.8 on Ubuntu with 
> 4.15.0-36-generic kernel from the official ubuntu repo. The cluster 
> has 4 mon + osd servers. Each osd server has the total of 9 spinning 
> osds and 1 ssd for the hdd and ssd pools. The hdds are backed by the 
> S3710 ssds for journaling with a ration of 1:5. The ssd pool osds are 
> not using external journals. Ceph is used as a Primary storage for 
> Cloudstack - all vm disk images are stored on the cluster.
>
> I have recently migrated all osds to the bluestore, which was a long 
> process with ups and downs, but I am happy to say that the migration 
> is done. During the migration I've disabled the scrubbing (both deep 
> and standard). After reenabling the scrubbing I have noticed the 
> cluster started having a large number of slow requests and poor client 
> IO (to the point of vms stall for minutes). Further investigation 
> showed that the slow requests happen because of the osds flapping. In 
> a single day my logs have over 1000 entries which report osd going 
> down. This effects random osds. Disabling deep-scrubbing stabilises 
> the cluster and the osds are no longer flap and the slow requests 
> disappear. As a short term solution I've disabled the deepscurbbing, 
> but was hoping to fix the issues with your help.
>
> At the moment, I am running the cluster with default settings apart 
> from the following settings:
>
> [global]
> osd_disk_thread_ioprio_priority = 7
> osd_disk_thread_ioprio_class = idle
> osd_recovery_op_priority = 1
>
> [osd]
> debug_ms = 0
> debug_auth = 0
> debug_osd = 0
> debug_bluestore = 0
> debug_bluefs = 0
> debug_bdev = 0
> debug_rocksdb = 0
>
>
> Could you share experiences with deep scrubbing of bluestore osds? Are 
> there any options that I should set to make sure the osds are not 
> flapping and the client IO is still available?
>
> Thanks
>
> Andrei
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181015/ef34c752/attachment.html>


More information about the ceph-users mailing list