[ceph-users] [Cbt] Poor libRBD write performance

Mark Nelson mnelson at redhat.com
Mon Nov 20 08:16:12 PST 2017


On 11/20/2017 10:06 AM, Moreno, Orlando wrote:
> Hi all,
>
>
>
> I’ve been experiencing weird performance behavior when using FIO RBD
> engine directly to an RBD volume with numjobs > 1. For a 4KB random
> write test at 32 QD and 1 numjob, I can get about 40K IOPS, but when I
> increase the numjobs to 4, it plummets to 2800 IOPS. I tried running the
> same exact test on a VM using FIO libaio targeting a block device
> (volume) attached through QEMU/RBD and I get ~35K-40K IOPS in both
> situations. In all cases, CPU was not fully utilized and there were no
> signs of any hardware bottlenecks. I did not disable any RBD features
> and most of the Ceph parameters are default (besides auth, debug, pool
> size, etc).
>
>
>
> My Ceph cluster is running on 6 nodes, all-NVMe, 22-core, 376GB mem,
> Luminous 12.2.1, Ubuntu 16.04, and clients running FIO job/VM on similar
> HW/SW spec. The VM has 16 vCPU, 64GB mem, and the root disk is locally
> stored while the persistent disk comes from an RBD volume serviced by
> the Ceph cluster.
>
>
>
> If anyone has seen this issue or have any suggestions please let me know.

Hi Orlando,

Try seeing if disabling the RBD image exclusive lock helps (if only to 
confirm that's what's going on).  I usually test with numjobs=1 and run 
multiple fio instances with higher iodepth values instead to avoid this. 
  See:

https://www.spinics.net/lists/ceph-devel/msg30468.html

and

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-September/004872.html

Mark

>
>
>
> Thanks,
>
> Orlando
>
>
>
> _______________________________________________
> Cbt mailing list
> Cbt at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/cbt-ceph.com
>


More information about the ceph-users mailing list