[ceph-users] Poor libRBD write performance

Jason Dillaman jdillama at redhat.com
Mon Nov 20 09:01:05 PST 2017


On Mon, Nov 20, 2017 at 12:00 PM, Moreno, Orlando
<orlando.moreno at intel.com> wrote:
> Hi Jason,
>
> You're right, thanks for pointing that out. I could've sworn I saw the same problem with exclusive-lock disabled, but after trying it again, disabling the feature does fix the write performance :)
>
> So does this mean that when an RBD is attached to a VM, it is considered a single client connection?

Yes, the VM (QEMU) is a single client to librbd.

> Thanks,
> Orlando
>
> -----Original Message-----
> From: Jason Dillaman [mailto:jdillama at redhat.com]
> Sent: Monday, November 20, 2017 9:10 AM
> To: Moreno, Orlando <orlando.moreno at intel.com>
> Cc: fio at vger.kernel.org; ceph-users at lists.ceph.com; cbt at lists.ceph.com
> Subject: Re: [ceph-users] Poor libRBD write performance
>
> I suspect you are seeing this issue [1]. TL;DR: never use "numjobs" >
> 1 against an RBD image that has the exclusive-lock feature enabled.
>
> [1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-August/012123.html
>
> On Mon, Nov 20, 2017 at 11:06 AM, Moreno, Orlando <orlando.moreno at intel.com> wrote:
>> Hi all,
>>
>>
>>
>> I’ve been experiencing weird performance behavior when using FIO RBD
>> engine directly to an RBD volume with numjobs > 1. For a 4KB random
>> write test at
>> 32 QD and 1 numjob, I can get about 40K IOPS, but when I increase the
>> numjobs to 4, it plummets to 2800 IOPS. I tried running the same exact
>> test on a VM using FIO libaio targeting a block device (volume)
>> attached through QEMU/RBD and I get ~35K-40K IOPS in both situations.
>> In all cases, CPU was not fully utilized and there were no signs of
>> any hardware bottlenecks. I did not disable any RBD features and most
>> of the Ceph parameters are default (besides auth, debug, pool size, etc).
>>
>>
>>
>> My Ceph cluster is running on 6 nodes, all-NVMe, 22-core, 376GB mem,
>> Luminous 12.2.1, Ubuntu 16.04, and clients running FIO job/VM on
>> similar HW/SW spec. The VM has 16 vCPU, 64GB mem, and the root disk is
>> locally stored while the persistent disk comes from an RBD volume
>> serviced by the Ceph cluster.
>>
>>
>>
>> If anyone has seen this issue or have any suggestions please let me know.
>>
>>
>>
>> Thanks,
>>
>> Orlando
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Jason



-- 
Jason


More information about the ceph-users mailing list