[ceph-users] [Cbt] Poor libRBD write performance
mnelson at redhat.com
Mon Nov 20 08:16:12 PST 2017
On 11/20/2017 10:06 AM, Moreno, Orlando wrote:
> Hi all,
> I’ve been experiencing weird performance behavior when using FIO RBD
> engine directly to an RBD volume with numjobs > 1. For a 4KB random
> write test at 32 QD and 1 numjob, I can get about 40K IOPS, but when I
> increase the numjobs to 4, it plummets to 2800 IOPS. I tried running the
> same exact test on a VM using FIO libaio targeting a block device
> (volume) attached through QEMU/RBD and I get ~35K-40K IOPS in both
> situations. In all cases, CPU was not fully utilized and there were no
> signs of any hardware bottlenecks. I did not disable any RBD features
> and most of the Ceph parameters are default (besides auth, debug, pool
> size, etc).
> My Ceph cluster is running on 6 nodes, all-NVMe, 22-core, 376GB mem,
> Luminous 12.2.1, Ubuntu 16.04, and clients running FIO job/VM on similar
> HW/SW spec. The VM has 16 vCPU, 64GB mem, and the root disk is locally
> stored while the persistent disk comes from an RBD volume serviced by
> the Ceph cluster.
> If anyone has seen this issue or have any suggestions please let me know.
Try seeing if disabling the RBD image exclusive lock helps (if only to
confirm that's what's going on). I usually test with numjobs=1 and run
multiple fio instances with higher iodepth values instead to avoid this.
> Cbt mailing list
> Cbt at lists.ceph.com
More information about the ceph-users