[ceph-users] luminous vs jewel rbd performance

Linh Vu vul at unimelb.edu.au
Wed Nov 15 17:31:35 PST 2017


Noticed that you're on 12.2.0 Raf. 12.2.1 fixed a lot of performance issues from 12.2.0 for us on Luminous/Bluestore. Have you tried upgrading to it?

________________________________
From: ceph-users <ceph-users-bounces at lists.ceph.com> on behalf of Rafael Lopez <rafael.lopez at monash.edu>
Sent: Thursday, 16 November 2017 11:59:14 AM
To: Mark Nelson
Cc: ceph-users
Subject: Re: [ceph-users] luminous vs jewel rbd performance

Hi Mark,

Sorry for the late reply... I have been away on vacation/openstack summit etc for over a month now and looking at this again.

Yeah the snippet was a bit misleading. The fio file contains small block jobs as well as big block jobs:

[write-rbd1-4m-depth1]
rbdname=rbd-tester-fio
bs=4m
iodepth=1
rw=write
stonewall
[write-rbd2-4m-depth16]
rbdname=rbd-tester-fio-2
bs=4m
iodepth=16
rw=write
stonewall

[read-rbd1-4m-depth1]
rbdname=rbd-tester-fio
bs=4m
iodepth=1
rw=read
stonewall
[read-rbd2-4m-depth16]
rbdname=rbd-tester-fio-2
bs=4m
iodepth=16
rw=read
stonewall

The performance hit is more noticeable on bigblock, I think up to 10x slower on some runs but as a percentage it seems to affect a small block workload too. I understand that runs will vary... I wish I had more runs from before upgrading to luminous but I only have that single set of results. Regardless, I cannot come close to that single set of results since upgrading to luminous.
I understand the caching stuff you mentioned, however we have not changed any of that config since the upgrade and the fio job is exactly the same. So if I do many runs on luminous throughout the course of a day, including when we think the cluster is least busy, we should be able to come pretty close to the jewel result on at least one of the runs or is my thinking flawed?

Sage mentioned at openstack that there was a perf regression with librbd which will be fixed in 12.2.2.... are you aware of this? If so can you send me the link to the bug?

Cheers,
Raf


On 22 September 2017 at 00:31, Mark Nelson <mnelson at redhat.com<mailto:mnelson at redhat.com>> wrote:
Hi Rafael,

In the original email you mentioned 4M block size, seq read, but here it looks like you are doing 4k writes?  Can you clarify?  If you are doing 4k direct sequential writes with iodepth=1 and are also using librbd cache, please make sure that librbd is set to writeback mode in both cases.  RBD by default will not kick into WB mode until it sees a flush request, and the librbd engine in fio doesn't issue one before a test is started.  It can be pretty easy to end up in a situation where writeback cache is active on some tests but not others if you aren't careful.  IE If one of your tests was done after a flush and the other was not, you'd likely see a dramatic difference in performance during this test.

You can avoid this by telling librbd to always use WB mode (at least when benchmarking):

rbd cache writethrough until flush = false

Mark


On 09/20/2017 01:51 AM, Rafael Lopez wrote:
Hi Alexandre,

Yeah we are using filestore for the moment with luminous. With regards
to client, I tried both jewel and luminous librbd versions against the
luminous cluster - similar results.

I am running fio on a physical machine with fio rbd engine. This is a
snippet of the fio config for the runs (the complete jobfile adds
variations of read/write/block size/iodepth).

[global]
ioengine=rbd
clientname=cinder-volume
pool=rbd-bronze
invalidate=1
ramp_time=5
runtime=30
time_based
direct=1

[write-rbd1-4k-depth1]
rbdname=rbd-tester-fio
bs=4k
iodepth=1
rw=write
stonewall

[write-rbd2-4k-depth16]
rbdname=rbd-tester-fio-2
bs=4k
iodepth=16
rw=write
stonewall

Raf

On 20 September 2017 at 16:43, Alexandre DERUMIER <aderumier at odiso.com<mailto:aderumier at odiso.com>
<mailto:aderumier at odiso.com<mailto:aderumier at odiso.com>>> wrote:

    Hi

    so, you use also filestore on luminous ?

    do you have also upgraded librbd on client ? (are you benching
    inside a qemu machine ? or directly with fio-rbd ?)



    (I'm going to do a lot of benchmarks in coming week, I'll post
    results on mailing soon.)



    ----- Mail original -----
    De: "Rafael Lopez" <rafael.lopez at monash.edu<mailto:rafael.lopez at monash.edu>
    <mailto:rafael.lopez at monash.edu<mailto:rafael.lopez at monash.edu>>>
    À: "ceph-users" <ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com>
    <mailto:ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com>>>

    Envoyé: Mercredi 20 Septembre 2017 08:17:23
    Objet: [ceph-users] luminous vs jewel rbd performance

    hey guys.
    wondering if anyone else has done some solid benchmarking of jewel
    vs luminous, in particular on the same cluster that has been
    upgraded (same cluster, client and config).

    we have recently upgraded a cluster from 10.2.9 to 12.2.0, and
    unfortunately i only captured results from a single fio (librbd) run
    with a few jobs in it before upgrading. i have run the same fio
    jobfile many times at different times of the day since upgrading,
    and been unable to produce a close match to the pre-upgrade (jewel)
    run from the same client. one particular job is significantly slower
    (4M block size, iodepth=1, seq read), up to 10x in one run.

    i realise i havent supplied much detail and it could be dozens of
    things, but i just wanted to see if anyone else had done more
    quantitative benchmarking or had similar experiences. keep in mind
    all we changed was daemons were restarted to use luminous code,
    everything else exactly the same. granted it is possible that
    some/all osds had some runtime config injected that differs from
    now, but i'm fairly confident this is not the case as they were
    recently restarted (on jewel code) after OS upgrades.

    cheers,
    Raf

    _______________________________________________
    ceph-users mailing list
    ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com> <mailto:ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com>>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>




--
*Rafael Lopez*
Research Devops Engineer
Monash University eResearch Centre

T: +61 3 9905 9118<tel:%2B61%203%209905%209118> <tel:%2B61%203%209905%209118>
M: +61 (0)427682670<tel:%2B61%20%280%29427682670> <tel:%2B61%204%2027682%20670>
E: rafael.lopez at monash.edu<mailto:rafael.lopez at monash.edu> <mailto:rafael.lopez at monash.edu<mailto:rafael.lopez at monash.edu>>



_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>



--
Rafael Lopez
Research Devops Engineer
Monash University eResearch Centre

T: +61 3 9905 9118<tel:%2B61%203%209905%209118>
M: +61 (0)427682670<tel:%2B61%204%2027682%20670>
E: rafael.lopez at monash.edu<mailto:rafael.lopez at monash.edu>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171116/fac7b18e/attachment.html>


More information about the ceph-users mailing list