[ceph-users] Disabling write cache on SATA HDDs reduces write latency 7 times

Ashley Merrick singapore at amerrick.co.uk
Sun Nov 11 02:43:09 PST 2018


Don’t have any SSD in the cluster to test.

Also without knowing the exact reason why it being enabled has such a
negative effect I wouldn’t be sure if also would be the same on SSD’s.

On Sun, 11 Nov 2018 at 6:41 PM, Marc Roos <M.Roos at f1-outsourcing.eu> wrote:

>
>
> Does it make sense to test disabling this on hdd cluster only?
>
>
> -----Original Message-----
> From: Ashley Merrick [mailto:singapore at amerrick.co.uk]
> Sent: zondag 11 november 2018 6:24
> To: vitalif at yourcmc.ru
> Cc: ceph-users at lists.ceph.com
> Subject: Re: [ceph-users] Disabling write cache on SATA HDDs reduces
> write latency 7 times
>
> I've just worked out I had the same issue, been trying to work out the
> cause for the past few days!
>
> However I am using brand new enterprise Toshiba drivers with 256MB write
> cache, was seeing I/O wait peaks of 40% even during a small writing
> operation to CEPH and commit / apply latency's in the 40ms+.
>
> Just went through and disabled the write cache on each drive, and done a
> few tests with the exact same write performance, but I/O wait in the <1%
> and commit / apply latency's in the 1-3ms max.
>
> Something somewhere definitely doesn't seem to like the write cache
> being enabled on the disks, this is a EC Pool in the latest Mimic
> version.
>
> On Sun, Nov 11, 2018 at 5:34 AM Vitaliy Filippov <vitalif at yourcmc.ru>
> wrote:
>
>
>         Hi
>
>         A weird thing happens in my test cluster made from desktop
> hardware.
>
>         The command `for i in /dev/sd?; do hdparm -W 0 $i; done`
> increases
>
>         single-thread write iops (reduces latency) 7 times!
>
>         It is a 3-node cluster with Ryzen 2700 CPUs, 3x SATA 7200rpm HDDs
> +
> 1x
>         SATA desktop SSD for system and ceph-mon + 1x SATA server SSD for
>         block.db/wal in each host. Hosts are linked by 10gbit ethernet
> (not
> the
>         fastest one though, average RTT according to flood-ping is
> 0.098ms). Ceph
>         and OpenNebula are installed on the same hosts, OSDs are prepared
> with
>         ceph-volume and bluestore with default options. SSDs have
> capacitors
>         ('power-loss protection'), write cache is turned off for them
> since
> the
>         very beginning (hdparm -W 0 /dev/sdb). They're quite old, but each
> of them
>         is capable of delivering ~22000 iops in journal mode (fio -sync=1
>         -direct=1 -iodepth=1 -bs=4k -rw=write).
>
>         However, RBD single-threaded random-write benchmark originally
> gave
> awful
>         results - when testing with `fio -ioengine=libaio -size=10G
> -sync=1
>
>         -direct=1 -name=test -bs=4k -iodepth=1 -rw=randwrite -runtime=60
>         -filename=./testfile` from inside a VM, the result was only 58
> iops
>
>         average (17ms latency). This was not what I expected from the
> HDD+SSD
>         setup.
>
>         But today I tried to play with cache settings for data disks. And
> I
> was
>         really surprised to discover that just disabling HDD write cache
> (hdparm
>         -W 0 /dev/sdX for all HDD devices) increases single-threaded
> performance
>         ~7 times! The result from the same VM (without even rebooting it)
> is
>         iops=405, avg lat=2.47ms. That's a magnitude faster and in fact
> 2.5ms
>         seems sort of an expected number.
>
>         As I understand 4k writes are always deferred at the default
> setting of
>         prefer_deferred_size_hdd=32768, this means they should only get
> written to
>         the journal device before OSD acks the write operation.
>
>         So my question is WHY? Why does HDD write cache affect commit
> latency with
>         WAL on an SSD?
>
>         I would also appreciate if anybody with similar setup (HDD+SSD
> with
>
>         desktop SATA controllers or HBA) could test the same thing...
>
>         --
>         With best regards,
>            Vitaliy Filippov
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users at lists.ceph.com
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181111/f683ed45/attachment.html>


More information about the ceph-users mailing list