[ceph-users] ceph zstd not for bluestor due to performance reasons

Sage Weil sage at newdream.net
Sun Nov 12 08:55:05 PST 2017


On Wed, 25 Oct 2017, Sage Weil wrote:
> On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote:
> > Hello,
> > 
> > in the lumious release notes is stated that zstd is not supported by
> > bluestor due to performance reason. I'm wondering why btrfs instead
> > states that zstd is as fast as lz4 but compresses as good as zlib.
> > 
> > Why is zlib than supported by bluestor? And why does btrfs / facebook
> > behave different?
> > 
> > "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph
> > also supports zstd for RGW compression but zstd is not recommended for
> > BlueStore for performance reasons.)"
> 
> zstd will work but in our testing the performance wasn't great for 
> bluestore in particular.  The problem was that for each compression run 
> there is a relatively high start-up cost initializing the zstd 
> context/state (IIRC a memset of a huge memory buffer) that dominated the 
> execution time... primarily because bluestore is generally compressing 
> pretty small chunks of data at a time, not big buffers or streams.
> 
> Take a look at unittest_compression timings on compressing 16KB buffers 
> (smaller than bluestore needs usually, but illustrated of the problem):
> 
> [ RUN      ] Compressor/CompressorTest.compress_16384/0
> [plugin zlib (zlib/isal)]
> [       OK ] Compressor/CompressorTest.compress_16384/0 (294 ms)
> [ RUN      ] Compressor/CompressorTest.compress_16384/1
> [plugin zlib (zlib/noisal)]
> [       OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms)
> [ RUN      ] Compressor/CompressorTest.compress_16384/2
> [plugin snappy (snappy)]
> [       OK ] Compressor/CompressorTest.compress_16384/2 (169 ms)
> [ RUN      ] Compressor/CompressorTest.compress_16384/3
> [plugin zstd (zstd)]
> [       OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms)
> 
> It's an order of magnitude slower than zlib or snappy, which probably 
> isn't acceptable--even if it is a bit smaller.

Update!  Zstd developer Yann Collet debugged this and it turns out it was 
a build issue, fixed by https://github.com/ceph/ceph/pull/18879/files 
(missing quotes!  yeesh).  The results now look quite good!

[ RUN      ] Compressor/CompressorTest.compress_16384/0
[plugin zlib (zlib/isal)]
[       OK ] Compressor/CompressorTest.compress_16384/0 (370 ms)
[ RUN      ] Compressor/CompressorTest.compress_16384/1
[plugin zlib (zlib/noisal)]
[       OK ] Compressor/CompressorTest.compress_16384/1 (1926 ms)
[ RUN      ] Compressor/CompressorTest.compress_16384/2
[plugin snappy (snappy)]
[       OK ] Compressor/CompressorTest.compress_16384/2 (163 ms)
[ RUN      ] Compressor/CompressorTest.compress_16384/3
[plugin zstd (zstd)]
[       OK ] Compressor/CompressorTest.compress_16384/3 (723 ms)

Not as fast as snappy, but somewhere between intel-accellerated zlib and 
non-accellerated zlib, with better compression ratios.

Also, the zstd compression level is currently hard-coded to level 5.  
That should be fixed at some point.

We can backport this to luminous so it's available in 12.2.3.

sage


More information about the ceph-users mailing list