[ceph-users] How to use rados_aio_write correctly?

Alexander Kushnirenko kushnirenko at gmail.com
Tue Oct 3 03:14:34 PDT 2017


I'm working on third party code (Bareos Storage daemon) which gives very
low write speeds for CEPH.  The code was written to demonstrate that it is
possible, but the speed is about 3-9 MB/s which is too slow.   I modified
the routine to use rados_aio_write instead of rados_write, and was able to
backup/restore data successfully with the speed about 30MB/s, which what I
would expect on 1GB/s network and rados bench results.  I studied examples
in the documents and github, but still I'm afraid that by code is working
merely by accident.  Could some one comment on the following questions:

Q1. Storage daemon sends write requests of 64K size, so current code works
like this:

rados_write(....., buffer, len=64K, offset=0)
rados_write(....., buffer, len=64K, offset=64K)
rados_write(....., buffer, len=64K, offset=128K)
... and so on ...

What is the correct way to use AIO (to use one completion or several?)
Version 1:

rados_aio_create_completion(NULL, NULL, NULL, &comp);
rados_aio_write(....., comp, buffer, len=64K, offset=0)
rados_aio_write(....., comp, buffer, len=64K, offset=64K)
rados_aio_write(....., comp, buffer, len=64K, offset=128K)
rados_aio_wait_for_complete(comp);    // wait for Async IO in memory
rados_aio_wait_for_safe(comp);        // and on disk

Version 2:
rados_aio_create_completion(NULL, NULL, NULL, &comp1);
rados_aio_create_completion(NULL, NULL, NULL, &comp2);
rados_aio_create_completion(NULL, NULL, NULL, &comp3);
rados_aio_write(....., comp1, buffer, len=64K, offset=0)
rados_aio_write(....., comp2, buffer, len=64K, offset=64K)
rados_aio_write(....., comp3, buffer, len=64K, offset=128K)
rados_aio_write(....., comp1, buffer, len=64K, offset=192K)
rados_aio_write(....., comp2, buffer, len=64K, offset=256K)
rados_aio_write(....., comp3, buffer, len=64K, offset=320K)

Q2.  Problem of maximum object size.  When I use rados_write I get an error
when I exceed maximum object size (132MB in luminous).  But when I use
rados_aio_write it happily goes beyond the limit of object, but actually
writes nothing, but does not make any error.  Is there a way to catch such

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171003/f6d02df5/attachment.html>

More information about the ceph-users mailing list