[ceph-users] Some questions concerning filestore --> bluestore migration

Mark Nelson mnelson at redhat.com
Fri Oct 5 11:58:08 PDT 2018


FWIW, here are values I measured directly from the RocksDB SST files 
under different small write workloads (ie the ones where you'd expect a 
larger DB footprint):

https://drive.google.com/file/d/1Ews2WR-y5k3TMToAm0ZDsm7Gf_fwvyFw/view?usp=sharing

These tests were only with 256GB of data written to a single OSD, so 
there's no guarantee that it will scale linearly up to 10TB 
(specifically it's possible that much larger RocksDB databases could 
have higher space amplification).  Also note that the RGW numbers could 
be very dependent on the client workload and are not likely universally 
representative.

Also remember that if you run out of space on your DB partitions you'll 
just end up putting higher rocksdb levels on the block device.  Slower 
to be sure, but not necessarily worse than filestore's behavior 
(especially in the RGW case, where the large object counts will cause PG 
directory splitting chaos).

Mark

On 10/05/2018 01:38 PM, solarflow99 wrote:
> oh my.. yes 2TB enterprise class SSDs, that a much higher requirement 
> than filestore needed.  That would be cost prohibitive to any lower 
> end ceph cluster,
>
>
>
> On Thu, Oct 4, 2018 at 11:19 PM Massimo Sgaravatto 
> <massimo.sgaravatto at gmail.com <mailto:massimo.sgaravatto at gmail.com>> 
> wrote:
>
>     Argg !!
>     With 10x10TB SATA DB and 2 SSD disks this would mean 2 TB for each
>     SSD !
>     If this is really required I am afraid I will keep using filestore ...
>
>     Cheers, Massimo
>
>     On Fri, Oct 5, 2018 at 7:26 AM <ceph at elchaka.de
>     <mailto:ceph at elchaka.de>> wrote:
>
>         Hello
>
>         Am 4. Oktober 2018 02:38:35 MESZ schrieb solarflow99
>         <solarflow99 at gmail.com <mailto:solarflow99 at gmail.com>>:
>         >I use the same configuration you have, and I plan on using
>         bluestore.
>         >My
>         >SSDs are only 240GB and it worked with filestore all this time, I
>         >suspect
>         >bluestore should be fine too.
>         >
>         >
>         >On Wed, Oct 3, 2018 at 4:25 AM Massimo Sgaravatto <
>         >massimo.sgaravatto at gmail.com
>         <mailto:massimo.sgaravatto at gmail.com>> wrote:
>         >
>         >> Hi
>         >>
>         >> I have a ceph cluster, running luminous, composed of 5 OSD
>         nodes,
>         >which is
>         >> using filestore.
>         >> Each OSD node has 2 E5-2620 v4 processors, 64 GB of RAM,
>         10x6TB SATA
>         >disk
>         >> + 2x200GB SSD disk (then I have 2 other disks in RAID for
>         the OS), 10
>         >Gbps.
>         >> So each SSD disk is used for the journal for 5 OSDs. With this
>         >> configuration everything is running smoothly ...
>         >>
>         >>
>         >> We are now buying some new storage nodes, and I am trying
>         to buy
>         >something
>         >> which is bluestore compliant. So the idea is to consider a
>         >configuration
>         >> something like:
>         >>
>         >> - 10 SATA disks (8TB / 10TB / 12TB each. TBD)
>         >> - 2 processor (~ 10 core each)
>         >> - 64 GB of RAM
>         >> - 2 SSD to be used for WAL+DB
>         >> - 10 Gbps
>         >>
>         >> For what concerns the size of the SSD disks I read in this
>         mailing
>         >list
>         >> that it is suggested to have at least 10GB of SSD disk/10TB
>         of SATA
>         >disk.
>         >>
>         >>
>         >> So, the questions:
>         >>
>         >> 1) Does this hardware configuration seem reasonable ?
>         >>
>         >> 2) Are there problems to live (forever, or until filestore
>         >deprecation)
>         >> with some OSDs using filestore (the old ones) and some OSDs
>         using
>         >bluestore
>         >> (the old ones) ?
>         >>
>         >> 3) Would you suggest to update to bluestore also the old
>         OSDs, even
>         >if the
>         >> available SSDs are too small (they don't satisfy the "10GB
>         of SSD
>         >disk/10TB
>         >> of SATA disk" rule) ?
>
>         AFAIR should the db size 4% of the osd in question.
>
>         So
>
>         For example, if the block size is 1TB, then block.db shouldn’t
>         be less than 40GB
>
>         See:
>         http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/
>
>         Hth
>         - Mehmet
>
>         >>
>         >> Thanks, Massimo
>         >> _______________________________________________
>         >> ceph-users mailing list
>         >> ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>         >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>         >>
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



More information about the ceph-users mailing list