[ceph-users] Bluestore WAL/DB decisions

Erik McCormick emccormick at cirrusseven.com
Fri Mar 29 07:37:19 PDT 2019


On Fri, Mar 29, 2019 at 1:48 AM Christian Balzer <chibi at gol.com> wrote:
>
> On Fri, 29 Mar 2019 01:22:06 -0400 Erik McCormick wrote:
>
> > Hello all,
> >
> > Having dug through the documentation and reading mailing list threads
> > until my eyes rolled back in my head, I am left with a conundrum
> > still. Do I separate the DB / WAL or not.
> >
> You clearly didn't find this thread, most significant post here but read
> it all:
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033799.html
>
> In short, a 30GB DB(and thus WAL) partition should do the trick for many
> use cases and will still be better than nothing.
>

Thanks for the link. I actually had seen it, but since it contained
the mention of the 4%, and my OSDs are larger than those of the
original poster there, I was still concerned that antying I could
throw at it would be insufficient. I have a few OSDs that I've created
with DB on the device, and this is what it ended up with after
backfilling:

Smallest:
        "db_total_bytes": 320063143936,
        "db_used_bytes": 1783627776,

Biggest:
        "db_total_bytes": 320063143936,
        "db_used_bytes": 167883309056,

So given that The biggest is ~160GB in size already, I wasn't certain
if it would be better to have some with only ~20% of it split off onto
an SSD, or leave it all together on the slower disk. I have a new
cluster I"m building out with the same hardware, so I guess I'll see
how it goes with a small DB unless anyone comes back and says it's a
terrible idea ;).

-Erik

> Christian
>
> > I had a bunch of nodes running filestore with 8 x 8TB spinning OSDs
> > and 2 x 240 GB SSDs. I had put the OS on the first SSD, and then split
> > the journals on the remaining SSD space.
> >
> > My initial minimal understanding of Bluestore was that one should
> > stick the DB and WAL on an SSD, and if it filled up it would just
> > spill back onto the OSD itself where it otherwise would have been
> > anyway.
> >
> > So now I start digging and see that the minimum recommended size is 4%
> > of OSD size. For me that's ~2.6 TB of SSD. Clearly I do not have that
> > available to me.
> >
> > I've also read that it's not so much the data size that matters but
> > the number of objects and their size. Just looking at my current usage
> > and extrapolating that to my maximum capacity, I get to ~1.44 million
> > objects / OSD.
> >
> > So the question is, do I:
> >
> > 1) Put everything on the OSD and forget the SSDs exist.
> >
> > 2) Put just the WAL on the SSDs
> >
> > 3) Put the DB (and therefore the WAL) on SSD, ignore the size
> > recommendations, and just give each as much space as I can. Maybe 48GB
> > / OSD.
> >
> > 4) Some scenario I haven't considered.
> >
> > Is the penalty for a too small DB on an SSD partition so severe that
> > it's not worth doing?
> >
> > Thanks,
> > Erik
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> --
> Christian Balzer        Network/Systems Engineer
> chibi at gol.com           Rakuten Communications


More information about the ceph-users mailing list