[ceph-users] Journal / WAL drive size?

Rudi Ahlers rudiahlers at gmail.com
Thu Nov 23 22:41:04 PST 2017


Thanx for the advice.

We're aiming for the highest possible performance on the cluster. The
servers have 4 port 10Gbe NIC's and I have seen traffic upto about 9GB/s on
)the current single port during testing. So I want to make sure the "cache
drive" can handle the current load (9x fairly heavily loaded physical
servers) and accommodate future growth with ease.

In your testing, once you have reached 50GB DB, did you grow it bigger to
see if it made any difference? Our storage is about 87TB.

There is a lot of network traffic between the mail servers and spam
servers. There's a Postfix and MS Exchange server. The mail 3 "mail"
servers are constantly busy. As mail comes in, it goes to the spam filter,
then runs through a couple different programs and custorm scripts to check
for spam, virusses, malware, etc, and then delivered to either the Postfix
or Exchange server depending on destination. Outgoing mail has an email
footer injected, and also checked for spam and viruses. The delay on email,
even internal, right now is very bad. And then there are 4 MS Windows
servers with MS SQL and a lot of database traffic between the servers, and
between the servers and office staff. Month-end accounting cause the
servers to go into "overload frenzy" due to stock take and invoicing (both
internal and to clients)

On Fri, Nov 24, 2017 at 3:33 AM, David Byte <dbyte at suse.com> wrote:

> The answer tends to be “it depends”. For my test system with 144 6TB
> drives, I use a 50GB DB partition. In another case of testing, we have a
> ratio of about 10GB per TB.   What you have to watch out for is the
> performance difference when the DB overruns the SSD partition.
>
> For the WAL, I provision 2GB and haven’t experienced any issues with that.
> You also probably will need to adjust the ratios, but that was covered in
> other threads previously.
>
> David Byte
> Sr. Technical Strategist
> IHV Alliances and Embedded
> SUSE
>
> Sent from my iPhone. Typos are Apple's fault.
>
> On Nov 23, 2017, at 3:19 PM, Rudi Ahlers <rudiahlers at gmail.com> wrote:
>
> Hi Richard,
>
> So do you rely on the CEPH to automatically decide the WAL device's
> location and size?
>
> On Thu, Nov 23, 2017 at 4:04 PM, Richard Hesketh <
> richard.hesketh at rd.bbc.co.uk> wrote:
>
>> Keep in mind that as yet we don't really have good estimates for how
>> large bluestore metadata DBs may become, but it will be somewhat
>> proportional to your number of objects. Considering the size of your OSDs,
>> a 15GB block.db partition is almost certainly too small. Unless you've got
>> a compelling reason not to you should probably partition your SSDs to use
>> all the available space. Personally, I manually partition my block.db
>> devices into even partitions and then invoke creation with
>>
>> ceph-disk prepare --bluestore /dev/sdX --block.db
>> /dev/disk/by-partuuid/whateveritis
>>
>> Invoking by UUID is done because if presented an existing partition
>> rather than the root block device it will just symlink precisely to the
>> argument given, so using /dev/sdXYZ arguments is dangerous as they may not
>> be consistent across hardware changes or even just reboots depending on
>> your system.
>>
>> Rich
>>
>> On 23/11/17 09:53, Caspar Smit wrote:
>> > Rudi,
>> >
>> > First off all do not deploy an OSD specifying the same seperate device
>> for DB and WAL:
>> >
>> > Please read the following why:
>> >
>> > http://docs.ceph.com/docs/master/rados/configuration/bluesto
>> re-config-ref/
>> >
>> >
>> > That said you have a fairly large amount of SSD size available so i
>> recommend using it as block.db:
>> >
>> > You can specify a fixed size block.db size in ceph.conf using:
>> >
>> > [global]
>> > bluestore_block_db_size = 16106127360
>> >
>> > The above is a 15GB block.db size
>> >
>> > Now when you deploy an OSD with a seperate block.db device the
>> partition will be 15GB.
>> >
>> > The default size is a percentage of the device i believe and not always
>> a usable amount.
>> >
>> > Caspar
>> >
>> > Met vriendelijke groet,
>> >
>> > Caspar Smit
>> > Systemengineer
>> > SuperNAS
>> > Dorsvlegelstraat 13
>> <https://maps.google.com/?q=Dorsvlegelstraat+13+%0D+%3E+1445+PA+Purmerend&entry=gmail&source=g>
>> > 1445 PA Purmerend
>> <https://maps.google.com/?q=Dorsvlegelstraat+13+%0D+%3E+1445+PA+Purmerend&entry=gmail&source=g>
>> >
>> > t: (+31) 299 410 414
>> > e: casparsmit at supernas.eu <mailto:casparsmit at supernas.eu>
>> > w: www.supernas.eu <http://www.supernas.eu>
>> >
>> > 2017-11-23 10:27 GMT+01:00 Rudi Ahlers <rudiahlers at gmail.com <mailto:
>> rudiahlers at gmail.com>>:
>> >
>> >     Hi,
>> >
>> >     Can someone please explain this to me in layman's terms. How big a
>> WAL drive do I really need?
>> >
>> >     I have a 2x 400GB SSD drives used as WAL / DB drive and 4x 8TB
>> HDD's used as OSD's. When I look at the drive partitions the DB / WAL
>> partitions are only 576Mb & 1GB respectively. This feels a bit small.
>> >
>> >
>> >     root at virt1:~# lsblk
>> >     NAME               MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
>> >     sda                  8:0    0   7.3T  0 disk
>> >     ├─sda1               8:1    0   100M  0 part
>> /var/lib/ceph/osd/ceph-0
>> >     └─sda2               8:2    0   7.3T  0 part
>> >     sdb                  8:16   0   7.3T  0 disk
>> >     ├─sdb1               8:17   0   100M  0 part
>> /var/lib/ceph/osd/ceph-1
>> >     └─sdb2               8:18   0   7.3T  0 part
>> >     sdc                  8:32   0   7.3T  0 disk
>> >     ├─sdc1               8:33   0   100M  0 part
>> /var/lib/ceph/osd/ceph-2
>> >     └─sdc2               8:34   0   7.3T  0 part
>> >     sdd                  8:48   0   7.3T  0 disk
>> >     ├─sdd1               8:49   0   100M  0 part
>> /var/lib/ceph/osd/ceph-3
>> >     └─sdd2               8:50   0   7.3T  0 part
>> >     sde                  8:64   0 372.6G  0 disk
>> >     ├─sde1               8:65   0     1G  0 part
>> >     ├─sde2               8:66   0   576M  0 part
>> >     ├─sde3               8:67   0     1G  0 part
>> >     └─sde4               8:68   0   576M  0 part
>> >     sdf                  8:80   0 372.6G  0 disk
>> >     ├─sdf1               8:81   0     1G  0 part
>> >     ├─sdf2               8:82   0   576M  0 part
>> >     ├─sdf3               8:83   0     1G  0 part
>> >     └─sdf4               8:84   0   576M  0 part
>> >     sdg                  8:96   0   118G  0 disk
>> >     ├─sdg1               8:97   0     1M  0 part
>> >     ├─sdg2               8:98   0   256M  0 part /boot/efi
>> >     └─sdg3               8:99   0 117.8G  0 part
>> >       ├─pve-swap       253:0    0     8G  0 lvm  [SWAP]
>> >       ├─pve-root       253:1    0  29.3G  0 lvm  /
>> >       ├─pve-data_tmeta 253:2    0    68M  0 lvm
>> >       │ └─pve-data     253:4    0  65.9G  0 lvm
>> >       └─pve-data_tdata 253:3    0  65.9G  0 lvm
>> >         └─pve-data     253:4    0  65.9G  0 lvm
>> >
>> >
>> >
>> >
>> >     --
>> >     Kind Regards
>> >     Rudi Ahlers
>> >     Website: http://www.rudiahlers.co.za
>> >
>> >     _______________________________________________
>> >     ceph-users mailing list
>> >     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users at lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>>
>> --
>> Richard Hesketh
>> Systems Engineer, Research Platforms
>> BBC Research & Development
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> Kind Regards
> Rudi Ahlers
> Website: http://www.rudiahlers.co.za
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Kind Regards
Rudi Ahlers
Website: http://www.rudiahlers.co.za
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171124/261f7ef1/attachment.html>


More information about the ceph-users mailing list