[ceph-users] Unexplainable high memory usage OSD with BlueStore

Wido den Hollander wido at 42on.com
Thu Nov 8 04:36:11 PST 2018



On 11/8/18 12:28 PM, Hector Martin wrote:
> On 11/8/18 5:52 PM, Wido den Hollander wrote:
>> [osd]
>> bluestore_cache_size_ssd = 1G
>>
>> The BlueStore Cache size for SSD has been set to 1GB, so the OSDs
>> shouldn't use more then that.
>>
>> When dumping the mem pools each OSD claims to be using between 1.8GB and
>> 2.2GB of memory.
>>
>> $ ceph daemon osd.X dump_mempools|jq '.total.bytes'
>>
>> Summing up all the values I get to a total of 15.8GB and the system is
>> using 22GB.
>>
>> Looking at 'ps aux --sort rss' I see OSDs using almost 10% of the
>> memory, which would be ~3GB for a single daemon.
> 
> This is similar to what I see on a memory-starved host with the OSDs
> configured with very little cache:
> 
> [osd]
>   bluestore cache size = 180000000
> 
> $ ceph daemon osd.13 dump_mempools|jq '.mempool.total.bytes'
> 163117861
> 

Interesting. Looking at my OSD in this case (cache = 1GB) I see
BlueStore reporting 1548288000 bytes at bluestore_cache_data.

That's 1.5GB while 1GB has been set.

This OSD claims to be using 2GB in total at mempool.total.bytes.

So that's 1.5GB for BlueStore's cache and then 512M for the rest?

PGLog and OSDMaps aren't using that much memory.

Wido

> That adds up, but ps says:
> 
> USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> ceph     234576  2.6  6.2 1236200 509620 ?      Ssl  20:10   0:16
> /usr/bin/ceph-osd -i 13 --pid-file /run/ceph/osd.13.pid -c
> /etc/ceph/ceph.conf --foreground
> 
> So ~500MB RSS for this one. Due to an emergency situation that made me
> lose half of the RAM on this host, I'm actually resorting to killing the
> oldest OSD every 5 minutes right now to keep the server from OOMing
> (this will be fixed soon).
> 
> I would very much like to know if this OSD memory usage outside of the
> bluestore cache size can be bounded or reduced somehow. I don't
> particularly care about performance, so it would be useful to be able to
> tune it lower. This would help single-host and smaller Ceph use cases; I
> think Ceph's properties make it a very interesting alternative to things
> like btrfs and zfs, but dedicating several GB of RAM per disk/OSD is not
> always viable. Right now it seems that besides the cache, OSDs will
> creep up in memory usage up to some threshold, and I'm not sure what
> determines what that baseline usage is or whether it can be controlled.
> 
> 


More information about the ceph-users mailing list