[ceph-users] Memory leak in OSDs running 12.2.1 beyond the buffer_anon mempool leak

Subhachandra Chandra schandra at grailbio.com
Wed Nov 29 14:24:49 PST 2017


Hello,

   We are trying out Ceph on a small cluster and are observing memory
leakage in the OSD processes. The leak seems to be in addition to the known
leak related to the "buffer_anon" pool and is high enough for the processes
to run against their memory limits in a few hours.

The following table gives a snapshot of increase in memory being used by
one of the OSD processes over an hour (t+63 indicates 63 minutes after the
first snapshot). Full mempool dumps and output of top are at the bottom.
Over an hour the OSDs went from RSS in the range 469-704MB to 735-844MB.
The container are restricted to 1GB of memory which causes them to restart
after a few hours.

t+00     t+11     t+63
683980   706980   786324    VmRSS KB
5803     10457    32308     buffer_anon KB(dump_mempools)
437369   444945   458688    total       KB(dump_mempools)



Our setup is as follows:
* 3 nodes each with 30 OSDs for a total of 90 OSDs.
* Running Luminous (12.2.1)  official docker images on top of CoreOS
* The OSDs use Bluestore with all the db.* partitions on the same drive
* The nodes have 32GB of RAM and 8 cores. The test cluster nodes do have
less than the recommended amount of RAM per OSD to constrain them and find
problems
* The cluster currently has 501 PGs/OSD (Again higher than recommended for
testing)
* The pools are setup for RGW usage with replication_factor of 3 on all the
pools (2752 PGs) except default.rgw.buckets.data (4096 PGs) which is setup
with 6+3 erasure coding.
* The clients use the python rados library to push 128MB files directly
into the default.rgw.buckets.data pool. There are 3 clients running in
parallel on VMs and are pushing  about 350-400MB/s in aggregate.

The conf file with non-default settings looks like
[global]
mon_max_pg_per_osd = 750
mon_osd_down_out_interval = 900
mon_pg_warn_max_per_osd = 600
osd_crush_chooseleaf_type = 0
osd_map_message_max = 10
osd_max_pg_per_osd_hard_ratio = 1.2

[mon]
mon_max_pgmap_epochs = 100
mon_min_osdmap_epochs = 100

[osd]
bluestore_cache_kv_ratio = .95
bluestore_cache_kv_max = 67108864
bluestore_cache_meta_ratio = .05
bluestore_cache_size = 268435456
osd_map_cache_size = 50
osd_map_max_advance = 25
osd_map_share_max_epochs = 25
osd_max_object_size = 1073741824
osd_max_write_size = 256
osd_pg_epoch_persisted_max_stale = 25
osd_pool_erasure_code_stripe_unit = 4194304


top - 20:46:18 up 1 day,  2:21,  2 users,  load average: 3.13, 1.78, 1.25
Tasks: 567 total,   1 running, 566 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.7 us,  5.9 sy,  0.0 ni, 73.5 id, 11.1 wa,  0.3 hi,  2.5 si,
0.0 st
KiB Mem:  32981660 total, 24427392 used,  8554268 free,   351396 buffers
KiB Swap:        0 total,        0 used,        0 free.  2803348 cached Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
 376901 64045     20   0 1522220 704896  29056 S  14.4  2.1   1:26.72
ceph-osd
 370611 64045     20   0 1527432 698080  29476 S   2.0  2.1   1:29.03
ceph-osd
 396886 64045     20   0 1486584 696480  29060 S   2.0  2.1   1:22.93
ceph-osd
 382254 64045     20   0 1516968 690196  28984 S   3.0  2.1   1:27.15
ceph-osd
 359523 64045     20   0 1516888 686728  29332 S   3.5  2.1   1:28.67
ceph-osd
 366478 64045     20   0 1560912 683980  29076 S   1.5  2.1   1:28.59
ceph-osd
 382255 64045     20   0 1493116 669616  29276 S   1.5  2.0   1:29.46
ceph-osd
 360152 64045     20   0 1529896 666628  29268 S   0.5  2.0   1:27.96
ceph-osd
 372155 64045     20   0 1523640 662492  29416 S  17.4  2.0   1:29.79
ceph-osd
 358800 64045     20   0 1513640 662224  29184 S  13.9  2.0   1:29.80
ceph-osd
 360142 64045     20   0 1517992 661868  29328 S   0.5  2.0   1:31.69
ceph-osd
 398310 64045     20   0 1504552 658216  28796 S   1.0  2.0   1:20.62
ceph-osd
 368705 64045     20   0 1505544 657776  29292 S   1.0  2.0   1:27.32
ceph-osd
 386044 64045     20   0 1501488 655960  29536 S   3.0  2.0   1:24.87
ceph-osd
 386940 64045     20   0 1503056 652552  29152 S   4.5  2.0   1:28.22
ceph-osd
 386050 64045     20   0 1489996 650628  28800 S   1.0  2.0   1:28.46
ceph-osd
 402086 64045     20   0 1504528 646672  29344 S   2.5  2.0   1:26.96
ceph-osd
 400590 64045     20   0 1487424 642288  29348 S   3.5  1.9   1:21.55
ceph-osd
 387860 64045     20   0 1504520 641296  29316 S   4.0  1.9   1:19.98
ceph-osd
 392900 64045     20   0 1493492 637572  29156 S   1.5  1.9   1:26.86
ceph-osd
 375314 64045     20   0 1520448 629272  29412 S   1.0  1.9   1:32.04
ceph-osd
 372038 64045     20   0 1497992 627176  29300 S   1.0  1.9   1:30.36
ceph-osd
 385149 64045     20   0 1514284 624428  28960 S   0.5  1.9   1:28.56
ceph-osd
 382236 64045     20   0 1512248 616256  29568 S   2.0  1.9   1:24.03
ceph-osd
 374703 64045     20   0 1511740 571404  29628 S   2.5  1.7   1:27.88
ceph-osd
 367873 64045     20   0 1394740 564488  29012 S   2.5  1.7   1:31.64
ceph-osd
 360104 64045     20   0 1373880 532880  29132 S   2.5  1.6   1:32.11
ceph-osd
 376002 64045     20   0 1391576 516256  29132 S   0.5  1.6   1:27.21
ceph-osd
 402624 64045     20   0 1353104 515964  28876 S   3.0  1.6   1:23.38
ceph-osd
 405130 64045     20   0 1367740 469660  29216 S   2.5  1.4   1:22.75
ceph-osd
    687 root      20   0  345656 237144 236272 S   0.0  0.7   1:00.56
systemd-+
   3179 root      20   0 4072976  85084  25448 S   0.0  0.3   1:04.11
dockerd

data1 core # cat /proc/366478/status  | grep -i rss
VmRSS:   683980 kB
RssAnon:   654904 kB
RssFile:    29076 kB
RssShmem:        0 kB

root at data1:/var/run/ceph# ceph daemon osd.1 dump_mempools
{
    "bloom_filter": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_alloc": {
        "items": 6000,
        "bytes": 6000
    },
    "bluestore_cache_data": {
        "items": 26,
        "bytes": 745472
    },
    "bluestore_cache_onode": {
        "items": 1419,
        "bytes": 964920
    },
    "bluestore_cache_other": {
        "items": 87945,
        "bytes": 1620755
    },
    "bluestore_fsck": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_txc": {
        "items": 2,
        "bytes": 1440
    },
    "bluestore_writing_deferred": {
        "items": 27,
        "bytes": 427916158
    },
    "bluestore_writing": {
        "items": 9,
        "bytes": 1168088
    },
    "bluefs": {
        "items": 84,
        "bytes": 5088
    },
    "buffer_anon": {
        "items": 5640,
        "bytes": 5942318
    },
    "buffer_meta": {
        "items": 114,
        "bytes": 10032
    },
    "osd": {
        "items": 502,
        "bytes": 6056128
    },
    "osd_mapbl": {
        "items": 0,
        "bytes": 0
    },
    "osd_pglog": {
        "items": 9366,
        "bytes": 2374198
    },
    "osdmap": {
        "items": 28585,
        "bytes": 1055944
    },
    "osdmap_mapping": {
        "items": 0,
        "bytes": 0
    },
    "pgmap": {
        "items": 0,
        "bytes": 0
    },
    "mds_co": {
        "items": 0,
        "bytes": 0
    },
    "unittest_1": {
        "items": 0,
        "bytes": 0
    },
    "unittest_2": {
        "items": 0,
        "bytes": 0
    },
    "total": {
        "items": 139719,
        "bytes": 447866541
    }
}

top - 20:57:34 up 1 day,  2:32,  2 users,  load average: 4.60, 4.05, 2.82
Tasks: 566 total,   1 running, 565 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.5 us,  6.0 sy,  0.0 ni, 71.0 id, 12.7 wa,  0.3 hi,  2.4 si,
0.0 st
KiB Mem:  32981660 total, 25681656 used,  7300004 free,   351396 buffers
KiB Swap:        0 total,        0 used,        0 free.  2808268 cached Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
 360142 64045     20   0 1592744 713812  29328 S   2.5  2.2   1:49.47
ceph-osd
 366478 64045     20   0 1560912 706980  29076 S   0.5  2.1   1:47.25
ceph-osd
 367873 64045     20   0 1581460 706308  29012 S   1.5  2.1   1:49.44
ceph-osd
 392900 64045     20   0 1534452 703872  29156 S   1.5  2.1   1:45.23
ceph-osd
 374703 64045     20   0 1558524 701372  29628 S   2.0  2.1   1:46.90
ceph-osd
 372038 64045     20   0 1547496 699644  29300 S   1.5  2.1   1:50.03
ceph-osd
 402086 64045     20   0 1544464 698036  29344 S   2.0  2.1   1:47.67
ceph-osd
 359523 64045     20   0 1594744 696468  29332 S   3.5  2.1   1:47.64
ceph-osd
 358800 64045     20   0 1553928 695140  29184 S   2.0  2.1   1:50.51
ceph-osd
 370611 64045     20   0 1574888 691988  29476 S   2.0  2.1   1:49.00
ceph-osd
 386940 64045     20   0 1544368 691512  29152 S   3.0  2.1   1:49.43
ceph-osd
 376901 64045     20   0 1545100 691120  29056 S   2.0  2.1   1:45.11
ceph-osd
 376002 64045     20   0 1522648 684568  29132 S   2.0  2.1   1:41.75
ceph-osd
 368705 64045     20   0 1549256 682788  29292 S  12.4  2.1   1:45.15
ceph-osd
 396886 64045     20   0 1574680 682776  29060 S   3.0  2.1   1:43.59
ceph-osd
 360152 64045     20   0 1590664 681848  29268 S  19.9  2.1   1:47.58
ceph-osd
 382254 64045     20   0 1558280 677496  29048 S   4.5  2.1   1:47.19
ceph-osd
 360104 64045     20   0 1534648 676172  29132 S   3.0  2.1   1:49.43
ceph-osd
 375314 64045     20   0 1561408 675596  29412 S   0.5  2.0   1:48.98
ceph-osd
 382255 64045     20   0 1537852 672216  29276 S  17.4  2.0   1:51.59
ceph-osd
 385149 64045     20   0 1519404 671404  28960 S   1.5  2.0   1:46.92
ceph-osd
 372155 64045     20   0 1541048 665004  29416 S   3.0  2.0   1:50.02
ceph-osd
 387860 64045     20   0 1526376 664912  29316 S   2.5  2.0   1:34.52
ceph-osd
 386044 64045     20   0 1532560 659852  29536 S   2.0  2.0   1:42.56
ceph-osd
 405130 64045     20   0 1498812 659236  29216 S   1.5  2.0   1:40.39
ceph-osd
 402624 64045     20   0 1551440 658544  28876 S   3.5  2.0   1:43.41
ceph-osd
 400590 64045     20   0 1529760 657700  29348 S   1.5  2.0   1:38.43
ceph-osd
 382236 64045     20   0 1534104 653224  29568 S   2.0  2.0   1:40.17
ceph-osd
 398310 64045     20   0 1516840 642552  28796 S   1.0  1.9   1:41.02
ceph-osd
 386050 64045     20   0 1539852 641968  28800 S   1.0  1.9   1:48.66
ceph-osd
    687 root      20   0  366148 245996 245124 S   0.0  0.7   1:01.04
systemd-+
   3179 root      20   0 4072976  84748  25448 S   0.0  0.3   1:04.30
dockerd
data1 core # cat /proc/366478/status  | grep -i rss
VmRSS:   706980 kB
RssAnon:   677904 kB
RssFile:    29076 kB
RssShmem:        0 kB

root at data1:/var/run/ceph# ceph daemon osd.1 dump_mempools
{
    "bloom_filter": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_alloc": {
        "items": 6000,
        "bytes": 6000
    },
    "bluestore_cache_data": {
        "items": 26,
        "bytes": 745472
    },
    "bluestore_cache_onode": {
        "items": 1625,
        "bytes": 1105000
    },
    "bluestore_cache_other": {
        "items": 147702,
        "bytes": 4368313
    },
    "bluestore_fsck": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_txc": {
        "items": 2,
        "bytes": 1440
    },
    "bluestore_writing_deferred": {
        "items": 27,
        "bytes": 427916158
    },
    "bluestore_writing": {
        "items": 9,
        "bytes": 1168088
    },
    "bluefs": {
        "items": 86,
        "bytes": 5120
    },
    "buffer_anon": {
        "items": 15580,
        "bytes": 10707976
    },
    "buffer_meta": {
        "items": 114,
        "bytes": 10032
    },
    "osd": {
        "items": 502,
        "bytes": 6056128
    },
    "osd_mapbl": {
        "items": 0,
        "bytes": 0
    },
    "osd_pglog": {
        "items": 9596,
        "bytes": 2478294
    },
    "osdmap": {
        "items": 28585,
        "bytes": 1055944
    },
    "osdmap_mapping": {
        "items": 0,
        "bytes": 0
    },
    "pgmap": {
        "items": 0,
        "bytes": 0
    },
    "mds_co": {
        "items": 0,
        "bytes": 0
    },
    "unittest_1": {
        "items": 0,
        "bytes": 0
    },
    "unittest_2": {
        "items": 0,
        "bytes": 0
    },
    "total": {
        "items": 209854,
        "bytes": 455623965
    }
}


top - 21:49:27 up 1 day,  3:24,  2 users,  load average: 8.10, 5.14, 3.87
Tasks: 564 total,   1 running, 563 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.9 us,  4.7 sy,  0.0 ni, 76.7 id, 11.5 wa,  0.2 hi,  1.9 si,
0.0 st
KiB Mem:  32981660 total, 28791752 used,  4189908 free,   351396 buffers
KiB Swap:        0 total,        0 used,        0 free.  2819272 cached Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
 359523 64045     20   0 1668152 844336  29332 S   0.5  2.6   3:16.18
ceph-osd
 358800 64045     20   0 1656328 802864  29184 S   4.0  2.4   3:21.13
ceph-osd
 370611 64045     20   0 1656488 799380  29476 S   3.5  2.4   3:22.27
ceph-osd
 385149 64045     20   0 1661772 798792  28960 S   2.0  2.4   3:19.39
ceph-osd
 376002 64045     20   0 1645208 798224  29132 S   2.0  2.4   3:00.87
ceph-osd
 382255 64045     20   0 1717756 797044  29276 S   2.5  2.4   3:16.91
ceph-osd
 374703 64045     20   0 1678364 795308  29628 S   1.5  2.4   3:15.13
ceph-osd
 372038 64045     20   0 1701800 794968  29300 S   2.5  2.4   3:21.30
ceph-osd
 398310 64045     20   0 1682088 794832  28796 S   2.5  2.4   3:23.65
ceph-osd
 360152 64045     20   0 1664072 794580  29268 S   4.0  2.4   3:25.78
ceph-osd
 386940 64045     20   0 1663152 789976  29152 S   1.5  2.4   3:26.28
ceph-osd
 367873 64045     20   0 1738484 789456  29012 S   1.5  2.4   3:27.69
ceph-osd
 368705 64045     20   0 1749288 788352  29292 S   2.0  2.4   3:08.00
ceph-osd
 402086 64045     20   0 1681040 787748  29344 S   1.5  2.4   3:25.16
ceph-osd
 382236 64045     20   0 1663512 787612  29568 S   2.0  2.4   3:03.32
ceph-osd
 376901 64045     20   0 1770412 785916  29056 S   4.0  2.4   3:09.34
ceph-osd
 375314 64045     20   0 1658016 785684  29412 S   2.5  2.4   3:25.24
ceph-osd
 387860 64045     20   0 1649640 783232  29316 S   1.5  2.4   2:54.61
ceph-osd
 396886 64045     20   0 1691768 775268  29060 S   1.0  2.4   3:21.43
ceph-osd
 386050 64045     20   0 1669260 769440  28800 S   2.0  2.3   3:15.79
ceph-osd
 360142 64045     20   0 1680168 768060  29328 S   3.5  2.3   3:13.78
ceph-osd
 405130 64045     20   0 1651740 766800  29216 S   1.0  2.3   3:09.41
ceph-osd
 366478 64045     20   0 1653072 764060  29076 S   1.5  2.3   3:14.65
ceph-osd
 360104 64045     20   0 1668152 764008  29132 S   0.5  2.3   3:23.04
ceph-osd
 372155 64045     20   0 1653016 761484  29416 S   1.0  2.3   3:19.32
ceph-osd
 382254 64045     20   0 1753192 755592  29048 S   1.0  2.3   3:19.92
ceph-osd
 392900 64045     20   0 1636212 755508  29156 S   1.0  2.3   3:08.03
ceph-osd
 386044 64045     20   0 1671856 753824  29536 S   1.5  2.3   3:13.25
ceph-osd
 400590 64045     20   0 1641056 741944  29348 S   2.0  2.2   2:59.38
ceph-osd
 402624 64045     20   0 1686992 735960  28876 S  16.9  2.2   3:17.28
ceph-osd
    687 root      20   0  382532 266000 265128 S   0.0  0.8   1:03.15
systemd-+
   3179 root      20   0 4072976  84748  25448 S   0.0  0.3   1:05.07
dockerd
data1 core # cat /proc/366478/status  | grep -i rss
VmRSS:   786324 kB
RssAnon:   757248 kB
RssFile:    29076 kB
RssShmem:        0 kB

root at data1:/var/run/ceph# ceph daemon osd.1 dump_mempools
{
    "bloom_filter": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_alloc": {
        "items": 6000,
        "bytes": 6000
    },
    "bluestore_cache_data": {
        "items": 26,
        "bytes": 745472
    },
    "bluestore_cache_onode": {
        "items": 2595,
        "bytes": 1764600
    },
    "bluestore_cache_other": {
        "items": 428389,
        "bytes": 17291843
    },
    "bluestore_fsck": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_txc": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_writing_deferred": {
        "items": 26,
        "bytes": 405541839
    },
    "bluestore_writing": {
        "items": 9,
        "bytes": 1168088
    },
    "bluefs": {
        "items": 94,
        "bytes": 5248
    },
    "buffer_anon": {
        "items": 62355,
        "bytes": 33084140
    },
    "buffer_meta": {
        "items": 113,
        "bytes": 9944
    },
    "osd": {
        "items": 502,
        "bytes": 6056128
    },
    "osd_mapbl": {
        "items": 0,
        "bytes": 0
    },
    "osd_pglog": {
        "items": 10669,
        "bytes": 2967654
    },
    "osdmap": {
        "items": 28585,
        "bytes": 1055944
    },
    "osdmap_mapping": {
        "items": 0,
        "bytes": 0
    },
    "pgmap": {
        "items": 0,
        "bytes": 0
    },
    "mds_co": {
        "items": 0,
        "bytes": 0
    },
    "unittest_1": {
        "items": 0,
        "bytes": 0
    },
    "unittest_2": {
        "items": 0,
        "bytes": 0
    },
    "total": {
        "items": 539363,
        "bytes": 469696900
    }
}

Thanks
Subhachandra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171129/8c8fbc93/attachment.html>


More information about the ceph-users mailing list