[ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

Peter Linder peter.linder at fiberdirekt.se
Sat Oct 7 07:12:50 PDT 2017


Hello Ceph-users!

Ok, so I've got 3 separate datacenters (low latency network in between)
and I want to make a hybrid NMVe/HDD pool for performance and cost reasons.

There are 3 servers with NVMe based OSDs, and 2 servers with normal HDDS
(Yes, one is missing, will be 3 of course. It needs some more work and
will be added later), with 1 NVMe server and 1 HDD server in each
datacenter.

I've been trying to use a rule like this:

rule hybrid {
        id 1
        type replicated
        min_size 1
        max_size 3
        step take default class nvme
        step chooseleaf firstn 1 type datacenter
        step emit
        step take default class hdd
        step chooseleaf firstn -1 type datacenter
        step emit
}

(min_size should be 2, i know). The idea is to select an nvme osd, and
then select the rest from hdd osds in different datacenters (see crush
map below for hierarchy). This would work I think if each datacenter
only had nmve or hdd osds, but currently there are 2 servers of the
different kinds in each datacenter.

Output from "ceph pg dump" shows that some PGs end up in the same
datacenter:

2.6c8        47                  0        0         0       0  197132288  613      613 active+clean 2017-10-07 14:27:33.943589  2222'613   2222:3446  [8,24]          8  [8,24]

Here OSD 8 and OSD 24 are indeed of diferent types, but are in the same
datacenter so redundancy for this PG would be depending on a single
datacenter...

Is there any way I can rethink this?

Please see my full crushmap etc below. Thanks for any help!

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class nvme
device 1 osd.1 class nvme
device 2 osd.2 class nvme
device 3 osd.3 class nvme
device 4 osd.4 class nvme
device 5 osd.5 class nvme
device 6 osd.6 class nvme
device 7 osd.7 class nvme
device 8 osd.8 class nvme
device 9 osd.9 class nvme
device 10 osd.10 class nvme
device 11 osd.11 class nvme
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 20 osd.20 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd
device 25 osd.25 class hdd
device 26 osd.26 class hdd
device 27 osd.27 class hdd
device 28 osd.28 class hdd
device 29 osd.29 class hdd
device 30 osd.30 class hdd
device 31 osd.31 class hdd
device 32 osd.32 class hdd
device 33 osd.33 class hdd
device 34 osd.34 class hdd
device 35 osd.35 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host storage11 {
        id -5           # do not change unnecessarily
        id -6 class nvme                # do not change unnecessarily
        id -10 class hdd                # do not change unnecessarily
        # weight 2.912
        alg straw2
        hash 0  # rjenkins1
        item osd.0 weight 0.728
        item osd.3 weight 0.728
        item osd.6 weight 0.728
        item osd.9 weight 0.728
}
host storage21 {
        id -13          # do not change unnecessarily
        id -14 class nvme               # do not change unnecessarily
        id -15 class hdd                # do not change unnecessarily
        # weight 65.496
        alg straw2
        hash 0  # rjenkins1
        item osd.12 weight 5.458
        item osd.13 weight 5.458
        item osd.14 weight 5.458
        item osd.15 weight 5.458
        item osd.16 weight 5.458
        item osd.17 weight 5.458
        item osd.18 weight 5.458
        item osd.19 weight 5.458
        item osd.20 weight 5.458
        item osd.21 weight 5.458
        item osd.22 weight 5.458
        item osd.23 weight 5.458
}
datacenter HORN79 {
        id -19          # do not change unnecessarily
        id -26 class nvme               # do not change unnecessarily
        id -27 class hdd                # do not change unnecessarily
        # weight 68.406
        alg straw2
        hash 0  # rjenkins1
        item storage11 weight 2.911
        item storage21 weight 65.495
}
host storage13 {
        id -7           # do not change unnecessarily
        id -8 class nvme                # do not change unnecessarily
        id -11 class hdd                # do not change unnecessarily
        # weight 2.912
        alg straw2
        hash 0  # rjenkins1
        item osd.2 weight 0.728
        item osd.5 weight 0.728
        item osd.8 weight 0.728
        item osd.11 weight 0.728
}
host storage23 {
        id -16          # do not change unnecessarily
        id -17 class nvme               # do not change unnecessarily
        id -18 class hdd                # do not change unnecessarily
        # weight 65.496
        alg straw2
        hash 0  # rjenkins1
        item osd.24 weight 5.458
        item osd.25 weight 5.458
        item osd.26 weight 5.458
        item osd.27 weight 5.458
        item osd.28 weight 5.458
        item osd.29 weight 5.458
        item osd.30 weight 5.458
        item osd.31 weight 5.458
        item osd.32 weight 5.458
        item osd.33 weight 5.458
        item osd.34 weight 5.458
        item osd.35 weight 5.458
}
datacenter WAR {
        id -20          # do not change unnecessarily
        id -24 class nvme               # do not change unnecessarily
        id -25 class hdd                # do not change unnecessarily
        # weight 68.406
        alg straw2
        hash 0  # rjenkins1
        item storage13 weight 2.911
        item storage23 weight 65.495
}
host storage12 {
        id -3           # do not change unnecessarily
        id -4 class nvme                # do not change unnecessarily
        id -9 class hdd         # do not change unnecessarily
        # weight 2.912
        alg straw2
        hash 0  # rjenkins1
        item osd.1 weight 0.728
        item osd.4 weight 0.728
        item osd.7 weight 0.728
        item osd.10 weight 0.728
}
datacenter TEG4 {
        id -21          # do not change unnecessarily
        id -22 class nvme               # do not change unnecessarily
        id -23 class hdd                # do not change unnecessarily
        # weight 2.911
        alg straw2
        hash 0  # rjenkins1
        item storage12 weight 2.911
}
root default {
        id -1           # do not change unnecessarily
        id -2 class nvme                # do not change unnecessarily
        id -12 class hdd                # do not change unnecessarily
        # weight 139.721
        alg straw2
        hash 0  # rjenkins1
        item HORN79 weight 68.405
        item WAR weight 68.405
        item TEG4 weight 2.911
}

# rules
rule hybrid {
        id 1
        type replicated
        min_size 1
        max_size 3
        step take default class nvme
        step chooseleaf firstn 1 type datacenter
        step emit
        step take default class hdd
        step chooseleaf firstn -1 type datacenter
        step emit
}
rule hdd {
        id 2
        type replicated
        min_size 1
        max_size 3
        step take default class hdd
        step chooseleaf firstn 0 type datacenter
        step emit
}
rule nvme {
        id 3
        type replicated
        min_size 1
        max_size 3
        step take default class nvme
        step chooseleaf firstn 0 type datacenter
        step emit
}

# end crush map




More information about the ceph-users mailing list