[ceph-users] Ceph cluster network bandwidth?

Blair Bethwaite blair.bethwaite at gmail.com
Thu Nov 16 12:09:46 PST 2017

What type of SAS disks, spinners or SSD? You really need to specify
the sustained write throughput of your OSD nodes if you want to figure
out whether your network is sufficient/appropriate.

At 3x replication if you want to sustain e.g. 1 GB/s of write traffic
from clients then you will need 2 GB/s of cluster network capacity -
first write hits the primary OSD on the frontend/client network,
second and third replicas are sent from the primary to those other two
OSDs. So then the question is, do you have 2GB/s of cluster network
capacity? It's easy to get confused thinking about this if you are not
accustomed to cluster computing...

E.g. If you have a single 10GbE NIC per host then you can TX & RX at a
max of ~9.8Gb/s (bi-directional), so you might think you'll be limited
to 9.8/8 = 1.2GB/s on the cluster network and thus 1.2/2 = 0.6GB/s
from clients. However luckily your client/s can be writing in parallel
across multiple PGs (and thus to different primary OSDs). So the way
to work out your max Ceph network capacity is to first calculate your
average bisectional bandwidth. Let's assume for simplicity that you
have a single 10GbE ToR switch for this cluster, so bisectional
bandwidth is 10Gb/s between all 6 of your OSD hosts. Take that and
divide by #replica-1:  6x 9.8Gb/s / 2 = 29.4Gb/s / 8 = 3.6GB/s. That
means your theoretical 6 node 10GbE cluster can sustain up to 1.8GB/s
of client write throughput.

If you are talking about 10x HDD based OSDs per node then a single
10GbE network is probably ok: 6 (nodes) x 10 (OSDs) x 100MB/s
(optimistic max write throughput per HDD) / 3 (replica count) =


On 16 November 2017 at 07:45, Sam Huracan <nowitzki.sammy at gmail.com> wrote:
> Hi,
> We intend build a new Ceph cluster with 6 Ceph OSD hosts, 10 SAS disks every
> host, using 10Gbps NIC for client network, object is replicated 3.
> So, how could I sizing the cluster network for best performance?
> As i have read, 3x replicate means 3x bandwidth client network = 30 Gbps, is
> it true? I think it is too much and make great cost
> Do you give me a suggestion?
> Thanks in advance.
