[ceph-users] CephFS performance.

jesper at krogh.cc jesper at krogh.cc
Wed Oct 3 22:04:48 PDT 2018

Hi All.

First thanks for the good discussion and strong answer's I've gotten so far.

Current cluster setup is 4 x 10 x 12TB 7.2K RPM drives with all and
10GbitE and metadata on rotating drives - 3x replication - 256GB memory in
OSD hosts and 32+ cores. Behind Perc with eachdiskraid0 and BBWC.

Planned changes:
- is to get 1-2 more OSD-hosts
- experiment with EC-pools for CephFS
- MDS onto seperate host and metadata onto SSD's.

I'm still struggling to get "non-cached" performance up to "hardware"
speed - whatever that means. I do "fio" benchmark using 10GB files, 16
threads, 4M block size -- at which I can "almost" sustained fill the
10GbitE NIC. In this configuraiton I would have expected it to be "way
above" 10Gbit speed thus have the NIC not "almost" filled - but fully
filled - could that be the metadata activities .. but on "big files" and
read - that should not be much - right?

Above is actually ok for production, thus .. not a big issue, just

Single threaded performance is still struggling

Cold HHD (read from disk in NFS-server end) / NFS performance:

jk at zebra01:~$ pipebench < /nfs/16GB.file > /dev/null
Piped   15.86 GB in 00h00m27.53s:  589.88 MB/second

Local page cache (just to say it isn't the profiling tool delivering
jk at zebra03:~$ pipebench < /nfs/16GB.file > /dev/null
Piped   29.24 GB in 00h00m09.15s:    3.19 GB/second
jk at zebra03:~$

Now from the Ceph system:
jk at zebra01:~$ pipebench < /ceph/bigfile.file> /dev/null
Piped   36.79 GB in 00h03m47.66s:  165.49 MB/second

Can block/stripe-size be tuned? Does it make sense?
Does read-ahead on the CephFS kernel-client need tuning?
What performance are other people seeing?
Other thoughts - recommendations?

On some of the shares we're storing pretty large files (GB size) and
need the backup to move them to tape - so it is preferred to be capable
of filling an LTO6 drive's write speed to capacity with a single thread.

40'ish 7.2K RPM drives - should - add up to more than above.. right?
This is the only current load being put on the cluster - + 100MB/s
recovery traffic.



More information about the ceph-users mailing list