+++ title = "Ceph Benchmarking" date = 2026-02-21 description = "The results of some of my recent ceph benchmarks" draft = true [taxonomies] categories = ["Homelab"] tags = ["Homelab", "Ceph"] +++ ## Motivation I have been running a ceph cluster in my homelab for about 2 years now, but never properly benchmarked it, let alone wrote down my findings or any potential conclusions. ## Setup everything for the Benchmarking On a machine in the ceph-cluster already: 1. Generate a minimal ceph config using `ceph config generate-minimal-conf` 2. Create a user for Benchmarking 1. Create a new user `ceph auth add client.linux_pc` 2. Edit the caps for my use-case - `mgr`: `profile rbd pool=test-rbd` - `mon`: `profile rbd` - `osd`: `profile rbd pool=test-rbd` 3. Get the keyring configuration using `ceph auth get client.linux_pc` On the client machine doing the Benchmarking: 1. Install basic ceph tools `apt-get install -y ceph-common` 2. Load the rbd kernel module `modprobe rbd` 3. Setup local ceph config - Copy the generated configuration to `/etc/ceph/ceph.conf` - `chmod 644 /etc/ceph/ceph.conf` 4. Setup local ceph keyring - Copy the keyring configuratio nto `/etc/ceph/ceph.client.linux_pc.keyring` - `chmod 644 /etc/ceph/ceph.client.linux_pc.keyring` 5. Confirm your configuration is working by running `ceph -s -n client.linux_pc` Setup the benchmark itself: 1. `rbd create -n client.linux_pc --size 10G --pool test-rbd bench-volume` 2. `rbd -n client.linux_pc device map --pool test-rbd bench-volume` (which should create a new block device, likely `/dev/rbd0`) 3. `mkfs.ext4 /dev/rbd0` 4. `mkdir /mnt/bench` 5. `mount /dev/rbd0 /mnt/bench` ## Benchmarks All benchmarks are run with the same configuration, only changing the access (read/write, random/sequential). Key configuration options are: - using libaio - direct io - 1 job {% details(summary="fio config") %} ``` [global] ioengine=libaio direct=1 size=4G numjobs=1 runtime=60s time_based startdelay=5s group_reporting stonewall name=write rw=write filename=bench [1io_4k] iodepth=1 bs=4k [1io_8k] iodepth=1 bs=8k [1io_64k] iodepth=1 bs=64k [1io_4M] iodepth=1 bs=4M [32io_4k] iodepth=32 bs=4k [32io_8k] iodepth=32 bs=8k [32io_64k] iodepth=32 bs=64k [32io_4M] iodepth=32 bs=4M ``` {% end %} ## Results {% details(summary="Random Reads") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/random_read.json") }} {% end %} {% details(summary="Random Writes") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/random_write.json") }} {% end %} {% details(summary="Sequential Reads") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/seq_read.json") }} {% end %} {% details(summary="Sequential Writes") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/seq_write.json") }} {% end %} ## Conclusion 1. Overall I am satisfied with the performance of the cluster for my current use-case. 2. There is a lot of room for improvement in the low queue-depth range 3. The network is not really a limiting factor currently - None of the nodes in the cluster exceeded 500MiB/s of TX or RX, so there is plenty of room for growth - My client used for testing was limited by the network, evident by the fact that the highest speed achieved is ~1.2GB/s (~10Gb/s) 4. My smallest node (the embedded epyc) could be the limiting factor as in some benchmarks, it reached 100% cpu usage, while my other nodes never exceeded 40% ## Extra Details {% details(summary="Cluster Hardware") %} - 10 Gb Networking between all nodes - Node - Ryzen 5 5500 - 64GB RAM - 4x 480GB enterprise SSD - Node - Ryzen 5 3600 - 64GB RAM - 4x 480GB enterprise SSD - Node - EPYC 3151 - 64GB RAM - 4x 480GB enterprise SSD {% end %} {% details(summary="Command to convert raw data into data for visualisation") %} ```bash jq '[.jobs[] | { iodepth: ."job options".iodepth, bs: ."job options".bs, operations: { iops: .write.iops, bw_bytes: .write.bw_bytes } }]' content/ceph-benchmarking/benchmarks/raw_random_write.json | jq ' # collect sorted unique labels (map({key:.bs,value:1})|from_entries|keys_unsorted) as $labels | { labels: $labels, iodepths: ( group_by(.iodepth) | map( . as $group | { iodepth: ($group[0].iodepth | tonumber), iops: [ $labels[] as $l | ($group[] | select(.bs == $l) | .operations.iops) // null ], bw: [ $labels[] as $l | ($group[] | select(.bs == $l) | .operations.bw_bytes) // null ] } ) ) } ' ``` {% end %} ## Future Work - Try directly on the block device - Try this using xfs instead of ext4 - Try this with and without drive caches