Files
blog/content/ceph-benchmarking/index.md
2026-02-21 18:57:06 +01:00

4.8 KiB

+++ title = "Ceph Benchmarking" date = 2026-02-21 description = "The results of some of my recent ceph benchmarks" draft = true

[taxonomies] categories = ["Homelab"] tags = ["Homelab", "Ceph"] +++

Motivation

I have been running a ceph cluster in my homelab for about 2 years now, but never properly benchmarked it, let alone wrote down my findings or any potential conclusions.

Setup everything for the Benchmarking

On a machine in the ceph-cluster already:

  1. Generate a minimal ceph config using ceph config generate-minimal-conf
  2. Create a user for Benchmarking
    1. Create a new user ceph auth add client.linux_pc
    2. Edit the caps for my use-case
      • mgr: profile rbd pool=test-rbd
      • mon: profile rbd
      • osd: profile rbd pool=test-rbd
    3. Get the keyring configuration using ceph auth get client.linux_pc

On the client machine doing the Benchmarking:

  1. Install basic ceph tools apt-get install -y ceph-common
  2. Load the rbd kernel module modprobe rbd
  3. Setup local ceph config
    • Copy the generated configuration to /etc/ceph/ceph.conf
    • chmod 644 /etc/ceph/ceph.conf
  4. Setup local ceph keyring
    • Copy the keyring configuratio nto /etc/ceph/ceph.client.linux_pc.keyring
    • chmod 644 /etc/ceph/ceph.client.linux_pc.keyring
  5. Confirm your configuration is working by running ceph -s -n client.linux_pc

Setup the benchmark itself:

  1. rbd create -n client.linux_pc --size 10G --pool test-rbd bench-volume
  2. rbd -n client.linux_pc device map --pool test-rbd bench-volume (which should create a new block device, likely /dev/rbd0)
  3. mkfs.ext4 /dev/rbd0
  4. mkdir /mnt/bench
  5. mount /dev/rbd0 /mnt/bench

Benchmarks

All benchmarks are run with the same configuration, only changing the access (read/write, random/sequential). Key configuration options are:

  • using libaio
  • direct io
  • 1 job

{% details(summary="fio config") %}

[global]
ioengine=libaio
direct=1
size=4G
numjobs=1
runtime=60s
time_based
startdelay=5s
group_reporting
stonewall

name=write
rw=write
filename=bench

[1io_4k]
iodepth=1
bs=4k
[1io_8k]
iodepth=1
bs=8k
[1io_64k]
iodepth=1
bs=64k
[1io_4M]
iodepth=1
bs=4M

[32io_4k]
iodepth=32
bs=4k
[32io_8k]
iodepth=32
bs=8k
[32io_64k]
iodepth=32
bs=64k
[32io_4M]
iodepth=32
bs=4M

{% end %}

Results

{% details(summary="Random Reads") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/random_read.json") }} {% end %}

{% details(summary="Random Writes") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/random_write.json") }} {% end %}

{% details(summary="Sequential Reads") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/seq_read.json") }} {% end %}

{% details(summary="Sequential Writes") %} {{ fio_benchmark(path="content/ceph-benchmarking/benchmarks/seq_write.json") }} {% end %}

Conclusion

  1. Overall I am satisfied with the performance of the cluster for my current use-case.
  2. There is a lot of room for improvement in the low queue-depth range
  3. The network is not really a limiting factor currently
    • None of the nodes in the cluster exceeded 500MiB/s of TX or RX, so there is plenty of room for growth
    • My client used for testing was limited by the network, evident by the fact that the highest speed achieved is ~1.2GB/s (~10Gb/s)
  4. My smallest node (the embedded epyc) could be the limiting factor as in some benchmarks, it reached 100% cpu usage, while my other nodes never exceeded 40%

Extra Details

{% details(summary="Cluster Hardware") %}

  • 10 Gb Networking between all nodes
  • Node
    • Ryzen 5 5500
    • 64GB RAM
    • 4x 480GB enterprise SSD
  • Node
    • Ryzen 5 3600
    • 64GB RAM
    • 4x 480GB enterprise SSD
  • Node
    • EPYC 3151
    • 64GB RAM
    • 4x 480GB enterprise SSD {% end %}

{% details(summary="Command to convert raw data into data for visualisation") %}

jq '[.jobs[] | { iodepth: ."job options".iodepth, bs: ."job options".bs, operations: { iops: .write.iops, bw_bytes: .write.bw_bytes } }]' content/ceph-benchmarking/benchmarks/raw_random_write.json | jq '
  # collect sorted unique labels
  (map({key:.bs,value:1})|from_entries|keys_unsorted) as $labels
  |
  {
    labels: $labels,
    iodepths:
      (
        group_by(.iodepth)
        | map(
            . as $group
            | {
                iodepth: ($group[0].iodepth | tonumber),
                iops: [
                  $labels[] as $l
                  | ($group[] | select(.bs == $l) | .operations.iops) // null
                ],
                bw: [
                  $labels[] as $l
                  | ($group[] | select(.bs == $l) | .operations.bw_bytes) // null
                ]
              }
          )
      )
  }
'

{% end %}

Future Work

  • Try directly on the block device
  • Try this using xfs instead of ext4
  • Try this with and without drive caches