6.8 KiB
kv_bench Reproduction Guide (Mace vs RocksDB)
This document defines a reproducible workflow for mace and rocksdb across phase0~phase4.
It now has two profiles:
local(default in this doc): validated on this machine class, intended to run end-to-end without exhausting resources.full: benchmark-refactor target matrix (much longer runtime).
1. Prerequisites
- Linux
- Rust/Cargo
- CMake (to build
rocksdb_bench) - Python 3 (reporting/plotting)
- A persistent storage path (NVMe/SSD recommended), not tmpfs
2. Hardware + Storage Baseline
For the local profile, assume approximately:
- CPU:
6C12T - RAM:
32GB - Disk:
100GBavailable benchmark storage
Before running, set paths and verify filesystem type/capacity:
export KV_BENCH_STORAGE_ROOT=/home/abby/kv_bench/target/repro_storage
export KV_BENCH_RESULT_ROOT=/home/abby/kv_bench/target/repro_results
mkdir -p "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
df -hT "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
free -h
Requirements:
KV_BENCH_STORAGE_ROOTandKV_BENCH_RESULT_ROOTmust not be ontmpfs.- Keep at least
25GBfree under storage root before long runs.
3. Initialization
cd /home/abby/kv_bench/scripts
./init.sh
source ./bin/activate
cd /home/abby/kv_bench
4. Quick Baseline (W1~W6)
Clean old data:
rm -rf "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
mkdir -p "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
rm -f "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv"
Run both engines (local profile parameters):
WARMUP_SECS=3 MEASURE_SECS=5 PREFILL_KEYS=50000 \
./scripts/mace.sh "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv"
WARMUP_SECS=3 MEASURE_SECS=5 PREFILL_KEYS=50000 \
./scripts/rocksdb.sh "${KV_BENCH_STORAGE_ROOT}/basic_rocks" "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv"
Generate plots:
./scripts/bin/python ./scripts/plot.py "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv" "${KV_BENCH_RESULT_ROOT}"
5. Phase Reproduction
5.1 Phase 1
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase1"
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase1"
rm -f "${KV_BENCH_RESULT_ROOT}/phase1_results.csv"
WARMUP_SECS=10 MEASURE_SECS=20 REPEATS=2 \
PHASE1_WORKLOADS="W1 W3 W6" \
PHASE1_THREADS="1 6" \
PHASE1_PROFILES="P2" \
PHASE1_PREFILL_TIER_S_P2=200000 \
./scripts/phase1.sh "${KV_BENCH_STORAGE_ROOT}/phase1" "${KV_BENCH_RESULT_ROOT}/phase1_results.csv"
5.2 Phase 2
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase2"
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase2"
rm -f "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
WARMUP_SECS=10 MEASURE_SECS=20 REPEATS=2 \
PHASE2_WORKLOADS_TIER_M="W1 W3 W6" \
PHASE2_THREADS_TIER_M="1 6" \
PHASE2_PROFILES="P2" \
PHASE2_PREFILL_TIER_M_P2=500000 \
RUN_TIER_L_REPRESENTATIVE=0 \
./scripts/phase2.sh "${KV_BENCH_STORAGE_ROOT}/phase2" "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
Optional (full profile tier-l representative subset):
RUN_TIER_L_REPRESENTATIVE=1 TIER_L_REPEATS=1 \
./scripts/phase2.sh "${KV_BENCH_STORAGE_ROOT}/phase2" "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
5.3 Phase 3
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase3"
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase3"
rm -f "${KV_BENCH_RESULT_ROOT}/phase3_results.csv"
WARMUP_SECS=10 MEASURE_SECS=20 REPEATS=2 \
PHASE3_WORKLOADS="W1 W3" \
PHASE3_THREADS="1 6" \
PHASE3_DURABILITIES="relaxed durable" \
PHASE3_KEY_SIZE=32 PHASE3_VALUE_SIZE=1024 PHASE3_PREFILL_KEYS=500000 \
./scripts/phase3.sh "${KV_BENCH_STORAGE_ROOT}/phase3" "${KV_BENCH_RESULT_ROOT}/phase3_results.csv"
5.4 Phase 4 (run one engine at a time)
local profile (memory-safe on 32GB machines, validated on 2026-03-04):
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase4_mace"
rm -f "${KV_BENCH_RESULT_ROOT}/phase4_results_mace.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_mace.csv"
SOAK_HOURS=1 PHASE4_MAX_CYCLES=3 \
PHASE4_WORKLOAD_MAIN=W1 PHASE4_WORKLOAD_VERIFY=W1 \
SEED_MEASURE_SECS=2 RUN_MEASURE_SECS=10 CRASH_INTERVAL_SECS=3 VERIFY_MEASURE_SECS=2 WARMUP_SECS=1 \
PHASE4_THREADS=1 PHASE4_KEY_SIZE=32 PHASE4_VALUE_SIZE=128 PHASE4_PREFILL_KEYS=1000 \
./scripts/phase4_soak.sh mace "${KV_BENCH_STORAGE_ROOT}/phase4_mace" \
"${KV_BENCH_RESULT_ROOT}/phase4_results_mace.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_mace.csv"
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase4_rocks"
rm -f "${KV_BENCH_RESULT_ROOT}/phase4_results_rocks.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_rocks.csv"
SOAK_HOURS=1 PHASE4_MAX_CYCLES=3 \
PHASE4_WORKLOAD_MAIN=W1 PHASE4_WORKLOAD_VERIFY=W1 \
SEED_MEASURE_SECS=2 RUN_MEASURE_SECS=10 CRASH_INTERVAL_SECS=3 VERIFY_MEASURE_SECS=2 WARMUP_SECS=1 \
PHASE4_THREADS=1 PHASE4_KEY_SIZE=32 PHASE4_VALUE_SIZE=128 PHASE4_PREFILL_KEYS=1000 \
./scripts/phase4_soak.sh rocksdb "${KV_BENCH_STORAGE_ROOT}/phase4_rocks" \
"${KV_BENCH_RESULT_ROOT}/phase4_results_rocks.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_rocks.csv"
Notes:
- The previous local phase4 parameters (
W3/W6, larger prefill/time windows) can exceed 32GB RAM on currentmacebuilds. - The local phase4 profile above keeps crash/restart semantics, but intentionally reduces write pressure so it completes on this host class.
For full profile, remove the PHASE4_* overrides and use benchmark-refactor defaults (recommend >=64GB RAM and ample NVMe space).
6. Result Files
The commands above write CSVs under ${KV_BENCH_RESULT_ROOT}:
benchmark_results.csvphase1_results.csvphase2_results.csvphase3_results.csvphase4_results_*.csvphase4_restart_*.csv
Unified schema columns include:
engine(mace/rocksdb)workload_id(W1..W6)durability_mode(relaxed/durable)threads,key_size,value_size,prefill_keysops_per_secp50_us,p95_us,p99_us,p999_userror_opsread_path
7. Report Commands
./scripts/bin/python ./scripts/phase1_eval.py "${KV_BENCH_RESULT_ROOT}/phase1_results.csv"
./scripts/bin/python ./scripts/phase2_report.py "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
./scripts/bin/python ./scripts/phase3_report.py "${KV_BENCH_RESULT_ROOT}/phase3_results.csv"
./scripts/bin/python ./scripts/phase4_report.py "${KV_BENCH_RESULT_ROOT}/phase4_restart_mace.csv"
./scripts/bin/python ./scripts/phase4_report.py "${KV_BENCH_RESULT_ROOT}/phase4_restart_rocks.csv"
8. Full-Profile Toggle (Benchmark Refactor Matrix)
If you want the full benchmark-refactor matrix:
- Use default phase script matrices (no
PHASE*_*narrowing). - Increase
WARMUP_SECS/MEASURE_SECS/REPEATSto target values. - Enable
RUN_TIER_L_REPRESENTATIVE=1as needed. - Keep large runs on persistent NVMe storage with enough free disk.
9. Comparison Rules
Only compare rows with identical:
workload_idkey_size/value_sizethreadsdurability_moderead_path
If error_ops > 0, investigate that case before drawing conclusions.