197 lines
6.8 KiB
Markdown
197 lines
6.8 KiB
Markdown
# kv_bench Reproduction Guide (Mace vs RocksDB)
|
|
|
|
This document defines a reproducible workflow for `mace` and `rocksdb` across phase0~phase4.
|
|
It now has two profiles:
|
|
|
|
- `local` (default in this doc): validated on this machine class, intended to run end-to-end without exhausting resources.
|
|
- `full`: benchmark-refactor target matrix (much longer runtime).
|
|
|
|
## 1. Prerequisites
|
|
- Linux
|
|
- Rust/Cargo
|
|
- CMake (to build `rocksdb_bench`)
|
|
- Python 3 (reporting/plotting)
|
|
- A persistent storage path (NVMe/SSD recommended), **not tmpfs**
|
|
|
|
## 2. Hardware + Storage Baseline
|
|
For the local profile, assume approximately:
|
|
- CPU: `6C12T`
|
|
- RAM: `32GB`
|
|
- Disk: `100GB` available benchmark storage
|
|
|
|
Before running, set paths and verify filesystem type/capacity:
|
|
|
|
```bash
|
|
export KV_BENCH_STORAGE_ROOT=/home/abby/kv_bench/target/repro_storage
|
|
export KV_BENCH_RESULT_ROOT=/home/abby/kv_bench/target/repro_results
|
|
mkdir -p "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
|
|
|
|
df -hT "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
|
|
free -h
|
|
```
|
|
|
|
Requirements:
|
|
- `KV_BENCH_STORAGE_ROOT` and `KV_BENCH_RESULT_ROOT` must not be on `tmpfs`.
|
|
- Keep at least `25GB` free under storage root before long runs.
|
|
|
|
## 3. Initialization
|
|
```bash
|
|
cd /home/abby/kv_bench/scripts
|
|
./init.sh
|
|
source ./bin/activate
|
|
cd /home/abby/kv_bench
|
|
```
|
|
|
|
## 4. Quick Baseline (W1~W6)
|
|
Clean old data:
|
|
|
|
```bash
|
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
|
mkdir -p "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
|
rm -f "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv"
|
|
```
|
|
|
|
Run both engines (`local` profile parameters):
|
|
|
|
```bash
|
|
WARMUP_SECS=3 MEASURE_SECS=5 PREFILL_KEYS=50000 \
|
|
./scripts/mace.sh "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv"
|
|
|
|
WARMUP_SECS=3 MEASURE_SECS=5 PREFILL_KEYS=50000 \
|
|
./scripts/rocksdb.sh "${KV_BENCH_STORAGE_ROOT}/basic_rocks" "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv"
|
|
```
|
|
|
|
Generate plots:
|
|
|
|
```bash
|
|
./scripts/bin/python ./scripts/plot.py "${KV_BENCH_RESULT_ROOT}/benchmark_results.csv" "${KV_BENCH_RESULT_ROOT}"
|
|
```
|
|
|
|
## 5. Phase Reproduction
|
|
|
|
### 5.1 Phase 1
|
|
```bash
|
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase1"
|
|
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase1"
|
|
rm -f "${KV_BENCH_RESULT_ROOT}/phase1_results.csv"
|
|
|
|
WARMUP_SECS=10 MEASURE_SECS=20 REPEATS=2 \
|
|
PHASE1_WORKLOADS="W1 W3 W6" \
|
|
PHASE1_THREADS="1 6" \
|
|
PHASE1_PROFILES="P2" \
|
|
PHASE1_PREFILL_TIER_S_P2=200000 \
|
|
./scripts/phase1.sh "${KV_BENCH_STORAGE_ROOT}/phase1" "${KV_BENCH_RESULT_ROOT}/phase1_results.csv"
|
|
```
|
|
|
|
### 5.2 Phase 2
|
|
```bash
|
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase2"
|
|
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase2"
|
|
rm -f "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
|
|
|
|
WARMUP_SECS=10 MEASURE_SECS=20 REPEATS=2 \
|
|
PHASE2_WORKLOADS_TIER_M="W1 W3 W6" \
|
|
PHASE2_THREADS_TIER_M="1 6" \
|
|
PHASE2_PROFILES="P2" \
|
|
PHASE2_PREFILL_TIER_M_P2=500000 \
|
|
RUN_TIER_L_REPRESENTATIVE=0 \
|
|
./scripts/phase2.sh "${KV_BENCH_STORAGE_ROOT}/phase2" "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
|
|
```
|
|
|
|
Optional (`full` profile tier-l representative subset):
|
|
|
|
```bash
|
|
RUN_TIER_L_REPRESENTATIVE=1 TIER_L_REPEATS=1 \
|
|
./scripts/phase2.sh "${KV_BENCH_STORAGE_ROOT}/phase2" "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
|
|
```
|
|
|
|
### 5.3 Phase 3
|
|
```bash
|
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase3"
|
|
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase3"
|
|
rm -f "${KV_BENCH_RESULT_ROOT}/phase3_results.csv"
|
|
|
|
WARMUP_SECS=10 MEASURE_SECS=20 REPEATS=2 \
|
|
PHASE3_WORKLOADS="W1 W3" \
|
|
PHASE3_THREADS="1 6" \
|
|
PHASE3_DURABILITIES="relaxed durable" \
|
|
PHASE3_KEY_SIZE=32 PHASE3_VALUE_SIZE=1024 PHASE3_PREFILL_KEYS=500000 \
|
|
./scripts/phase3.sh "${KV_BENCH_STORAGE_ROOT}/phase3" "${KV_BENCH_RESULT_ROOT}/phase3_results.csv"
|
|
```
|
|
|
|
### 5.4 Phase 4 (run one engine at a time)
|
|
`local` profile (memory-safe on 32GB machines, validated on 2026-03-04):
|
|
|
|
```bash
|
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase4_mace"
|
|
rm -f "${KV_BENCH_RESULT_ROOT}/phase4_results_mace.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_mace.csv"
|
|
|
|
SOAK_HOURS=1 PHASE4_MAX_CYCLES=3 \
|
|
PHASE4_WORKLOAD_MAIN=W1 PHASE4_WORKLOAD_VERIFY=W1 \
|
|
SEED_MEASURE_SECS=2 RUN_MEASURE_SECS=10 CRASH_INTERVAL_SECS=3 VERIFY_MEASURE_SECS=2 WARMUP_SECS=1 \
|
|
PHASE4_THREADS=1 PHASE4_KEY_SIZE=32 PHASE4_VALUE_SIZE=128 PHASE4_PREFILL_KEYS=1000 \
|
|
./scripts/phase4_soak.sh mace "${KV_BENCH_STORAGE_ROOT}/phase4_mace" \
|
|
"${KV_BENCH_RESULT_ROOT}/phase4_results_mace.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_mace.csv"
|
|
|
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase4_rocks"
|
|
rm -f "${KV_BENCH_RESULT_ROOT}/phase4_results_rocks.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_rocks.csv"
|
|
|
|
SOAK_HOURS=1 PHASE4_MAX_CYCLES=3 \
|
|
PHASE4_WORKLOAD_MAIN=W1 PHASE4_WORKLOAD_VERIFY=W1 \
|
|
SEED_MEASURE_SECS=2 RUN_MEASURE_SECS=10 CRASH_INTERVAL_SECS=3 VERIFY_MEASURE_SECS=2 WARMUP_SECS=1 \
|
|
PHASE4_THREADS=1 PHASE4_KEY_SIZE=32 PHASE4_VALUE_SIZE=128 PHASE4_PREFILL_KEYS=1000 \
|
|
./scripts/phase4_soak.sh rocksdb "${KV_BENCH_STORAGE_ROOT}/phase4_rocks" \
|
|
"${KV_BENCH_RESULT_ROOT}/phase4_results_rocks.csv" "${KV_BENCH_RESULT_ROOT}/phase4_restart_rocks.csv"
|
|
```
|
|
|
|
Notes:
|
|
- The previous local phase4 parameters (`W3/W6`, larger prefill/time windows) can exceed 32GB RAM on current `mace` builds.
|
|
- The local phase4 profile above keeps crash/restart semantics, but intentionally reduces write pressure so it completes on this host class.
|
|
|
|
For `full` profile, remove the `PHASE4_*` overrides and use benchmark-refactor defaults (recommend `>=64GB` RAM and ample NVMe space).
|
|
|
|
## 6. Result Files
|
|
The commands above write CSVs under `${KV_BENCH_RESULT_ROOT}`:
|
|
- `benchmark_results.csv`
|
|
- `phase1_results.csv`
|
|
- `phase2_results.csv`
|
|
- `phase3_results.csv`
|
|
- `phase4_results_*.csv`
|
|
- `phase4_restart_*.csv`
|
|
|
|
Unified schema columns include:
|
|
- `engine` (`mace` / `rocksdb`)
|
|
- `workload_id` (`W1..W6`)
|
|
- `durability_mode` (`relaxed` / `durable`)
|
|
- `threads,key_size,value_size,prefill_keys`
|
|
- `ops_per_sec`
|
|
- `p50_us,p95_us,p99_us,p999_us`
|
|
- `error_ops`
|
|
- `read_path`
|
|
|
|
## 7. Report Commands
|
|
```bash
|
|
./scripts/bin/python ./scripts/phase1_eval.py "${KV_BENCH_RESULT_ROOT}/phase1_results.csv"
|
|
./scripts/bin/python ./scripts/phase2_report.py "${KV_BENCH_RESULT_ROOT}/phase2_results.csv"
|
|
./scripts/bin/python ./scripts/phase3_report.py "${KV_BENCH_RESULT_ROOT}/phase3_results.csv"
|
|
./scripts/bin/python ./scripts/phase4_report.py "${KV_BENCH_RESULT_ROOT}/phase4_restart_mace.csv"
|
|
./scripts/bin/python ./scripts/phase4_report.py "${KV_BENCH_RESULT_ROOT}/phase4_restart_rocks.csv"
|
|
```
|
|
|
|
## 8. Full-Profile Toggle (Benchmark Refactor Matrix)
|
|
If you want the full benchmark-refactor matrix:
|
|
- Use default phase script matrices (no `PHASE*_*` narrowing).
|
|
- Increase `WARMUP_SECS/MEASURE_SECS/REPEATS` to target values.
|
|
- Enable `RUN_TIER_L_REPRESENTATIVE=1` as needed.
|
|
- Keep large runs on persistent NVMe storage with enough free disk.
|
|
|
|
## 9. Comparison Rules
|
|
Only compare rows with identical:
|
|
- `workload_id`
|
|
- `key_size/value_size`
|
|
- `threads`
|
|
- `durability_mode`
|
|
- `read_path`
|
|
|
|
If `error_ops > 0`, investigate that case before drawing conclusions.
|