docs: rewrite README quickstart and add full repro guide
This commit is contained in:
parent
7d1615b36c
commit
c8eec21f7b
77
README.md
77
README.md
@ -1,33 +1,74 @@
|
||||
# mace 0.0.27 vs rocksdb 10.4.2
|
||||
# kv_bench (Mace vs RocksDB)
|
||||
|
||||
## sequential insert
|
||||

|
||||
Quick start for reproducible comparison. Full guide: [docs/repro.md](./docs/repro.md).
|
||||
|
||||

|
||||
## 5-Minute Quickstart
|
||||
1. Set your storage root (any mount path, not hardcoded to `/nvme`):
|
||||
|
||||
## random insert
|
||||

|
||||
```bash
|
||||
export KV_BENCH_STORAGE_ROOT=/path/to/your/storage/kvbench
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}"
|
||||
```
|
||||
|
||||

|
||||
2. Initialize Python env once:
|
||||
|
||||
---
|
||||
```bash
|
||||
cd /home/abby/kv_bench/scripts
|
||||
./init.sh
|
||||
source ./bin/activate
|
||||
cd /home/abby/kv_bench
|
||||
```
|
||||
|
||||
## random get (warm get)
|
||||
3. Run baseline comparison (both engines write to the same CSV):
|
||||
|
||||

|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
||||
|
||||

|
||||
./scripts/mace.sh "${KV_BENCH_STORAGE_ROOT}/basic_mace" ./scripts/benchmark_results.csv
|
||||
./scripts/rocksdb.sh "${KV_BENCH_STORAGE_ROOT}/basic_rocks" ./scripts/benchmark_results.csv
|
||||
```
|
||||
|
||||
---
|
||||
4. View and plot results:
|
||||
|
||||
# mixed perfomance (hot get)
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/plot.py ./scripts/benchmark_results.csv ./scripts
|
||||
```
|
||||
|
||||

|
||||
## Fast Result Reading
|
||||
- Raw input CSV: `./scripts/benchmark_results.csv`
|
||||
- Key columns:
|
||||
- `engine` (`mace` / `rocksdb`)
|
||||
- `workload_id` (`W1..W6`)
|
||||
- `ops_per_sec` (higher is better)
|
||||
- `p99_us` (lower is better)
|
||||
- `error_ops` (must be 0 before drawing conclusions)
|
||||
|
||||

|
||||
## Phase Reports
|
||||
- Phase 1 (stability CV):
|
||||
|
||||
# sequential scan (warm scan)
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase1_eval.py ./scripts/phase1_results.csv
|
||||
```
|
||||
|
||||

|
||||
- Phase 2 (core median + slow scenarios):
|
||||
|
||||

|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase2_report.py ./scripts/phase2_results.csv
|
||||
```
|
||||
|
||||
- Phase 3 (durability cost):
|
||||
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase3_report.py ./scripts/phase3_results.csv
|
||||
```
|
||||
|
||||
- Phase 4 (restart/recovery):
|
||||
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase4_report.py ./scripts/phase4_restart_mace.csv
|
||||
./scripts/bin/python ./scripts/phase4_report.py ./scripts/phase4_restart_rocks.csv
|
||||
```
|
||||
|
||||
## Full Reproduction
|
||||
For phase-by-phase commands, knobs, and interpretation rules, use [docs/repro.md](./docs/repro.md).
|
||||
|
||||
167
docs/repro.md
Normal file
167
docs/repro.md
Normal file
@ -0,0 +1,167 @@
|
||||
# kv_bench Reproduction Guide (Mace vs RocksDB)
|
||||
|
||||
This repository is used to reproduce and compare `mace` and `rocksdb` benchmark results across phase0~phase4.
|
||||
|
||||
## 1. Prerequisites
|
||||
- Linux
|
||||
- A high-speed storage mount directory you choose (typically an NVMe mount point)
|
||||
- Rust/Cargo
|
||||
- CMake (to build `rocksdb_bench`)
|
||||
- Python 3 (for result aggregation and plotting)
|
||||
|
||||
## 2. Storage Directory Configuration (Important)
|
||||
`/nvme` is no longer hardcoded. You can use any mount directory.
|
||||
|
||||
Recommended: set one shared variable first:
|
||||
|
||||
```bash
|
||||
export KV_BENCH_STORAGE_ROOT=/path/to/your/nvme_mount/kvbench
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}"
|
||||
```
|
||||
|
||||
All scripts below take this directory (or one of its subdirectories) as the first argument.
|
||||
|
||||
## 3. Initialization
|
||||
```bash
|
||||
cd /home/abby/kv_bench/scripts
|
||||
./init.sh
|
||||
source ./bin/activate
|
||||
cd /home/abby/kv_bench
|
||||
```
|
||||
|
||||
## 4. Quick Baseline Comparison (W1~W6)
|
||||
Clean old data first:
|
||||
|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
||||
```
|
||||
|
||||
Run both engines:
|
||||
|
||||
```bash
|
||||
./scripts/mace.sh "${KV_BENCH_STORAGE_ROOT}/basic_mace" ./scripts/benchmark_results.csv
|
||||
./scripts/rocksdb.sh "${KV_BENCH_STORAGE_ROOT}/basic_rocks" ./scripts/benchmark_results.csv
|
||||
```
|
||||
|
||||
Generate plots:
|
||||
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/plot.py ./scripts/benchmark_results.csv ./scripts
|
||||
```
|
||||
|
||||
## 5. Phase Reproduction Commands
|
||||
|
||||
### Phase 1
|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase1"
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase1"
|
||||
./scripts/phase1.sh "${KV_BENCH_STORAGE_ROOT}/phase1" ./scripts/phase1_results.csv
|
||||
```
|
||||
|
||||
### Phase 2
|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase2"
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase2"
|
||||
./scripts/phase2.sh "${KV_BENCH_STORAGE_ROOT}/phase2" ./scripts/phase2_results.csv
|
||||
```
|
||||
|
||||
Optional: enable tier-l representative subset:
|
||||
|
||||
```bash
|
||||
RUN_TIER_L_REPRESENTATIVE=1 TIER_L_REPEATS=1 \
|
||||
./scripts/phase2.sh "${KV_BENCH_STORAGE_ROOT}/phase2" ./scripts/phase2_results.csv
|
||||
```
|
||||
|
||||
### Phase 3
|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase3"
|
||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}/phase3"
|
||||
./scripts/phase3.sh "${KV_BENCH_STORAGE_ROOT}/phase3" ./scripts/phase3_results.csv
|
||||
```
|
||||
|
||||
### Phase 4 (run one engine at a time)
|
||||
Mace:
|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase4_mace"
|
||||
./scripts/phase4_soak.sh mace "${KV_BENCH_STORAGE_ROOT}/phase4_mace" \
|
||||
./scripts/phase4_results_mace.csv ./scripts/phase4_restart_mace.csv
|
||||
```
|
||||
|
||||
RocksDB:
|
||||
```bash
|
||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/phase4_rocks"
|
||||
./scripts/phase4_soak.sh rocksdb "${KV_BENCH_STORAGE_ROOT}/phase4_rocks" \
|
||||
./scripts/phase4_results_rocks.csv ./scripts/phase4_restart_rocks.csv
|
||||
```
|
||||
|
||||
## 6. Where Result Inputs (CSV) Are Stored
|
||||
Default output files:
|
||||
- `./scripts/benchmark_results.csv`
|
||||
- `./scripts/phase1_results.csv`
|
||||
- `./scripts/phase2_results.csv`
|
||||
- `./scripts/phase3_results.csv`
|
||||
- `./scripts/phase4_results_*.csv`
|
||||
- `./scripts/phase4_restart_*.csv`
|
||||
|
||||
The unified schema is emitted by both engine binaries (same format for mace/rocksdb). Key columns:
|
||||
- `engine`: `mace` / `rocksdb`
|
||||
- `workload_id`: `W1..W6`
|
||||
- `durability_mode`: `relaxed` / `durable`
|
||||
- `threads,key_size,value_size,prefill_keys`: case configuration
|
||||
- `ops_per_sec`: throughput
|
||||
- `p50_us,p95_us,p99_us,p999_us`: latency percentiles
|
||||
- `error_ops`: number of failed operations
|
||||
- `read_path`: `snapshot` / `rw_txn`
|
||||
|
||||
## 7. Where to Interpret Results
|
||||
|
||||
### Phase 1 (stability)
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase1_eval.py ./scripts/phase1_results.csv
|
||||
```
|
||||
Check:
|
||||
- `throughput_cv` (<=10%)
|
||||
- `p99_cv` (<=15%)
|
||||
- `stable` and overall pass ratio
|
||||
|
||||
### Phase 2 (core report)
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase2_report.py ./scripts/phase2_results.csv
|
||||
```
|
||||
Check:
|
||||
- `throughput_median`
|
||||
- `p95_median`, `p99_median`
|
||||
- `slower_engine`, `slower_ratio`
|
||||
|
||||
### Phase 3 (durability cost)
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase3_report.py ./scripts/phase3_results.csv
|
||||
```
|
||||
Check:
|
||||
- `throughput_drop_pct` (durable vs relaxed throughput drop)
|
||||
- `p99_inflation_pct` (durable vs relaxed p99 inflation)
|
||||
|
||||
### Phase 4 (recovery capability)
|
||||
```bash
|
||||
./scripts/bin/python ./scripts/phase4_report.py ./scripts/phase4_restart_mace.csv
|
||||
./scripts/bin/python ./scripts/phase4_report.py ./scripts/phase4_restart_rocks.csv
|
||||
```
|
||||
Check:
|
||||
- `restart_success`
|
||||
- `restart_ready_ms` at `p50/p95/p99/max`
|
||||
|
||||
## 8. CLI Configurability (No Hardcoded Disk Prefix)
|
||||
- Both benchmark binaries support `--path` to set the DB directory.
|
||||
- All scripts use the first argument as storage root/path.
|
||||
- You can point `${KV_BENCH_STORAGE_ROOT}` to any mount point (NVMe, SSD, RAID, ephemeral disk).
|
||||
|
||||
## 9. Comparison Best Practices
|
||||
Only compare cases under identical dimensions:
|
||||
- `workload_id`
|
||||
- `key_size/value_size`
|
||||
- `threads`
|
||||
- `durability_mode`
|
||||
- `read_path`
|
||||
|
||||
If `error_ops > 0`, investigate that case first before drawing performance conclusions.
|
||||
Loading…
Reference in New Issue
Block a user