docs: clarify comparison semantics and add baseline compare script
This commit is contained in:
parent
9e1f1a884c
commit
6b8fcffd00
38
README.md
38
README.md
@ -1,6 +1,6 @@
|
|||||||
# kv_bench (Mace vs RocksDB)
|
# kv_bench (Mace vs RocksDB)
|
||||||
|
|
||||||
Quick start for reproducible comparison. Full guide: [docs/repro.md](./docs/repro.md).
|
Quick start for reproducible Mace vs RocksDB comparison. Full guide: [docs/repro.md](./docs/repro.md).
|
||||||
|
|
||||||
## 5-Minute Quickstart
|
## 5-Minute Quickstart
|
||||||
1. Set your storage root (any mount path, not hardcoded to `/nvme`):
|
1. Set your storage root (any mount path, not hardcoded to `/nvme`):
|
||||||
@ -13,13 +13,13 @@ mkdir -p "${KV_BENCH_STORAGE_ROOT}"
|
|||||||
2. Initialize Python env once:
|
2. Initialize Python env once:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd /home/abby/kv_bench/scripts
|
cd "$HOME/kv_bench/scripts"
|
||||||
./init.sh
|
./init.sh
|
||||||
source ./bin/activate
|
source ./bin/activate
|
||||||
cd /home/abby/kv_bench
|
cd "$HOME/kv_bench"
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Run baseline comparison (both engines write to the same CSV):
|
3. Run baseline comparison (both engines append to the same CSV):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -rf "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
rm -rf "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_rocks"
|
||||||
@ -29,20 +29,32 @@ mkdir -p "${KV_BENCH_STORAGE_ROOT}/basic_mace" "${KV_BENCH_STORAGE_ROOT}/basic_r
|
|||||||
./scripts/rocksdb.sh "${KV_BENCH_STORAGE_ROOT}/basic_rocks" ./scripts/benchmark_results.csv
|
./scripts/rocksdb.sh "${KV_BENCH_STORAGE_ROOT}/basic_rocks" ./scripts/benchmark_results.csv
|
||||||
```
|
```
|
||||||
|
|
||||||
4. View and plot results:
|
4. Plot results:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./scripts/bin/python ./scripts/plot.py ./scripts/benchmark_results.csv ./scripts
|
./scripts/bin/python ./scripts/plot.py ./scripts/benchmark_results.csv ./scripts
|
||||||
```
|
```
|
||||||
|
|
||||||
## Fast Result Reading
|
5. Print a direct comparison table from the CSV:
|
||||||
- Raw input CSV: `./scripts/benchmark_results.csv`
|
|
||||||
- Key columns:
|
```bash
|
||||||
- `engine` (`mace` / `rocksdb`)
|
./scripts/bin/python ./scripts/compare_baseline.py ./scripts/benchmark_results.csv
|
||||||
- `workload_id` (`W1..W6`)
|
```
|
||||||
- `ops_per_sec` (higher is better)
|
|
||||||
- `p99_us` (lower is better)
|
## What Is Compared
|
||||||
- `error_ops` (must be 0 before drawing conclusions)
|
- Comparison unit: rows with identical `workload_id`, `threads`, `key_size`, `value_size`, `durability_mode`, `read_path`
|
||||||
|
- Throughput metric: workload-level `ops_per_sec` (higher is better)
|
||||||
|
- `W1/W2/W3/W4`: mixed read+update throughput
|
||||||
|
- `W5`: mixed read+update+scan throughput
|
||||||
|
- `W6`: scan throughput (counted by scan requests, not scanned key count)
|
||||||
|
- Tail latency metric: workload-level `p99_us` (lower is better)
|
||||||
|
- This is the mixed p99 of all operations executed in that workload row, not per-op-type p99
|
||||||
|
- `W1/W2/W3/W4`: mixed read+update p99
|
||||||
|
- `W5`: mixed read+update+scan p99
|
||||||
|
- `W6`: scan p99
|
||||||
|
- Reliability gate: if `error_ops > 0`, debug that case before drawing conclusions
|
||||||
|
|
||||||
|
Raw CSV path: `./scripts/benchmark_results.csv`
|
||||||
|
|
||||||
## Phase Reports
|
## Phase Reports
|
||||||
- Phase 1 (stability CV):
|
- Phase 1 (stability CV):
|
||||||
|
|||||||
@ -22,8 +22,8 @@ For the local profile, assume approximately:
|
|||||||
Before running, set paths and verify filesystem type/capacity:
|
Before running, set paths and verify filesystem type/capacity:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export KV_BENCH_STORAGE_ROOT=/home/abby/kv_bench/target/repro_storage
|
export KV_BENCH_STORAGE_ROOT="$HOME/kv_bench/target/repro_storage"
|
||||||
export KV_BENCH_RESULT_ROOT=/home/abby/kv_bench/target/repro_results
|
export KV_BENCH_RESULT_ROOT="$HOME/kv_bench/target/repro_results"
|
||||||
mkdir -p "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
|
mkdir -p "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
|
||||||
|
|
||||||
df -hT "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
|
df -hT "${KV_BENCH_STORAGE_ROOT}" "${KV_BENCH_RESULT_ROOT}"
|
||||||
@ -36,10 +36,10 @@ Requirements:
|
|||||||
|
|
||||||
## 3. Initialization
|
## 3. Initialization
|
||||||
```bash
|
```bash
|
||||||
cd /home/abby/kv_bench/scripts
|
cd "$HOME/kv_bench/scripts"
|
||||||
./init.sh
|
./init.sh
|
||||||
source ./bin/activate
|
source ./bin/activate
|
||||||
cd /home/abby/kv_bench
|
cd "$HOME/kv_bench"
|
||||||
```
|
```
|
||||||
|
|
||||||
## 4. Quick Baseline (W1~W6)
|
## 4. Quick Baseline (W1~W6)
|
||||||
|
|||||||
90
scripts/compare_baseline.py
Normal file
90
scripts/compare_baseline.py
Normal file
@ -0,0 +1,90 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import sys
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Compare mace vs rocksdb from benchmark_results.csv"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"csv_path",
|
||||||
|
nargs="?",
|
||||||
|
default="./scripts/benchmark_results.csv",
|
||||||
|
help="Path to benchmark CSV (default: ./scripts/benchmark_results.csv)",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
df = pd.read_csv(args.csv_path)
|
||||||
|
|
||||||
|
required = {
|
||||||
|
"engine",
|
||||||
|
"workload_id",
|
||||||
|
"threads",
|
||||||
|
"key_size",
|
||||||
|
"value_size",
|
||||||
|
"durability_mode",
|
||||||
|
"read_path",
|
||||||
|
"ops_per_sec",
|
||||||
|
"p99_us",
|
||||||
|
"error_ops",
|
||||||
|
}
|
||||||
|
missing = required - set(df.columns)
|
||||||
|
if missing:
|
||||||
|
raise ValueError(f"Missing columns in csv: {sorted(missing)}")
|
||||||
|
|
||||||
|
keys = [
|
||||||
|
"workload_id",
|
||||||
|
"threads",
|
||||||
|
"key_size",
|
||||||
|
"value_size",
|
||||||
|
"durability_mode",
|
||||||
|
"read_path",
|
||||||
|
]
|
||||||
|
|
||||||
|
ok = df[df["error_ops"] == 0].copy()
|
||||||
|
if ok.empty:
|
||||||
|
print("No rows with error_ops == 0, cannot compare.")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
agg = ok.groupby(keys + ["engine"], as_index=False).agg(
|
||||||
|
ops_per_sec=("ops_per_sec", "median"),
|
||||||
|
p99_us=("p99_us", "median"),
|
||||||
|
)
|
||||||
|
|
||||||
|
piv = agg.pivot_table(
|
||||||
|
index=keys,
|
||||||
|
columns="engine",
|
||||||
|
values=["ops_per_sec", "p99_us"],
|
||||||
|
aggfunc="first",
|
||||||
|
)
|
||||||
|
piv.columns = [f"{metric}_{engine}" for metric, engine in piv.columns]
|
||||||
|
out = piv.reset_index()
|
||||||
|
|
||||||
|
for col in [
|
||||||
|
"ops_per_sec_mace",
|
||||||
|
"ops_per_sec_rocksdb",
|
||||||
|
"p99_us_mace",
|
||||||
|
"p99_us_rocksdb",
|
||||||
|
]:
|
||||||
|
if col not in out.columns:
|
||||||
|
out[col] = pd.NA
|
||||||
|
|
||||||
|
out["qps_ratio_mace_over_rocksdb"] = (
|
||||||
|
out["ops_per_sec_mace"] / out["ops_per_sec_rocksdb"]
|
||||||
|
)
|
||||||
|
out["p99_ratio_mace_over_rocksdb"] = out["p99_us_mace"] / out["p99_us_rocksdb"]
|
||||||
|
out = out.sort_values(keys)
|
||||||
|
|
||||||
|
print(out.to_string(index=False))
|
||||||
|
print("\nInterpretation:")
|
||||||
|
print("- qps_ratio_mace_over_rocksdb > 1: mace has higher throughput")
|
||||||
|
print("- p99_ratio_mace_over_rocksdb < 1: mace has lower p99 latency")
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
Loading…
Reference in New Issue
Block a user