Benchmarks: Statistics — model & plot Aiur/Zisk benchmark cost data#469
Draft
samuelburnham wants to merge 1 commit into
Draft
Benchmarks: Statistics — model & plot Aiur/Zisk benchmark cost data#469samuelburnham wants to merge 1 commit into
samuelburnham wants to merge 1 commit into
Conversation
General-purpose Python project (`benchstats`) for modelling and plotting
measured benchmark data across proving stacks. Two regimes per system:
profiled features -> cost (predict Aiur FFT / Zisk cycles from cheap
out-of-circuit counters), and cost -> runtime (time / throughput / RAM).
- benchstats/ : load, fit, plot, CLI (aiur-predictor, aiur-runtime,
zisk-predictor, zisk-runtime, zisk-prove-validate, all).
- data/aiur/ : FFT cost + exec/prove/RAM (98); profiler features + FFT (65).
- data/zisk/ : single-shard (66) / multi-shard (12) features+cycles;
prove_real.csv (23 real GPU leaves) + multishard_failed.csv.
- docs/ : Aiur cost models + predictor; Zisk shard.rs model vs real proves.
- flake.nix : dev shell ships python3 + matplotlib.
Validation of the *actual* load-bearing Zisk model: `zisk-prove-validate` reads
the planner constants straight from the ix repo's crates/kernel/src/shard.rs
(no Python-fit function) and plots them against 23 real GPU leaf proves measured
via zisk-host + /usr/bin/time -v on an RTX PRO 6000 (the calibration HW):
8 single-subject + 6 mergesort sharded (witness=5) + 9 multi-shard
(mergesort/rbmap/binsearch) at witness=10.
Findings: RAM 50+33·B is R²=0.80 (slope exact at 33 GiB/Bstep; base ~12 GiB
conservative). prove-time 54+158·B is R²=0.93 (slope tracks; 54s intercept is a
cold-start artifact). --max-witness-stored 10 barely changes RAM (+2-4 GiB; no
recalibration needed). Sharding the whole initStd library env is infeasible —
single atomic Muts blocks crash the prover (~13.5 B steps) or OOM >235 GB
(~2.4 B, model under-predicts ~2x); clean multi-shard data comes from baked
programs. See docs/zisk-prove-validation.md.
Reproducible: Aiur via bench-typecheck / ix check --stats-out; native features
via only_const_profile; Zisk cycles/proves via zisk-host; model constants from
shard.rs (the single source of truth).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
General-purpose Python project (
benchstats) for modelling and plotting measured benchmark data across proving stacks. Two regimes per system: profiled features -> cost (predict Aiur FFT / Zisk cycles from cheap out-of-circuit counters), and cost -> runtime (time / throughput / RAM).Validation of the actual load-bearing Zisk model:
zisk-prove-validatereads the planner constants straight from the ix repo's crates/kernel/src/shard.rs (no Python-fit function) and plots them against 23 real GPU leaf proves measured via zisk-host + /usr/bin/time -v on an RTX PRO 6000 (the calibration HW): 8 single-subject + 6 mergesort sharded (witness=5) + 9 multi-shard (mergesort/rbmap/binsearch) at witness=10.Findings: RAM 50+33·B is R²=0.80 (slope exact at 33 GiB/Bstep; base ~12 GiB conservative). prove-time 54+158·B is R²=0.93 (slope tracks; 54s intercept is a cold-start artifact). --max-witness-stored 10 barely changes RAM (+2-4 GiB; no recalibration needed). Sharding the whole initStd library env is infeasible — single atomic Muts blocks crash the prover (~13.5 B steps) or OOM >235 GB (~2.4 B, model under-predicts ~2x); clean multi-shard data comes from baked programs. See docs/zisk-prove-validation.md.
Reproducible: Aiur via bench-typecheck / ix check --stats-out; native features via only_const_profile; Zisk cycles/proves via zisk-host; model constants from shard.rs (the single source of truth).