Benchmarks: Statistics — model & plot Aiur/Zisk benchmark cost data by samuelburnham · Pull Request #469 · argumentcomputer/ix

samuelburnham · 2026-07-02T02:22:44Z

General-purpose Python project (benchstats) for modelling and plotting measured benchmark data across proving stacks. Two regimes per system: profiled features -> cost (predict Aiur FFT / Zisk cycles from cheap out-of-circuit counters), and cost -> runtime (time / throughput / RAM).

benchstats/ : load, fit, plot, CLI (aiur-predictor, aiur-runtime, zisk-predictor, zisk-runtime, zisk-prove-validate, all).
data/aiur/ : FFT cost + exec/prove/RAM (98); profiler features + FFT (65).
data/zisk/ : single-shard (66) / multi-shard (12) features+cycles; prove_real.csv (23 real GPU leaves) + multishard_failed.csv.
docs/ : Aiur cost models + predictor; Zisk shard.rs model vs real proves.
flake.nix : dev shell ships python3 + matplotlib.

Validation of the actual load-bearing Zisk model: zisk-prove-validate reads the planner constants straight from the ix repo's crates/kernel/src/shard.rs (no Python-fit function) and plots them against 23 real GPU leaf proves measured via zisk-host + /usr/bin/time -v on an RTX PRO 6000 (the calibration HW): 8 single-subject + 6 mergesort sharded (witness=5) + 9 multi-shard (mergesort/rbmap/binsearch) at witness=10.

Findings: RAM 50+33·B is R²=0.80 (slope exact at 33 GiB/Bstep; base ~12 GiB conservative). prove-time 54+158·B is R²=0.93 (slope tracks; 54s intercept is a cold-start artifact). --max-witness-stored 10 barely changes RAM (+2-4 GiB; no recalibration needed). Sharding the whole initStd library env is infeasible — single atomic Muts blocks crash the prover (~13.5 B steps) or OOM >235 GB (~2.4 B, model under-predicts ~2x); clean multi-shard data comes from baked programs. See docs/zisk-prove-validation.md.

Reproducible: Aiur via bench-typecheck / ix check --stats-out; native features via only_const_profile; Zisk cycles/proves via zisk-host; model constants from shard.rs (the single source of truth).

General-purpose Python project (`benchstats`) for modelling and plotting measured benchmark data across proving stacks. Two regimes per system: profiled features -> cost (predict Aiur FFT / Zisk cycles from cheap out-of-circuit counters), and cost -> runtime (time / throughput / RAM). - benchstats/ : load, fit, plot, CLI (aiur-predictor, aiur-runtime, zisk-predictor, zisk-runtime, zisk-prove-validate, all). - data/aiur/ : FFT cost + exec/prove/RAM (98); profiler features + FFT (65). - data/zisk/ : single-shard (66) / multi-shard (12) features+cycles; prove_real.csv (23 real GPU leaves) + multishard_failed.csv. - docs/ : Aiur cost models + predictor; Zisk shard.rs model vs real proves. - flake.nix : dev shell ships python3 + matplotlib. Validation of the *actual* load-bearing Zisk model: `zisk-prove-validate` reads the planner constants straight from the ix repo's crates/kernel/src/shard.rs (no Python-fit function) and plots them against 23 real GPU leaf proves measured via zisk-host + /usr/bin/time -v on an RTX PRO 6000 (the calibration HW): 8 single-subject + 6 mergesort sharded (witness=5) + 9 multi-shard (mergesort/rbmap/binsearch) at witness=10. Findings: RAM 50+33·B is R²=0.80 (slope exact at 33 GiB/Bstep; base ~12 GiB conservative). prove-time 54+158·B is R²=0.93 (slope tracks; 54s intercept is a cold-start artifact). --max-witness-stored 10 barely changes RAM (+2-4 GiB; no recalibration needed). Sharding the whole initStd library env is infeasible — single atomic Muts blocks crash the prover (~13.5 B steps) or OOM >235 GB (~2.4 B, model under-predicts ~2x); clean multi-shard data comes from baked programs. See docs/zisk-prove-validation.md. Reproducible: Aiur via bench-typecheck / ix check --stats-out; native features via only_const_profile; Zisk cycles/proves via zisk-host; model constants from shard.rs (the single source of truth).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmarks: Statistics — model & plot Aiur/Zisk benchmark cost data#469

Benchmarks: Statistics — model & plot Aiur/Zisk benchmark cost data#469
samuelburnham wants to merge 1 commit into
mainfrom
sb/aiur-fft-cost-model

samuelburnham commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

samuelburnham commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant