Build #97 + #99 + #100 MVPs (recommended defaults): nudges, benchmark engine, multi-device sync by 0bserver07 · Pull Request #110 · 0bserver07/StackUnderflow

0bserver07 · 2026-07-03T14:31:11Z

Builds the MVPs of the three designed roadmap issues (#97, #99, #100) with the maintainer-ratified recommended defaults. Three file-disjoint commits (only cli.py is shared, in non-overlapping regions), integrated clean, full suite green.

#97 — Active-surfacing nudges (`hooks/proactive.py`)

Phase 0 (governance) + Phase 1 (command-cluster nudge), default-off. A deterministic should_surface gate with per-session dedupe, frequency cap, cross-session cooldown, and dismiss-based adaptive quieting; state in a file-locked ~/.stackunderflow/proactive_state.json (never the DB, no writer contention). The shipped recall.py file-risk nudge is wrapped so default-off = byte-identical shipped behavior (zero regression); the command-cluster signal is precomputed on ingest and looked up O(1) via _normalise_command. No LLM/network on the hook path. 43 tests.

#99 — Comparative benchmark engine (`reports/benchmark.py`)

Observational, stratified (intent × size) benchmark over local history with statistical honesty: Wilson intervals, seeded bootstrap, Benjamini–Hochberg FDR, direct standardization (Simpson's-paradox-safe), sample floors, and "insufficient evidence" as a first-class verdict. services/benchmark_stats.py is stdlib-only. Rubric ratified in docs/specs/benchmark-rubric-v1.md (weights .45/.35/.20, τ=7.0, 90% CI). Route + benchmark CLI group (--json via the memory envelope) + recommend_model_for_task meta-agent tool + a Compare-tab "Which model wins" panel. Cost only ever read from session_mart. /api/benchmark warm = 1.7ms vs a 200ms budget. Move 0 unified task-classification into one canonical classify_task (tag/recommender tests kept green). 62 tests.

#100 — Multi-device sync MVP (`stackunderflow/sync/`)

Phase 1: one-way, client-side-encrypted, BYO-bucket backup of the Overview/Cost-core mart aggregates only (never transcripts/usage_events/price_book). age via pyrage; a narrow ObjectStore (boto3 + an in-memory fake); deterministic shards re-keyed local project_id → stable (provider, slug); two-phase manifest commit; skip-if-unchanged outbox. sync init/push/status. Additive v028 migration (sync_identity, sync_outbox); an import-guarded [sync] extra (version untouched). Default-off is byte-identical. 59 tests (crypto path gated on pyrage).

Verification

Full suite: 4001 passed, 2 skipped (the 1 transient failure was a wall-clock hook-latency p99 budget under concurrent build load — passes 5/5 in isolation, one of the known load-sensitive perf tests)
ruff --select E,F clean · tsc typecheck clean · vite build clean (benchmark panel bundled) · version guard green · contract validator green · test_pricing_invariants green
Recommended defaults applied throughout; all maintainer-owned knobs surfaced/configurable

🤖 Generated with Claude Code

Note

Supersedes #109 (its branch was deleted before merge). Fixes a platform-dependent float-precision assert in a #99 Wilson-interval test (lo == 0.0 → approx(0.0, abs=1e-9)) that failed only on the CI runner (2.8e-17 != 0.0).

…er + command-cluster nudge Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…, aggregates-only, v028 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…CI-guarded, Compare-tab panel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

socket-security · 2026-07-03T14:32:02Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	pypi/boto3@1.43.40
	pypi/pyrage@1.3.0

View full report

0bserver07 and others added 3 commits July 3, 2026 10:18

feat(hooks): active-surfacing proactive nudges (#97) — governance lay…

87b7ef2

…er + command-cluster nudge Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(sync): multi-device encrypted-backup MVP (#100) — age/BYO-bucket…

13c5655

…, aggregates-only, v028 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

feat(benchmark): comparative benchmark engine MVP (#99) — stratified/…

92270da

…CI-guarded, Compare-tab panel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

0bserver07 merged commit 99b7df5 into main Jul 3, 2026
15 checks passed

0bserver07 deleted the feat/roadmap-mvps-97-99-100 branch July 3, 2026 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build #97 + #99 + #100 MVPs (recommended defaults): nudges, benchmark engine, multi-device sync#110

Build #97 + #99 + #100 MVPs (recommended defaults): nudges, benchmark engine, multi-device sync#110
0bserver07 merged 3 commits into
mainfrom
feat/roadmap-mvps-97-99-100

0bserver07 commented Jul 3, 2026

Uh oh!

socket-security Bot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0bserver07 commented Jul 3, 2026

#97 — Active-surfacing nudges (hooks/proactive.py)

#99 — Comparative benchmark engine (reports/benchmark.py)

#100 — Multi-device sync MVP (stackunderflow/sync/)

Verification

Note

Uh oh!

socket-security Bot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

#97 — Active-surfacing nudges (`hooks/proactive.py`)

#99 — Comparative benchmark engine (`reports/benchmark.py`)

#100 — Multi-device sync MVP (`stackunderflow/sync/`)