Compiler fixes for auxiliary constant generation#473
Open
johnchandlerburnham wants to merge 3 commits into
Open
Compiler fixes for auxiliary constant generation#473johnchandlerburnham wants to merge 3 commits into
johnchandlerburnham wants to merge 3 commits into
Conversation
Remove the `recr`/`refl` bools and the `nested` count from the Ixon `Inductive` constant and its serialization (Rust and Lean), and from the `Indc` reveal-proof variant, renumbering the field-presence mask bits. These flags are derivable from constructor structure, so storing them was redundant and trusting declared values was an adversarial surface (e.g. is_rec = false on a recursive inductive enables improper struct-eta). - kernel: KConst::Indc loses is_rec/is_refl/nested. is_rec is now computed on demand (computed_is_rec), memoized in the new env is_rec_cache with a provisional entry to break the whnf -> try_struct_eta_iota -> is_struct_like cycle. This replaces the declared-vs-computed H1 verification in check_inductive. - compile: new compute_lean_ind_flags recomputes Lean's block-wide isRec/isReflexive/numNested wherever an InductiveVal is reconstructed without a source Lean env (kernel egress, decompile), since Ixon no longer stores the flags; validate_lean_ind_flags checks a whole env against the recomputation. - tests/benchmarks: add AuxDedup1/AuxDedup2 mutual fixtures exercising aux-constant dedup across blocks (fix forthcoming); add a CompileMutualFixtures benchmark lib building the mutual test fixtures; ignore *.ixe.
Evaporated auxiliaries (over-merge splits): when SCC splitting strands a nested aux's spec-param inductives outside the owner's SCC, no SCC holds the joint family, and dropping the irrelevant over-merged motives leaves exactly the external inductive's generic recursor. Canonical treatment: `rec_N` claims alias `<Ext>.rec` (e.g. `List.rec`), call sites are rebuilt onto the external telescope via head-rewrite CallSitePlans (owner-gated, single-motive targets), and `below_N`/`brecOn_N` compile as surgered originals like `_sizeOf_N`. Fixes the AuxDedup kernel-check failures (28 -> 0); AuxDedup1 now generates identical auxiliaries to AuxDedup2 (the canonical structure). New AuxDedupMixed fixture covers a perm mixing a canonical slot and PERM_OUT_OF_SCC for the same owner. Documented in docs/ix_canonicity.md 6.5. Call-site surgery guard is now durable across serialization: aux-regen detection accepts `Named.original.is_some()` in addition to the in-memory `aux_name_to_addr`, so deserialized-state roundtrip recompiles no longer misapply surgery. Shift-aware `instantiate_rev` replaces unshifted substitution in the type-walking helpers (fixes fvar leaks in `.brecOn.go` bodies). Byte-exact aux roundtrip: `roundtrip_block` Phase A now preseeds the ref/univ tables (`preseed_expr_tables`) like every production compile path. The serialized constant embeds those tables in sorted order; compiling without the preseed filled them in traversal order instead, permuting every `Ref`/univ index — byte-different but semantically identical constants (decode resolves through the embedded table). This silently failed the Phase-A address comparison against `Named.original.0` for 1529 of 1545 aux constants (including plain stdlib like `Nat.casesOn`); a debug probe proved compile(original) == compile(regen) in every case, i.e. the regeneration itself was always faithful. With the invariant holding corpus-wide, the Phase-A recompile-hash mismatch is now a hard error with no aux exemption, and every roundtrip arm records failures in `aux_gen_errors` (recovery keeps the Lean-facing env populated for diagnosis but is never silent). Pass-2 scope hygiene: the below-def roundtrip loop filters by the original-gated `aux_members` like its sibling loops, so evaporated `below_N` keep their faithful Pass-1 decompile. IX_ROUNDTRIP_DEBUG now dumps hashed component scalars/hashes and runs an original-form recompile probe for any mismatch. Test fixes: kernel-tutorial `bad_raw_consts` inductive fixtures carry recomputation-honest flags so compile-side `validate_ind_flags` no longer poisons the shared tutorial env (73/335 -> 335/335, with the kernel rejecting each bad fixture as designed); validate-aux seeds match module-private fixture names via `privateToUserName?` and enable the Canonicity prefix; Phase 4b gains per-module markers so a fully absent identity group fails loudly when its fixture module is loaded (previously vacuous at 0 pass / 0 fail, now 109 pass). Gates: kernel-check-env 201296/201296; rust-compile all phases with 0 aux_gen errors, 0 mismatches, and 0 Phase-A address divergences on the full 213k env (live and deserialized); validate-aux 0 failures at 4393-constant scope; rust-serialize byte-exact; kernel-ixon-roundtrip 143694/0; kernel-tutorial 335/335; cargo test and lake test green.
Behavior-neutral cleanups flagged by `cargo clippy --all-targets`: map_or over map+unwrap_or and slice::contains in surgery.rs, an enumerate loop for the motive-peeling walk in aux_motive_sigs, and let-chain collapses for the inductive-flags fixup loops in decompile.rs and kernel_egress.rs. Plus `cargo fmt` line-wrapping drift left over from the previous commit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related changes to the content-addressing layer and its aux-constant handling, fixing underlying issue reported in #465
1. Ixon no longer stores
recr/refl/nestedonInductive. These are derivablefrom constructor structure, so storing them was redundant and trusting declared values
was an adversarial surface (e.g.
is_rec = falseon a recursive inductive enablesimproper struct-eta). The kernel now computes
is_recon demand, memoized in a new envcache with a provisional entry to break the
whnf → try_struct_eta_iota → is_struct_likecycle; this replaces the declared-vs-computed check in
check_inductive. The compileside gains
compute_lean_ind_flagsto recompute Lean's block-wideisRec/isReflexive/numNestedwherever anInductiveValis reconstructed without asource Lean env (kernel egress, decompile), and
validate_lean_ind_flagsto check awhole env against the recomputation.
2. Evaporated auxiliaries get a canonical form. When SCC splitting strands a nested
aux's spec-param inductives outside the owner's SCC, no SCC holds the joint family, and
dropping the irrelevant over-merged motives leaves exactly the external inductive's
generic recursor. Canonical treatment:
rec_Nclaims alias<Ext>.rec(e.g.List.rec), call sites are rebuilt onto the external telescope via head-rewriteCallSitePlans (owner-gated, single-motive targets), and
below_N/brecOn_Ncompile assurgered originals like
_sizeOf_N. Fixes the AuxDedup kernel-check failures (28 → 0);AuxDedup1 now generates auxiliaries identical to AuxDedup2's canonical structure.
Documented in
docs/ix_canonicity.md§6.5.Byte-exact aux roundtrip
roundtrip_blockPhase A (recompile the regenerated Lean form, compare against thestored original address) silently failed for 1529 of 1545 aux constants — including
plain stdlib like
Nat.casesOn. Root cause: every production compile path preseeds theref/univ tables in sorted order (
preseed_expr_tables) before compiling, and theserialized constant embeds those tables; Phase A compiled without the preseed, filling
the tables in traversal order instead — every
Ref/univ index permuted, byte-differentbut semantically identical constants (decode resolves through the embedded table). A
debug probe recompiling the Lean original through the identical path proved
compile(original) == compile(regen)in every case: regeneration was always faithful,the comparison context was not.
With the preseed mirrored in Phase A the invariant holds corpus-wide, so a Phase-A
recompile-hash mismatch is now a hard error with no aux exemption, and every
roundtrip arm records failures in
aux_gen_errors(recovery keeps the Lean-facing envpopulated for diagnosis but is never silent). Related hardening: call-site surgery
detection is durable across serialization (
Named.original.is_some()alongside thein-memory map), shift-aware
instantiate_revin the type-walking helpers (fixes fvarleaks in
.brecOn.gobodies), and the below-def roundtrip loop filters by theoriginal-gated members like its sibling loops.
IX_ROUNDTRIP_DEBUGnow dumps hashedcomponent summaries and runs an original-form recompile probe on any mismatch.
Test fixes and fixtures
bad_raw_constsinductive fixtures carry recomputation-honest flagsso the whole-env
validate_ind_flagsno longer poisons the shared tutorial env(73/335 → 335/335, with the kernel rejecting each bad fixture as designed).
privateToUserName?, theCanonicity prefix is enabled, and Phase 4b gains per-module markers so a fully absent
identity group fails loudly when its fixture module is loaded (previously vacuous at
0 pass / 0 fail, now 109 pass / 0 fail).
mixing a canonical slot and PERM_OUT_OF_SCC for the same owner), plus a
CompileMutualFixtures benchmark lib.
Gates
divergences on the full 213k env (live and deserialized)
cargo clippy --all-targetsclean