feat(estimators): iterative alternating-projection demeaning for N-way absorbed FE#586
Conversation
PR Review:
|
93f8586 to
7af6c6e
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: PR Review: ✅ Looks goodOverall Assessment✅ Looks good — no unmitigated P0 or P1 findings. The previous P1 zero-total-weight absorbed-group issue and previous P2 Executive Summary
MethodologyNo methodology blockers found. The new MAP path is documented in The previous zero-weight-group concern is resolved by guarded weighted means in Code QualityNo findings.
PerformanceNo findings. The iterative groupby loop is more expensive than the old closed-form unweighted two-way transform, but that cost follows from switching to exact MAP on unbalanced panels. MaintainabilityNo findings. The Tech DebtFinding 1 — P3: Stale TODO cross-reference still says weighted N>1 absorb is gatedLocation: Impact: Concrete fix: SecurityNo findings. Documentation/TestsNo blocking findings. The registry and docstring updates cover the methodology change. Focused test execution could not be run in this environment because |
…y absorbed FE N>1 absorbed fixed effects used single-pass sequential demeaning, which is the exact (weighted) Frisch-Waugh-Lovell residualization only on balanced orthogonal-FE panels; on unbalanced panels it was a biased approximation (coefficients off by ~1e-2 in tested cases). Add an N-way method-of-alternating-projections engine demean_by_groups() in utils.py; route the DiD/MultiPeriodDiD absorb= paths and the shared two-way within_transform() through it, fixing TWFE / SunAbraham / BaconDecomposition on unbalanced unweighted panels too. Lift the weighted-multi-absorb rejection (now supported via weighted MAP). Single-absorb and balanced-panel results are byte-stable; the weighted within_transform output is bit-identical; R-parity goldens unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7af6c6e to
91c6381
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: PR Review: ✅ Looks goodOverall Assessment✅ Looks good — no unmitigated P0 or P1 findings. The prior weighted multi-absorb zero-weight concern is addressed, and the stale TODO cross-reference from the previous review is corrected. Executive Summary
MethodologyNo findings. The methodology change is documented in Code QualityNo findings. The new helper centralizes MAP behavior at PerformanceNo findings. The iterative MAP loop is more expensive than the old closed-form unweighted two-way transform, but that follows from switching to exact residualization on unbalanced panels. MaintainabilityNo findings. The prior stale TODO concern is resolved: the old actionable multi-absorb row is gone, and the SE consistency note now says N>1 absorbed FE with weights is supported via iterative demeaning at Tech DebtNo findings. SecurityNo findings. Documentation/TestsNo blocking findings. Coverage includes unbalanced full-dummy parity and 3-way MAP checks in |
Summary
absorb=[a, b, ...]inDifferenceInDifferences/MultiPeriodDiD) and the shared unweighted two-waywithin_transform(used byTwoWayFixedEffects,SunAbraham,BaconDecomposition) used single-pass sequential demeaning, which is the exact (weighted) Frisch-Waugh-Lovell residualization only when the FE subspaces are orthogonal (balanced fully-crossed panels). On unbalanced panels it was a biased approximation (coefficients off by ~1e-2 in tested cases).diff_diff.utils.demean_by_groups(): each variable is demeaned by each FE dimension in turn until convergence — the exact (weighted) FWL residual onto the combined column space of all absorbed dummies, matching Rfixest/reghdfe/lfe. The two-waywithin_transform()now delegates to it.absorb=call sites (analytical + replicate-refit, ×2 estimators) route through the engine; the weighted-multi-absorbValueErrorrejection is lifted (now supported via weighted MAP).demean_by_group; the weightedwithin_transformoutput is bit-identical (Wooldridge golden at atol=1e-14 stays green); balanced multi-way matches the prior closed-form demean to machine precision. The unweighted two-way path now also emits the non-convergenceUserWarning(previously only the weighted path could).n_absorbed_effects = sum_d(nunique_d - 1)) is preserved unchanged; multi-way-FE DOF correctness is out of scope.Methodology references (required if estimator / math changes)
lfe; Correiareghdfe).docs/methodology/REGISTRY.md(TwoWayFixedEffects within-transform Note; "Absorbed Fixed Effects with Survey Weights" Note).fixest/reghdfe/lfefor N>1 absorbed FE on unbalanced panels (previously it deviated via single-pass).Validation
tests/test_utils.py(newTestDemeanByGroups: engine vs full-dummy OLS for unbalanced 2-way and 3-way, len==1 byte-identity, weighted byte-identity guard vs a frozen copy of the old loop, orthogonality, non-convergence warning);tests/test_within_transform.py(relaxed 5 balanced closed-form assertions toassert_allclose~1e-12, added unbalanced-correctness + convergence-warning tests);tests/test_methodology_did.py(newTestMultiAbsorbIterativeDemean: unbalanced unweighted and non-uniform-weighted DiD/MPDabsorb=vsfixed_effects=parity at atol≈1e-8, FE-collinear regressor reports NaN coef);tests/test_survey.py(flipped the two multi-absorb+survey rejection tests to now-supported + full-dummy parity).test_methodology_twfe.py/test_methodology_sun_abraham.py/test_methodology_bacon.py/test_methodology_wooldridge.pyall green (balanced panels: MAP == closed-form to ~1 ULP; weighted byte-identical).Security / privacy