fix(custody): enforce self-verify + review-record guards on bulk verify/mark-complete#252
Merged
Merged
Conversation
…fy/mark-complete The single-story verify path (Progress.verify_story/4) blocks a caller from verifying its own implemented work and requires an independent review record. The bulk paths (BulkOperations.bulk_verify/4 and bulk_mark_complete/4) skipped both checks, so an orchestrator identity — which can share an agent_id with the implementer it dispatched — could verify or mark-complete its own work, defeating chain-of-custody. - Expose Progress.verify_allowed?/2, review_conducted?/3, and mark_complete_allowed?/1 (thin wrappers over the existing private guards). - bulk_verify now runs the self-verify + review-record guards per story. - bulk_mark_complete enforces the same never-dispatched structural guard as backfill_story/4, so it cannot force-complete a story that entered the dispatch lifecycle. - Per-story failures surface as partial-success errors (batch semantics unchanged); a fully-blocked batch returns 422. Tests: adds self-verify block, missing-review block, and dispatched-story mark-complete rejection; updates existing success-path tests to establish the required review-record / never-dispatched preconditions. Refs private advisory GHSA-h49f-rhf8-8p6f.
mkreyman
added a commit
that referenced
this pull request
Jul 1, 2026
…w guards (parity with #252) (#253) * fix: rename custody guards from predicate to ensure_ naming convention The three public custody guard functions (verify_allowed?, review_conducted?, mark_complete_allowed?) return :ok | {:error, _}, not booleans. Using ? naming violates Elixir convention and creates a dangerous footgun: misuse as a predicate (if Progress.verify_allowed?(...), do: ...) silently disables the guard since the error tuple is truthy. Rename to non-predicate names: - verify_allowed? → ensure_verify_allowed - review_conducted? → ensure_review_conducted - mark_complete_allowed? → ensure_mark_complete_allowed This signals to callers that these are validation functions returning :ok | {:error, _} and must be used with 'with' or pattern matching, not as predicates. Also: - Add documentation that nil orchestrator_agent_id is treated as untrusted - Update bulk_mark_complete MCP tool description to clarify backfill-only restriction Fixes the three Medium/Low findings from enhanced review of PR #252. * fix(custody): enforce self-verify + review-record guards on bulk verify/mark-complete and bulk reject Address all five findings from PR-1 enhanced review: 1. Guard renames: verify_allowed? → ensure_verify_allowed, etc. (Already completed in prior work) 2. Add self-verify guard to bulk_reject: enforce chain-of-custody parity by blocking self-adjudication in bulk reject path (same guard as verify_story/4 single-path) 3. Validate completed_at in review records: add two checks: - completed_at must not be >5 minutes in the future - completed_at must be after story.reported_done_at Both validations happen in ReviewRecord.create_changeset + Progress.record_review 4. Block re-verifying already-verified stories in bulk: check verified_status==:verified in validate_verify_preconditions and reject with :already_verified error (already formatted by format_reason/1) 5. Update @doc for ensure_verify_allowed: document nil identity behavior as 'always blocked' per chain-of-custody model Add 2 new tests for findings 2 and 4: - test 'bulk reject blocks self-adjudication' - test 'bulk verify blocks re-verification of already-verified story' All 3096 tests pass; no findings deferred per NO-DEFERRALS rule. * fix(custody): scope review-record hardening, add missing tests + docs Follow-up refinements to the bulk custody-parity branch (on top of the rename + guard commits): - Revert an unrequested Progress.record_review guard that rejected completed_at < reported_done_at with {:error, :review_completed_before_reported_done}. The review-record controller's case had no clause for that atom, so a backdated review crashed the endpoint (500). Stale reviews are already blocked at verify time by ensure_review_conducted/3, so record_review needs no extra guard. - FIX 5: add ReviewRecord.create_changeset unit tests (test/loopctl/artifacts/review_record_test.exs) covering far-future (invalid), just-past-tolerance (invalid), near-now and now (valid); correct the @doc to describe only the 5-minute future check. - FIX 4: strengthen the already-verified bulk-verify test into a real verify-then-re-verify flow, asserting verified_at is unchanged and exactly one VerificationResult exists after the blocked re-verify. - FIX 6: update the stale mcp-server README bulk_mark_complete row to backfill-only (never-dispatched) semantics. Refs private advisory GHSA-h49f-rhf8-8p6f. * fix(custody): revert crash-prone review guard, harden FIX4 test + docs Companion to the review-record test commit — carries the code and doc changes that belong with it: - progress.ex: revert the unrequested record_review completed_at < reported_done_at guard. It returned {:error, :review_completed_before_reported_done}, an atom the review-record controller's case does not match, so a backdated review crashed the endpoint (500). Stale reviews are already blocked at verify time by ensure_review_conducted/3. - review_record.ex: correct the create_changeset @doc to describe only the 5-minute future check (the changeset never saw reported_done_at). - bulk_operations_controller_test.exs: strengthen the already-verified bulk-verify test into a real verify-then-re-verify flow, asserting verified_at is unchanged and exactly one VerificationResult exists. - mcp-server/README.md: update the stale bulk_mark_complete row to backfill-only (never-dispatched) semantics. Refs private advisory GHSA-h49f-rhf8-8p6f. * fix(custody): tighten review completed_at future skew from 5min to 60s Enhanced-review (Sonnet-5/xhigh) flagged that the 5-minute future tolerance on review-record completed_at is generous for clock skew and leaves a bounded window where a forward-dated review can cover a later, unreviewed re-report. Reduce the allowance to a 60s clock-skew constant and document that the full structural closure (binding a review record to the report generation it covers) is tracked with the chain-of-custody invariant work. Refs private advisory GHSA-h49f-rhf8-8p6f.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes a chain-of-custody parity gap between the single-story and bulk story-completion paths (private advisory GHSA-h49f-rhf8-8p6f, finding
ie-01, Critical).The single-story
Progress.verify_story/4blocks a caller from verifying its own implemented work (self_verify_blocked) and requires an independent review record. The bulk endpoints skipped both checks:POST /api/v1/stories/bulk/verify— verified with only a status check.POST /api/v1/stories/bulk/mark-complete— force-completed any story, including one still in the dispatch lifecycle.Because an orchestrator identity can share an
agent_idwith the implementer it dispatched, either bulk path let it verify/complete its own work — defeating the core trust guarantee.Changes
Progress(wrapping existing private logic):verify_allowed?/2,review_conducted?/3,mark_complete_allowed?/1.bulk_verifynow runs the self-verify + review-record guards per story (parity withverify_story/4).bulk_mark_completenow enforces the same never-dispatched structural guard asbackfill_story/4.Tests
Adds self-verify-block, missing-review-block, and dispatched-story mark-complete rejection cases; updates existing success-path tests to establish the required preconditions. Full suite green locally (3094 tests, 0 failures); credo --strict + dialyzer clean.