Skip to content

fix(release): make changelog generation resilient to GitHub flakiness#21

Merged
alerizzo merged 2 commits into
mainfrom
fix/resilient-changelog-generation
Jun 25, 2026
Merged

fix(release): make changelog generation resilient to GitHub flakiness#21
alerizzo merged 2 commits into
mainfrom
fix/resilient-changelog-generation

Conversation

@alerizzo

Copy link
Copy Markdown
Collaborator

Summary

  • The release workflow's "version packages" step has been failing on transient GitHub API errors (Failed to parse data from GitHub / Invalid response body ... Premature close). @changesets/changelog-github calls the GitHub GraphQL API to enrich changelog entries, and @changesets/apply-release-plan generates every entry inside one Promise.all() — so a single dropped connection rejects the whole batch and aborts changeset version, leaving the release PR uncreated.
  • Wrap the GitHub changelog generator in .changeset/changelog.cjs and point .changeset/config.json at it. The enrichment is now best-effort: it retries the GraphQL call a few times (dataloader clears failed keys on rejection, so retries genuinely re-issue the query), then falls back to @changesets/changelog-git (plain entries with commit SHAs, no network) if GitHub stays unreachable. The release proceeds either way; only the changelog decoration degrades when GitHub is down.
  • Pin @changesets/changelog-git explicitly as a devDependency so the fallback never relies on transitive hoisting. Retry behaviour is tunable via CHANGELOG_GITHUB_ATTEMPTS (default 3) and CHANGELOG_GITHUB_RETRY_MS (default 1000ms).

This is an internal release-infra change — it doesn't alter the published CLI, so the changeset is empty (no version bump).

Test plan

  • npm test — 399 tests pass, including 5 new tests for the retry/fallback logic in .changeset/changelog.test.ts
  • Verified end-to-end: loading the module the way apply-release-plan does and running it with no GITHUB_TOKEN (forcing the real GitHub generator to fail) produces a valid git-style line instead of throwing
  • npx changeset status parses the updated config cleanly

🤖 Generated with Claude Code

The release workflow's "version packages" step kept failing with
"Failed to parse data from GitHub / Invalid response body ... Premature
close". @changesets/changelog-github calls the GitHub GraphQL API to
enrich changelog entries, and @changesets/apply-release-plan generates
all entries inside one Promise.all() — so a single transient API failure
rejects the whole batch and aborts versioning. No changeset gets applied
and the release PR is never created.

Wrap the GitHub changelog generator in .changeset/changelog.cjs so the
enrichment is best-effort: retry the GraphQL call a few times (dataloader
clears failed keys, so retries genuinely re-issue the query), then fall
back to @changesets/changelog-git (plain entries with commit SHAs, no
network) if GitHub stays unreachable. The release now proceeds even when
GitHub's API is down; only the changelog decoration degrades.

- Pin @changesets/changelog-git explicitly so the fallback never relies
  on transitive hoisting.
- Tunable via CHANGELOG_GITHUB_ATTEMPTS / CHANGELOG_GITHUB_RETRY_MS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 25, 2026 19:26
@codacy-production

codacy-production Bot commented Jun 25, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 12 complexity · 6 duplication

Metric Results
Complexity 12
Duplication 6

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a resilient changelog generator wrapper (changelog.cjs) that retries GitHub API calls and falls back to git-based changelog generation upon failure, preventing CI pipeline disruptions. It also includes corresponding unit tests, configuration updates, and documentation. The review feedback highlights critical improvements for robustness, including handling non-Error rejections to avoid runtime crashes, validating environment variables to prevent NaN values in timers, and using Vitest's environment stubbing utilities to avoid global state pollution in tests.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread .changeset/changelog.cjs
Comment on lines +47 to +73
async function withFallback(label, githubFn, gitFn) {
const { maxAttempts, retryDelayMs } = getConfig();
let lastError;

for (let attempt = 1; attempt <= maxAttempts; attempt++) {
try {
return await githubFn();
} catch (error) {
lastError = error;
if (attempt < maxAttempts) {
console.warn(
`[changelog] GitHub enrichment for ${label} failed ` +
`(attempt ${attempt}/${maxAttempts}): ${error.message}. Retrying...`,
);
// Linear backoff: 1x, 2x, ... the base delay.
await sleep(retryDelayMs * attempt);
}
}
}

console.warn(
`[changelog] GitHub enrichment for ${label} failed after ${maxAttempts} ` +
`attempts: ${lastError && lastError.message}. Falling back to a plain ` +
`(git) changelog entry for this release.`,
);
return gitFn();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If githubFn() rejects with a non-Error value (such as null, undefined, or a plain string), accessing error.message will throw a TypeError (e.g., Cannot read properties of null (reading 'message')). Since this TypeError occurs inside the catch block, it will propagate uncaught and crash the entire changelog generation process, defeating the resilience of this wrapper.

Additionally, if maxAttempts is configured to be 0 or negative, the loop will not execute, leaving lastError as undefined and causing the final log message to print attempts: undefined.

We should safely extract the error message using optional chaining/fallback (error?.message || String(error)) and add an early return if maxAttempts <= 0.

async function withFallback(label, githubFn, gitFn) {
  const { maxAttempts, retryDelayMs } = getConfig();
  if (maxAttempts <= 0) {
    return gitFn();
  }

  let lastError;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await githubFn();
    } catch (error) {
      lastError = error;
      if (attempt < maxAttempts) {
        const errMsg = error?.message || String(error);
        console.warn(
          "[changelog] GitHub enrichment for " + label + " failed " +
            "(attempt " + attempt + "/" + maxAttempts + "): " + errMsg + ". Retrying..."
        );
        // Linear backoff: 1x, 2x, ... the base delay.
        await sleep(retryDelayMs * attempt);
      }
    }
  }

  const finalErrMsg = lastError?.message || String(lastError);
  console.warn(
    "[changelog] GitHub enrichment for " + label + " failed after " + maxAttempts + " " +
      "attempts: " + finalErrMsg + ". Falling back to a plain " +
      "(git) changelog entry for this release."
  );
  return gitFn();
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed. Added an errorMessage() helper ((err && err.message) || String(err)) used in both warnings so a non-Error rejection can't throw inside the catch and defeat the resilience. getConfig() now also clamps maxAttempts to ≥ 1, so the loop always runs at least once (no more lastError: undefined). Added a test for the non-Error rejection path.

🤖 Generated by /pr-fixup command

Comment thread .changeset/changelog.cjs
Comment on lines +33 to +41
function getConfig() {
return {
maxAttempts: Number(process.env.CHANGELOG_GITHUB_ATTEMPTS) || 3,
retryDelayMs:
process.env.CHANGELOG_GITHUB_RETRY_MS !== undefined
? Number(process.env.CHANGELOG_GITHUB_RETRY_MS)
: 1000,
};
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If process.env.CHANGELOG_GITHUB_RETRY_MS is defined but is not a valid number (e.g., "abc"), Number(process.env.CHANGELOG_GITHUB_RETRY_MS) will return NaN. This causes sleep(retryDelayMs * attempt) to receive NaN, which can lead to unexpected timer behavior in Node.js.

We should validate that both parsed values are valid non-negative integers, falling back to safe defaults if they are not.

function getConfig() {
  const attempts = Number(process.env.CHANGELOG_GITHUB_ATTEMPTS);
  const delay = Number(process.env.CHANGELOG_GITHUB_RETRY_MS);
  return {
    maxAttempts: Number.isInteger(attempts) && attempts >= 0 ? attempts : 3,
    retryDelayMs: Number.isInteger(delay) && delay >= 0 ? delay : 1000,
  };
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — getConfig() now validates both tunables: maxAttempts must be an integer ≥ 1 and retryDelayMs must be finite and ≥ 0, otherwise it falls back to the safe defaults (3 / 1000). Added tests for an invalid attempt count and a non-numeric delay.

🤖 Generated by /pr-fixup command

Comment thread .changeset/changelog.test.ts Outdated
Comment on lines +1 to +17
import { describe, it, expect, beforeEach, vi } from "vitest";
// @ts-expect-error - plain CJS module, no type declarations
import changelog from "./changelog.cjs";

const { withFallback } = changelog;

// The wrapper delegates getReleaseLine / getDependencyReleaseLine to
// `withFallback`, which holds the retry-then-fall-back logic. We test that logic
// directly with injected fakes rather than mocking @changesets/changelog-github
// (vi.mock cannot intercept the require() inside the .cjs module).
describe("changelog withFallback", () => {
beforeEach(() => {
// Remove retry delays so the failing-path tests run instantly.
process.env.CHANGELOG_GITHUB_RETRY_MS = "0";
delete process.env.CHANGELOG_GITHUB_ATTEMPTS;
vi.spyOn(console, "warn").mockImplementation(() => {});
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Modifying process.env globally in tests without restoring them can cause side effects in other tests if they run in the same process and rely on those env vars. Using Vitest's built-in vi.stubEnv and vi.unstubAllEnvs() is much cleaner and safer.

import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
// @ts-expect-error - plain CJS module, no type declarations
import changelog from "./changelog.cjs";

const { withFallback } = changelog;

// The wrapper delegates getReleaseLine / getDependencyReleaseLine to
// `withFallback`, which holds the retry-then-fall-back logic. We test that logic
// directly with injected fakes rather than mocking @changesets/changelog-github
// (vi.mock cannot intercept the require() inside the .cjs module).
describe("changelog withFallback", () => {
  beforeEach(() => {
    // Remove retry delays so the failing-path tests run instantly.
    vi.stubEnv("CHANGELOG_GITHUB_RETRY_MS", "0");
    vi.stubEnv("CHANGELOG_GITHUB_ATTEMPTS", undefined);
    vi.spyOn(console, "warn").mockImplementation(() => {});
  });

  afterEach(() => {
    vi.unstubAllEnvs();
  });

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — switched the test to vi.stubEnv + vi.unstubAllEnvs() (in afterEach) so env mutations don't leak across tests.

🤖 Generated by /pr-fixup command

@codacy-production codacy-production Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR successfully implements the resilient wrapper and fallback logic for GitHub changelog generation, and Codacy analysis indicates the code is up to standards.

However, there is a risk regarding the module export structure in .changeset/changelog.cjs. The current implementation assumes a .default export, which may not be present depending on how the dependencies are packaged. This could result in a runtime failure during the release process, potentially negating the resilience the PR aims to provide. This should be addressed to ensure compatibility across different environments.

Test suggestions

  • GitHub generator returns successfully on the first attempt
  • GitHub generator recovers and returns successfully after a transient failure
  • System falls back to Git generator after exhausting all GitHub retry attempts
  • Retry count is configurable via the CHANGELOG_GITHUB_ATTEMPTS environment variable
  • Linear backoff calculation correctly scales the sleep delay between attempts

TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback

Comment thread .changeset/changelog.cjs
Comment on lines +28 to +29
const github = require("@changesets/changelog-github").default;
const git = require("@changesets/changelog-git").default;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM RISK

Suggestion: The import logic may fail if the required modules do not have a .default export. Consider using a fallback to the module itself for better compatibility.

Suggested change
const github = require("@changesets/changelog-github").default;
const git = require("@changesets/changelog-git").default;
const github = require("@changesets/changelog-github").default || require("@changesets/changelog-github");
const git = require("@changesets/changelog-git").default || require("@changesets/changelog-git");

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off on this one. Both deps are exact-pinned (@changesets/changelog-github@0.6.0, @changesets/changelog-git@0.2.1) and I verified each exports only default (Object.keys(require(...))['default']). Changesets' own loader (apply-release-plan) also unwraps .default, so if it were ever missing the whole release would break regardless — and || require(...) would just resolve to the namespace object (which has no getReleaseLine), so it wouldn't actually add safety. Keeping the explicit .default.

🤖 Generated by /pr-fixup command

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Changesets release workflow by making changelog enrichment via GitHub GraphQL best-effort: it retries transient GitHub failures and falls back to a git-based changelog generator so changeset version can proceed even when GitHub is flaky.

Changes:

  • Added a resilient Changesets changelog wrapper that retries GitHub enrichment and falls back to @changesets/changelog-git.
  • Updated Changesets config to use the wrapper and pinned @changesets/changelog-git as an explicit devDependency.
  • Added unit tests for retry/fallback behavior and documented the approach in deployment specs.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
.changeset/changelog.cjs Implements retry + fallback wrapper around Changesets GitHub changelog generator
.changeset/changelog.test.ts Adds tests covering retry and fallback behavior
.changeset/config.json Points Changesets to the new local changelog wrapper
package.json Pins @changesets/changelog-git as a devDependency to ensure fallback availability
package-lock.json Locks the added devDependency
SPECS/deployment.md Documents the resilient changelog wrapper behavior and tunables
.changeset/bold-views-kiss.md Adds an empty changeset (currently missing a summary line)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .changeset/changelog.cjs
Comment on lines +33 to +41
function getConfig() {
return {
maxAttempts: Number(process.env.CHANGELOG_GITHUB_ATTEMPTS) || 3,
retryDelayMs:
process.env.CHANGELOG_GITHUB_RETRY_MS !== undefined
? Number(process.env.CHANGELOG_GITHUB_RETRY_MS)
: 1000,
};
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — getConfig() now clamps maxAttempts to an integer ≥ 1 and ignores non-numeric/negative retryDelayMs, defaulting to 3 / 1000ms. Tests added.

🤖 Generated by /pr-fixup command

Comment thread .changeset/changelog.cjs
Comment on lines +57 to +60
console.warn(
`[changelog] GitHub enrichment for ${label} failed ` +
`(attempt ${attempt}/${maxAttempts}): ${error.message}. Retrying...`,
);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — both warnings now go through a safe errorMessage() helper instead of error.message, so a non-Error throw can't break the catch.

🤖 Generated by /pr-fixup command

Comment thread .changeset/changelog.cjs
Comment on lines +67 to +71
console.warn(
`[changelog] GitHub enrichment for ${label} failed after ${maxAttempts} ` +
`attempts: ${lastError && lastError.message}. Falling back to a plain ` +
`(git) changelog entry for this release.`,
);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the same errorMessage() helper is applied to the final warning too, and there's a new test covering the non-Error rejection path.

🤖 Generated by /pr-fixup command

Comment on lines +1 to +2
---
---

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added a short note to the changeset explaining it's an internal release-infra change with no version bump.

🤖 Generated by /pr-fixup command

Address the AI review round on the resilient changelog wrapper:
- Guard against non-Error rejections via an errorMessage() helper so a
  `throw undefined`/string can't make .message throw inside the catch and
  defeat the resilience (Gemini, Copilot).
- Validate the CHANGELOG_GITHUB_* tunables in getConfig(): integer
  attempts >= 1 and a finite non-negative delay, ignoring bad overrides
  (Gemini, Copilot).
- Switch tests to vi.stubEnv/vi.unstubAllEnvs and add edge-case tests
  (non-Error rejection, invalid attempt count, non-numeric delay).
- Add an explanatory note to the empty changeset (Copilot).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alerizzo alerizzo merged commit c9ac858 into main Jun 25, 2026
4 checks passed
@alerizzo alerizzo deleted the fix/resilient-changelog-generation branch June 25, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants