Skip to content

Ignore invalid language tags in vocab parsing#848

Open
dahlia wants to merge 3 commits into
fedify-dev:2.0-maintenancefrom
dahlia:fix/issue-847-invalid-language-tags
Open

Ignore invalid language tags in vocab parsing#848
dahlia wants to merge 3 commits into
fedify-dev:2.0-maintenancefrom
dahlia:fix/issue-847-invalid-language-tags

Conversation

@dahlia

@dahlia dahlia commented Jun 28, 2026

Copy link
Copy Markdown
Member

Why this patch handles the error in the parser

Remote JSON-LD can contain a language map key that Intl.Locale does not accept. Before this patch, the generated vocabulary parser still passed that key into new LanguageString(...), so a malformed remote field could surface as a RangeError while parsing an inbox activity.

The fix keeps LanguageString itself strict. Constructing one directly with an invalid tag should still fail, because that is useful feedback for local code. The generated parser has a different job: it is reading remote input and already skips scalar values that do not match any supported decoded shape. This patch makes invalid language tags follow that same path.

How the parser now avoids the crash

The rdf:langString data check in packages/vocab-tools/src/type.ts now requires the @language value to be accepted by Intl.Locale before emitting a LanguageString constructor call. packages/vocab-tools/src/class.ts emits the helper used by generated vocabulary classes.

That helper catches only RangeError, which is the error shape used for a bad locale identifier. Other errors still propagate. That keeps the fallback narrow: malformed remote language tags are treated as bad input, but unrelated programming or runtime errors are not hidden.

The regression test in packages/vocab/src/vocab.test.ts uses a Note with a mixed contentMap: one malformed language key and one valid en key. The valid entry still parses, and the malformed entry is ignored.

Verification

I ran the targeted regression test first and confirmed that it failed with the old RangeError. After the fix, these checks passed:

  • deno task -f @fedify/vocab test --filter "Note.fromJsonLd() ignores malformed language tags"
  • mise run test:update_snapshots
  • git diff --check
  • mise run check
  • mise run test-each vocab-tools vocab

Generated vocabulary parsers now validate language tags before
constructing LanguageString values.  Malformed remote language-map entries
are skipped like other undecodable scalar values, while valid entries still
parse normally.

Fixes fedify-dev#847

Assisted-by: Codex:gpt-5.5
@dahlia dahlia self-assigned this Jun 28, 2026
@dahlia dahlia added component/vocab Activity Vocabulary related component/inbox Inbox related component/vocab-tools Vocabulary code generation (@fedify/vocab-tools) labels Jun 28, 2026
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9b579da9-9c8e-4c47-b241-79e04b502c02

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@dahlia

dahlia commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

@codex review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an issue where malformed language tags in remote JSON-LD language maps caused parsing to abort with a RangeError. It introduces a helper function isValidLanguageTag using Intl.Locale to validate language tags, integrates this check into the scalar type data validation, and adds a test case to ensure malformed tags are gracefully ignored. There are no review comments, and I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9cfdea25b3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/vocab-tools/src/type.ts
@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

Files with missing lines Coverage Δ
packages/vocab-tools/src/class.ts 97.40% <100.00%> (+0.06%) ⬆️
packages/vocab-tools/src/codec.ts 99.61% <100.00%> (+<0.01%) ⬆️
packages/vocab-tools/src/type.ts 84.29% <100.00%> (+0.02%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Malformed language-map entries are dropped during decoding, but the
parsed object could still keep the original JSON-LD cache and return
those entries from a default toJsonLd() call.

Track whether any decoded value was skipped and carry that state through
subclass parsing so objects only reuse the original JSON-LD when it still
matches the accepted decoded values.

fedify-dev#848 (comment)

Assisted-by: Codex:gpt-5.5
@dahlia

dahlia commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

@codex review

@dahlia

dahlia commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an issue where malformed language tags in remote JSON-LD language maps caused parsing to abort with a RangeError. It introduces a helper function isValidLanguageTag using Intl.Locale to validate language tags, skips caching the original JSON-LD if any properties fail to decode, and adds a test case to verify that malformed language tags are ignored while the rest of the object is successfully parsed. I have no feedback to provide as there are no review comments.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c18ea8f27

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/vocab-tools/src/codec.ts Outdated
A child vocabulary object can reject malformed language-map values while
decoding, which makes its original JSON-LD cache unsafe to reuse.

Carry that cache state back to the parent decoder before storing the
decoded child so parse-then-serialize paths do not re-emit skipped nested
values from the parent's cached payload.

fedify-dev#848 (comment)

Assisted-by: Codex:gpt-5.5
@dahlia

dahlia commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

@codex review

@dahlia

dahlia commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue in @fedify/vocab where malformed language tags in remote JSON-LD language maps would cause parsing to abort with a RangeError. It introduces a validation helper, isValidLanguageTag, using Intl.Locale, and updates the generated class decoder logic to skip caching when invalid or uncacheable values are encountered. Unit tests have been added to verify that malformed language tags are ignored both at the root level and within nested objects. There are no review comments, and I have no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

Reviewed commit: 0445b15826

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@dahlia dahlia requested review from 2chanhaeng and sij411 June 28, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/inbox Inbox related component/vocab Activity Vocabulary related component/vocab-tools Vocabulary code generation (@fedify/vocab-tools)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant