Skip to content

feat: RFC 6570 URI templates with operator-aware security#2356

Merged
maxisbey merged 1 commit into
mainfrom
feat/uri-template-rfc6570
Jun 26, 2026
Merged

feat: RFC 6570 URI templates with operator-aware security#2356
maxisbey merged 1 commit into
mainfrom
feat/uri-template-rfc6570

Conversation

@maxisbey

@maxisbey maxisbey commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

Replaces the regex-based resource template matcher with a linear-time RFC 6570 implementation, adds configurable path-safety validation, and exports a standalone UriTemplate usable from both client and low-level server.

Motivation

The existing matcher only supports RFC 6570 Level 1 ({var}). Users writing file://{+path} for multi-segment paths hit a confusing "mismatch" error because the regex-based extraction doesn't understand the + operator. The TypeScript, C#, and Go SDKs all support Level 3+. The matcher also doesn't escape template literals, so data://v1.0/{id} incorrectly matches data://v1X0/42.

Separately, encoded ..%2F sequences in extracted parameters are passed to handlers unchecked. The 2026 spec adds a normative requirement that file:// resource servers sanitize paths.

Closes #436, closes #220, closes #159, closes #378
Supersedes #197, #427, #1439, #2353

What's in here

mcp.shared.uri_template.UriTemplate — RFC 6570 Levels 1-3 plus path-style explode ({/var*}, {.var*}, {;var*}). parse() / expand() / match(). Re-exported at the top level so clients can expand a template a server advertises.

The matcher is a two-ended linear scan with no backtracking. Rather than handling ambiguous templates with scan-time special cases, the parser rejects them up front: two variables adjacent with no literal between them, more than one multi-segment variable, and more than one {?...} expression all raise InvalidUriTemplate at decoration time. Operators that emit their own delimiter ({+path}{.ext}, {+path}{/seg}, {+path}{;k}) still compose. Query expressions match leniently (order-agnostic, partial, extras ignored), and a handler parameter bound to one must declare a Python default — also checked at decoration time.

mcp.shared.path_securitycontains_path_traversal(), is_absolute_path(), safe_join(). The last resolves and verifies containment, catching symlink escapes.

ResourceSecurity policy — wired into MCPServer(resource_security=...) and @mcp.resource(..., security=...). By default rejects .. escapes, absolute paths, and null bytes. A rejection is indistinguishable on the wire from a not-found resource (-32602) and halts template iteration. Per-parameter opt-out via exempt_params={"name"}.

Tests

Example-based suites for UriTemplate and path_security, e2e tests through Client for the decorator and policy, plus two seeded property tests over the full space the parser accepts: match(expand(v)) round-trips and match() never raises on any input. 100% coverage.

Breaking changes

See docs/migration.md for the full list. Headline items:

  • Path-safety checks applied by default (opt out via ResourceSecurity)
  • Template literals match exactly; {var} stops at structural delimiters and now matches an empty value
  • Two adjacent variables, more than one multi-segment variable, more than one {?...}, and other malformed templates raise at decoration time
  • A {?...}-bound handler parameter without a default raises at decoration time

Docs

Tested reference page at docs/advanced/uri-templates.md with runnable examples under docs_src/uri_templates/.

AI Disclaimer

@maxisbey maxisbey marked this pull request as ready for review March 26, 2026 23:13
Comment thread src/mcp/shared/uri_template.py Outdated
Comment thread src/mcp/shared/uri_template.py

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new bugs found and the two issues from my earlier review are fixed, but this is a large feature PR (~1000 lines of new security-sensitive code across uri_template.py and path_security.py) with breaking changes that warrants human review.

Extended reasoning...

Overview

This PR replaces the regex-based resource template matcher with a full RFC 6570 URI template engine (uri_template.py, ~876 lines), adds filesystem path safety primitives (path_security.py, ~156 lines), and wires both into the MCPServer decorator via a configurable ResourceSecurity policy. It also adds 376 lines of new documentation and extensive test coverage (~800 lines across two new test files plus additions to existing test files). The PR claims to close 4 issues and supersede 4 other PRs.

Security risks

The path_security.py module implements path-traversal detection, absolute-path detection, and safe_join — all security-critical primitives. The contains_path_traversal function uses a depth-tracking algorithm rather than a simple substring check, which is the right approach but needs careful human verification. The ResourceSecurity policy is secure-by-default (rejecting traversal and absolute paths), which is good. However, the correctness of these security boundaries is too important for automated approval alone.

The uri_template.py module includes ReDoS protection that rejects quadratic-backtracking patterns at parse time. The design choices here (which patterns to allow vs reject) are architectural decisions that benefit from human review.

Level of scrutiny

This PR requires high scrutiny. It introduces ~1000 lines of new production code in two new modules, contains multiple breaking changes, and touches security-sensitive code paths (path traversal protection, URI parsing for resource routing). The RFC 6570 implementation involves complex regex construction and matching logic that could have subtle edge cases beyond what automated testing catches.

Other factors

The test coverage is thorough (211 tests for URI templates, 50 for path security, plus E2E tests), and the two bugs I found in my previous review were promptly fixed with regression tests. The documentation is comprehensive. However, the breaking changes listed in the migration guide affect existing users and represent design decisions (e.g., static URI + Context-only handler now errors, query params match leniently by default) that a maintainer should explicitly sign off on.

Comment thread src/mcp/shared/uri_template.py Outdated
Comment thread docs/advanced/uri-templates.md Outdated
Comment thread src/mcp/shared/uri_template.py Outdated
Comment thread docs/migration.md Outdated
Comment thread docs/server/resources.md Outdated
@maxisbey

maxisbey commented Mar 27, 2026

Copy link
Copy Markdown
Contributor Author

putting this back as a draft as idk if the maintenance burden is worth it. This is defined in the MCP spec as RFC 6570 but like... RFC 6570 technically only defines expansion and not matching. So in reality it's awkward for us to be setting resources as strings in the form of RFC 6570 and then the SDKs somehow have to be able to match a resources/read request to the resource template. Additionally actually pulling the values out of the fully formed URL into an RFC 6570 adds a crazy amount of complexity.

So idk dude, I'm going to draft this for now and figure out:

  1. Rip out a lot of it and only the stuff people really use
    • technically not spec compliant as MCP says it should be the full RFC
    • potentially way simpler to implement
    • covers pretty much all/most cases people will actually use resource templates for (who the heck is going to implement the full rfc like this pr does on the client side... I guess llms who are trying to put data in might?)
  2. Only do matching and not value extraction
    • seems not great
  3. Leave the code as is
    • very complex
    • potentially high maintenance burden if anything is wrong
    • but maybe it's fine and if something break llms are genuinely smart enough to handle this

either way, will leave as draft for a few days and think through it.

@maxisbey maxisbey marked this pull request as draft March 27, 2026 23:22
@maxisbey maxisbey force-pushed the feat/uri-template-rfc6570 branch from 8b5ca89 to 0517df0 Compare June 26, 2026 10:59
@maxisbey maxisbey marked this pull request as ready for review June 26, 2026 11:04

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found and verified against the latest diff

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Fix all with cubic | Re-trigger cubic

Comment thread docs/advanced/uri-templates.md Outdated
Comment thread src/mcp/shared/uri_template.py Outdated
Comment thread docs/migration.md Outdated
Comment thread docs/advanced/uri-templates.md Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

Comment thread docs/migration.md Outdated
Comment thread src/mcp/shared/uri_template.py Outdated

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest revisions address all the issues I flagged in earlier review rounds (the floor-bounded suffix scan, the adjacency/ReDoS parse-time rejections, and the doc fixes), and I didn't find new problems — but given the scope here (a new linear-scan RFC 6570 matcher, default-on path-safety policy, and several breaking changes to template matching), this still needs a human maintainer's review of the overall design and maintenance trade-off you raised.

Extended reasoning...

Overview

This PR replaces the regex-based resource template matcher with a from-scratch RFC 6570 engine (mcp.shared.uri_template, ~1,100 lines), adds path-safety primitives (mcp.shared.path_security), and wires a configurable ResourceSecurity policy into MCPServer / @mcp.resource. It also introduces breaking behavior changes (exact literal matching, empty-value matches, decoration-time template validation, default rejection of traversal/absolute-path/null-byte values) plus new docs and a large test suite (~1,300 new test lines).

Security risks

The change is squarely security-relevant: it defines the path-traversal / absolute-path / null-byte rejection policy applied to client-controlled URIs, the safe_join containment boundary for filesystem handlers, and the parse-time rules that prevent quadratic-time matching on attacker-controlled input. Earlier review rounds surfaced two real ReDoS gaps and an unhandled AssertionError reachable from client-controlled URIs; all were fixed with regression tests, and this round found no further issues. Remaining risk is mostly in subtle matcher edge cases and in whether the default security policy (e.g. flagging x:y as Windows drive-relative) is the right ergonomic trade-off — judgment calls for maintainers, not bugs.

Level of scrutiny

This deserves the highest level of human scrutiny: it is a large new shared component on the resources/read hot path, it changes default behavior for existing users (breaking changes documented in the migration guide), and it commits the SDK to maintaining a bespoke RFC 6570 matcher long-term. The author has themselves questioned whether the maintenance burden is worth it relative to a reduced operator subset, which is exactly the kind of design decision a maintainer should weigh in on.

Other factors

Test coverage is extensive (matcher unit tests, path-security tests, e2e tests through Client, regression tests for every previously-found bug), and all prior inline findings from my earlier passes plus the cubic-bot findings have been addressed in follow-up commits. I did not find new defects in the current revision, so my non-approval is about scope, breaking changes, and the open design question — not outstanding correctness issues.

@maxisbey maxisbey force-pushed the feat/uri-template-rfc6570 branch 3 times, most recently from 828c6ca to 9438378 Compare June 26, 2026 15:20

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🟡 docs/advanced/uri-templates.md:388-410 — All 14 Python examples in the new docs/advanced/uri-templates.md are inline fenced code blocks rather than --8<-- includes from docs_src/, which is the convention every other docs/advanced/ page (except deprecated.md) follows and that pyproject.toml documents ("docs_src/ holds the complete, runnable code examples included into docs/*.md"). Because inline snippets are only ruff-linted with F821 ignored — never type-checked or executed — import omissions keep slipping through: the "Listing templates" example (lines ~388-410) still uses ServerRequestContext[Any] and Server(...) without importing them. Moving these examples into docs_src/ would catch this class of error mechanically.

    Extended reasoning...

    What the issue is

    The new docs/advanced/uri-templates.md page embeds all 14 of its Python examples as inline fenced code blocks, with zero --8<-- snippet includes. This breaks the project's documented convention: pyproject.toml (line ~170) states that "docs_src/ holds the complete, runnable code examples included into docs/*.md", and tests/test_examples.py reiterates that "the real validation of those files is pyright + ruff + tests/docs_src/". Every other page in docs/advanced/ except deprecated.md follows the convention — authorization.md (2 includes), low-level-server.md (6), middleware.md (1), multi-round-trip.md (2), oauth-clients.md (2), opentelemetry.md (1), pagination.md (2), session-groups.md (4) — and even docs/tutorial/resources.md, which this PR touches, includes docs_src/resources/tutorial00*.py.

    Why it matters in practice

    Inline snippets get a much weaker validation pipeline than docs_src/ files. tests/test_examples.py lints inline docs/advanced snippets with ruff only and explicitly ignores F821 (undefined names, since snippets often lack imports); they are never type-checked by pyright and never executed. docs_src/ files, by contrast, are pyright-checked and exercised by tests/docs_src/. The consequence has already shown up repeatedly during this PR's review: an earlier round flagged a missing from typing import Any across all three low-level examples (fixed in 19822fb), and a later round flagged missing Server/ServerRequestContext imports in the third example.

    The remaining concrete symptom

    The "Listing templates" example (currently lines ~388-410) still has the problem. Step through what a user copying it would hit:

    1. The example's import block contains only from typing import Any and from mcp_types import ListResourceTemplatesResult, PaginatedRequestParams, ResourceTemplate.
    2. The handler signature on the next lines is async def on_list_resource_templates(ctx: ServerRequestContext[Any], ...)ServerRequestContext is not imported, so Python raises NameError: name 'ServerRequestContext' is not defined at definition time.
    3. After fixing that, the bottom of the example calls Server("my-server", ...)Server is not imported either, raising another NameError.
    4. The first two low-level examples on the same page ("Static resources" and "Matching requests") import both names correctly, so this is a copy-paste omission that the docs_src + pyright pipeline would have caught immediately.

    Several other snippets on the page also reference undefined helpers (check_health, db, helpdesk, log_store, root, describe_binary), which is acceptable for illustrative fragments but is exactly the case the F821 ignore exists for — and why import errors in the genuinely copy-paste-ready examples can't be detected.

    Why nothing prevents it

    The convention isn't enforced mechanically: nothing fails if a docs page uses inline blocks instead of includes. The only guard for inline blocks is ruff with F821 disabled, so undefined names — including missing imports — pass silently. Two review rounds of manual fixes on this same file demonstrate the gap.

    How to fix

    Move the runnable examples (at minimum the three low-level server examples, and ideally the @mcp.resource examples) into docs_src/ modules — there is already a docs_src/resources/ directory they could live in — and reference them via --8<-- includes, the same as the rest of docs/advanced/. As an immediate stopgap, add from mcp.server.context import ServerRequestContext and from mcp.server.lowlevel import Server to the "Listing templates" import block.

    This is a documentation-consistency/maintainability issue with no impact on shipped runtime code, hence nit severity. All verifiers agreed; there were no refutations.

Comment thread src/mcp/shared/uri_template.py Outdated
Comment thread src/mcp/server/mcpserver/server.py
Comment thread src/mcp/shared/uri_template.py
Comment thread docs/advanced/uri-templates.md
@maxisbey maxisbey force-pushed the feat/uri-template-rfc6570 branch 2 times, most recently from a994db9 to b03c2c1 Compare June 26, 2026 17:14
Replace the regex-based resource template matcher with a linear-time
RFC 6570 implementation (mcp.shared.uri_template.UriTemplate), add
filesystem path-safety primitives (mcp.shared.path_security), and
wire a configurable ResourceSecurity policy into MCPServer.

The matcher is a two-ended linear scan with no backtracking. Rather
than handling ambiguous templates with scan-time special cases, the
parser rejects them up front: two variables adjacent with no literal
between them (including a variable adjacent to the multi-segment
variable), more than one multi-segment variable, and more than one
{?...} expression all raise InvalidUriTemplate when the decorator
runs. Operators that emit their own delimiter ({.ext}, {/seg},
{;name}) anchor themselves and still compose with a multi-segment
variable. A handler parameter bound to an optional {?...}/{&...}
query variable must declare a Python default; this is also checked
at decoration time so the mistake cannot reach a request.

ResourceSecurity rejects extracted parameter values that contain a
null byte, look like an absolute path, or would resolve outside their
starting directory. A rejection is indistinguishable on the wire from
a not-found resource (-32602) and halts template iteration so a later
permissive template is never tried. safe_join() is exported for
filesystem handlers. UriTemplate is re-exported at the top level so
clients can expand a template a server advertises.

Beyond the example-based suite, two seeded property tests cover the
whole space the parser accepts: match(expand(v)) round-trips and
re-expands to the same URI for every accepted template, and match()
never raises on any input.

Docs: a tested reference page at docs/advanced/uri-templates.md with
runnable examples under docs_src/uri_templates/, a forward link from
the resources tutorial, and migration notes for every behaviour
change.
@maxisbey maxisbey force-pushed the feat/uri-template-rfc6570 branch from b03c2c1 to c75e71c Compare June 26, 2026 17:23
Comment on lines +1084 to +1113
# to the stop-char scan when the value is empty.
if uri.startswith(nxt.text, pos):
# Following literal begins immediately: value is empty.
# Checked before '=' so a literal that itself starts
# with '=' is not mistaken for the ifemp separator.
result[var.name] = ""
continue
if pos < limit and uri[pos] == "=":
pos += 1 # value follows; fall through to the scan
else:
# The following literal does not start here and there is
# no '=': the URI's name continued past the template's
# (e.g. ;keys vs ;key) — no parse.
return None

# Latest valid end: the var stops at the first stop-char or
# the scan limit, whichever comes first.
latest = pos
while latest < limit and uri[latest] not in stops:
latest += 1

# First occurrence of the following literal: the capture takes
# the minimum span, leaving the greedy variable as much of the
# URI as possible. The search window's upper bound already
# forces any hit to start at or before ``latest``, so the var
# never extends past a stop-char.
end = uri.find(nxt.text, pos, latest + len(nxt.text))
if end == -1:
return None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In _scan_prefix, an ifemp capture ({;name}) sitting before the template's greedy variable is matched lazily after consuming the =, so when the parameter's value itself begins with the following literal the capture comes back empty and the rest of the value leaks into the greedy variable — e.g. UriTemplate.parse('file://x{;rev}.{+path}') matching its own expand({'rev': '.5', 'path': 'a/b'}) ('file://x;rev=.5.a/b') returns {'rev': '', 'path': '5.a/b'}, which re-expands to 'file://x;rev.5.a/b' (no =), i.e. not a valid pre-image of the URI. Since expand() never emits ;name= for an empty value, an empty capture reached via the = branch can simply be rejected (return None) or a longer capture preferred, turning the silently-wrong match into a clean non-match.

Extended reasoning...

What the bug is

_scan_prefix handles a {;name} (ifemp) capture that sits in the prefix before the template's single greedy variable. The ifemp branch first checks whether the following literal starts at pos (empty value, bare ;name form); otherwise it requires a = and advances past it, then falls through to the lazy scan that ends the capture at the first occurrence of the following literal:

end = uri.find(nxt.text, pos, latest + len(nxt.text))

If the parameter's value itself begins with the following literal text, this find lands at pos itself: the capture becomes the empty string, and everything that should have been the value is left for the greedy variable to absorb.

Step-by-step proof

t = UriTemplate.parse('file://x{;rev}.{+path}')
uri = t.expand({'rev': '.5', 'path': 'a/b'})   # 'file://x;rev=.5.a/b'
t.match(uri)                                    # {'rev': '', 'path': '5.a/b'}   <- wrong
t.expand(t.match(uri))                          # 'file://x;rev.5.a/b'           != uri

Walking through _scan_prefix for 'file://x;rev=.5.a/b':

  1. The literal 'file://x;rev' is consumed; pos now points at '='.
  2. uri.startswith('.', pos) is False (the char is '='), so the empty/bare-name branch is not taken.
  3. uri[pos] == '='pos += 1; the cursor now points at '.', the first character of the value '.5'.
  4. The lazy uri.find('.', pos, ...) finds the following literal '.' immediately at pos, so uri[pos:end] is ''rev = ''.
  5. The greedy {+path} then takes the gap, producing path = '5.a/b'.

A second reproduction with a non-dot literal: UriTemplate.parse('api{;key}v{+rest}')expand({'key': 'v1', 'rest': '/data'}) gives 'api;key=v1v/data', but match() returns {'key': '', 'rest': '1v/data'}, re-expanding to 'api;keyv1v/data'.

Why this is not the documented lazy/greedy ambiguity

For non-ifemp operators a lazy split still yields a valid alternative pre-image (the module docstring documents this), and the match() docstring's round-trip caveat only covers values containing the operator's separator unencoded — '.5' contains no ;, so the caveat does not apply here. The ; operator is special because its wire form differs between empty (;rev, no =) and non-empty (;rev=v): the returned assignment {'rev': '', 'path': '5.a/b'} re-expands to a different URI, so it is not a pre-image of the input at all. The handler silently receives values the client never sent.

Why existing safeguards miss it

_partition_greedy only rejects two captures that are directly adjacent; here a literal (. or v) sits between {;rev} and {+path}, so the template parses — templates of exactly this shape (api{;key}/{+rest}) are explicitly in the allowed-templates test list. The PR's property test (test_match_inverts_expand_for_every_parseable_template) asserts expand(match(expand(v))) == expand(v), which this case violates, but its value alphabet (lowercase + digits) is deliberately disjoint from its literal alphabet, so it never generates a value that begins with the following literal and never trips on this.

Impact

A server registering a template like file://x{;rev}.{+path} or api{;key}v{+rest} gets an empty value for the ; parameter and a polluted greedy value for URIs the template's own expand() produces — whenever the ; value starts with the literal that follows it. No error is raised, so this is silently-wrong handler arguments rather than a crash. The trigger is narrow ({;var} + literal + greedy variable, value starting with that literal), so this is non-blocking but worth fixing since it is in code newly added by this PR.

How to fix

Since expand() never emits ;name= with an empty value, an empty capture reached via the = branch can never correspond to a real expansion. Simplest fix: in the ifemp branch of _scan_prefix, after the lazy find, return None when the capture is empty and it was reached via the = branch — turning the wrong match into a clean non-match. Alternatively, prefer a longer capture in that case (e.g. search again from pos + 1), which would make the round-trip succeed for these inputs.

@maxisbey maxisbey enabled auto-merge (squash) June 26, 2026 18:28
@maxisbey maxisbey merged commit 24717cc into main Jun 26, 2026
34 checks passed
@maxisbey maxisbey deleted the feat/uri-template-rfc6570 branch June 26, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants