feat: RFC 6570 URI templates with operator-aware security#2356
Conversation
There was a problem hiding this comment.
No new bugs found and the two issues from my earlier review are fixed, but this is a large feature PR (~1000 lines of new security-sensitive code across uri_template.py and path_security.py) with breaking changes that warrants human review.
Extended reasoning...
Overview
This PR replaces the regex-based resource template matcher with a full RFC 6570 URI template engine (uri_template.py, ~876 lines), adds filesystem path safety primitives (path_security.py, ~156 lines), and wires both into the MCPServer decorator via a configurable ResourceSecurity policy. It also adds 376 lines of new documentation and extensive test coverage (~800 lines across two new test files plus additions to existing test files). The PR claims to close 4 issues and supersede 4 other PRs.
Security risks
The path_security.py module implements path-traversal detection, absolute-path detection, and safe_join — all security-critical primitives. The contains_path_traversal function uses a depth-tracking algorithm rather than a simple substring check, which is the right approach but needs careful human verification. The ResourceSecurity policy is secure-by-default (rejecting traversal and absolute paths), which is good. However, the correctness of these security boundaries is too important for automated approval alone.
The uri_template.py module includes ReDoS protection that rejects quadratic-backtracking patterns at parse time. The design choices here (which patterns to allow vs reject) are architectural decisions that benefit from human review.
Level of scrutiny
This PR requires high scrutiny. It introduces ~1000 lines of new production code in two new modules, contains multiple breaking changes, and touches security-sensitive code paths (path traversal protection, URI parsing for resource routing). The RFC 6570 implementation involves complex regex construction and matching logic that could have subtle edge cases beyond what automated testing catches.
Other factors
The test coverage is thorough (211 tests for URI templates, 50 for path security, plus E2E tests), and the two bugs I found in my previous review were promptly fixed with regression tests. The documentation is comprehensive. However, the breaking changes listed in the migration guide affect existing users and represent design decisions (e.g., static URI + Context-only handler now errors, query params match leniently by default) that a maintainer should explicitly sign off on.
|
putting this back as a draft as idk if the maintenance burden is worth it. This is defined in the MCP spec as RFC 6570 but like... RFC 6570 technically only defines expansion and not matching. So in reality it's awkward for us to be setting resources as strings in the form of RFC 6570 and then the SDKs somehow have to be able to match a So idk dude, I'm going to draft this for now and figure out:
either way, will leave as draft for a few days and think through it. |
8b5ca89 to
0517df0
Compare
There was a problem hiding this comment.
2 issues found and verified against the latest diff
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Fix all with cubic | Re-trigger cubic
There was a problem hiding this comment.
1 issue found across 7 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
There was a problem hiding this comment.
The latest revisions address all the issues I flagged in earlier review rounds (the floor-bounded suffix scan, the adjacency/ReDoS parse-time rejections, and the doc fixes), and I didn't find new problems — but given the scope here (a new linear-scan RFC 6570 matcher, default-on path-safety policy, and several breaking changes to template matching), this still needs a human maintainer's review of the overall design and maintenance trade-off you raised.
Extended reasoning...
Overview
This PR replaces the regex-based resource template matcher with a from-scratch RFC 6570 engine (mcp.shared.uri_template, ~1,100 lines), adds path-safety primitives (mcp.shared.path_security), and wires a configurable ResourceSecurity policy into MCPServer / @mcp.resource. It also introduces breaking behavior changes (exact literal matching, empty-value matches, decoration-time template validation, default rejection of traversal/absolute-path/null-byte values) plus new docs and a large test suite (~1,300 new test lines).
Security risks
The change is squarely security-relevant: it defines the path-traversal / absolute-path / null-byte rejection policy applied to client-controlled URIs, the safe_join containment boundary for filesystem handlers, and the parse-time rules that prevent quadratic-time matching on attacker-controlled input. Earlier review rounds surfaced two real ReDoS gaps and an unhandled AssertionError reachable from client-controlled URIs; all were fixed with regression tests, and this round found no further issues. Remaining risk is mostly in subtle matcher edge cases and in whether the default security policy (e.g. flagging x:y as Windows drive-relative) is the right ergonomic trade-off — judgment calls for maintainers, not bugs.
Level of scrutiny
This deserves the highest level of human scrutiny: it is a large new shared component on the resources/read hot path, it changes default behavior for existing users (breaking changes documented in the migration guide), and it commits the SDK to maintaining a bespoke RFC 6570 matcher long-term. The author has themselves questioned whether the maintenance burden is worth it relative to a reduced operator subset, which is exactly the kind of design decision a maintainer should weigh in on.
Other factors
Test coverage is extensive (matcher unit tests, path-security tests, e2e tests through Client, regression tests for every previously-found bug), and all prior inline findings from my earlier passes plus the cubic-bot findings have been addressed in follow-up commits. I did not find new defects in the current revision, so my non-approval is about scope, breaking changes, and the open design question — not outstanding correctness issues.
828c6ca to
9438378
Compare
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
docs/advanced/uri-templates.md:388-410— All 14 Python examples in the new docs/advanced/uri-templates.md are inline fenced code blocks rather than--8<--includes fromdocs_src/, which is the convention every other docs/advanced/ page (except deprecated.md) follows and that pyproject.toml documents ("docs_src/ holds the complete, runnable code examples included into docs/*.md"). Because inline snippets are only ruff-linted with F821 ignored — never type-checked or executed — import omissions keep slipping through: the "Listing templates" example (lines ~388-410) still usesServerRequestContext[Any]andServer(...)without importing them. Moving these examples into docs_src/ would catch this class of error mechanically.Extended reasoning...
What the issue is
The new
docs/advanced/uri-templates.mdpage embeds all 14 of its Python examples as inline fenced code blocks, with zero--8<--snippet includes. This breaks the project's documented convention:pyproject.toml(line ~170) states that "docs_src/ holds the complete, runnable code examples included into docs/*.md", andtests/test_examples.pyreiterates that "the real validation of those files is pyright + ruff + tests/docs_src/". Every other page indocs/advanced/exceptdeprecated.mdfollows the convention — authorization.md (2 includes), low-level-server.md (6), middleware.md (1), multi-round-trip.md (2), oauth-clients.md (2), opentelemetry.md (1), pagination.md (2), session-groups.md (4) — and evendocs/tutorial/resources.md, which this PR touches, includesdocs_src/resources/tutorial00*.py.Why it matters in practice
Inline snippets get a much weaker validation pipeline than
docs_src/files.tests/test_examples.pylints inlinedocs/advancedsnippets with ruff only and explicitly ignores F821 (undefined names, since snippets often lack imports); they are never type-checked by pyright and never executed.docs_src/files, by contrast, are pyright-checked and exercised bytests/docs_src/. The consequence has already shown up repeatedly during this PR's review: an earlier round flagged a missingfrom typing import Anyacross all three low-level examples (fixed in 19822fb), and a later round flagged missingServer/ServerRequestContextimports in the third example.The remaining concrete symptom
The "Listing templates" example (currently lines ~388-410) still has the problem. Step through what a user copying it would hit:
- The example's import block contains only
from typing import Anyandfrom mcp_types import ListResourceTemplatesResult, PaginatedRequestParams, ResourceTemplate. - The handler signature on the next lines is
async def on_list_resource_templates(ctx: ServerRequestContext[Any], ...)—ServerRequestContextis not imported, so Python raisesNameError: name 'ServerRequestContext' is not definedat definition time. - After fixing that, the bottom of the example calls
Server("my-server", ...)—Serveris not imported either, raising anotherNameError. - The first two low-level examples on the same page ("Static resources" and "Matching requests") import both names correctly, so this is a copy-paste omission that the docs_src + pyright pipeline would have caught immediately.
Several other snippets on the page also reference undefined helpers (
check_health,db,helpdesk,log_store,root,describe_binary), which is acceptable for illustrative fragments but is exactly the case the F821 ignore exists for — and why import errors in the genuinely copy-paste-ready examples can't be detected.Why nothing prevents it
The convention isn't enforced mechanically: nothing fails if a docs page uses inline blocks instead of includes. The only guard for inline blocks is ruff with F821 disabled, so undefined names — including missing imports — pass silently. Two review rounds of manual fixes on this same file demonstrate the gap.
How to fix
Move the runnable examples (at minimum the three low-level server examples, and ideally the
@mcp.resourceexamples) intodocs_src/modules — there is already adocs_src/resources/directory they could live in — and reference them via--8<--includes, the same as the rest of docs/advanced/. As an immediate stopgap, addfrom mcp.server.context import ServerRequestContextandfrom mcp.server.lowlevel import Serverto the "Listing templates" import block.This is a documentation-consistency/maintainability issue with no impact on shipped runtime code, hence nit severity. All verifiers agreed; there were no refutations.
- The example's import block contains only
a994db9 to
b03c2c1
Compare
Replace the regex-based resource template matcher with a linear-time
RFC 6570 implementation (mcp.shared.uri_template.UriTemplate), add
filesystem path-safety primitives (mcp.shared.path_security), and
wire a configurable ResourceSecurity policy into MCPServer.
The matcher is a two-ended linear scan with no backtracking. Rather
than handling ambiguous templates with scan-time special cases, the
parser rejects them up front: two variables adjacent with no literal
between them (including a variable adjacent to the multi-segment
variable), more than one multi-segment variable, and more than one
{?...} expression all raise InvalidUriTemplate when the decorator
runs. Operators that emit their own delimiter ({.ext}, {/seg},
{;name}) anchor themselves and still compose with a multi-segment
variable. A handler parameter bound to an optional {?...}/{&...}
query variable must declare a Python default; this is also checked
at decoration time so the mistake cannot reach a request.
ResourceSecurity rejects extracted parameter values that contain a
null byte, look like an absolute path, or would resolve outside their
starting directory. A rejection is indistinguishable on the wire from
a not-found resource (-32602) and halts template iteration so a later
permissive template is never tried. safe_join() is exported for
filesystem handlers. UriTemplate is re-exported at the top level so
clients can expand a template a server advertises.
Beyond the example-based suite, two seeded property tests cover the
whole space the parser accepts: match(expand(v)) round-trips and
re-expands to the same URI for every accepted template, and match()
never raises on any input.
Docs: a tested reference page at docs/advanced/uri-templates.md with
runnable examples under docs_src/uri_templates/, a forward link from
the resources tutorial, and migration notes for every behaviour
change.
b03c2c1 to
c75e71c
Compare
| # to the stop-char scan when the value is empty. | ||
| if uri.startswith(nxt.text, pos): | ||
| # Following literal begins immediately: value is empty. | ||
| # Checked before '=' so a literal that itself starts | ||
| # with '=' is not mistaken for the ifemp separator. | ||
| result[var.name] = "" | ||
| continue | ||
| if pos < limit and uri[pos] == "=": | ||
| pos += 1 # value follows; fall through to the scan | ||
| else: | ||
| # The following literal does not start here and there is | ||
| # no '=': the URI's name continued past the template's | ||
| # (e.g. ;keys vs ;key) — no parse. | ||
| return None | ||
|
|
||
| # Latest valid end: the var stops at the first stop-char or | ||
| # the scan limit, whichever comes first. | ||
| latest = pos | ||
| while latest < limit and uri[latest] not in stops: | ||
| latest += 1 | ||
|
|
||
| # First occurrence of the following literal: the capture takes | ||
| # the minimum span, leaving the greedy variable as much of the | ||
| # URI as possible. The search window's upper bound already | ||
| # forces any hit to start at or before ``latest``, so the var | ||
| # never extends past a stop-char. | ||
| end = uri.find(nxt.text, pos, latest + len(nxt.text)) | ||
| if end == -1: | ||
| return None | ||
|
|
There was a problem hiding this comment.
🟡 In _scan_prefix, an ifemp capture ({;name}) sitting before the template's greedy variable is matched lazily after consuming the =, so when the parameter's value itself begins with the following literal the capture comes back empty and the rest of the value leaks into the greedy variable — e.g. UriTemplate.parse('file://x{;rev}.{+path}') matching its own expand({'rev': '.5', 'path': 'a/b'}) ('file://x;rev=.5.a/b') returns {'rev': '', 'path': '5.a/b'}, which re-expands to 'file://x;rev.5.a/b' (no =), i.e. not a valid pre-image of the URI. Since expand() never emits ;name= for an empty value, an empty capture reached via the = branch can simply be rejected (return None) or a longer capture preferred, turning the silently-wrong match into a clean non-match.
Extended reasoning...
What the bug is
_scan_prefix handles a {;name} (ifemp) capture that sits in the prefix before the template's single greedy variable. The ifemp branch first checks whether the following literal starts at pos (empty value, bare ;name form); otherwise it requires a = and advances past it, then falls through to the lazy scan that ends the capture at the first occurrence of the following literal:
end = uri.find(nxt.text, pos, latest + len(nxt.text))If the parameter's value itself begins with the following literal text, this find lands at pos itself: the capture becomes the empty string, and everything that should have been the value is left for the greedy variable to absorb.
Step-by-step proof
t = UriTemplate.parse('file://x{;rev}.{+path}')
uri = t.expand({'rev': '.5', 'path': 'a/b'}) # 'file://x;rev=.5.a/b'
t.match(uri) # {'rev': '', 'path': '5.a/b'} <- wrong
t.expand(t.match(uri)) # 'file://x;rev.5.a/b' != uriWalking through _scan_prefix for 'file://x;rev=.5.a/b':
- The literal
'file://x;rev'is consumed;posnow points at'='. uri.startswith('.', pos)is False (the char is'='), so the empty/bare-name branch is not taken.uri[pos] == '='→pos += 1; the cursor now points at'.', the first character of the value'.5'.- The lazy
uri.find('.', pos, ...)finds the following literal'.'immediately atpos, souri[pos:end]is''→rev = ''. - The greedy
{+path}then takes the gap, producingpath = '5.a/b'.
A second reproduction with a non-dot literal: UriTemplate.parse('api{;key}v{+rest}') — expand({'key': 'v1', 'rest': '/data'}) gives 'api;key=v1v/data', but match() returns {'key': '', 'rest': '1v/data'}, re-expanding to 'api;keyv1v/data'.
Why this is not the documented lazy/greedy ambiguity
For non-ifemp operators a lazy split still yields a valid alternative pre-image (the module docstring documents this), and the match() docstring's round-trip caveat only covers values containing the operator's separator unencoded — '.5' contains no ;, so the caveat does not apply here. The ; operator is special because its wire form differs between empty (;rev, no =) and non-empty (;rev=v): the returned assignment {'rev': '', 'path': '5.a/b'} re-expands to a different URI, so it is not a pre-image of the input at all. The handler silently receives values the client never sent.
Why existing safeguards miss it
_partition_greedy only rejects two captures that are directly adjacent; here a literal (. or v) sits between {;rev} and {+path}, so the template parses — templates of exactly this shape (api{;key}/{+rest}) are explicitly in the allowed-templates test list. The PR's property test (test_match_inverts_expand_for_every_parseable_template) asserts expand(match(expand(v))) == expand(v), which this case violates, but its value alphabet (lowercase + digits) is deliberately disjoint from its literal alphabet, so it never generates a value that begins with the following literal and never trips on this.
Impact
A server registering a template like file://x{;rev}.{+path} or api{;key}v{+rest} gets an empty value for the ; parameter and a polluted greedy value for URIs the template's own expand() produces — whenever the ; value starts with the literal that follows it. No error is raised, so this is silently-wrong handler arguments rather than a crash. The trigger is narrow ({;var} + literal + greedy variable, value starting with that literal), so this is non-blocking but worth fixing since it is in code newly added by this PR.
How to fix
Since expand() never emits ;name= with an empty value, an empty capture reached via the = branch can never correspond to a real expansion. Simplest fix: in the ifemp branch of _scan_prefix, after the lazy find, return None when the capture is empty and it was reached via the = branch — turning the wrong match into a clean non-match. Alternatively, prefer a longer capture in that case (e.g. search again from pos + 1), which would make the round-trip succeed for these inputs.
Replaces the regex-based resource template matcher with a linear-time RFC 6570 implementation, adds configurable path-safety validation, and exports a standalone
UriTemplateusable from both client and low-level server.Motivation
The existing matcher only supports RFC 6570 Level 1 (
{var}). Users writingfile://{+path}for multi-segment paths hit a confusing "mismatch" error because the regex-based extraction doesn't understand the+operator. The TypeScript, C#, and Go SDKs all support Level 3+. The matcher also doesn't escape template literals, sodata://v1.0/{id}incorrectly matchesdata://v1X0/42.Separately, encoded
..%2Fsequences in extracted parameters are passed to handlers unchecked. The 2026 spec adds a normative requirement thatfile://resource servers sanitize paths.Closes #436, closes #220, closes #159, closes #378
Supersedes #197, #427, #1439, #2353
What's in here
mcp.shared.uri_template.UriTemplate— RFC 6570 Levels 1-3 plus path-style explode ({/var*},{.var*},{;var*}).parse()/expand()/match(). Re-exported at the top level so clients can expand a template a server advertises.The matcher is a two-ended linear scan with no backtracking. Rather than handling ambiguous templates with scan-time special cases, the parser rejects them up front: two variables adjacent with no literal between them, more than one multi-segment variable, and more than one
{?...}expression all raiseInvalidUriTemplateat decoration time. Operators that emit their own delimiter ({+path}{.ext},{+path}{/seg},{+path}{;k}) still compose. Query expressions match leniently (order-agnostic, partial, extras ignored), and a handler parameter bound to one must declare a Python default — also checked at decoration time.mcp.shared.path_security—contains_path_traversal(),is_absolute_path(),safe_join(). The last resolves and verifies containment, catching symlink escapes.ResourceSecuritypolicy — wired intoMCPServer(resource_security=...)and@mcp.resource(..., security=...). By default rejects..escapes, absolute paths, and null bytes. A rejection is indistinguishable on the wire from a not-found resource (-32602) and halts template iteration. Per-parameter opt-out viaexempt_params={"name"}.Tests
Example-based suites for
UriTemplateandpath_security, e2e tests throughClientfor the decorator and policy, plus two seeded property tests over the full space the parser accepts:match(expand(v))round-trips andmatch()never raises on any input. 100% coverage.Breaking changes
See
docs/migration.mdfor the full list. Headline items:ResourceSecurity){var}stops at structural delimiters and now matches an empty value{?...}, and other malformed templates raise at decoration time{?...}-bound handler parameter without a default raises at decoration timeDocs
Tested reference page at
docs/advanced/uri-templates.mdwith runnable examples underdocs_src/uri_templates/.AI Disclaimer