Let explicit selectors route unverifiable ARNs to BedrockInvokeModel#10
Let explicit selectors route unverifiable ARNs to BedrockInvokeModel#10samuelboland wants to merge 1 commit into
Conversation
invoke_model? consulted anthropic_model? before the selector, so an
application-inference-profile ARN with no provider_name metadata could
never route to InvokeModel — even when an Array or Proc selector opted
in explicitly. Since assume_model_exists chats carry no registry
metadata, every ARN-routed request fell back to Converse regardless of
the operator's configuration, and the refusal warning's own advice
("use an Array or Proc selector to opt in explicitly") was impossible
to follow.
Split vendor verification into two tiers: provably non-Anthropic ids
(known vendor prefixes, bare or cross-region) are never routed under
any selector; unverifiable ids (ARNs without metadata) route when an
Array or Proc opts in explicitly. Only the blanket true selector still
requires positive verification, since it expresses 'all Anthropic
models' rather than a specific opt-in.
|
🤖 Supernova Code Review — View trace |
There was a problem hiding this comment.
The fix is logically correct and well-motivated. The two-tier vendor verification — hard-blocking known non-Anthropic prefixes at the top of invoke_model? while letting Array/Proc selectors bypass anthropic_model? for unverifiable ARNs — cleanly solves the problem described. Test coverage is thorough, including the key negative cases (non-Anthropic ids stay on Converse even under a permissive Proc selector).
Two minor observations inlined; neither is a blocker.
|
|
||
| # Matches known non-Anthropic vendor ids in both bare ("amazon.nova-pro-v1:0") and | ||
| # cross-region ("us.amazon.nova-pro-v1:0") forms. | ||
| NON_ANTHROPIC_PATTERN = /\A(?:[a-z0-9-]+\.)?(?:#{NON_ANTHROPIC_VENDORS.join('|')})\./ |
There was a problem hiding this comment.
nit: the optional geo-prefix segment [a-z0-9-]+ in NON_ANTHROPIC_PATTERN is unboundedly wide — it would treat any single-dot-prefixed string (e.g. a hypothetical "foo.amazon.x") as a cross-region non-Anthropic id. The existing cross-region geo codes are all short (us, eu, apac, global, jp, au, us-gov), so a tighter bound like [a-z]{2,6}(?:-[a-z]+)? would make the intent clearer and prevent surprise matches on longer prefixes. Low severity since real Bedrock model IDs don't look like that today.
| context 'with an unverifiable ARN id (no provider_name metadata)' do # rubocop:disable RSpec/MultipleMemoizedHelpers | ||
| before { allow(RubyLLM.logger).to receive(:warn) } | ||
|
|
||
| it 'routes to BedrockInvokeModel when an Array selector lists the ARN' do |
There was a problem hiding this comment.
The before { allow(RubyLLM.logger).to receive(:warn) } stub is shared across all four examples in this context, but the three cases that bypass anthropic_model? (Array, Proc-true, Proc-false) never actually trigger the warning path. Consider adding a negative assertion to the first two cases (expect(RubyLLM.logger).not_to have_received(:warn)) to confirm the warning is silenced by bypassing anthropic_model? rather than by coincidence. The test for the true selector case at line 256–260 already does this correctly.
Problem
invoke_model?consultsanthropic_model?before the selector, so an application-inference-profile ARN with noprovider_namemetadata can never route to InvokeModel — even when an Array or Proc selector opts in explicitly. The refusal warning's own advice ("Use an Array or Proc selector with bedrock_use_invoke_model to opt in explicitly") is impossible to follow.This breaks agents_app in staging/production: chats there use application-inference-profile ARNs as model ids (cost-allocation tagging), and every chat is created with
assume_model_exists: true, which buildsModel::Info.default— no registry metadata ever reaches the request-time model object. Result: with theBEDROCK_INVOKE_MODELflag on, every request silently fell back to Converse, wherewith_thinking(effort:)renders as a barereasoning_effortfield that Anthropic rejects with a 400 ("reasoning_effort: Extra inputs are not permitted") — on every model.Fix
Split vendor verification into two tiers:
amazon.,meta., etc., bare or cross-region likeus.amazon.) are never routed, under any selector. The InvokeModel payload is Anthropic Messages format and would be rejected.Model::Infocarries noprovider_name) route when the selector opts in explicitly (Array listing the id, or Proc returning truthy). An operator naming the exact id is the verification.trueselector still requires positive verification viaanthropic_model?, since it expresses "all Anthropic models", not "this specific model".Verified against the exact production scenario:
Model::Info.default(<staging profile ARN>, 'bedrock')with a Proc selector now routes toBedrockInvokeModel;us.amazon.nova-pro-v1:0under the same Proc still routes to Converse. Full suite: 1628 examples, 0 failures.🤖 Generated with Claude Code