feat(aicore/fallback): add opt-in model fallback for Orchestration v2#185
Open
lenin-ribeiro wants to merge 4 commits into
Open
feat(aicore/fallback): add opt-in model fallback for Orchestration v2#185lenin-ribeiro wants to merge 4 commits into
lenin-ribeiro wants to merge 4 commits into
Conversation
Add a sibling 'fallback' subpackage exposing FallbackModel, FallbackConfig (dataclasses) and a set_fallbacks() entry point that mirrors the filtering API style. Fallback is opt-in: set_aicore_config() does not enable it; the developer activates it explicitly via set_fallbacks() or by setting AICORE_FALLBACK_ENABLED=true and AICORE_FALLBACK_MODELS / _CONFIG. The litellm SAP provider already builds modules as a list when fallback_sap_modules is present in optional_params. The SDK now injects that kwarg from the active FallbackConfig via the existing transport patch. Refactor filtering/filters.py to host both concerns in a single subclass, OrchestrationPatchConfig (FilteringOrchestrationConfig kept as alias): - _install split into _install_filter + _install_fallback sharing _apply_patch(); _install retained as alias for back-compat. - transform_request now injects fallback_sap_modules before super(), and BROADCASTS filtering to every module entry (was modules[0] only). - transform_response attaches response.intermediate_failures from the body so callers can inspect which preferences were skipped. Non-streaming only in v1; streaming surfacing is deferred. Tests: - 35 new unit tests across test_fallback_config.py, test_patch.py and test_set_fallbacks.py covering dataclass shape, env parsing, patch injection, filtering broadcast, intermediate_failures attachment, and install lifecycle composition with filtering. - New BDD fallback.feature + test_fallback_bdd.py with 4 scenarios (primary success, primary unsupported -> fallback used, filtering composition, streaming + fallback). conftest skips cleanly when AICORE_FALLBACK_TEST_* env vars are missing. - Bump expected enum count for AICORE_SET_FALLBACKS. Docs & ops: - user-guide.md gains a Model Fallback (opt-in) section with programmatic and env-driven examples, composition with filtering, and the v1 streaming limitation. - .env_integration_tests.example documents the new AICORE_FALLBACK_TEST_PRIMARY_MODEL / _FALLBACK_MODEL secrets.
…odule entries litellm's transform_request only builds the primary module's template from `messages`; fallback entries get whatever was popped from their dict's "messages" key (transformation.py:371), which is `[]` for FallbackModel.to_dict(). The orchestration server then rejected with "config.modules[N].prompt_templating.prompt.template should be non-empty". Mirrors the existing filtering broadcast in the same transform_request. Adds a realistic unit test (and helper) that would have caught this before integration — the previous list-modules fixture hardcoded an empty template on both the primary AND fallback entries, normalising the bug away.
Parity with sibling subpackages aicore/ and aicore/filtering/, both of which already ship a py.typed marker. The parent package marker already covers the subpackage transitively, but the one-marker-per-subpackage convention is what docs/GUIDELINES.md prescribes.
…d filtering package The parent branch refactored aicore/filtering/filters.py into four files (_api.py, _models.py, _patch.py, config.py). This branch's fallback code hooked directly into filters.py; the merge requires porting: - src/sap_cloud_sdk/aicore/fallback/_patch.py (new): owns OrchestrationPatchConfig (now a subclass of FilteringOrchestrationConfig), _active_fallback_cfg, and _install_fallback. Keeps the same hooks as before: fallback_sap_modules injection, prompt-template broadcast to every fallback module entry, filtering broadcast across all entries (overriding the parent's primary-only injection), intermediate_failures attachment. - src/sap_cloud_sdk/aicore/filtering/_patch.py: _install now defers to the installed fallback subclass when _active_fallback_cfg is set, so calling set_filtering() while fallback is active no longer clobbers the patch. Lazy import of fallback._patch avoids a circular dependency. - src/sap_cloud_sdk/aicore/fallback/fallback.py: import _install_fallback from the new ._patch module instead of the deleted filtering.filters. - tests/aicore/fallback/unit/test_patch.py + test_set_fallbacks.py: rewired to the new import paths. Adjusted test_patch_installed_when_only_filtering to assert FilteringOrchestrationConfig (not OrchestrationPatchConfig) is installed — filtering-only no longer uses the combined subclass under the refactored design. Local verification: pytest tests/aicore → 145 passed, 8 skipped; pytest tests (sans live-credential integration suites) → 2610 passed, 73 skipped; ruff check + ruff format --check + ty check → all green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds opt-in model fallback for SAP AI Core Orchestration v2 to the
sap_cloud_sdk.aicoremodule.Orchestration v2 supports preference-ordered fallback module configurations: when the primary call fails (model unsupported in region, 429, 408, or any 5xx — and unsupported-model only for streaming), the server transparently retries with the next preference. The underlying litellm SAP provider already builds
body["config"]["modules"]as a list whenfallback_sap_modulesis present inoptional_params; what was missing was the SDK-side ergonomic surface and the response-side visibility into which preferences were skipped.This PR introduces:
FallbackModel,FallbackConfig— typed dataclasses for declaring per-preference model + params + version.set_fallbacks(config)— single entry point mirroringset_filtering(). Fallback is opt-in:set_aicore_config()does NOT activate it. Developers either callset_fallbacks(...)programmatically or setAICORE_FALLBACK_ENABLED=true(withAICORE_FALLBACK_MODELSorAICORE_FALLBACK_CONFIG) and callset_fallbacks()with no args.response.intermediate_failures— when the fallback path fires, the per-preference failure list from the orchestration response is surfaced as an attribute on the returnedModelResponse.Nonewhen the primary succeeded, useful as a quick check.FilteringOrchestrationConfigis renamed toOrchestrationPatchConfig(alias kept for back-compat) and now owns both filtering and fallback. One install/uninstall path, no ordering issues.modules[0]only). Consistent SDK-side default; if a fallback should run unfiltered, the developer can calldisable_filtering()before the call.Related Issue
N/A — additive feature, no issue tracked
Type of Change
How to Test
Unit tests (no live credentials required)
Expect 142 passed, 8 skipped (the 8 skips are integration scenarios waiting for live env vars).
Integration tests (requires AI Core access)
.env_integration_tests.exampleto.env_integration_testsand fill in the AI Core creds.Manual smoke
Checklist
pytest tests/aicore,ruff check,ruff format --check,ty check)aicore/user-guide.mdgains a "Model Fallback (opt-in)" sectionBreaking Changes
None for users of
sap_cloud_sdk.aicorepublic APIs. All existing names (FilteringOrchestrationConfig,_install, etc.) remain importable via aliases.There is one user-visible behavioural change that only affects users who use filtering AND fallback together (an impossible combination on
maintoday since fallback didn't exist): when both are active, the filtering configuration now applies to every module entry (primary + every fallback), not justmodules[0]. This is the safe-by-default semantic — to run a fallback unfiltered, explicitlydisable_filtering()before that call. Documented in the user guide.Additional Notes
Design choices (selected via user Q&A during planning)
set_fallbacks()entry point, global state, nodisable_fallbacks()(developers either opt in viaset_fallbacks()or never call it; runtime clearing isset_fallbacks(None)).OrchestrationPatchConfighandles filtering injection AND fallback injection in onetransform_request. Single install/uninstall lifecycle. Idempotent.AICORE_FALLBACK_ENABLEDdefaults tofalse(unlike filtering, which is on by default afterset_aicore_config()). Two-tier schema:AICORE_FALLBACK_MODELS(comma list, simple case) +AICORE_FALLBACK_CONFIG(JSON, full per-model config; takes precedence).intermediate_failureson the response object. PydanticModelResponseusesextra="allow", so we can attach the field directly. Accessed viagetattr(response, "intermediate_failures", None).v1 limitations (documented)
intermediate_failuresis surfaced for non-streaming responses only. Capturing the field fromSAPStreamIteratorchunks requires deeper changes to litellm internals and is deferred to a future iteration. The streaming integration test asserts that fallback still fires correctly server-side; it doesn't assertintermediate_failures.Telemetry
Added
Operation.AICORE_SET_FALLBACKS = "set_fallbacks". Theset_fallbacksentry point is decorated with@record_metrics(Module.AICORE, Operation.AICORE_SET_FALLBACKS)perdocs/GUIDELINES.md.Files added / changed