Skip to content

feat(api): Add global IP rate limiting framework (ADR-0022 phase 1)#846

Draft
ymh1874 wants to merge 1 commit into
openstack-experimental:mainfrom
ymh1874:feature/843-rate-limiting
Draft

feat(api): Add global IP rate limiting framework (ADR-0022 phase 1)#846
ymh1874 wants to merge 1 commit into
openstack-experimental:mainfrom
ymh1874:feature/843-rate-limiting

Conversation

@ymh1874

@ymh1874 ymh1874 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Implements ADR-0022 phase 1: a handler-level, governor-backed rate-limiting framework wired on POST /v3/auth/tokens.
  • Adds a reusable RateLimitSection config struct and RateLimitState service field so future buckets (per-user, per-domain, per-IdP) are a one-field addition to the framework.
  • Fires the global per-IP check before password hashing (ADR-0022 Invariant 4); returns 429 Too Many Requests with Retry-After.

Depends on #842 (feature/358-capture-client-info). This branch is stacked on top of it — ConnectInfo<SocketAddr> populated by into_make_service_with_connect_info from that PR is required. Base should be changed to feature/358-capture-client-info once #842 merges.

ADR-0022 invariants addressed

# Invariant How
1 No hardcoded limits All limits from [rate_limit_global_ip] in keystone.conf
2 Fail-hard init RateLimitState::from_config returns KeystoneError::RateLimitConfig and aborts startup when enabled=true with zero burst or replenish rate
3 Uniform response Single Retry-After header; no key-identifying information exposed
4 Check before hash Rate limit fires before authenticate_request (password hash)
5 Distinct buckets One Arc<DefaultKeyedRateLimiter<String>> per bucket; deferred buckets are None fields
6 Monotonic clock governor::DefaultClock = QuantaClock on std targets (TSC-backed, always monotonic)

Invariants 7 (username normalization) and 8 (post-lookup per-user throttle) are deferred — they require keying on a confirmed user ID after DB lookup but before hashing, tracked as a follow-up driver refactor.

SPIFFE bypass

Internal (mTLS/TCP) and admin (mTLS/UDS) interfaces do not populate ConnectInfo<SocketAddr>; the handler receives None and skips IP limiting. Only the public TCP listener is subject to this check.

Note: Option<Extension<ConnectInfo<SocketAddr>>> is used (not Option<ConnectInfo<SocketAddr>>) because in axum 0.8, Option<T>: FromRequestParts<S> requires T: OptionalFromRequestParts<S>, which ConnectInfo<T> does not implement but Extension<T> does.

Files changed

File Change
Cargo.toml Add governor = "0.10" workspace dep
crates/core/Cargo.toml Use governor.workspace = true
crates/config/src/rate_limit.rs (new) RateLimitSection — reusable config struct for any bucket
crates/config/src/lib.rs Add rate_limit_global_ip: RateLimitSection to Config
crates/core-types/src/error.rs Add KeystoneError::RateLimitConfig
crates/core/src/rate_limit.rs (new) RateLimitState, check_ip, retain_recent, IPv6 /64 key derivation, unit tests
crates/core/src/keystone.rs Add rate_limiters: RateLimitState to Service
crates/api-types/src/error.rs Add KeystoneApiError::TooManyRequests { retry_after }
crates/api-types/src/error_conv.rs Map TooManyRequests → 429 + Retry-After header
crates/keystone/src/api/v3/auth/token/create.rs Add IP rate-limit check to create handler
crates/keystone/src/bin/keystone.rs Add 60 s background eviction task
tools/keystone.conf Document [rate_limit_global_ip] with default values

Test plan

  • cargo fmt --check clean
  • cargo clippy --lib --tests clean on all modified crates
  • 252 unit tests pass in core/config/api-types (including 8 new rate-limit tests)
  • 234 unit tests pass in keystone crate (all existing token handler tests pass)
  • Manual end-to-end: enable [rate_limit_global_ip] with small burst, loop curl -X POST /v3/auth/tokens, confirm 401 → 429 flip with Retry-After header

Partially implements #843.

🤖 Generated with Claude Code

@ymh1874 ymh1874 force-pushed the feature/843-rate-limiting branch from 5c825d0 to 346b29c Compare June 25, 2026 13:02
Implements ADR-0022 phase 1: a handler-level rate-limiting framework
backed by the `governor` crate. Wires a global per-IP bucket on the
`POST /v3/auth/tokens` handler, checking the limit before the
CPU-intensive password-hash path (Invariant 4).

Framework design:
- `crates/config/src/rate_limit.rs`: reusable `RateLimitSection` struct
  (enabled/burst_size/replenish_rate_per_second) shared by all future
  buckets (per-user, per-domain, per-IdP).
- `crates/core/src/rate_limit.rs`: `RateLimitState` with one
  `Option<Arc<DefaultKeyedRateLimiter<String>>>` per bucket. Disabled
  buckets cost only an `Option` discriminant. Includes `check_ip`,
  `retain_recent`, and IPv6 /64 prefix aggregation.
- Fail-hard init (Invariant 2): `RateLimitState::from_config` returns
  `KeystoneError::RateLimitConfig` when `enabled=true` with zero burst
  or replenish rate, aborting startup rather than silently mis-configuring.
- `KeystoneApiError::TooManyRequests` → HTTP 429 with `Retry-After` header.
- SPIFFE bypass: `Option<Extension<ConnectInfo<SocketAddr>>>` as the
  first handler argument gives `None` on internal/admin mTLS interfaces
  (which don't populate `ConnectInfo`), so rate limiting applies only
  to the public TCP listener.
- Background eviction task (60 s interval) calls `retain_recent()` on
  all keyed stores, preventing unbounded memory growth under adversarial
  unique-key flooding.

Handler tests cover:
- burst_size=1 with `ConnectInfo` injected: first request passes rate
  limit (auth error → non-429), second is rejected with 429 + Retry-After.
- No `ConnectInfo` (SPIFFE bypass): both requests reach identity even
  after the burst is exhausted, and neither returns 429.

Deferred: per-user, per-domain, and per-IdP buckets require keying on a
confirmed user/domain ID after DB lookup but before password hashing — an
invasive driver refactor tracked as a follow-up to this framework PR.

Partially implements openstack-experimental#843.

Note: This commit was done with the help of AI.
Signed-off-by: Yousef Hussein <ymh1874@gmail.com>
@ymh1874 ymh1874 force-pushed the feature/843-rate-limiting branch from 346b29c to c98c7cb Compare June 26, 2026 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant