Skip to content

fix(media): support IRSA/credential-chain S3 auth and configurable signing region#1406

Merged
tlongwell-block merged 3 commits into
mainfrom
fix/s3-irsa-auth
Jun 30, 2026
Merged

fix(media): support IRSA/credential-chain S3 auth and configurable signing region#1406
tlongwell-block merged 3 commits into
mainfrom
fix/s3-irsa-auth

Conversation

@tlongwell-block

Copy link
Copy Markdown
Collaborator

What

Make the relay's S3 media client work with IRSA (EKS pod IAM role) and with non-us-east-1 regions, without long-lived static IAM keys.

Two coupled bugs in MediaStorage::new (crates/buzz-media/src/storage.rs) blocked the bb-block production deploy:

  1. Auth was static-only. It always called Credentials::new(Some(access), Some(secret), …). That path short-circuits the AWS credential chain the moment a key is present (aws-creds credentials.rs:284), so the pod's IRSA role (role/buzz) — which the media bucket policy already trusts — was never used. The only way to run was to mint a long-lived IAM user + static keys.
  2. Signing region was hardcoded us-east-1. Region::Custom { region: "us-east-1", … } is the SigV4 credential scope (aws-region region.rs:204). Pointed at https://s3.us-west-2.amazonaws.com, AWS rejects the mismatched scope.

The key insight

The rust-s3 / aws-creds stack already supports the AWS default credential chain (env → profile → web-identity/IRSA → container → instance-metadata) via Credentials::default(). The http-credentials feature that gates the IRSA AssumeRoleWithWebIdentity path is already enabled transitively through our tokio-rustls-tls feature (aws-creds Cargo.toml:47-49; confirmed via cargo tree -e features and attohttpc in Cargo.lock). No new dependency, no feature flag. We just weren't calling it.

Changes

  • MediaStorage::new: if both s3_access_key and s3_secret_key are non-empty → use them as static credentials (MinIO/local/dev, any static-key deploy — unchanged). Otherwise → Credentials::default(), resolving IRSA/env/profile/metadata.
  • MediaConfig.s3_region (new): env BUZZ_S3_REGION, falling back to AWS_REGION, default us-east-1. Used for SigV4 signing in place of the hardcode.
  • Unit test: static keys build a client without touching the credential chain, and the configured region propagates to the bucket.

Behavior

  • Local dev unchanged. With the AWS key envs unset, the relay still defaults them to buzz_dev/buzz_dev_secret (static path), and s3_region defaults to us-east-1.
  • Production opts into IRSA by setting BUZZ_S3_ACCESS_KEY=""/BUZZ_S3_SECRET_KEY="" (empty strings — an empty env var stays empty; only a missing var falls back to the buzz_dev default) and providing AWS_REGION/BUZZ_S3_REGION. The pod already injects AWS_REGION, AWS_DEFAULT_REGION, and the IRSA web-identity env.

Companion (bb-block, separate)

Once this is on the relay image, the bb-block manifest drops the s3-access-key/s3-secret-key ExternalSecret data refs and sets the two BUZZ_S3_* key envs to empty. No IAM user, no static keys to rotate. (Context: bb-block #110 / Buzz deploy thread.)

Testing

  • cargo test -p buzz-media — 41 passed (incl. new static_keys_build_client_with_configured_region).
  • cargo test -p buzz-relay --lib config — 14 passed.
  • cargo clippy -p buzz-media -p buzz-relay --all-targets — clean.
  • Pre-commit + pre-push hooks (fmt, rust-tests, etc.) green.

…gion

The media S3 client always built static credentials from
BUZZ_S3_ACCESS_KEY/BUZZ_S3_SECRET_KEY and hardcoded the SigV4 signing
region to us-east-1. On EKS this blocks two things: the relay can't use
its IRSA pod role (role/buzz) for media, forcing long-lived static IAM
keys; and any non-us-east-1 deployment signs requests with the wrong
credential scope, which AWS rejects.

The underlying rust-s3 / aws-creds stack already supports the AWS
default credential chain (env -> profile -> web-identity/IRSA ->
container -> instance metadata) via Credentials::default(); the
http-credentials feature is already enabled transitively through our
tokio-rustls-tls feature. We just never called it.

- MediaStorage::new: when both access/secret keys are non-empty, keep
  using them as static credentials (MinIO/local/dev unchanged); when
  both are empty, fall back to Credentials::default() so the pod's IAM
  role resolves via IRSA. Reject partial static credentials instead of
  silently switching auth modes.
- Add MediaConfig.s3_region (env BUZZ_S3_REGION, falling back to
  AWS_REGION, default us-east-1) and use it for SigV4 signing instead
  of the hardcoded us-east-1.

Local dev is unchanged: with the AWS key envs unset, the relay still
defaults them to buzz_dev/buzz_dev_secret (static path). Production opts
into IRSA by setting the key envs to empty strings (and dropping the
s3-access-key/s3-secret-key ExternalSecret refs).

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
npub1qyvc0c5kl4gqv2fd97fsk46tu378sqgy35vc83rvgfwne90sel7s0ed67d and others added 2 commits June 30, 2026 17:01
Ignored integration test proving the IRSA/credential-chain fallback did
not regress hardcoded credentials: builds MediaStorage::new with static
keys (buzz_dev/buzz_dev_secret) and round-trips put -> head -> get ->
delete against the docker-compose MinIO. Opt-in via --ignored; reads
config from BUZZ_S3_* env with MinIO defaults.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…to git store

GitStore::new had the same two bugs the media fix addressed: a hardcoded
us-east-1 signing region and static-only credentials (Credentials::new
with Some/Some), which short-circuits the AWS credential chain and never
reaches web-identity/IRSA. The git store is wired from the same
config.media.* fields, so under IRSA it would receive empty-string keys
and fail.

Mirror MediaStorage::new exactly:
- take a region param, threaded from config.media.s3_region at the call
  site in state.rs (BUZZ_S3_REGION -> AWS_REGION -> us-east-1);
- select credentials by (access_key.is_empty(), secret_key.is_empty()):
  both non-empty -> static; both empty -> Credentials::default() (chain);
  mixed -> StoreError::Config fail-fast, so a half-configured static
  deploy can't silently fall through to the chain.

Add a StoreError::Config(String) variant for the fail-fast case. Update
the two test call sites (store.rs probe, hydrate.rs live) to pass an
explicit region. Add unit tests for the static-region and partial-key
paths. Verified locally: the BUZZ_GIT_S3_PROBE live MinIO git pack
round-trip (store -> hydrate -> clone) passes through the modified
constructor, so the static path is unaffected.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
@tlongwell-block tlongwell-block merged commit 06ef533 into main Jun 30, 2026
29 checks passed
@tlongwell-block tlongwell-block deleted the fix/s3-irsa-auth branch June 30, 2026 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant