Skip to content

feat(observability): query gateway core + dashboard shell#5378

Draft
Ma77Ball wants to merge 88 commits into
apache:mainfrom
Ma77Ball:obs/pr4/gateway-core
Draft

feat(observability): query gateway core + dashboard shell#5378
Ma77Ball wants to merge 88 commits into
apache:mainfrom
Ma77Ball:obs/pr4/gateway-core

Conversation

@Ma77Ball

@Ma77Ball Ma77Ball commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Introduces the tenant-scoped read path the dashboard queries, plus the Angular shell that hosts the per-signal panels. This PR warrants the closest review: it enforces tenancy and rate limiting and is the first user-visible surface.
Backend:

  • BackendClient: HTTP client to the telemetry backends.
  • ScopeResolver: derives the caller's tenant scope and constrains every query to it.
  • RateLimiter, AuditLogger, GatewayContext: per-request rate limiting, audit logging, and shared request context.
  • dtos.scala: typed request objects with validators (time window, page size, free text, service name).
  • ObservabilityResources with the /observability/health endpoint, registered in TexeraWebApplication.
  • RequestContextMdcFilter and UserContextMdcFilter: inject request and user context into the logging MDC.
  • ObservabilityGatewayConfig and its configuration file.
    Frontend:
  • Observability dashboard page, route, and navigation entry, plus observability.service, observability.types, and the traces-pivot.service.
  • Health gating: each tab is guarded by the per-signal reachability check; an unreachable signal renders an explicit state rather than a broken panel. Signal panels follow in PR5 through PR8.

Any related issues, documentation, or discussions?

Closes: #5370
Part of #4070. Stacked on #5377.

How was this PR tested?

  • Backend specs for the gateway core, DTO validation, scope resolver, rate limiter, and MDC filters; sbt scalafmtCheckAll passes.
  • Frontend component and service specs; prettier-eslint and eslint pass.
  • Compile and the full test suites run in this PR's CI.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8 in compliance with ASF

Ma77Ball and others added 4 commits June 5, 2026 04:49
…, SDK bootstrap (default-off)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nt scope, health, routing)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ca/eBPF profiling

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tracing primitives

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added engine dependencies Pull requests that update a dependency file frontend Changes related to the frontend GUI docs Changes related to documentations dev common labels Jun 5, 2026
@codecov-commenter

codecov-commenter commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.50000% with 483 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.17%. Comparing base (6d31f46) to head (71ac9ce).
⚠️ Report is 52 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...apache/texera/web/observability/gateway/dtos.scala 41.98% 100 Missing and 5 partials ⚠️
...ala/org/apache/texera/observability/OtelInit.scala 63.30% 43 Missing and 8 partials ⚠️
...texera/web/observability/gateway/AuditLogger.scala 0.00% 41 Missing ⚠️
...observability/gateway/ObservabilityResources.scala 0.00% 38 Missing ⚠️
...xera/web/observability/gateway/ScopeResolver.scala 19.44% 27 Missing and 2 partials ⚠️
...kspace/service/notebook-migration/migration-llm.ts 79.61% 14 Missing and 7 partials ⚠️
...era/web/observability/gateway/GatewayContext.scala 0.00% 19 Missing ⚠️
...a/org/apache/texera/web/TexeraWebApplication.scala 0.00% 15 Missing ⚠️
...org/apache/texera/observability/TexeraTracer.scala 0.00% 14 Missing ⚠️
...wer-button/computing-unit-selection.component.html 0.00% 14 Missing ⚠️
... and 27 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5378      +/-   ##
============================================
+ Coverage     53.86%   55.17%   +1.31%     
- Complexity     2756     3017     +261     
============================================
  Files          1099     1140      +41     
  Lines         42541    44114    +1573     
  Branches       4577     4819     +242     
============================================
+ Hits          22916    24342    +1426     
- Misses        18290    18326      +36     
- Partials       1335     1446     +111     
Flag Coverage Δ *Carryforward flag
access-control-service 70.14% <100.00%> (-0.30%) ⬇️
agent-service 34.36% <ø> (ø) Carriedforward from f6bf45d
amber 57.43% <57.92%> (+2.23%) ⬆️
computing-unit-managing-service 0.00% <0.00%> (-1.66%) ⬇️
config-service 50.76% <40.00%> (-5.95%) ⬇️
file-service 58.88% <68.96%> (+1.81%) ⬆️
frontend 48.96% <83.65%> (+0.90%) ⬆️
notebook-migration-service 78.57% <ø> (?)
pyamber 90.15% <100.00%> (+0.02%) ⬆️ Carriedforward from f6bf45d
python 90.76% <ø> (-0.04%) ⬇️ Carriedforward from f6bf45d
workflow-compiling-service 54.74% <ø> (-3.96%) ⬇️

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions Bot added the platform Non-amber Scala service paths label Jun 5, 2026
Ma77Ball and others added 20 commits June 16, 2026 14:08
Call OtelInit.init(<service.name>) in each service main so its logs
bridge to the OTel collector under its own service.name; cap noisy
framework loggers (pekko/iceberg/hadoop/kafka/jetty/jersey/grpc/
netty/hikari/awssdk) at WARN in each service config.

Services: access-control, config, file, computing-unit-managing,
workflow-compiling, computing-unit-master, texera-web, amber.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e span

- WorkflowMetricsRecorder: emit workflow lifecycle metrics keyed by
  execution, driven from the ExecutionStateStore state-transition
  chokepoint; registered via WorkflowMetricsRecorder.init() in
  ComputingUnitMaster
- WorkflowService: wrap initExecutionService in a run-level TexeraTracer
  span so setup-path logs carry the trace id

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- bin/observability/docker-compose.yml: collector + parca-agent stack
- bin/single-node/docker-compose.yml: mount the otel-collector and parca
  configs and run the parca-agent sidecar

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- dtos: drop the per-signal Signal/maxWindowSeconds enum and the upper
  bound on TimeWindow.validate -- DB-backed counts have no retention
  limit and the backends just return what they retain; BadTimeWindow
  becomes a plain value (only empty/inverted windows are rejected)
- DtoValidationSpec: cover the new unbounded-window behavior
- UI shell: observability route + dashboard navigation entry

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @bobbai00, @aicam, @Yicong-Huang
    You can notify them by mentioning @bobbai00, @aicam, @Yicong-Huang in a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common dependencies Pull requests that update a dependency file dev docs Changes related to documentations engine frontend Changes related to the frontend GUI platform Non-amber Scala service paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Observability] Tenant-scoped query gateway and dashboard shell

2 participants