Skip to content

feat(observability): logs end-to-end (gateway + UI panel)#5379

Draft
Ma77Ball wants to merge 113 commits into
apache:mainfrom
Ma77Ball:obs/pr5/logs
Draft

feat(observability): logs end-to-end (gateway + UI panel)#5379
Ma77Ball wants to merge 113 commits into
apache:mainfrom
Ma77Ball:obs/pr5/logs

Conversation

@Ma77Ball

@Ma77Ball Ma77Ball commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

First complete signal: log search through the gateway plus a UI panel to drive it.
Backend:

  • Logs query builder and response parsers (parseLogs, parseLogSources), applying per-field redaction.
  • LogsResource exposing log search and the source-facets endpoint, registered in TexeraWebApplication.
    Frontend:
  • Logs panel with time-window, service, workflow, computing-unit, execution, level, and free-text filters, server-side paging, and sources-backed autofill.

Any related issues, documentation, or discussions?

Closes: #5371
Part of #4070. Stacked on #5378.

How was this PR tested?

  • Backend specs for the logs query builder and response parsers; sbt scalafmtCheckAll passes.
  • Frontend logs-panel and service specs; prettier-eslint and eslint pass.
  • Compile and the full test suites run in this PR's CI.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8 in compliance with ASF

Ma77Ball and others added 5 commits June 5, 2026 04:49
…, SDK bootstrap (default-off)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s panel

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nt scope, health, routing)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ca/eBPF profiling

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tracing primitives

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added engine dependencies Pull requests that update a dependency file frontend Changes related to the frontend GUI docs Changes related to documentations infra common labels Jun 5, 2026
@codecov-commenter

codecov-commenter commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 63.93636% with 612 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.29%. Comparing base (6d31f46) to head (6f9292a).
⚠️ Report is 52 commits behind head on main.

Files with missing lines Patch % Lines
...observability/gateway/ObservabilityResources.scala 0.00% 165 Missing ⚠️
...observability/logs-panel/logs-panel.component.html 56.71% 58 Missing ⚠️
...apache/texera/web/observability/gateway/dtos.scala 70.32% 48 Missing and 6 partials ⚠️
...ala/org/apache/texera/observability/OtelInit.scala 63.30% 43 Missing and 8 partials ⚠️
...texera/web/observability/gateway/AuditLogger.scala 0.00% 41 Missing ⚠️
...xera/web/observability/gateway/ScopeResolver.scala 19.44% 27 Missing and 2 partials ⚠️
...era/web/observability/gateway/GatewayContext.scala 0.00% 19 Missing ⚠️
...a/org/apache/texera/web/TexeraWebApplication.scala 0.00% 16 Missing ⚠️
...r/observability/logs-panel/logs-panel.component.ts 85.18% 3 Missing and 13 partials ⚠️
...ra/web/observability/gateway/ResponseParsers.scala 78.78% 8 Missing and 6 partials ⚠️
... and 27 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5379      +/-   ##
============================================
+ Coverage     53.86%   55.29%   +1.43%     
- Complexity     2756     3018     +262     
============================================
  Files          1099     1144      +45     
  Lines         42541    44603    +2062     
  Branches       4577     4924     +347     
============================================
+ Hits          22916    24665    +1749     
- Misses        18290    18469     +179     
- Partials       1335     1469     +134     
Flag Coverage Δ *Carryforward flag
access-control-service 70.14% <100.00%> (-0.30%) ⬇️
agent-service 34.36% <ø> (ø) Carriedforward from f6bf45d
amber 57.46% <58.20%> (+2.26%) ⬆️
computing-unit-managing-service 0.00% <0.00%> (-1.66%) ⬇️
config-service 50.76% <40.00%> (-5.95%) ⬇️
file-service 58.88% <68.96%> (+1.81%) ⬆️
frontend 49.28% <77.93%> (+1.22%) ⬆️
notebook-migration-service 78.57% <ø> (?)
pyamber 90.15% <100.00%> (+0.02%) ⬆️ Carriedforward from f6bf45d
python 90.76% <ø> (-0.04%) ⬇️ Carriedforward from f6bf45d
workflow-compiling-service 54.74% <ø> (-3.96%) ⬇️

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Ma77Ball and others added 10 commits June 20, 2026 21:58
Call OtelInit.init(<service.name>) in each service main so its logs
bridge to the OTel collector under its own service.name; cap noisy
framework loggers (pekko/iceberg/hadoop/kafka/jetty/jersey/grpc/
netty/hikari/awssdk) at WARN in each service config.

Services: access-control, config, file, computing-unit-managing,
workflow-compiling, computing-unit-master, texera-web, amber.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e span

- WorkflowMetricsRecorder: emit workflow lifecycle metrics keyed by
  execution, driven from the ExecutionStateStore state-transition
  chokepoint; registered via WorkflowMetricsRecorder.init() in
  ComputingUnitMaster
- WorkflowService: wrap initExecutionService in a run-level TexeraTracer
  span so setup-path logs carry the trace id

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- bin/observability/docker-compose.yml: collector + parca-agent stack
- bin/single-node/docker-compose.yml: mount the otel-collector and parca
  configs and run the parca-agent sidecar

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- dtos: drop the per-signal Signal/maxWindowSeconds enum and the upper
  bound on TimeWindow.validate -- DB-backed counts have no retention
  limit and the backends just return what they retain; BadTimeWindow
  becomes a plain value (only empty/inverted windows are rejected)
- DtoValidationSpec: cover the new unbounded-window behavior
- UI shell: observability route + dashboard navigation entry

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @bobbai00, @aicam, @Yicong-Huang
    You can notify them by mentioning @bobbai00, @aicam, @Yicong-Huang in a comment.

Ma77Ball and others added 19 commits June 23, 2026 03:22
…ements

- gateway LogsResource: resolve log user ids to display names via UserDao,
  and pull id-field autofill values from VictoriaLogs /field_values (those
  ids are record fields, not stream labels); adapt to the unbounded
  TimeWindow.validate signature
- builders: fix the body filter to the correct `_msg:"..."` phrase form
  (contains_str is not valid LogsQL)
- ResponseParsers: parseFieldValueLongs for the autofill values
- dtos/observability.types: LogSourcesResponse.userNames
- logs panel: user-name dropdown + 7-day default window

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common dependencies Pull requests that update a dependency file docs Changes related to documentations engine frontend Changes related to the frontend GUI infra platform Non-amber Scala service paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Observability] Log search: gateway endpoint and dashboard panel

2 participants