diff --git a/docs/design/application-name-telemetry.md b/docs/design/application-name-telemetry.md new file mode 100644 index 0000000000..f6163f6cd9 --- /dev/null +++ b/docs/design/application-name-telemetry.md @@ -0,0 +1,241 @@ +# Application Name Telemetry + +> **Status:** Implemented · **Tracking issue:** [#3216](https://github.com/Azure/data-api-builder/issues/3216) + +## Summary + +Data API builder (DAB) embeds a compact, anonymous **usage-telemetry token** into the `Application Name` property of the connection strings it uses to reach SQL Server, Azure SQL, Azure SQL Data Warehouse (DWSQL), and PostgreSQL. Because `Application Name` is surfaced on the database side (for example in `sys.dm_exec_sessions.program_name`), this lets the team understand — **in aggregate and without any per-customer identifiers** — which DAB version is running and which features are enabled, using telemetry the database already collects. + +The token has the shape: + +```text +dab_oss_+||+ +``` + +Example (an MSSQL pool, REST + GraphQL on, Static Web Apps auth): + +```text +dab_oss_2.0.0+XXSX|110000M1M000MMMMMWMM|100?111001110?+ +``` + +It is opt-out (`DAB_TELEMETRY_APPNAME_OPT_OUT=1`), carries no secrets or identifiers, and is purely additive to the existing `Application Name` value. + +## Motivation + +DAB ships as an open-source container that customers run anywhere. We have very little visibility into how it is configured or which features are exercised. The connection's `Application Name` is a standard, low-cost signal that the database side already records, so encoding a small feature fingerprint there gives us aggregate usage insight with: + +- **no new endpoints, services, or network calls,** +- **no per-customer data,** and +- **a single, easy place to query** on the database side. + +### Goals + +- Encode the DAB **version** and a **feature fingerprint** of a deployment into `Application Name`. +- Make the token **queryable and decodable** so the data can be aggregated and read back. +- Be **safe by construction**: no secrets, no identifiers, easy to opt out of, and additive to any user-supplied `Application Name`. +- Avoid changing **connection-pool behavior** (see [Pooling model](#pooling-model-why-some-fields-are-x)). + +### Non-goals + +- **Per-request** telemetry (which API was called, which entity, which role). Those facets are not knowable when a pooled connection opens and belong in DAB's request-level telemetry (OpenTelemetry / Application Insights), not the `Application Name`. +- Telemetry for **MySQL** and **Cosmos DB**. MySQL does not get a payload, and Cosmos connection strings are left untouched. + +## Background: `Application Name` and connection pooling + +`Application Name` is a first-class keyword in both the SQL Server and Npgsql connection-string builders, and it is part of the **connection-pool key**. Two connection strings that differ only in `Application Name` produce two separate pools. + +Two consequences shape the design: + +1. The token is computed **once per data source at configuration load** and is constant for the lifetime of that data source, so embedding it does **not** create additional pools per request — every request to a data source reuses the same `Application Name`. +2. Each data source already has its own connection string (its own pool), so the token is naturally emitted **per pool**. + +DAB already appended a plain `dab_oss_` user agent to the `Application Name`; this feature replaces that plain value with the richer, decodable token (while preserving any user-supplied `Application Name` as a comma-separated prefix). + +## The token format + +```text +dab_oss_+||+ +``` + +- `dab_oss_` — a fixed marker (`ProductInfo.DAB_USER_AGENT_MARKER`) used to locate the token and to decode it. +- `` — the product version `Major.Minor.Patch` (from `ProductInfo.DAB_USER_AGENT`). The telemetry is always based on the product version, so it is independent of any host label (see [Hosted label](#hosted-label-dab_app_name_env)). +- The payload is wrapped in `+ ... +` and split into **three `|`-delimited sections**: `context`, `runtime`, and `entity`. + +> **Note on the issue's example.** The original issue's example string listed a fourth `general` segment, but the issue only defined three settings tables (Context, Runtime, Entity). `general` was never defined, so the implementation uses the three authoritative sections only. + +Each position in a section is a single character drawn from a small alphabet. The shared sentinel values are: + +| Char | Meaning | +| --- | --- | +| `0` | feature present and off / false | +| `1` | feature present and on / true | +| `M` | **missing** — the config section that would answer this is absent | +| `X` | **not applicable** — not knowable when the pool opens (per-request fields) | +| `?` | **not supported** — the concept is not yet modeled in DAB | + +A few positions use field-specific letters instead (Source and Auth provider), described below. + +### Context section (4 characters) + +Identifies *what kind of connection* this is. Only `Source` is knowable when a pooled connection opens; the rest are per-request and therefore `X` (see [Pooling model](#pooling-model-why-some-fields-are-x)). + +| Pos | Field | Encoding | +| --- | --- | --- | +| 1 | Protocol | always `X` (per-request: REST / GraphQL / MCP) | +| 2 | Object | always `X` (per-request: table / view / stored-proc / document) | +| 3 | Source | the database engine of this data source (see table) | +| 4 | Role | always `X` (per-request: anonymous / authenticated / custom) | + +**Source map:** `MSSQL -> S`, `DWSQL -> D`, `PostgreSQL -> P`, `MySQL -> M`, `Cosmos -> C`, and `X` when there is no live data source (for example the CLI, which has no open connection). + +### Runtime section (20 characters) + +A fingerprint of the **global** `runtime` configuration. Each position is `0` / `1` / `M` unless noted. + +| Pos | Setting | +| --- | --- | +| 1 | `runtime.rest.enabled` | +| 2 | `runtime.graphql.enabled` | +| 3 | `runtime.mcp.enabled` | +| 4 | `runtime.host.mode` (`0` = Development, `1` = Production, `M` = missing) | +| 5 | `data-source-files` present (multi-database) | +| 6 | `azure-key-vault` configured | +| 7 | `runtime.health.enabled` | +| 8 | `runtime.cache.enabled` | +| 9 | `runtime.cache.level-2.enabled` | +| 10 | data source uses on-behalf-of (OBO) auth | +| 11 | auto-entities present | +| 12 | `runtime.rest.request-body-strict` | +| 13 | `runtime.graphql.multiple-mutations.create.enabled` | +| 14 | `runtime.telemetry.open-telemetry.enabled` | +| 15 | `runtime.telemetry.application-insights.enabled` | +| 16 | `runtime.telemetry.azure-log-analytics.enabled` | +| 17 | `runtime.telemetry.file.enabled` (file sink) | +| 18 | `runtime.host.authentication.provider` (letter, see below) | +| 19 | embeddings enabled | +| 20 | embeddings endpoint configured | + +**Auth provider letters (position 18):** `U` = Unauthenticated/Simulator-disabled, `S` = Simulator, `W` = StaticWebApps, `A` = AppService, `E` = EntraID / AzureAD, `C` = a custom JWT provider, `M` = no authentication section. The single-letter mapping is a DAB-chosen convention (the issue gave the alphabet without a legend); it is trivial to adjust because encode and decode share one table. + +### Entity section (14 characters) + +An "**is any entity using X?**" fingerprint computed across the (merged) entity set. Each position is `0` / `1` / `M`; positions 4 and 14 may also be `?` because they are not yet modeled. + +| Pos | "Any entity …" | +| --- | --- | +| 1 | is a table | +| 2 | is a view | +| 3 | is a stored procedure | +| 4 | is an MCP persisted document (`?` — not modeled) | +| 5 | has caching enabled | +| 6 | has REST enabled | +| 7 | has GraphQL enabled | +| 8 | exposes MCP DML tools | +| 9 | exposes an MCP custom tool | +| 10 | uses a custom role (not `anonymous` / `authenticated`) | +| 11 | uses an item-level policy | +| 12 | has a description | +| 13 | has relationships | +| 14 | uses parameter embedding (`?` — not modeled) | + +`M` here means "no entities at all," distinguishing an empty deployment from one whose entities simply do not use a feature. + +## Design + +### Encoder / decoder + +A single class, `ApplicationNameTelemetry` (in `Azure.DataApiBuilder.Config.Telemetry`), owns the format: + +- `EncodeTelemetryString(config, liveDataSource)` produces the pure `dab_oss_+...+` token. It is **independent of the opt-out switch and of any host label** — it always emits the full payload, which is why the CLI uses it for inspection. +- `BuildApplicationNameSegment(config, liveDataSource)` produces what is actually embedded: it honors the opt-out switch and prepends any host label. +- `Decode(applicationName)` turns a token back into human-readable lines and is tolerant of truncation, a missing trailing delimiter, an absent payload, and extra (newer) flags. + +Encode and decode are driven by **one ordered list of settings per section** (`_contextSettings`, `_runtimeSettings`, `_entitySettings`). Each setting knows how to encode itself and how to describe a decoded character, so the two directions can never drift apart, and adding a flag is a one-line, append-only change. + +### Where and when the token is embedded + +The token is woven into connection strings at **configuration load time**, never per request. + +- **File / standard load.** `RuntimeConfigLoader.TryParseConfig` post-processes the parsed config and, for every MSSQL / DWSQL / PostgreSQL data source, replaces the `Application Name` with the embedded token. A single public dispatcher, `GetConnectionStringWithApplicationName(connectionString, config, dataSource)`, selects the engine-specific builder (`SqlConnectionStringBuilder` vs `NpgsqlConnectionStringBuilder`); engines without telemetry support return the connection string unchanged. +- **Hosted / late-config.** The `POST /configuration` endpoint supplies configuration after startup with environment-variable replacement disabled, which bypasses the file-load post-processing. `RuntimeConfigProvider.Initialize` therefore embeds the token itself for every data source after the config is materialized, so hosted deployments — exactly where the `dab_hosted` label matters most — are covered for both the single-connection-string and merged-config endpoint variants. + +### Pooling model: why some fields are `X` + +The `Application Name` is the pool key. If `Protocol`, `Object`, and `Role` were encoded per request, DAB would need a distinct pool per `(protocol, object, role)` combination (3 × 4 × 3 = 36 per data source, multiplied again per user under OBO), exploding the pool count and harming performance. We therefore adopt **Model A**: encode only what is fixed when the pool opens (`Source`) and emit `X` for the per-request facets. Those per-request dimensions, when needed, belong in DAB's request-level telemetry, not the connection's `Application Name`. + +### Global telemetry per pool (multi-database) + +In a multi-database deployment each data source is its own pool, so the token is embedded into each. We deliberately encode the **global** runtime and the **complete (merged) entity set** at every pool, rather than scoping the entity fingerprint to the entities of that specific data source. The decisive reason is that **the token carries no deployment-correlation identifier**, so the consumer cannot stitch per-pool slices back into one deployment. Encoding the global picture at every pool means **any single sampled connection is sufficient** to know the deployment's full feature profile — robust to sampling and to rarely-opened pools. (The `Source` character still differs per pool, so the engine mix is preserved.) + +### Idempotency + +Embedding is idempotent: the engine-specific helpers parse the existing `Application Name` and **skip** if it already contains the `dab_oss_` marker. This guarantees a value can never accumulate a duplicated payload (`...+...+,dab_oss_...+...+`) even if the embed path runs more than once (for example loader post-processing followed by the late-config provider). A user-supplied `Application Name` with no marker is preserved and the token is appended after a comma. + +### Opt-out (`DAB_TELEMETRY_APPNAME_OPT_OUT`) + +Setting `DAB_TELEMETRY_APPNAME_OPT_OUT=1` reduces the embedded value to **version only** (`dab_oss_`, no payload). Any other value (or unset) leaves telemetry on. The marker is preserved even when opted out so the version remains decodable. + +### Hosted label (`DAB_APP_NAME_ENV`) + +When `DAB_APP_NAME_ENV` is set (DAB's hosted offering sets it to `dab_hosted`), its value is preserved as a **comma prefix**: `dab_hosted,dab_oss_+...+`. Telemetry is always computed from the product version, so the host label **never suppresses** the token and the `dab_oss_` marker stays intact for decoding. + +### CLI: `dab appname` + +A new offline command supports inspection without a database: + +- `dab appname --config ` parses the config and prints the token. Context is emitted as placeholders (no live connection), so the `Source` is `X`. This command performs **no validation and opens no connection** — it is a static inspection tool, and it intentionally always shows the full encoding regardless of the opt-out switch. +- `dab appname --decode ""` prints a human-readable legend, tolerant of truncation. +- `-o, --output ` writes the result to a file instead of stdout. + +### Diagnostic logging + +When the token is computed, DAB emits a single Debug log of the **token only** (never the full connection string, which can contain secrets). Because the log level may not be known when connection strings are first computed, the entry is buffered in a shared `LogBuffer` and flushed once the logger is available (at startup, on hot reload, and on the hosted late-config path). The buffer is **bounded** (drop-oldest beyond a cap) so it cannot grow without limit if it is ever left undrained. + +## Privacy and security + +- The token contains **no connection-string contents, secrets, server names, database names, or customer identifiers** — only the DAB version and boolean/categorical feature flags. +- It is **opt-out** via `DAB_TELEMETRY_APPNAME_OPT_OUT=1`. +- The diagnostic log emits the **token**, never the connection string. + +## Scope + +| Engine | Telemetry token | Notes | +| --- | --- | --- | +| SQL Server / Azure SQL (`MSSQL`) | Yes (`Source = S`) | | +| SQL Data Warehouse (`DWSQL`) | Yes (`Source = D`) | shares the SQL Server builder | +| PostgreSQL | Yes (`Source = P`) | | +| MySQL | No | connection string left unchanged | +| Cosmos DB | No | connection string left unchanged | + +## Decoding and observing in production + +On the database side, the token appears as the session's program name. For SQL Server: + +```sql +SELECT program_name +FROM sys.dm_exec_sessions +WHERE program_name LIKE 'dab_oss%' OR program_name LIKE '%,dab_oss%'; +``` + +A captured token can be decoded back to a legend with `dab appname --decode ""`. + +## Testing + +- **Encoder / decoder unit tests** for token shape, each section's flag mapping, the Source and auth-provider maps, opt-out, the host-label prefix, and round-trip / truncation-tolerant decoding. +- **Connection-string injection tests** for MSSQL, DWSQL, and PostgreSQL (including the user-supplied `Application Name` prefix case), and the no-op cases for MySQL / Cosmos. +- **Multi-database tests** asserting child data sources encode the global runtime and merged entities, and a heterogeneous (MSSQL + PostgreSQL) case asserting the per-pool `Source` character. +- **Hosted / late-config tests** asserting telemetry is embedded through `RuntimeConfigProvider.Initialize` for the single-source and multi-database cases, plus end-to-end `/configuration` and `/configuration/v2` endpoint tests. +- **Idempotency test** asserting a re-embed is a no-op (exactly one marker). +- **`LogBuffer` tests** asserting the bounded drop-oldest behavior and flush-and-drain. +- **CLI tests** for encode (to file and stdout), decode, the config-not-found error path, and opt-out independence. + +## Extensibility + +- **Adding a flag.** Append a `Setting` to the relevant section list; encode and decode update together. Sentinels (`M`, `?`, `X`) keep older decoders forward-compatible. +- **Adding an engine.** Implement an engine-specific `Get...ConnectionStringWithApplicationName`, add it to the dispatcher's switch, and map the engine to a `Source` character. The injection sites (file load, hosted, multi-database) are already engine-agnostic. + + +## References + +- Tracking issue: [#3216](https://github.com/Azure/data-api-builder/issues/3216) +- Key types: `ApplicationNameTelemetry` (`src/Config/Telemetry/`), `RuntimeConfigLoader` / `FileSystemRuntimeConfigLoader` (`src/Config/`), `RuntimeConfigProvider` (`src/Core/Configurations/`), `LogBuffer` (`src/Config/`), `AppNameOptions` (`src/Cli/Commands/`). diff --git a/src/Cli.Tests/EndToEndTests.cs b/src/Cli.Tests/EndToEndTests.cs index f625dba101..7b31bba90c 100644 --- a/src/Cli.Tests/EndToEndTests.cs +++ b/src/Cli.Tests/EndToEndTests.cs @@ -2,6 +2,7 @@ // Licensed under the MIT License. using Azure.DataApiBuilder.Config.Converters; +using Azure.DataApiBuilder.Config.Telemetry; using Azure.DataApiBuilder.Product; using Azure.DataApiBuilder.Service; using Cli.Constants; @@ -128,7 +129,10 @@ public void TestInitializingRestAndGraphQLGlobalSettings() replacementSettings: replacementSettings)); SqlConnectionStringBuilder builder = new(runtimeConfig.DataSource!.ConnectionString); - Assert.AreEqual(ProductInfo.GetDataApiBuilderUserAgent(), builder.ApplicationName); + // Application Name now embeds the dab_oss telemetry block (dab_oss_++), + // so assert it begins with the product user agent rather than exact-matching it. + Assert.IsTrue(builder.ApplicationName.StartsWith(ProductInfo.GetDataApiBuilderUserAgent()), + $"Expected Application Name to start with '{ProductInfo.GetDataApiBuilderUserAgent()}' but was '{builder.ApplicationName}'."); Assert.IsNotNull(runtimeConfig); Assert.AreEqual(DatabaseType.MSSQL, runtimeConfig.DataSource.DatabaseType); @@ -139,6 +143,133 @@ public void TestInitializingRestAndGraphQLGlobalSettings() Assert.IsTrue(runtimeConfig.Runtime.GraphQL?.Enabled); } + /// + /// The `appname` command encodes the telemetry Application Name from a config (offline — no + /// validation and no database connection) and decodes a telemetry string back into a + /// human-readable description. + /// + [TestMethod] + public void TestAppNameEncodeAndDecode() + { + // Arrange: a minimal, self-contained MSSQL config in the mock file system. + string configJson = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { ""database-type"": ""mssql"", ""connection-string"": ""Server=localhost;Database=demo;User Id=sa;Password=Placeholder1;"" }, + ""runtime"": { ""rest"": { ""enabled"": true }, ""graphql"": { ""enabled"": true }, ""host"": { ""mode"": ""development"", ""authentication"": { ""provider"": ""StaticWebApps"" } } }, + ""entities"": { ""Book"": { ""source"": { ""object"": ""dbo.books"", ""type"": ""table"" }, ""permissions"": [ { ""role"": ""anonymous"", ""actions"": [ ""read"" ] } ] } } + }"; + _fileSystem!.File.WriteAllText("appname-config.json", configJson); + + // Act: encode to an output file. This must succeed offline (no validation / no DB connection). + int encodeCode = Program.Execute( + new[] { "appname", "--config", "appname-config.json", "--output", "appname-out.txt" }, + _cliLogger!, _fileSystem!, _runtimeConfigLoader!); + + // Assert: encode succeeded and produced a well-formed telemetry string. + Assert.AreEqual(0, encodeCode, "appname --config should succeed offline"); + string telemetry = _fileSystem.File.ReadAllText("appname-out.txt"); + Assert.IsTrue(telemetry.StartsWith("dab_oss_"), telemetry); + Assert.IsTrue(telemetry.EndsWith("+"), telemetry); + + // Act: decode the produced string back into a human-readable description. + int decodeCode = Program.Execute( + new[] { "appname", "--decode", telemetry, "--output", "appname-decoded.txt" }, + _cliLogger!, _fileSystem!, _runtimeConfigLoader!); + + // Assert: decode succeeded and produced recognizable lines. + Assert.AreEqual(0, decodeCode, "appname --decode should succeed"); + string decoded = _fileSystem.File.ReadAllText("appname-decoded.txt"); + Assert.IsTrue(decoded.Contains("Version: dab_oss_"), decoded); + Assert.IsTrue(decoded.Contains("runtime.rest.enabled"), decoded); + Assert.IsTrue(decoded.Contains("entities.any.table"), decoded); + } + + /// + /// The `appname` encode path returns GENERAL_ERROR (and writes no output file) when the config + /// file cannot be found. + /// + [TestMethod] + public void TestAppNameEncodeFailsWhenConfigMissing() + { + int code = Program.Execute( + new[] { "appname", "--config", "does-not-exist.json", "--output", "appname-out.txt" }, + _cliLogger!, _fileSystem!, _runtimeConfigLoader!); + + Assert.AreEqual(CliReturnCode.GENERAL_ERROR, code, "appname encode should fail when the config cannot be found."); + Assert.IsFalse(_fileSystem!.File.Exists("appname-out.txt"), "No output file should be written on failure."); + } + + /// + /// The `appname` encode path writes the telemetry string to stdout when --output is omitted. + /// + [TestMethod] + public void TestAppNameEncodeWritesToStdout() + { + string configJson = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { ""database-type"": ""mssql"", ""connection-string"": ""Server=localhost;Database=demo;User Id=sa;Password=Placeholder1;"" }, + ""runtime"": { ""rest"": { ""enabled"": true }, ""graphql"": { ""enabled"": true } }, + ""entities"": { } + }"; + _fileSystem!.File.WriteAllText("appname-config.json", configJson); + + TextWriter originalOut = Console.Out; + StringWriter capturedOut = new(); + Console.SetOut(capturedOut); + int code; + try + { + code = Program.Execute( + new[] { "appname", "--config", "appname-config.json" }, + _cliLogger!, _fileSystem!, _runtimeConfigLoader!); + } + finally + { + Console.SetOut(originalOut); + } + + Assert.AreEqual(CliReturnCode.SUCCESS, code, "appname encode to stdout should succeed."); + string stdout = capturedOut.ToString(); + Assert.IsTrue(stdout.Contains("dab_oss_"), $"stdout should contain the telemetry marker but was '{stdout}'."); + Assert.IsTrue(stdout.TrimEnd().EndsWith("+"), $"stdout telemetry should end with '+' but was '{stdout}'."); + } + + /// + /// The `appname` command is a design-time inspection tool: it always shows the full telemetry + /// encoding so users can see what would be collected, independent of the runtime opt-out switch + /// (DAB_TELEMETRY_APPNAME_OPT_OUT). This pins that intentional behavior. + /// + [TestMethod] + public void TestAppNameEncodeIsIndependentOfOptOut() + { + string? originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, "1"); + try + { + string configJson = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { ""database-type"": ""mssql"", ""connection-string"": ""Server=localhost;Database=demo;User Id=sa;Password=Placeholder1;"" }, + ""runtime"": { ""rest"": { ""enabled"": true }, ""graphql"": { ""enabled"": true } }, + ""entities"": { } + }"; + _fileSystem!.File.WriteAllText("appname-config.json", configJson); + + int code = Program.Execute( + new[] { "appname", "--config", "appname-config.json", "--output", "appname-out.txt" }, + _cliLogger!, _fileSystem!, _runtimeConfigLoader!); + + Assert.AreEqual(CliReturnCode.SUCCESS, code, "appname encode should succeed even when opted out."); + string telemetry = _fileSystem.File.ReadAllText("appname-out.txt"); + Assert.IsTrue( + telemetry.StartsWith("dab_oss_") && telemetry.EndsWith("+"), + $"appname should show the full telemetry encoding regardless of opt-out, but was '{telemetry}'."); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + } + } + /// /// Test to validate the usage of --graphql.multiple-mutations.create.enabled option of the init command for all database types. /// diff --git a/src/Cli/Commands/AppNameOptions.cs b/src/Cli/Commands/AppNameOptions.cs new file mode 100644 index 0000000000..3255109533 --- /dev/null +++ b/src/Cli/Commands/AppNameOptions.cs @@ -0,0 +1,104 @@ +// Copyright (c) Microsoft Corporation. +// Licensed under the MIT License. + +using System.IO.Abstractions; +using Azure.DataApiBuilder.Config; +using Azure.DataApiBuilder.Config.ObjectModel; +using Azure.DataApiBuilder.Config.Telemetry; +using Azure.DataApiBuilder.Core.Configurations; +using Cli.Constants; +using CommandLine; +using Microsoft.Extensions.Logging; + +namespace Cli.Commands +{ + /// + /// Options for the appname command, which encodes the DAB telemetry Application Name + /// from a config file, or decodes a telemetry Application Name into a human-readable description. + /// + [Verb("appname", isDefault: false, HelpText = "Show or decode the DAB telemetry 'Application Name' embedded in SQL connections.", Hidden = false)] + public class AppNameOptions : Options + { + public AppNameOptions(string? decode = null, string? output = null, string? config = null) + : base(config) + { + Decode = decode; + Output = output; + } + + /// + /// When provided, decodes the given telemetry Application Name string into a human-readable + /// description instead of encoding from a config file. Decoding is tolerant of truncation. + /// + [Option("decode", Required = false, HelpText = "Decode a telemetry Application Name string into a human-readable description.")] + public string? Decode { get; } + + /// + /// Optional file path to write the result to. When omitted, the result is written to stdout. + /// + [Option('o', "output", Required = false, HelpText = "Write the result to the specified file instead of stdout.")] + public string? Output { get; } + + /// + /// Handles the appname command. + /// + public int Handler(ILogger logger, FileSystemRuntimeConfigLoader loader, IFileSystem fileSystem) + { + // Decode mode: a pure, tolerant string decode. No config or validation is required. + // Presence of the option (even with an empty/whitespace value) selects decode mode; the + // decoder itself reports a friendly message for empty input. + if (Decode is not null) + { + IReadOnlyList decodedLines = ApplicationNameTelemetry.Decode(Decode); + WriteResult(string.Join(Environment.NewLine, decodedLines), fileSystem, logger, trailingNewLine: true); + return CliReturnCode.SUCCESS; + } + + // Encode mode: parse the config and emit the telemetry Application Name. + // We intentionally do NOT run full `validate` here — validation opens a database + // connection, whereas encoding only needs the parsed runtime/entity settings. + // Requiring a live database would defeat the purpose of this static inspection command. + if (!ConfigGenerator.TryGetConfigForRuntimeEngine(Config, loader, fileSystem, out _)) + { + logger.LogError("Could not determine the config file to use."); + return CliReturnCode.GENERAL_ERROR; + } + + RuntimeConfigProvider runtimeConfigProvider = new(loader); + if (!runtimeConfigProvider.TryGetConfig(out RuntimeConfig? runtimeConfig) || runtimeConfig is null) + { + logger.LogError("Failed to parse the config file."); + return CliReturnCode.GENERAL_ERROR; + } + + // There is no live connection context at design time, so the context fields + // (Protocol/Object/Source/Role) are emitted as placeholders. + string telemetryAppName = ApplicationNameTelemetry.EncodeTelemetryString(runtimeConfig, liveDataSource: null); + WriteResult(telemetryAppName, fileSystem, logger, trailingNewLine: false); + return CliReturnCode.SUCCESS; + } + + /// + /// Writes the result to the output file when --output is provided, otherwise to stdout. + /// + private void WriteResult(string content, IFileSystem fileSystem, ILogger logger, bool trailingNewLine) + { + if (!string.IsNullOrWhiteSpace(Output)) + { + // Mirror stdout behavior: append a trailing newline for human-readable (decode) output, + // but keep encode output exact (no trailing newline) so it can be copied/piped verbatim. + string fileContent = trailingNewLine ? content + Environment.NewLine : content; + fileSystem.File.WriteAllText(Output, fileContent); + logger.LogInformation("Wrote output to '{outputFile}'.", Output); + } + else if (trailingNewLine) + { + Console.WriteLine(content); + } + else + { + Console.Write(content); + } + } + } +} diff --git a/src/Cli/Program.cs b/src/Cli/Program.cs index a9d9247ebd..faba1ee6d5 100644 --- a/src/Cli/Program.cs +++ b/src/Cli/Program.cs @@ -88,7 +88,7 @@ public static int Execute(string[] args, ILogger cliLogger, IFileSystem fileSyst }); // Parsing user arguments and executing required methods. - int result = parser.ParseArguments(args) + int result = parser.ParseArguments(args) .MapResult( (InitOptions options) => options.Handler(cliLogger, loader, fileSystem), (AddOptions options) => options.Handler(cliLogger, loader, fileSystem), @@ -100,6 +100,7 @@ public static int Execute(string[] args, ILogger cliLogger, IFileSystem fileSyst (AutoConfigOptions options) => options.Handler(cliLogger, loader, fileSystem), (AutoConfigSimulateOptions options) => options.Handler(cliLogger, loader, fileSystem), (ExportOptions options) => options.Handler(cliLogger, loader, fileSystem), + (AppNameOptions options) => options.Handler(cliLogger, loader, fileSystem), errors => DabCliParserErrorHandler.ProcessErrorsAndReturnExitCode(errors)); return result; diff --git a/src/Config/DeserializationVariableReplacementSettings.cs b/src/Config/DeserializationVariableReplacementSettings.cs index 350824409b..8448c9a7bf 100644 --- a/src/Config/DeserializationVariableReplacementSettings.cs +++ b/src/Config/DeserializationVariableReplacementSettings.cs @@ -18,6 +18,14 @@ public class DeserializationVariableReplacementSettings public bool DoReplaceAkvVar { get; set; } public EnvironmentVariableReplacementFailureMode EnvFailureMode { get; set; } = EnvironmentVariableReplacementFailureMode.Throw; + /// + /// When true, connection-string Application Name (telemetry) injection is skipped during this + /// parse. This is set for nested child config loads in a multi-database setup: a child config + /// lacks the global runtime section and knows only its own entities, so the top-level load + /// performs the injection once over the fully-merged config for every data source. + /// + public bool SkipApplicationNameInjection { get; set; } + // @env\(' : match @env(' // @akv\(' : match @akv(' // .*? : lazy match any character except newline 0 or more times diff --git a/src/Config/FileSystemRuntimeConfigLoader.cs b/src/Config/FileSystemRuntimeConfigLoader.cs index 016931bf62..e3529c696f 100644 --- a/src/Config/FileSystemRuntimeConfigLoader.cs +++ b/src/Config/FileSystemRuntimeConfigLoader.cs @@ -360,6 +360,11 @@ private void HotReloadConfig(bool isDevMode, ILogger? logger = null) IsNewConfigValidated = false; SignalConfigChanged(); + // Telemetry (and any other) logs buffered during the reload parse are otherwise only + // drained once at startup. Flush them now so hot-reload logs are actually emitted and the + // shared static buffer does not accumulate entries across successive reloads. + FlushLogBuffer(); + logger?.LogInformation("Hot-reload process finished."); } @@ -551,13 +556,10 @@ public void SetLogger(ILogger logger) } /// - /// Flush all logs from the buffer after the log level is set from the RuntimeConfig. - /// Logger needs to be present, or else the logs will be lost. + /// The logger this loader emits to once set, consumed by the base + /// to drain buffered logs. /// - public void FlushLogBuffer() - { - _logBuffer.FlushToLogger(_logger!); - } + protected override ILogger? Logger => _logger; /// /// Helper method that sends the log to the buffer if the logger has not being set up. diff --git a/src/Config/LogBuffer.cs b/src/Config/LogBuffer.cs index f014f012ed..63e1cd3236 100644 --- a/src/Config/LogBuffer.cs +++ b/src/Config/LogBuffer.cs @@ -12,6 +12,12 @@ namespace Azure.DataApiBuilder.Config /// public class LogBuffer { + /// + /// Upper bound on buffered entries. Prevents unbounded growth when the buffer is never drained + /// (e.g. a loader with no logger in a hot-reload loop). The oldest entries are dropped first. + /// + internal const int MAX_BUFFERED_ENTRIES = 1000; + private readonly ConcurrentQueue<(LogLevel LogLevel, string Message, Exception? Exception)> _logBuffer; private readonly object _flushLock = new(); @@ -26,6 +32,12 @@ public LogBuffer() public void BufferLog(LogLevel logLevel, string message, Exception? exception = null) { _logBuffer.Enqueue((logLevel, message, exception)); + + // Keep the buffer bounded so it cannot grow without limit if it is never drained. Dropping + // the oldest entries first preserves the most recent (most useful) diagnostics. + while (_logBuffer.Count > MAX_BUFFERED_ENTRIES && _logBuffer.TryDequeue(out _)) + { + } } /// diff --git a/src/Config/ObjectModel/RuntimeConfig.cs b/src/Config/ObjectModel/RuntimeConfig.cs index 21d2c23252..a8b71d10c9 100644 --- a/src/Config/ObjectModel/RuntimeConfig.cs +++ b/src/Config/ObjectModel/RuntimeConfig.cs @@ -379,7 +379,14 @@ public RuntimeConfig( // be resolved using the parent's Key Vault configuration. // If a child config defines its own azure-key-vault section, TryParseConfig's // ExtractAzureKeyVaultOptions will detect it and override these parent options. - DeserializationVariableReplacementSettings replacementSettings = new(azureKeyVaultOptions: this.AzureKeyVault, doReplaceEnvVar: true, doReplaceAkvVar: true, envFailureMode: EnvironmentVariableReplacementFailureMode.Ignore); + DeserializationVariableReplacementSettings replacementSettings = new(azureKeyVaultOptions: this.AzureKeyVault, doReplaceEnvVar: true, doReplaceAkvVar: true, envFailureMode: EnvironmentVariableReplacementFailureMode.Ignore) + { + // Defer Application Name (telemetry) injection to the top-level load. A child config + // has no global runtime section and only its own entities; the root performs the + // injection once over the fully-merged config so each data source's pool reflects the + // global runtime and the complete entity set. + SkipApplicationNameInjection = true + }; foreach (string dataSourceFile in DataSourceFiles.SourceFiles) { diff --git a/src/Config/RuntimeConfigLoader.cs b/src/Config/RuntimeConfigLoader.cs index b40d0b084f..169348ac28 100644 --- a/src/Config/RuntimeConfigLoader.cs +++ b/src/Config/RuntimeConfigLoader.cs @@ -10,9 +10,11 @@ using Azure.DataApiBuilder.Config.Converters; using Azure.DataApiBuilder.Config.NamingPolicies; using Azure.DataApiBuilder.Config.ObjectModel; +using Azure.DataApiBuilder.Config.Telemetry; using Azure.DataApiBuilder.Product; using Azure.DataApiBuilder.Service.Exceptions; using Microsoft.Data.SqlClient; +using Microsoft.Extensions.Logging; using Microsoft.Extensions.Primitives; using Npgsql; using static Azure.DataApiBuilder.Config.DabConfigEvents; @@ -28,6 +30,26 @@ public abstract class RuntimeConfigLoader protected static LogBuffer _logBuffer = new(); + /// + /// Logger used to drain buffered logs. null on a base loader with no logging; loaders that + /// own a logger (e.g. ) override this so + /// can emit to it. + /// + protected virtual ILogger? Logger => null; + + /// + /// Flushes any logs buffered during config parsing / telemetry embedding (notably the telemetry + /// Application Name Debug log) to . Safe no-op when no logger is available (the + /// buffered logs remain until a later flush), so it cannot lose logs or regress flush behavior. + /// + public void FlushLogBuffer() + { + if (Logger is not null) + { + _logBuffer.FlushToLogger(Logger); + } + } + // Public to allow the RuntimeProvider and other users of class to set via out param. // May be candidate to refactor by changing all of the Parse/Load functions to save // state in place of using out params. @@ -223,7 +245,12 @@ public static bool TryParseConfig(string json, azureKeyVaultOptions: azureKeyVaultOptions, doReplaceEnvVar: replacementSettings.DoReplaceEnvVar, doReplaceAkvVar: replacementSettings.DoReplaceAkvVar, - envFailureMode: replacementSettings.EnvFailureMode); + envFailureMode: replacementSettings.EnvFailureMode) + { + // Preserve the child-config skip flag across this AKV-driven rebuild so nested + // configs still defer Application Name injection to the top-level load. + SkipApplicationNameInjection = replacementSettings.SkipApplicationNameInjection + }; } } @@ -238,47 +265,51 @@ public static bool TryParseConfig(string json, return false; } - // retreive current connection string from config - string updatedConnectionString = config.DataSource?.ConnectionString ?? string.Empty; - - if (!string.IsNullOrEmpty(connectionString)) + // Embed the DAB Application Name (with anonymous usage telemetry) into the connection + // string of every MSSQL / DWSQL / PostgreSQL data source. + // + // We iterate the fully-merged data-source map and pass the merged `config`, so that in a + // multi-database setup each data source reflects the GLOBAL runtime settings and the + // COMPLETE (merged) entity set rather than its own partial child config. Child configs skip + // this step during their own parse (SkipApplicationNameInjection); the top-level load runs + // it once here, after the merge, so every connection pool carries a self-contained snapshot + // of the deployment. + // + // The explicit connection-string override (the `connectionString` parameter), when present, + // applies only to the default data source. + // The explicit connection-string override is applied to the default data source regardless of + // env-var replacement, while telemetry embedding is gated on DoReplaceEnvVar (and skipped for + // nested child configs, which defer injection to the top-level load). + bool embedTelemetry = replacementSettings?.DoReplaceEnvVar == true && replacementSettings?.SkipApplicationNameInjection != true; + bool hasConnectionStringOverride = !string.IsNullOrEmpty(connectionString); + + if (embedTelemetry || hasConnectionStringOverride) { - // update connection string if provided. - updatedConnectionString = connectionString; - } - - // Post-processing for connection strings only applies when a data source is present. - // Root configs (with data-source-files) may not have a data source. - if (config.DataSource is not null) - { - Dictionary datasourceNameToConnectionString = new(); - - // add to dictionary if datasourceName is present - datasourceNameToConnectionString.TryAdd(config.DefaultDataSourceName, updatedConnectionString); - - // iterate over dictionary and update runtime config with connection strings. - foreach ((string dataSourceKey, string connectionValue) in datasourceNameToConnectionString) + foreach ((string dataSourceName, DataSource dataSource) in config.GetDataSourceNamesToDataSourcesIterator().ToList()) { - string updatedConnection = connectionValue; + bool isDefaultDataSource = string.Equals(dataSourceName, config.DefaultDataSourceName, StringComparison.OrdinalIgnoreCase); - DataSource ds = config.GetDataSourceFromDataSourceName(dataSourceKey); + // The override applies only to the default data source; others keep their own value. + bool applyOverrideHere = isDefaultDataSource && hasConnectionStringOverride; - // Add Application Name for telemetry for MsSQL or PgSql - if (ds.DatabaseType is DatabaseType.MSSQL && replacementSettings?.DoReplaceEnvVar == true) + // Nothing to do for a non-default data source when we're not embedding telemetry. + if (!embedTelemetry && !applyOverrideHere) { - updatedConnection = GetConnectionStringWithApplicationName(connectionValue); - } - else if (ds.DatabaseType is DatabaseType.PostgreSQL && replacementSettings?.DoReplaceEnvVar == true) - { - updatedConnection = GetPgSqlConnectionStringWithApplicationName(connectionValue); + continue; } - ds = ds with { ConnectionString = updatedConnection }; - config.UpdateDataSourceNameToDataSource(config.DefaultDataSourceName, ds); + string baseConnectionString = applyOverrideHere ? connectionString! : dataSource.ConnectionString; + + string updatedConnectionString = embedTelemetry + ? GetConnectionStringWithApplicationName(baseConnectionString, config, dataSource) + : baseConnectionString; + + DataSource updatedDataSource = dataSource with { ConnectionString = updatedConnectionString }; + config.UpdateDataSourceNameToDataSource(dataSourceName, updatedDataSource); - if (string.Equals(dataSourceKey, config.DefaultDataSourceName, StringComparison.OrdinalIgnoreCase)) + if (isDefaultDataSource) { - config = config with { DataSource = ds }; + config = config with { DataSource = updatedDataSource }; } } } @@ -362,14 +393,37 @@ public static JsonSerializerOptions GetSerializationOptions( return options; } + /// + /// Embeds the DAB Application Name (with anonymous usage telemetry) into the connection + /// string for the given data source, dispatching to the engine-specific implementation. Engines + /// that do not support telemetry (e.g. MySQL) return the connection string unchanged. + /// + /// Connection string for connecting to the database. + /// The fully-resolved runtime config used to compute the telemetry payload. + /// The data source whose connection is being opened (selects the engine and per-pool fields). + /// The connection string with the telemetry-bearing Application Name embedded. + public static string GetConnectionStringWithApplicationName(string connectionString, RuntimeConfig config, DataSource dataSource) + { + return dataSource.DatabaseType switch + { + DatabaseType.MSSQL or DatabaseType.DWSQL => GetMsSqlConnectionStringWithApplicationName(connectionString, config, dataSource), + DatabaseType.PostgreSQL => GetPgSqlConnectionStringWithApplicationName(connectionString, config, dataSource), + _ => connectionString, + }; + } + /// /// It adds or replaces a property in the connection string with `Application Name` property. /// If the connection string already contains the property, it appends the property `Application Name` to the connection string, /// else add the Application Name property with DataApiBuilder Application Name based on hosted/oss platform. /// /// Connection string for connecting to database. + /// When provided, anonymous DAB telemetry is embedded into the `Application Name` + /// (honoring the `DAB_TELEMETRY_APPNAME_OPT_OUT` opt-out). When null, only the plain user agent is used. + /// The data source whose connection is being opened, used to encode per-pool + /// fields (Source, OBO). Ignored when is null. /// Updated connection string with `Application Name` property. - internal static string GetConnectionStringWithApplicationName(string connectionString) + internal static string GetMsSqlConnectionStringWithApplicationName(string connectionString, RuntimeConfig? config = null, DataSource? liveDataSource = null) { // If the connection string is null, empty, or whitespace, return it as is. if (string.IsNullOrWhiteSpace(connectionString)) @@ -377,8 +431,6 @@ internal static string GetConnectionStringWithApplicationName(string connectionS return connectionString; } - string applicationName = ProductInfo.GetDataApiBuilderUserAgent(); - // Create a StringBuilder from the connection string. SqlConnectionStringBuilder connectionStringBuilder; try @@ -394,6 +446,26 @@ internal static string GetConnectionStringWithApplicationName(string connectionS innerException: ex); } + // Idempotency guard: if DAB telemetry was already embedded into the Application Name (e.g. by + // the loader's post-processing), do not append it again — that would duplicate the payload. + if (connectionStringBuilder.ApplicationName?.Contains(ProductInfo.DAB_USER_AGENT_MARKER, StringComparison.Ordinal) == true) + { + return connectionString; + } + + // When the full runtime config is available, embed anonymous DAB telemetry into the + // Application Name (honoring the opt-out switch). Otherwise fall back to the plain user agent. + string applicationName = config is null + ? ProductInfo.GetDataApiBuilderUserAgent() + : ApplicationNameTelemetry.BuildApplicationNameSegment(config, liveDataSource); + + if (config is not null) + { + // Emit the telemetry-bearing Application Name (never the full connection string, which can + // contain secrets) at Debug, once per pool, as required by the telemetry design. + _logBuffer.BufferLog(LogLevel.Debug, $"DAB telemetry Application Name computed for '{liveDataSource?.DatabaseType}' data source: {applicationName}"); + } + string defaultApplicationName = new SqlConnectionStringBuilder().ApplicationName; // If the connection string does not contain the `Application Name` property, add it. @@ -420,8 +492,11 @@ internal static string GetConnectionStringWithApplicationName(string connectionS /// else add the Application Name property with DataApiBuilder Application Name based on hosted/oss platform. /// /// Connection string for connecting to database. + /// When provided, anonymous DAB usage telemetry is embedded in the Application Name (honoring the opt-out switch); otherwise the plain user agent is used. + /// The data source whose connection is being opened, used to encode per-pool + /// fields (Source, OBO). Ignored when is null. /// Updated connection string with `Application Name` property. - internal static string GetPgSqlConnectionStringWithApplicationName(string connectionString) + internal static string GetPgSqlConnectionStringWithApplicationName(string connectionString, RuntimeConfig? config = null, DataSource? liveDataSource = null) { // If the connection string is null, empty, or whitespace, return it as is. if (string.IsNullOrWhiteSpace(connectionString)) @@ -429,8 +504,6 @@ internal static string GetPgSqlConnectionStringWithApplicationName(string connec return connectionString; } - string applicationName = ProductInfo.GetDataApiBuilderUserAgent(); - // Create a StringBuilder from the connection string. NpgsqlConnectionStringBuilder connectionStringBuilder; try @@ -446,6 +519,26 @@ internal static string GetPgSqlConnectionStringWithApplicationName(string connec innerException: ex); } + // Idempotency guard: if DAB telemetry was already embedded into the Application Name (e.g. by + // the loader's post-processing), do not append it again — that would duplicate the payload. + if (connectionStringBuilder.ApplicationName?.Contains(ProductInfo.DAB_USER_AGENT_MARKER, StringComparison.Ordinal) == true) + { + return connectionString; + } + + // When the full runtime config is available, embed anonymous DAB telemetry into the + // Application Name (honoring the opt-out switch). Otherwise fall back to the plain user agent. + string applicationName = config is null + ? ProductInfo.GetDataApiBuilderUserAgent() + : ApplicationNameTelemetry.BuildApplicationNameSegment(config, liveDataSource); + + if (config is not null) + { + // Emit the telemetry-bearing Application Name (never the full connection string, which can + // contain secrets) at Debug, once per pool, as required by the telemetry design. + _logBuffer.BufferLog(LogLevel.Debug, $"DAB telemetry Application Name computed for '{liveDataSource?.DatabaseType}' data source: {applicationName}"); + } + // If the connection string does not contain the `Application Name` property, add it. // or if the connection string contains the `Application Name` property, replace it with the DataApiBuilder Application Name. if (string.IsNullOrEmpty(connectionStringBuilder.ApplicationName)) diff --git a/src/Config/Telemetry/ApplicationNameTelemetry.cs b/src/Config/Telemetry/ApplicationNameTelemetry.cs new file mode 100644 index 0000000000..3a6b4f6352 --- /dev/null +++ b/src/Config/Telemetry/ApplicationNameTelemetry.cs @@ -0,0 +1,469 @@ +// Copyright (c) Microsoft Corporation. +// Licensed under the MIT License. + +using System.Text; +using Azure.DataApiBuilder.Config.ObjectModel; +using Azure.DataApiBuilder.Product; + +namespace Azure.DataApiBuilder.Config.Telemetry; + +/// +/// Encodes (and decodes) lightweight, anonymous DAB telemetry into the SQL Server +/// Application Name connection-string property. +/// +/// Format: +/// +/// dab_oss_<version>+<context>|<runtime>|<entity>+ +/// +/// Example: dab_oss_1.2.3+XXSX|11111M10...|10111101M...|11111011...+ +/// +/// The block is self-delimiting: it always starts with the dab_oss_ marker and ends +/// with +, so it can be located and decoded even when it is appended after a user's +/// custom Application Name (e.g. MyApp,dab_oss_...+...+) or an OBO per-user pool hash +/// (e.g. {hash}|MyApp,dab_oss_...+...+). Because of this, the inner | separators +/// never need to change and the existing composition separators (, and OBO |) are +/// left untouched. +/// +/// Encoding notes: +/// +/// Sections are additive: new flags are appended to the end of a section (before the +/// |) so older decoders remain forward-compatible. +/// Boolean-style settings encode as 1 (enabled/true), 0 (disabled/false) or +/// M (the owning config section is missing). +/// ? marks a setting whose concept does not yet exist in the engine. +/// Context fields that are per-request (Protocol, Object, Role) are not knowable when a +/// pooled connection is opened, so they are encoded as X. Only Source (known per +/// data source) is populated at runtime; the CLI, which has no live connection, emits all +/// X. +/// +/// +public static class ApplicationNameTelemetry +{ + /// + /// Environment variable used to opt out of embedding telemetry in the Application Name. + /// When set to exactly "1" the payload is omitted and only dab_oss_<version> + /// is emitted. Any other value (including "0", missing or invalid) keeps telemetry on. + /// + public const string OPT_OUT_ENV_VAR = "DAB_TELEMETRY_APPNAME_OPT_OUT"; + + /// Placeholder used for values that are unknown/not-applicable at the current scope. + private const char NOT_APPLICABLE = 'X'; + + /// Placeholder used for settings whose concept does not yet exist in the engine. + private const char NOT_SUPPORTED = '?'; + + /// Marks a config section that is missing entirely. + private const char MISSING = 'M'; + + private const char SECTION_SEPARATOR = '|'; + private const char PAYLOAD_DELIMITER = '+'; + + /// Inputs available to a setting encoder. + private readonly record struct EncodeInputs(RuntimeConfig Config, DataSource? LiveDataSource); + + /// A single telemetry setting: its name, how to encode it, and how to describe a value. + private sealed record Setting(string Name, Func Encode, Func Describe); + + /// + /// Produces the pure telemetry string (dab_oss_<version>+<context>|<runtime>|<entity>+), + /// independent of the opt-out switch and of DAB_APP_NAME_ENV. Used by the CLI and as the + /// telemetry-bearing portion of the connection-string segment. + /// + /// The runtime config to encode. + /// + /// The data source whose connection is being opened, or null when there is no live + /// connection context (e.g. the dab appname --config CLI command). When null, the + /// Source field is emitted as X and per–data-source flags (such as OBO) fall back to the + /// config's default data source. + /// + public static string EncodeTelemetryString(RuntimeConfig config, DataSource? liveDataSource = null) + { + EncodeInputs inputs = new(config, liveDataSource); + + string context = EncodeSection(_contextSettings, inputs); + string runtime = EncodeSection(_runtimeSettings, inputs); + string entity = EncodeSection(_entitySettings, inputs); + + return new StringBuilder() + .Append(ProductInfo.DAB_USER_AGENT) + .Append(PAYLOAD_DELIMITER) + .Append(context).Append(SECTION_SEPARATOR) + .Append(runtime).Append(SECTION_SEPARATOR) + .Append(entity) + .Append(PAYLOAD_DELIMITER) + .ToString(); + } + + /// + /// Builds the DAB-owned portion of the Application Name to embed in a connection string. + /// + /// When opted out, only dab_oss_<version> is returned (no payload). + /// Telemetry is always based on the product version (), + /// so it is never suppressed by DAB_APP_NAME_ENV. + /// When DAB_APP_NAME_ENV is set, its value is preserved as a comma prefix + /// (e.g. dab_hosted,dab_oss_...+...+) so a host/custom label survives while the + /// dab_oss_ marker remains intact for decoding. + /// + /// + /// The runtime config to encode. + /// The data source whose connection is being opened. + public static string BuildApplicationNameSegment(RuntimeConfig config, DataSource? liveDataSource) + { + string telemetry = IsOptedOut() + ? ProductInfo.DAB_USER_AGENT + : EncodeTelemetryString(config, liveDataSource); + + string? customLabel = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + + return string.IsNullOrWhiteSpace(customLabel) + ? telemetry + : $"{customLabel},{telemetry}"; + } + + /// + /// Decodes a telemetry-bearing Application Name into human-readable lines. The input may be a + /// raw telemetry string or a full Application Name with a user prefix and/or OBO hash. Decoding + /// is tolerant: a value truncated by SQL Server's 128-character limit, a missing trailing + /// delimiter, or extra (newer) flags are all handled without throwing. + /// + /// The Application Name (or telemetry string) to decode. + /// One human-readable line per recognized value. + public static IReadOnlyList Decode(string? applicationName) + { + List lines = new(); + + if (string.IsNullOrWhiteSpace(applicationName)) + { + lines.Add("No DAB telemetry found (empty Application Name)."); + return lines; + } + + int markerIndex = applicationName.IndexOf(ProductInfo.DAB_USER_AGENT_MARKER, StringComparison.Ordinal); + if (markerIndex < 0) + { + lines.Add("No DAB telemetry found (missing 'dab_oss_' marker)."); + return lines; + } + + // Everything from the marker onward, ignoring any user prefix / OBO hash before it. + string block = applicationName[markerIndex..]; + + int payloadStart = block.IndexOf(PAYLOAD_DELIMITER); + string version = payloadStart < 0 ? block : block[..payloadStart]; + lines.Add($"Version: {version}"); + + if (payloadStart < 0) + { + lines.Add("Telemetry payload: none (opted out or version-only Application Name)."); + return lines; + } + + // Payload sits between the opening '+' and the (optional, possibly truncated) closing '+'. + string payload = block[(payloadStart + 1)..]; + if (payload.EndsWith(PAYLOAD_DELIMITER)) + { + payload = payload[..^1]; + } + + string[] sections = payload.Split(SECTION_SEPARATOR); + DecodeSection(lines, "Context", _contextSettings, sections, index: 0); + DecodeSection(lines, "Runtime", _runtimeSettings, sections, index: 1); + DecodeSection(lines, "Entity", _entitySettings, sections, index: 2); + + return lines; + } + + /// Returns true when telemetry has been explicitly opted out via the environment variable. + private static bool IsOptedOut() => + string.Equals( + Environment.GetEnvironmentVariable(OPT_OUT_ENV_VAR)?.Trim(), + "1", + StringComparison.Ordinal); + + private static string EncodeSection(IReadOnlyList settings, EncodeInputs inputs) + { + char[] chars = new char[settings.Count]; + for (int i = 0; i < settings.Count; i++) + { + chars[i] = settings[i].Encode(inputs); + } + + return new string(chars); + } + + private static void DecodeSection( + List lines, + string sectionName, + IReadOnlyList settings, + string[] sections, + int index) + { + if (index >= sections.Length) + { + // Section absent (truncated before this section was reached). + return; + } + + string section = sections[index]; + for (int i = 0; i < section.Length; i++) + { + char value = section[i]; + if (i < settings.Count) + { + lines.Add($"{sectionName} > {settings[i].Name}: {value} ({settings[i].Describe(value)})"); + } + else + { + // A newer engine added a flag this decoder does not know about. + lines.Add($"{sectionName} > [position {i + 1}]: {value} (unrecognized – added by a newer version)"); + } + } + } + + // --------------------------------------------------------------------------------------------- + // Value helpers + // --------------------------------------------------------------------------------------------- + + /// Encodes a tri-state flag: 1=true, 0=false, M=missing section. + private static char Flag(bool? value) => value switch + { + true => '1', + false => '0', + null => MISSING, + }; + + /// Encodes a presence flag: 1=present, 0=absent. + private static char Present(bool present) => present ? '1' : '0'; + + /// Evaluates an "any entity matches" predicate, returning M when no entities exist. + private static char AnyEntity(RuntimeConfig config, Func predicate) + { + IReadOnlyDictionary? entities = config.Entities?.Entities; + if (entities is null || entities.Count == 0) + { + return MISSING; + } + + return Present(entities.Values.Any(predicate)); + } + + private static char EncodeSource(DatabaseType? source) => source switch + { + DatabaseType.MSSQL => 'S', + DatabaseType.DWSQL => 'D', + DatabaseType.PostgreSQL => 'P', + DatabaseType.MySQL => 'M', + DatabaseType.CosmosDB_NoSQL => 'C', + DatabaseType.CosmosDB_PostgreSQL => 'C', + _ => NOT_APPLICABLE, + }; + + /// + /// Encodes whether on-behalf-of (user-delegated) auth is enabled for the data source. Uses the + /// live data source when one is supplied, so each connection pool reflects its own setting; + /// otherwise falls back to the config's default data source (e.g. the CLI, which has no live + /// connection). Encoded as M when no data source is available. + /// + private static char EncodeObo(EncodeInputs inputs) + { + DataSource? dataSource = inputs.LiveDataSource ?? inputs.Config.DataSource; + return dataSource is null ? MISSING : Present(dataSource.IsUserDelegatedAuthEnabled); + } + + private static char EncodeHostMode(RuntimeConfig config) + { + HostOptions? host = config.Runtime?.Host; + if (host is null) + { + return MISSING; + } + + return host.Mode == HostMode.Production ? '1' : '0'; + } + + /// + /// Encodes the authentication provider. The issue defines the alphabet U E C S A W without a + /// legend; the mapping below is the chosen interpretation and is easy to adjust if needed. + /// + private static char EncodeAuthProvider(RuntimeConfig config) + { + AuthenticationOptions? auth = config.Runtime?.Host?.Authentication; + if (auth is null) + { + return MISSING; + } + + string provider = auth.Provider; + if (provider.Equals(AuthenticationOptions.UNAUTHENTICATED_AUTHENTICATION, StringComparison.OrdinalIgnoreCase)) + { + return 'U'; + } + + if (provider.Equals(AuthenticationOptions.SIMULATOR_AUTHENTICATION, StringComparison.OrdinalIgnoreCase)) + { + return 'S'; + } + + if (provider.Equals(nameof(EasyAuthType.StaticWebApps), StringComparison.OrdinalIgnoreCase)) + { + return 'W'; + } + + if (provider.Equals(nameof(EasyAuthType.AppService), StringComparison.OrdinalIgnoreCase)) + { + return 'A'; + } + + if (provider.Equals("AzureAD", StringComparison.OrdinalIgnoreCase) + || provider.Equals("EntraID", StringComparison.OrdinalIgnoreCase)) + { + return 'E'; + } + + // Any other (custom) JWT provider. + return 'C'; + } + + private static bool UsesCustomRole(Entity entity) => + entity.Permissions is not null && + entity.Permissions.Any(p => + !p.Role.Equals("anonymous", StringComparison.OrdinalIgnoreCase) && + !p.Role.Equals("authenticated", StringComparison.OrdinalIgnoreCase)); + + private static bool UsesPolicy(Entity entity) => + entity.Permissions is not null && + entity.Permissions.Any(p => + p.Actions is not null && + p.Actions.Any(a => a.Policy is not null && + (a.Policy.Database is not null || a.Policy.Request is not null))); + + // --------------------------------------------------------------------------------------------- + // Describers (value char -> human-readable meaning) used for decoding. + // --------------------------------------------------------------------------------------------- + + private static string DescribeFlag(char value) => value switch + { + '1' => "enabled/yes", + '0' => "disabled/no", + MISSING => "missing", + NOT_SUPPORTED => "not yet supported", + _ => "unrecognized", + }; + + private static string DescribeProtocol(char value) => value switch + { + 'R' => "REST", + 'G' => "GraphQL", + 'M' => "MCP", + NOT_APPLICABLE => "not applicable", + _ => "unrecognized", + }; + + private static string DescribeObject(char value) => value switch + { + 'T' => "Table", + 'V' => "View", + 'S' => "Stored Procedure", + 'P' => "Persisted Document", + NOT_APPLICABLE => "not applicable", + _ => "unrecognized", + }; + + private static string DescribeSource(char value) => value switch + { + 'S' => "SQL", + 'D' => "DWSQL", + 'P' => "Postgres", + 'M' => "MySQL", + 'C' => "Cosmos", + NOT_APPLICABLE => "not applicable", + _ => "unrecognized", + }; + + private static string DescribeRole(char value) => value switch + { + 'N' => "Anonymous", + 'A' => "Authenticated", + 'C' => "Custom", + NOT_APPLICABLE => "not applicable", + _ => "unrecognized", + }; + + private static string DescribeHostMode(char value) => value switch + { + '0' => "Development", + '1' => "Production", + MISSING => "missing", + _ => "unrecognized", + }; + + private static string DescribeAuthProvider(char value) => value switch + { + 'U' => "Unauthenticated", + 'E' => "EntraId", + 'C' => "Custom", + 'S' => "Simulator", + 'A' => "AppService", + 'W' => "StaticWebApps", + MISSING => "missing", + _ => "unrecognized", + }; + + // --------------------------------------------------------------------------------------------- + // Schema – the ordered list of settings per section. Encoding and decoding share these lists so + // they can never drift out of sync. Append new settings to the END of a section only. + // --------------------------------------------------------------------------------------------- + + private static readonly IReadOnlyList _contextSettings = new[] + { + // Protocol/Object/Role are per-request and unknown when a pooled connection is opened. + new Setting("Protocol", _ => NOT_APPLICABLE, DescribeProtocol), + new Setting("Object", _ => NOT_APPLICABLE, DescribeObject), + new Setting("Source", i => EncodeSource(i.LiveDataSource?.DatabaseType), DescribeSource), + new Setting("Role", _ => NOT_APPLICABLE, DescribeRole), + }; + + private static readonly IReadOnlyList _runtimeSettings = new[] + { + new Setting("runtime.rest.enabled", i => Flag(i.Config.Runtime?.Rest?.Enabled), DescribeFlag), + new Setting("runtime.graphql.enabled", i => Flag(i.Config.Runtime?.GraphQL?.Enabled), DescribeFlag), + new Setting("runtime.mcp.enabled", i => Flag(i.Config.Runtime?.Mcp?.Enabled), DescribeFlag), + new Setting("runtime.host.mode", i => EncodeHostMode(i.Config), DescribeHostMode), + new Setting("data-source-files", i => Present(i.Config.DataSourceFiles?.SourceFiles?.Any() == true), DescribeFlag), + new Setting("azure-key-vault", i => Present(!string.IsNullOrEmpty(i.Config.AzureKeyVault?.Endpoint)), DescribeFlag), + new Setting("health.enabled", i => Flag(i.Config.Runtime?.Health?.Enabled), DescribeFlag), + new Setting("cache.enabled", i => Flag(i.Config.Runtime?.Cache?.Enabled), DescribeFlag), + new Setting("cache.l2", i => Flag(i.Config.Runtime?.Cache?.Level2?.Enabled), DescribeFlag), + new Setting("data-source.obo", EncodeObo, DescribeFlag), + new Setting("autoentities", i => Present(i.Config.Autoentities?.Any() == true), DescribeFlag), + new Setting("rest.request-body-strict", i => Flag(i.Config.Runtime?.Rest?.RequestBodyStrict), DescribeFlag), + new Setting("graphql.multiple-mutations.create.enabled", i => Flag(i.Config.Runtime?.GraphQL?.MultipleMutationOptions?.MultipleCreateOptions?.Enabled), DescribeFlag), + new Setting("telemetry.open-telemetry.enabled", i => Flag(i.Config.Runtime?.Telemetry?.OpenTelemetry?.Enabled), DescribeFlag), + new Setting("telemetry.application-insights.enabled", i => Flag(i.Config.Runtime?.Telemetry?.ApplicationInsights?.Enabled), DescribeFlag), + new Setting("telemetry.azure-log-analytics.enabled", i => Flag(i.Config.Runtime?.Telemetry?.AzureLogAnalytics?.Enabled), DescribeFlag), + new Setting("telemetry.file-sink.enabled", i => Flag(i.Config.Runtime?.Telemetry?.File?.Enabled), DescribeFlag), + new Setting("auth.provider", i => EncodeAuthProvider(i.Config), DescribeAuthProvider), + new Setting("embedding.enabled", i => Flag(i.Config.Runtime?.Embeddings?.Enabled), DescribeFlag), + new Setting("embedding.endpoint.enabled", i => Flag(i.Config.Runtime?.Embeddings?.Endpoint?.Enabled), DescribeFlag), + }; + + private static readonly IReadOnlyList _entitySettings = new[] + { + new Setting("entities.any.table", i => AnyEntity(i.Config, e => e.Source?.Type == EntitySourceType.Table), DescribeFlag), + new Setting("entities.any.view", i => AnyEntity(i.Config, e => e.Source?.Type == EntitySourceType.View), DescribeFlag), + new Setting("entities.any.stored-procedure", i => AnyEntity(i.Config, e => e.Source?.Type == EntitySourceType.StoredProcedure), DescribeFlag), + // MCP persisted documents are not yet a modeled concept in the engine. + new Setting("entities.any.mcp-persisted-document", _ => NOT_SUPPORTED, DescribeFlag), + new Setting("entities.any.cache", i => AnyEntity(i.Config, e => e.Cache?.Enabled == true), DescribeFlag), + new Setting("entities.any.rest.enabled", i => AnyEntity(i.Config, e => e.IsRestEnabled), DescribeFlag), + new Setting("entities.any.graphql.enabled", i => AnyEntity(i.Config, e => e.IsGraphQLEnabled), DescribeFlag), + new Setting("entities.any.mcp.dml-tools", i => AnyEntity(i.Config, e => e.Mcp?.DmlToolEnabled == true), DescribeFlag), + new Setting("entities.any.mcp.custom-tool", i => AnyEntity(i.Config, e => e.Mcp?.CustomToolEnabled == true), DescribeFlag), + new Setting("entities.any.custom-roles", i => AnyEntity(i.Config, UsesCustomRole), DescribeFlag), + new Setting("entities.any.policies", i => AnyEntity(i.Config, UsesPolicy), DescribeFlag), + new Setting("entities.any.descriptions", i => AnyEntity(i.Config, e => !string.IsNullOrEmpty(e.Description)), DescribeFlag), + new Setting("entities.any.relationships", i => AnyEntity(i.Config, e => e.Relationships?.Any() == true), DescribeFlag), + // Parameter-level embeddings are not yet a modeled concept in the engine. + new Setting("entities.any.parameter.embed", _ => NOT_SUPPORTED, DescribeFlag), + }; +} diff --git a/src/Core/Configurations/RuntimeConfigProvider.cs b/src/Core/Configurations/RuntimeConfigProvider.cs index d0fa320313..c38f666d5b 100644 --- a/src/Core/Configurations/RuntimeConfigProvider.cs +++ b/src/Core/Configurations/RuntimeConfigProvider.cs @@ -210,6 +210,15 @@ public async Task Initialize( _configLoader.RuntimeConfig = HandleCosmosNoSqlConfiguration(schema, runtimeConfig, runtimeConfig.DataSource.ConnectionString); } + // Hosted / late-config (V2) parses with telemetry injection skipped; embed it into every + // data source's connection string so hosted connection pools carry the usage snapshot. + _configLoader.RuntimeConfig = EmbedTelemetryInDataSourceConnectionStrings(_configLoader.RuntimeConfig, skipDataSourceName: null); + + // Flush the telemetry Debug log(s) buffered during embedding. The startup-time flush has + // already run by the time this late-config path executes, so without flushing here the + // buffered telemetry logs would never be emitted. + _configLoader.FlushLogBuffer(); + ManagedIdentityAccessToken[_configLoader.RuntimeConfig.DefaultDataSourceName] = accessToken; } @@ -293,17 +302,65 @@ public async Task Initialize( _configLoader.RuntimeConfig = runtimeConfig.DataSource.DatabaseType switch { DatabaseType.CosmosDB_NoSQL => HandleCosmosNoSqlConfiguration(graphQLSchema, runtimeConfig, connectionString), - _ => runtimeConfig with { DataSource = runtimeConfig.DataSource with { ConnectionString = connectionString } } + // Embed anonymous usage telemetry into the hosted / late-config connection string's + // Application Name (honoring the opt-out switch and the DAB_APP_NAME_ENV host label). + // Hosted deployments take this path, so it is exactly where the dab_hosted label matters. + _ => runtimeConfig with { DataSource = runtimeConfig.DataSource with { ConnectionString = RuntimeConfigLoader.GetConnectionStringWithApplicationName(connectionString, runtimeConfig, runtimeConfig.DataSource) } } }; ManagedIdentityAccessToken[_configLoader.RuntimeConfig.DefaultDataSourceName] = accessToken; _configLoader.RuntimeConfig.UpdateDataSourceNameToDataSource(_configLoader.RuntimeConfig.DefaultDataSourceName, _configLoader.RuntimeConfig.DataSource!); + // The default data source was supplemented with the separately-supplied connection string + // above. Embed telemetry into any additional (child / multi-database) data sources too, so + // every hosted connection pool carries the usage snapshot. + _configLoader.RuntimeConfig = EmbedTelemetryInDataSourceConnectionStrings(_configLoader.RuntimeConfig, skipDataSourceName: _configLoader.RuntimeConfig.DefaultDataSourceName); + + // Flush the telemetry Debug log(s) buffered during embedding. The startup-time flush has + // already run by the time this late-config path executes, so without flushing here the + // buffered telemetry logs would never be emitted. + _configLoader.FlushLogBuffer(); + return await InvokeConfigLoadedHandlersAsync(); } return false; } + /// + /// Embeds anonymous usage telemetry into the Application Name of each data source's + /// connection string. Hosted / late-config initialization parses with env-var replacement disabled, + /// so the loader's telemetry injection is skipped; this re-applies it so every hosted connection + /// pool — single- or multi-database — carries the usage snapshot. Engines without telemetry support + /// (e.g. MySQL, Cosmos) are left unchanged by the underlying dispatcher. + /// + /// The runtime config whose data-source connection strings are updated. + /// An optional data source to skip (e.g. the default, already + /// supplemented with a separately-supplied connection string); pass null to process all. + /// The config with telemetry embedded (and its default data source kept in sync). + private static RuntimeConfig EmbedTelemetryInDataSourceConnectionStrings(RuntimeConfig config, string? skipDataSourceName) + { + foreach ((string dataSourceName, DataSource dataSource) in config.GetDataSourceNamesToDataSourcesIterator().ToList()) + { + if (skipDataSourceName is not null && string.Equals(dataSourceName, skipDataSourceName, StringComparison.OrdinalIgnoreCase)) + { + continue; + } + + DataSource updatedDataSource = dataSource with + { + ConnectionString = RuntimeConfigLoader.GetConnectionStringWithApplicationName(dataSource.ConnectionString, config, dataSource) + }; + config.UpdateDataSourceNameToDataSource(dataSourceName, updatedDataSource); + + if (string.Equals(dataSourceName, config.DefaultDataSourceName, StringComparison.OrdinalIgnoreCase)) + { + config = config with { DataSource = updatedDataSource }; + } + } + + return config; + } + /// /// Runtimeconfig is hot-reloadable when the configuration is not in production mode and not late configured. /// diff --git a/src/Product/ProductInfo.cs b/src/Product/ProductInfo.cs index f42d7802c0..ca14eab115 100644 --- a/src/Product/ProductInfo.cs +++ b/src/Product/ProductInfo.cs @@ -10,7 +10,13 @@ public static class ProductInfo { public const string DAB_APP_NAME_ENV = "DAB_APP_NAME_ENV"; public const string COSMOSDB_DATABASE_NAME = "COSMOSDB_DATABASE_NAME"; - public static readonly string DAB_USER_AGENT = $"dab_oss_{GetProductVersion()}"; + + /// + /// Prefix that identifies a DAB open-source user agent / telemetry block. Kept as a separate + /// constant so consumers (e.g. Application Name telemetry decoding) can locate the block. + /// + public const string DAB_USER_AGENT_MARKER = "dab_oss_"; + public static readonly string DAB_USER_AGENT = $"{DAB_USER_AGENT_MARKER}{GetProductVersion()}"; public static readonly string CLOUD_ROLE_NAME = "DataApiBuilder"; /// diff --git a/src/Service.Tests/Configuration/ConfigurationTests.cs b/src/Service.Tests/Configuration/ConfigurationTests.cs index 71ef6ed6cf..9e2c200240 100644 --- a/src/Service.Tests/Configuration/ConfigurationTests.cs +++ b/src/Service.Tests/Configuration/ConfigurationTests.cs @@ -21,6 +21,7 @@ using Azure.DataApiBuilder.Auth; using Azure.DataApiBuilder.Config; using Azure.DataApiBuilder.Config.ObjectModel; +using Azure.DataApiBuilder.Config.Telemetry; using Azure.DataApiBuilder.Core; using Azure.DataApiBuilder.Core.AuthenticationHelpers; using Azure.DataApiBuilder.Core.Authorization; @@ -903,6 +904,12 @@ public void MsSqlConnStringSupplementedWithAppNameProperty( string expectedDabModifiedConnString, bool dabEnvOverride) { + string originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + string originalAppName = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + + // Ensure telemetry is enabled (not opted out) so the Application Name carries the dab_oss payload. + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, null); + // Explicitly set the DAB_APP_NAME_ENV to null to ensure that the DAB_APP_NAME_ENV is not set. if (dabEnvOverride) { @@ -913,27 +920,39 @@ public void MsSqlConnStringSupplementedWithAppNameProperty( Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, null); } - // Resolve assembly version. Not possible to do in DataRow as DataRows expect compile-time constants. - string resolvedAssemblyVersion = ProductInfo.GetDataApiBuilderUserAgent(); - expectedDabModifiedConnString += resolvedAssemblyVersion; + try + { + // The DAB-owned portion of the Application Name is always the dab_oss_ telemetry block. + // When DAB_APP_NAME_ENV is set, its value is preserved as a comma prefix. The encoded telemetry + // payload then follows, so we assert the Application Name prefix and that the payload terminates with '+'. + string expectedAppNamePrefix = expectedDabModifiedConnString + + (dabEnvOverride ? $"dab_hosted,{ProductInfo.DAB_USER_AGENT}" : ProductInfo.DAB_USER_AGENT); - RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.MSSQL, configProvidedConnString); + RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.MSSQL, configProvidedConnString); - // Act - bool configParsed = RuntimeConfigLoader.TryParseConfig( - json: runtimeConfig.ToJson(), - config: out RuntimeConfig updatedRuntimeConfig, - replacementSettings: new(doReplaceEnvVar: true)); + // Act + bool configParsed = RuntimeConfigLoader.TryParseConfig( + json: runtimeConfig.ToJson(), + config: out RuntimeConfig updatedRuntimeConfig, + replacementSettings: new(doReplaceEnvVar: true)); - // Assert - Assert.AreEqual( - expected: true, - actual: configParsed, - message: "Runtime config unexpectedly failed parsing."); - Assert.AreEqual( - expected: expectedDabModifiedConnString, - actual: updatedRuntimeConfig.DataSource.ConnectionString, - message: "DAB did not properly set the 'Application Name' connection string property."); + // Assert + Assert.AreEqual( + expected: true, + actual: configParsed, + message: "Runtime config unexpectedly failed parsing."); + Assert.IsTrue( + updatedRuntimeConfig.DataSource.ConnectionString.StartsWith(expectedAppNamePrefix, StringComparison.Ordinal), + $"Expected connection string to start with '{expectedAppNamePrefix}' but was '{updatedRuntimeConfig.DataSource.ConnectionString}'."); + Assert.IsTrue( + updatedRuntimeConfig.DataSource.ConnectionString.EndsWith("+", StringComparison.Ordinal), + $"Expected telemetry payload to terminate with '+' but connection string was '{updatedRuntimeConfig.DataSource.ConnectionString}'."); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, originalAppName); + } } /// @@ -956,6 +975,12 @@ public void PgSqlConnStringSupplementedWithAppNameProperty( string expectedDabModifiedConnString, bool dabEnvOverride) { + string originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + string originalAppName = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + + // Ensure telemetry is enabled (not opted out) so the Application Name carries the dab_oss payload. + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, null); + // Explicitly set the DAB_APP_NAME_ENV to null to ensure that the DAB_APP_NAME_ENV is not set. if (dabEnvOverride) { @@ -966,27 +991,269 @@ public void PgSqlConnStringSupplementedWithAppNameProperty( Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, null); } - // Resolve assembly version. Not possible to do in DataRow as DataRows expect compile-time constants. - string resolvedAssemblyVersion = ProductInfo.GetDataApiBuilderUserAgent(); - expectedDabModifiedConnString += resolvedAssemblyVersion; + try + { + // The DAB-owned portion of the Application Name is always the dab_oss_ telemetry block. + // When DAB_APP_NAME_ENV is set, its value is preserved as a comma prefix. The encoded telemetry + // payload then follows, so we assert the Application Name prefix and that the payload terminates with '+'. + string expectedAppNamePrefix = expectedDabModifiedConnString + + (dabEnvOverride ? $"dab_hosted,{ProductInfo.DAB_USER_AGENT}" : ProductInfo.DAB_USER_AGENT); - RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.PostgreSQL, configProvidedConnString); + RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.PostgreSQL, configProvidedConnString); - // Act - bool configParsed = RuntimeConfigLoader.TryParseConfig( + // Act + bool configParsed = RuntimeConfigLoader.TryParseConfig( + json: runtimeConfig.ToJson(), + config: out RuntimeConfig updatedRuntimeConfig, + replacementSettings: new(doReplaceEnvVar: true)); + + // Assert + Assert.AreEqual( + expected: true, + actual: configParsed, + message: "Runtime config unexpectedly failed parsing."); + Assert.IsTrue( + updatedRuntimeConfig.DataSource.ConnectionString.StartsWith(expectedAppNamePrefix, StringComparison.Ordinal), + $"Expected connection string to start with '{expectedAppNamePrefix}' but was '{updatedRuntimeConfig.DataSource.ConnectionString}'."); + Assert.IsTrue( + updatedRuntimeConfig.DataSource.ConnectionString.EndsWith("+", StringComparison.Ordinal), + $"Expected telemetry payload to terminate with '+' but connection string was '{updatedRuntimeConfig.DataSource.ConnectionString}'."); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, originalAppName); + } + } + + /// + /// Validates that DWSQL data sources also receive the telemetry-bearing Application Name. + /// DWSQL uses the SqlClient connection-string builder (like MSSQL) and supports Application Name, + /// so the dab_oss telemetry block (with Source encoded as 'D') is embedded. + /// + [TestMethod] + public void DwSqlConnStringSupplementedWithAppNameProperty() + { + string originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + string originalAppName = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + + // Ensure telemetry is enabled (not opted out) and no host label is set. + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, null); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, null); + + try + { + RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.DWSQL, "Data Source=<>;"); + + bool configParsed = RuntimeConfigLoader.TryParseConfig( + json: runtimeConfig.ToJson(), + config: out RuntimeConfig updatedRuntimeConfig, + replacementSettings: new(doReplaceEnvVar: true)); + + Assert.IsTrue(configParsed, "Runtime config unexpectedly failed parsing."); + + string connectionString = updatedRuntimeConfig.DataSource.ConnectionString; + Assert.IsTrue( + connectionString.StartsWith("Data Source=<>;Application Name=" + ProductInfo.DAB_USER_AGENT, StringComparison.Ordinal), + $"Expected DWSQL Application Name to carry the telemetry block but was '{connectionString}'."); + Assert.IsTrue( + connectionString.EndsWith("+", StringComparison.Ordinal), + $"Expected DWSQL telemetry payload to terminate with '+' but was '{connectionString}'."); + + // The encoded Source for a DWSQL pool must decode as 'D'. + IReadOnlyList decoded = ApplicationNameTelemetry.Decode(connectionString); + Assert.IsTrue( + decoded.Any(line => line.Contains("Source: D (DWSQL)")), + string.Join(Environment.NewLine, decoded)); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, originalAppName); + } + } + + /// + /// Validates that when telemetry is opted out via DAB_TELEMETRY_APPNAME_OPT_OUT=1, the connection + /// string Application Name carries only the version marker (dab_oss_<version>) with no payload. + /// + [TestMethod] + public void ConnStringAppNameOmitsPayloadWhenOptedOut() + { + string originalAppName = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + string originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, null); + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, "1"); + + try + { + RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.MSSQL, "Data Source=<>;"); + + bool configParsed = RuntimeConfigLoader.TryParseConfig( + json: runtimeConfig.ToJson(), + config: out RuntimeConfig updatedRuntimeConfig, + replacementSettings: new(doReplaceEnvVar: true)); + + Assert.IsTrue(configParsed, "Runtime config unexpectedly failed parsing."); + Assert.AreEqual( + "Data Source=<>;Application Name=" + ProductInfo.DAB_USER_AGENT, + updatedRuntimeConfig.DataSource.ConnectionString, + "Opted-out Application Name should be version-only with no telemetry payload."); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, originalAppName); + } + } + + /// + /// Validates that the hosted / late-configured path (POST /configuration, which supplies the + /// connection string separately with doReplaceEnvVar:false) still embeds anonymous usage + /// telemetry — including the DAB_APP_NAME_ENV host label — into the connection string's + /// Application Name. This is the deployment shape where the 'dab_hosted' label is most valuable. + /// + [TestMethod] + public async Task HostedLateConfigConnStringSupplementedWithTelemetry() + { + string originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + string originalAppName = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, null); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, "dab_hosted"); + + try + { + RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.MSSQL, "Server=placeholder;"); + FileSystemRuntimeConfigLoader loader = new(new MockFileSystem()); + RuntimeConfigProvider provider = new(loader); + + // Mirror the hosted /configuration path: connection string supplied separately, env-var + // replacement disabled. Telemetry must still be embedded. + bool initialized = await provider.Initialize( + runtimeConfig.ToJson(), + graphQLSchema: null, + connectionString: "Server=hosted-sql;Database=hosteddb;", + accessToken: null, + replacementSettings: new(azureKeyVaultOptions: null, doReplaceEnvVar: false, doReplaceAkvVar: false)); + + Assert.IsTrue(initialized, "Hosted late-config initialization should succeed."); + + string connectionString = provider.GetConfig().DataSource.ConnectionString; + Assert.IsTrue( + connectionString.Contains("Application Name=dab_hosted," + ProductInfo.DAB_USER_AGENT, StringComparison.Ordinal), + $"Hosted connection string should carry the dab_hosted label and dab_oss marker but was '{connectionString}'."); + Assert.IsTrue( + connectionString.EndsWith("+", StringComparison.Ordinal), + $"Hosted connection string should carry the telemetry payload but was '{connectionString}'."); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, originalAppName); + } + } + + /// + /// Validates that an explicit connection-string override is applied to the default data source + /// even when env-var replacement is disabled (DoReplaceEnvVar == false). Telemetry, which is + /// gated on DoReplaceEnvVar, is not embedded in that case, but the override must still take effect. + /// + [TestMethod] + public void ConnStringOverrideAppliedWhenEnvVarReplacementDisabled() + { + RuntimeConfig runtimeConfig = CreateBasicRuntimeConfigWithNoEntity(DatabaseType.MSSQL, "Server=in-config;"); + + bool parsed = RuntimeConfigLoader.TryParseConfig( json: runtimeConfig.ToJson(), config: out RuntimeConfig updatedRuntimeConfig, - replacementSettings: new(doReplaceEnvVar: true)); + parseError: out _, + replacementSettings: new(doReplaceEnvVar: false), + connectionString: "Server=override-server;Database=overridedb;"); - // Assert - Assert.AreEqual( - expected: true, - actual: configParsed, - message: "Runtime config unexpectedly failed parsing."); + Assert.IsTrue(parsed, "Runtime config unexpectedly failed parsing."); Assert.AreEqual( - expected: expectedDabModifiedConnString, - actual: updatedRuntimeConfig.DataSource.ConnectionString, - message: "DAB did not properly set the 'Application Name' connection string property."); + "Server=override-server;Database=overridedb;", + updatedRuntimeConfig.DataSource.ConnectionString, + "The explicit connection-string override should be applied even when env-var replacement (and telemetry) is disabled."); + } + + /// + /// Multi-database hosted scenario: the late-config path supplements the default data source with + /// the separately-supplied connection string, and must also embed telemetry into child data + /// sources (from data-source-files) so every hosted connection pool carries the usage snapshot. + /// + [TestMethod] + public async Task HostedLateConfigMultiDbChildConnStringSupplementedWithTelemetry() + { + string originalOptOut = Environment.GetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR); + string originalAppName = Environment.GetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV); + + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, null); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, "dab_hosted"); + + // The RuntimeConfig constructor loads child data-source-files from a real FileSystem. + string childFilePath = System.IO.Path.Combine(System.IO.Path.GetTempPath(), System.IO.Path.GetRandomFileName() + ".json"); + + try + { + string childConfig = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { ""database-type"": ""mssql"", ""connection-string"": ""Server=child-sql;Database=childdb;TrustServerCertificate=True;"" }, + ""entities"": { ""ChildEntity"": { ""source"": ""dbo.ChildTable"", ""permissions"": [{ ""role"": ""anonymous"", ""actions"": [""read""] }] } } + }"; + await File.WriteAllTextAsync(childFilePath, childConfig); + + string rootConfig = $@"{{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": {{ ""database-type"": ""mssql"", ""connection-string"": ""Server=placeholder;"" }}, + ""data-source-files"": [""{childFilePath.Replace("\\", "\\\\")}""], + ""runtime"": {{ ""rest"": {{ ""enabled"": true }} }}, + ""entities"": {{ ""RootEntity"": {{ ""source"": ""dbo.RootTable"", ""permissions"": [{{ ""role"": ""anonymous"", ""actions"": [""read""] }}] }} }} + }}"; + + FileSystemRuntimeConfigLoader loader = new(new MockFileSystem()); + RuntimeConfigProvider provider = new(loader); + + bool initialized = await provider.Initialize( + rootConfig, + graphQLSchema: null, + connectionString: "Server=hosted-default;Database=defaultdb;", + accessToken: null, + replacementSettings: new(azureKeyVaultOptions: null, doReplaceEnvVar: false, doReplaceAkvVar: false)); + + Assert.IsTrue(initialized, "Hosted multi-database late-config initialization should succeed."); + + RuntimeConfig loaded = provider.GetConfig(); + string expectedAppName = "Application Name=dab_hosted," + ProductInfo.DAB_USER_AGENT; + + // Default data source: supplemented with the supplied connection string + telemetry. + Assert.IsTrue( + loaded.DataSource.ConnectionString.Contains(expectedAppName, StringComparison.Ordinal) + && loaded.DataSource.ConnectionString.EndsWith("+", StringComparison.Ordinal), + $"Default data source should carry telemetry but was '{loaded.DataSource.ConnectionString}'."); + + // Child data source: its own server, also supplemented with telemetry. + DataSource childDataSource = loaded.GetDataSourceFromDataSourceName(loaded.GetDataSourceNameFromEntityName("ChildEntity")); + Assert.IsTrue( + childDataSource.ConnectionString.Contains(expectedAppName, StringComparison.Ordinal) + && childDataSource.ConnectionString.EndsWith("+", StringComparison.Ordinal), + $"Child data source should carry telemetry but was '{childDataSource.ConnectionString}'."); + Assert.IsTrue( + childDataSource.ConnectionString.Contains("child-sql", StringComparison.Ordinal), + $"Child data source should retain its own server but was '{childDataSource.ConnectionString}'."); + } + finally + { + Environment.SetEnvironmentVariable(ApplicationNameTelemetry.OPT_OUT_ENV_VAR, originalOptOut); + Environment.SetEnvironmentVariable(ProductInfo.DAB_APP_NAME_ENV, originalAppName); + + if (File.Exists(childFilePath)) + { + File.Delete(childFilePath); + } + } } /// diff --git a/src/Service.Tests/Configuration/RuntimeConfigLoaderTests.cs b/src/Service.Tests/Configuration/RuntimeConfigLoaderTests.cs index ae30698592..c94f523b3d 100644 --- a/src/Service.Tests/Configuration/RuntimeConfigLoaderTests.cs +++ b/src/Service.Tests/Configuration/RuntimeConfigLoaderTests.cs @@ -3,6 +3,7 @@ using System; using System.Collections.Generic; +using System.Data.Common; using System.IO; using System.IO.Abstractions; using System.IO.Abstractions.TestingHelpers; @@ -11,6 +12,7 @@ using Azure.DataApiBuilder.Config; using Azure.DataApiBuilder.Config.Converters; using Azure.DataApiBuilder.Config.ObjectModel; +using Azure.DataApiBuilder.Product; using Microsoft.Extensions.Logging; using Microsoft.VisualStudio.TestTools.UnitTesting; using Newtonsoft.Json.Linq; @@ -531,6 +533,253 @@ public async Task ChildConfigLoadFailureHaltsParentConfigLoading() } } + /// + /// In a multi-database setup, the connection-string "Application Name" (anonymous usage telemetry) + /// for a child data source must reflect the GLOBAL runtime settings and the merged entity set — not + /// the child config's own (absent) runtime. Child configs defer Application Name injection to the + /// top-level load, which performs it once over the fully-merged config so every connection pool + /// carries a self-contained snapshot of the deployment. + /// Regression test for https://github.com/Azure/data-api-builder/issues/3216 + /// + [TestMethod] + public async Task MultiDbChildDataSourceConnectionStringEncodesGlobalTelemetry() + { + // Root config: defines the GLOBAL runtime (REST on, GraphQL off, StaticWebApps auth) plus its + // own default MSSQL data source. GraphQL-off / StaticWebApps make the runtime section distinctive. + string parentConfig = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { + ""database-type"": ""mssql"", + ""connection-string"": ""Server=tcp:127.0.0.1,1433;Database=ParentDb;TrustServerCertificate=True;"" + }, + ""runtime"": { + ""rest"": { ""enabled"": true }, + ""graphql"": { ""enabled"": false }, + ""host"": { + ""cors"": { ""origins"": [] }, + ""authentication"": { ""provider"": ""StaticWebApps"" } + } + }, + ""entities"": { + ""ParentEntity"": { + ""source"": ""dbo.ParentTable"", + ""permissions"": [{ ""role"": ""anonymous"", ""actions"": [""read""] }] + } + } + }"; + + // Child config: a second MSSQL data source with NO runtime section of its own. Before the fix, + // its telemetry was computed from this partial config and the runtime section was all-missing. + string childConfig = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { + ""database-type"": ""mssql"", + ""connection-string"": ""Server=tcp:127.0.0.1,1433;Database=ChildDb;TrustServerCertificate=True;"" + }, + ""entities"": { + ""ChildEntity"": { + ""source"": ""dbo.ChildTable"", + ""permissions"": [{ ""role"": ""anonymous"", ""actions"": [""read""] }] + } + } + }"; + + // The RuntimeConfig constructor loads child data-source-files from a real FileSystem, so the + // child config must live in a real temp file. + string childFilePath = Path.Combine(Path.GetTempPath(), Path.GetRandomFileName() + ".json"); + try + { + await File.WriteAllTextAsync(childFilePath, childConfig); + + JObject parentJson = JObject.Parse(parentConfig); + parentJson.Add("data-source-files", new JArray(childFilePath)); + + MockFileSystem fs = new(new Dictionary() + { + { "dab-config.json", new MockFileData(parentJson.ToString()) } + }); + + FileSystemRuntimeConfigLoader loader = new(fs); + + DeserializationVariableReplacementSettings replacementSettings = new( + azureKeyVaultOptions: null, + doReplaceEnvVar: true, + doReplaceAkvVar: false, + envFailureMode: EnvironmentVariableReplacementFailureMode.Ignore); + + Assert.IsTrue( + loader.TryLoadConfig("dab-config.json", out RuntimeConfig runtimeConfig, replacementSettings: replacementSettings), + "Multi-database config should load successfully."); + + // Resolve both data sources from the merged config. + DataSource parentDataSource = runtimeConfig.GetDataSourceFromDataSourceName(runtimeConfig.GetDataSourceNameFromEntityName("ParentEntity")); + DataSource childDataSource = runtimeConfig.GetDataSourceFromDataSourceName(runtimeConfig.GetDataSourceNameFromEntityName("ChildEntity")); + + (_, string parentRuntime, string parentEntity) = GetTelemetrySections(parentDataSource.ConnectionString); + (_, string childRuntime, string childEntity) = GetTelemetrySections(childDataSource.ConnectionString); + + // Sanity: the root has a real runtime, so its encoded runtime section is meaningful, i.e. not + // entirely the 'M' (missing) sentinel. This guarantees the equality checks below are meaningful. + Assert.IsTrue( + parentRuntime.Any(flag => flag != 'M'), + $"Root runtime telemetry section should be meaningful, but was all-missing: '{parentRuntime}'."); + + // The fix: a child data source with no runtime of its own must encode the GLOBAL runtime, + // identical to the default data source's pool. + Assert.AreEqual( + parentRuntime, + childRuntime, + "Child data source telemetry should encode the global runtime, identical to the default data source."); + + // Entities are global (merged), so every pool encodes the same entity section. + Assert.AreEqual( + parentEntity, + childEntity, + "Child data source telemetry should encode the merged (global) entity section, identical to the default data source."); + } + finally + { + if (File.Exists(childFilePath)) + { + File.Delete(childFilePath); + } + } + } + + /// + /// Heterogeneous multi-database setup: a PostgreSQL child data source must also embed usage telemetry + /// in its connection-string "Application Name", encoding the GLOBAL runtime + merged entities (identical + /// to the MSSQL default pool) while reporting its own Source character ('P'). Companion to the MSSQL + /// multi-DB test; guards the Postgres extension of the telemetry feature. + /// Regression test for https://github.com/Azure/data-api-builder/issues/3216 + /// + [TestMethod] + public async Task MultiDbPostgresChildDataSourceEncodesGlobalTelemetryWithPostgresSource() + { + // Root (default) MSSQL data source carrying the GLOBAL runtime. + string parentConfig = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { + ""database-type"": ""mssql"", + ""connection-string"": ""Server=tcp:127.0.0.1,1433;Database=ParentDb;TrustServerCertificate=True;"" + }, + ""runtime"": { + ""rest"": { ""enabled"": true }, + ""graphql"": { ""enabled"": false }, + ""host"": { + ""cors"": { ""origins"": [] }, + ""authentication"": { ""provider"": ""StaticWebApps"" } + } + }, + ""entities"": { + ""ParentEntity"": { + ""source"": ""dbo.ParentTable"", + ""permissions"": [{ ""role"": ""anonymous"", ""actions"": [""read""] }] + } + } + }"; + + // Child PostgreSQL data source with NO runtime section of its own. + string childConfig = @"{ + ""$schema"": ""https://github.com/Azure/data-api-builder/releases/download/vmajor.minor.patch/dab.draft.schema.json"", + ""data-source"": { + ""database-type"": ""postgresql"", + ""connection-string"": ""Host=localhost;Database=ChildDb;Username=testuser;"" + }, + ""entities"": { + ""ChildEntity"": { + ""source"": ""public.ChildTable"", + ""permissions"": [{ ""role"": ""anonymous"", ""actions"": [""read""] }] + } + } + }"; + + // The RuntimeConfig constructor loads child data-source-files from a real FileSystem, so the + // child config must live in a real temp file. + string childFilePath = Path.Combine(Path.GetTempPath(), Path.GetRandomFileName() + ".json"); + try + { + await File.WriteAllTextAsync(childFilePath, childConfig); + + JObject parentJson = JObject.Parse(parentConfig); + parentJson.Add("data-source-files", new JArray(childFilePath)); + + MockFileSystem fs = new(new Dictionary() + { + { "dab-config.json", new MockFileData(parentJson.ToString()) } + }); + + FileSystemRuntimeConfigLoader loader = new(fs); + + DeserializationVariableReplacementSettings replacementSettings = new( + azureKeyVaultOptions: null, + doReplaceEnvVar: true, + doReplaceAkvVar: false, + envFailureMode: EnvironmentVariableReplacementFailureMode.Ignore); + + Assert.IsTrue( + loader.TryLoadConfig("dab-config.json", out RuntimeConfig runtimeConfig, replacementSettings: replacementSettings), + "Heterogeneous multi-database config should load successfully."); + + DataSource parentDataSource = runtimeConfig.GetDataSourceFromDataSourceName(runtimeConfig.GetDataSourceNameFromEntityName("ParentEntity")); + DataSource childDataSource = runtimeConfig.GetDataSourceFromDataSourceName(runtimeConfig.GetDataSourceNameFromEntityName("ChildEntity")); + + (string parentContext, string parentRuntime, string parentEntity) = GetTelemetrySections(parentDataSource.ConnectionString); + (string childContext, string childRuntime, string childEntity) = GetTelemetrySections(childDataSource.ConnectionString); + + // Context = [Protocol][Object][Source][Role]; only Source is known at pool time. + // The PostgreSQL pool encodes Source 'P'; the MSSQL pool encodes Source 'S'. + Assert.AreEqual('P', childContext[2], $"PostgreSQL data source should encode Source 'P'. Actual context: '{childContext}'."); + Assert.AreEqual('S', parentContext[2], $"MSSQL data source should encode Source 'S'. Actual context: '{parentContext}'."); + + // Runtime and entities are global, so both pools (regardless of engine) encode identical sections. + Assert.AreEqual( + parentRuntime, + childRuntime, + "PostgreSQL child should encode the global runtime, identical to the MSSQL default data source."); + Assert.AreEqual( + parentEntity, + childEntity, + "PostgreSQL child should encode the merged (global) entity section, identical to the MSSQL default data source."); + } + finally + { + if (File.Exists(childFilePath)) + { + File.Delete(childFilePath); + } + } + } + + /// + /// Extracts the three telemetry sections (context, runtime, entity) from the DAB usage-telemetry + /// payload embedded in a connection string's "Application Name" property. + /// Payload shape: [{env},]dab_oss_<version>+<context>|<runtime>|<entity>+ + /// + private static (string Context, string Runtime, string Entity) GetTelemetrySections(string connectionString) + { + // Use the engine-agnostic base builder so this works for both SQL Server and PostgreSQL connection strings. + DbConnectionStringBuilder builder = new() { ConnectionString = connectionString }; + Assert.IsTrue( + builder.TryGetValue("Application Name", out object applicationNameValue), + $"Connection string '{connectionString}' should contain an Application Name."); + string applicationName = (string)applicationNameValue; + + Assert.IsTrue( + applicationName.Contains(ProductInfo.DAB_USER_AGENT_MARKER) && applicationName.EndsWith("+", StringComparison.Ordinal), + $"Application Name '{applicationName}' should carry a DAB telemetry payload ending with '+'."); + + // Drop the trailing delimiter, then take the region after the last '+' (the version itself can + // contain '+' build metadata, so anchoring on the last '+' before the sections is robust). + string sectionsRegion = applicationName.TrimEnd('+'); + sectionsRegion = sectionsRegion.Substring(sectionsRegion.LastIndexOf('+') + 1); + + string[] sections = sectionsRegion.Split('|'); + Assert.AreEqual(3, sections.Length, $"Telemetry payload in '{applicationName}' should have 3 sections, but was '{sectionsRegion}'."); + + return (sections[0], sections[1], sections[2]); + } + /// /// Tests that EnableAggregation returns true by default when runtime.graphql section is absent. /// This is a regression test for the bug where EnableAggregation returned false (disabled) @@ -616,6 +865,91 @@ public void EnableAggregation_WhenExplicitlySet_ReturnsConfiguredValue(bool expl $"EnableAggregation should be {explicitValue} when explicitly set to {explicitValue} in config."); } + /// + /// Embedding telemetry into a connection string that already carries the DAB telemetry marker must + /// be a no-op (idempotent), so a value can never accumulate a duplicated payload + /// (...+...+,dab_oss_...+...+) if the embed ever runs more than once (e.g. the loader's + /// post-processing followed by the late-config provider). + /// + [TestMethod] + public void GetConnectionStringWithApplicationName_IsIdempotent() + { + DataSource dataSource = new(DatabaseType.MSSQL, "Server=localhost;Database=test;"); + RuntimeConfig config = new( + Schema: "s", + DataSource: dataSource, + Entities: new RuntimeEntities(new Dictionary())); + + string once = RuntimeConfigLoader.GetConnectionStringWithApplicationName("Server=localhost;Database=test;", config, dataSource); + string twice = RuntimeConfigLoader.GetConnectionStringWithApplicationName(once, config, dataSource); + + Assert.AreEqual(once, twice, "Re-embedding telemetry should be a no-op (idempotent)."); + + int markerOccurrences = (twice.Length - twice.Replace(ProductInfo.DAB_USER_AGENT_MARKER, string.Empty).Length) / ProductInfo.DAB_USER_AGENT_MARKER.Length; + Assert.AreEqual(1, markerOccurrences, $"Telemetry marker should appear exactly once but was '{twice}'."); + } + + /// + /// Computing the telemetry Application Name buffers a Debug log into a shared static buffer that, + /// before the fix, was only drained once at startup — so the log was never emitted on hot reload and + /// the buffer accumulated an entry per data source on every reload. This validates that + /// (a) is null-safe when no logger has been + /// set, and (b) emits the buffered telemetry log to a configured logger (so reloads drain the buffer). + /// Regression test for https://github.com/Azure/data-api-builder/issues/3216 + /// + [TestMethod] + public void FlushLogBuffer_IsNullSafe_AndEmitsBufferedTelemetryLog() + { + DataSource dataSource = new(DatabaseType.MSSQL, "Server=localhost;Database=test;"); + RuntimeConfig config = new( + Schema: "s", + DataSource: dataSource, + Entities: new RuntimeEntities(new Dictionary())); + + // Computing the Application Name buffers a Debug telemetry log into the shared static buffer. + RuntimeConfigLoader.GetConnectionStringWithApplicationName("Server=localhost;Database=test;", config, dataSource); + + // (a) A loader with no logger set must not throw when flushing a non-empty buffer. Previously this + // threw a NullReferenceException because the buffer was flushed to a null logger. + FileSystemRuntimeConfigLoader loaderWithoutLogger = new(new MockFileSystem()); + loaderWithoutLogger.FlushLogBuffer(); + + // (b) With a logger set, a freshly buffered telemetry log is emitted rather than silently + // accumulating in the static buffer until the next startup. + RuntimeConfigLoader.GetConnectionStringWithApplicationName("Server=localhost;Database=test;", config, dataSource); + + CapturingLogger logger = new(); + FileSystemRuntimeConfigLoader loaderWithLogger = new(new MockFileSystem()); + loaderWithLogger.SetLogger(logger); + loaderWithLogger.FlushLogBuffer(); + + Assert.IsTrue( + logger.Messages.Any(m => m.Contains("DAB telemetry Application Name computed")), + "FlushLogBuffer should emit the buffered telemetry Debug log to the configured logger."); + } + + /// Minimal in-memory that records formatted messages for assertions. + private sealed class CapturingLogger : ILogger + { + public List Messages { get; } = new(); + + public IDisposable BeginScope(TState state) => NullScope.Instance; + + public bool IsEnabled(LogLevel logLevel) => true; + + public void Log(LogLevel logLevel, EventId eventId, TState state, Exception exception, Func formatter) + => Messages.Add(formatter(state, exception)); + + private sealed class NullScope : IDisposable + { + public static readonly NullScope Instance = new(); + + public void Dispose() + { + } + } + } + /// /// Loads a from a JSON string using a mock file system. /// diff --git a/src/Service.Tests/UnitTests/ApplicationNameTelemetryTests.cs b/src/Service.Tests/UnitTests/ApplicationNameTelemetryTests.cs new file mode 100644 index 0000000000..da5af83a70 --- /dev/null +++ b/src/Service.Tests/UnitTests/ApplicationNameTelemetryTests.cs @@ -0,0 +1,422 @@ +// Copyright (c) Microsoft Corporation. +// Licensed under the MIT License. + +using System; +using System.Collections.Generic; +using System.Linq; +using Azure.DataApiBuilder.Config.ObjectModel; +using Azure.DataApiBuilder.Config.ObjectModel.Embeddings; +using Azure.DataApiBuilder.Config.Telemetry; +using Azure.DataApiBuilder.Product; +using Microsoft.VisualStudio.TestTools.UnitTesting; +using static Azure.DataApiBuilder.Service.Tests.GraphQLBuilder.Helpers.GraphQLTestHelpers; + +namespace Azure.DataApiBuilder.Service.Tests.UnitTests +{ + /// + /// Unit tests for – the encoder/decoder that embeds DAB + /// telemetry into the SQL Server Application Name. + /// + [TestClass] + public class ApplicationNameTelemetryTests + { + private const string OPT_OUT_VAR = ApplicationNameTelemetry.OPT_OUT_ENV_VAR; + private const string APP_NAME_VAR = ProductInfo.DAB_APP_NAME_ENV; + + /// Ensure a clean environment for the env-sensitive tests. + [TestInitialize] + public void ClearEnvironment() + { + Environment.SetEnvironmentVariable(OPT_OUT_VAR, null); + Environment.SetEnvironmentVariable(APP_NAME_VAR, null); + } + + [TestCleanup] + public void ResetEnvironment() + { + Environment.SetEnvironmentVariable(OPT_OUT_VAR, null); + Environment.SetEnvironmentVariable(APP_NAME_VAR, null); + } + + /// + /// The encoded string must start with the product user agent, be wrapped in '+', and contain + /// exactly three '|'-separated sections of fixed widths (context=4, runtime=20, entity=14). + /// + [TestMethod] + public void EncodeTelemetryString_HasExpectedShape() + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), Source(DatabaseType.MSSQL)); + + Assert.IsTrue(telemetry.StartsWith(ProductInfo.DAB_USER_AGENT + "+", StringComparison.Ordinal), telemetry); + Assert.IsTrue(telemetry.EndsWith("+", StringComparison.Ordinal), telemetry); + + (string context, string runtime, string entity) = Sections(telemetry); + Assert.AreEqual(4, context.Length, "context width"); + Assert.AreEqual(20, runtime.Length, "runtime width"); + Assert.AreEqual(14, entity.Length, "entity width"); + } + + [DataTestMethod] + [DataRow(DatabaseType.MSSQL, 'S')] + [DataRow(DatabaseType.DWSQL, 'D')] + [DataRow(DatabaseType.PostgreSQL, 'P')] + [DataRow(DatabaseType.MySQL, 'M')] + [DataRow(DatabaseType.CosmosDB_NoSQL, 'C')] + public void EncodeTelemetryString_EncodesSource(DatabaseType dbType, char expectedSource) + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), Source(dbType)); + (string context, _, _) = Sections(telemetry); + + // Context = [Protocol][Object][Source][Role]; only Source is known at pool time. + Assert.AreEqual("XX", context[..2], "Protocol/Object are placeholders"); + Assert.AreEqual(expectedSource, context[2], "Source"); + Assert.AreEqual('X', context[3], "Role is a placeholder"); + } + + [TestMethod] + public void EncodeTelemetryString_NoLiveSource_EmitsAllPlaceholders() + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), liveDataSource: null); + (string context, _, _) = Sections(telemetry); + Assert.AreEqual("XXXX", context); + } + + [TestMethod] + public void EncodeTelemetryString_RuntimeFlags_MissingSectionsEncodeAsM() + { + // No Runtime section at all -> all runtime "enabled"-style flags are 'M'. + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(runtime: null), Source(DatabaseType.MSSQL)); + (_, string runtime, _) = Sections(telemetry); + + Assert.AreEqual('M', runtime[0], "rest.enabled missing"); + Assert.AreEqual('M', runtime[1], "graphql.enabled missing"); + Assert.AreEqual('M', runtime[2], "mcp.enabled missing"); + Assert.AreEqual('M', runtime[3], "host.mode missing"); + Assert.AreEqual('M', runtime[17], "auth.provider missing"); + } + + [TestMethod] + public void EncodeTelemetryString_RuntimeFlags_ReflectConfiguredValues() + { + RuntimeOptions runtime = new( + Rest: new RestRuntimeOptions(Enabled: false, RequestBodyStrict: false), + GraphQL: new GraphQLRuntimeOptions(Enabled: true, MultipleMutationOptions: new MultipleMutationOptions(new MultipleCreateOptions(true))), + Mcp: new McpRuntimeOptions(Enabled: false), + Host: new HostOptions(Cors: null, Authentication: new AuthenticationOptions("StaticWebApps"), Mode: HostMode.Production), + Telemetry: new TelemetryOptions( + ApplicationInsights: new ApplicationInsightsOptions(Enabled: true), + OpenTelemetry: new OpenTelemetryOptions(Enabled: false), + AzureLogAnalytics: new AzureLogAnalyticsOptions(enabled: true), + File: new FileSinkOptions(enabled: true)), + Cache: new RuntimeCacheOptions(Enabled: true) { Level2 = new RuntimeCacheLevel2Options(Enabled: true) }, + Health: new RuntimeHealthCheckConfig(enabled: false), + Embeddings: new EmbeddingsOptions( + Provider: EmbeddingProviderType.AzureOpenAI, + BaseUrl: "https://example.openai.azure.com", + ApiKey: "test-key", + Enabled: true, + Endpoint: new EmbeddingsEndpointOptions { Enabled = false })); + + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(runtime: runtime), Source(DatabaseType.MSSQL)); + (_, string r, _) = Sections(telemetry); + + Assert.AreEqual('0', r[0], "rest.enabled=false"); + Assert.AreEqual('1', r[1], "graphql.enabled=true"); + Assert.AreEqual('0', r[2], "mcp.enabled=false"); + Assert.AreEqual('1', r[3], "host.mode=Production"); + Assert.AreEqual('0', r[6], "health.enabled=false"); + Assert.AreEqual('1', r[7], "cache.enabled=true"); + Assert.AreEqual('1', r[8], "cache.l2=true"); + Assert.AreEqual('0', r[11], "rest.request-body-strict=false"); + Assert.AreEqual('1', r[12], "graphql.multiple-mutations.create.enabled=true"); + Assert.AreEqual('0', r[13], "open-telemetry.enabled=false"); + Assert.AreEqual('1', r[14], "application-insights.enabled=true"); + Assert.AreEqual('1', r[15], "azure-log-analytics.enabled=true"); + Assert.AreEqual('1', r[16], "file-sink.enabled=true"); + Assert.AreEqual('W', r[17], "auth.provider=StaticWebApps"); + Assert.AreEqual('1', r[18], "embedding.enabled=true"); + Assert.AreEqual('0', r[19], "embedding.endpoint.enabled=false"); + } + + [TestMethod] + public void EncodeTelemetryString_HostMode_EncodesDevAndProd() + { + RuntimeOptions dev = new(Rest: null, GraphQL: null, Mcp: null, Host: new HostOptions(null, null, HostMode.Development)); + RuntimeOptions prod = new(Rest: null, GraphQL: null, Mcp: null, Host: new HostOptions(null, null, HostMode.Production)); + + Assert.AreEqual('0', Sections(ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(runtime: dev), Source(DatabaseType.MSSQL))).runtime[3]); + Assert.AreEqual('1', Sections(ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(runtime: prod), Source(DatabaseType.MSSQL))).runtime[3]); + } + + [TestMethod] + public void EncodeTelemetryString_Obo_EncodesPresence() + { + DataSource oboSource = new(DatabaseType.MSSQL, "Server=localhost;Database=test;") + { + UserDelegatedAuth = new UserDelegatedAuthOptions(Enabled: true) + }; + RuntimeConfig config = new(Schema: "t", DataSource: oboSource, Entities: new(new Dictionary())); + + (_, string runtime, _) = Sections(ApplicationNameTelemetry.EncodeTelemetryString(config, oboSource)); + Assert.AreEqual('1', runtime[9], "data-source.obo=true"); + } + + [DataTestMethod] + [DataRow("Unauthenticated", 'U')] + [DataRow("Simulator", 'S')] + [DataRow("StaticWebApps", 'W')] + [DataRow("AppService", 'A')] + [DataRow("AzureAD", 'E')] + [DataRow("EntraID", 'E')] + [DataRow("SomeCustomJwtProvider", 'C')] + public void EncodeTelemetryString_EncodesAuthProvider(string provider, char expected) + { + RuntimeOptions runtime = new( + Rest: null, GraphQL: null, Mcp: null, + Host: new HostOptions(Cors: null, Authentication: new AuthenticationOptions(provider))); + + (_, string r, _) = Sections(ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(runtime: runtime), Source(DatabaseType.MSSQL))); + Assert.AreEqual(expected, r[17], $"auth.provider letter for '{provider}'"); + } + + [TestMethod] + public void EncodeTelemetryString_AuthProvider_MissingWhenNoAuthentication() + { + RuntimeOptions runtime = new(Rest: null, GraphQL: null, Mcp: null, Host: new HostOptions(Cors: null, Authentication: null)); + (_, string r, _) = Sections(ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(runtime: runtime), Source(DatabaseType.MSSQL))); + Assert.AreEqual('M', r[17], "auth.provider is M when no authentication is configured"); + } + + [TestMethod] + public void EncodeTelemetryString_EntityFlags_MissingWhenNoEntities() + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), Source(DatabaseType.MSSQL)); + (_, _, string entity) = Sections(telemetry); + + // Tri-state flags are 'M' (no entities); the two unmodeled concepts are '?'. + Assert.AreEqual('M', entity[0], "any table"); + Assert.AreEqual('?', entity[3], "MCP persisted documents (not modeled)"); + Assert.AreEqual('?', entity[13], "parameter embed (not modeled)"); + } + + [TestMethod] + public void EncodeTelemetryString_EntityFlags_ReflectEntities() + { + Dictionary entities = new() + { + ["Tbl"] = GenerateEmptyEntity(EntitySourceType.Table), + ["Vw"] = GenerateEmptyEntity(EntitySourceType.View), + ["Described"] = GenerateEmptyEntity() with { Description = "a description" }, + ["Cached"] = GenerateEmptyEntity() with { Cache = new EntityCacheOptions(Enabled: true) }, + ["CustomRole"] = GenerateEmptyEntity() with { Permissions = new[] { new EntityPermission("manager", Array.Empty()) } }, + ["Policy"] = GenerateEmptyEntity() with + { + Permissions = new[] + { + new EntityPermission("anonymous", new[] + { + new EntityAction(EntityActionOperation.Read, null, new EntityActionPolicy(Database: "@item.id eq 1")), + }), + }, + }, + ["Related"] = GenerateEmptyEntity() with + { + Relationships = new Dictionary + { + ["rel"] = new(Cardinality.One, "Tbl", Array.Empty(), Array.Empty(), null, Array.Empty(), Array.Empty()), + }, + }, + }; + + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(entities: entities), Source(DatabaseType.MSSQL)); + (_, _, string e) = Sections(telemetry); + + Assert.AreEqual('1', e[0], "any table"); + Assert.AreEqual('1', e[1], "any view"); + Assert.AreEqual('0', e[2], "any stored procedure"); + Assert.AreEqual('1', e[5], "any rest.enabled"); + Assert.AreEqual('1', e[6], "any graphql.enabled"); + Assert.AreEqual('1', e[9], "any custom roles"); + Assert.AreEqual('1', e[10], "any policies"); + Assert.AreEqual('1', e[11], "any descriptions"); + Assert.AreEqual('1', e[12], "any relationships"); + Assert.AreEqual('1', e[4], "any cache"); + } + + [TestMethod] + public void EncodeTelemetryString_EntityFlags_ReflectMcpToolUsage() + { + // One entity opts into MCP DML tools only, another opts into the MCP custom tool only, + // so the "any entity uses ..." flags for both must be set. + Dictionary entities = new() + { + ["Dml"] = GenerateEmptyEntity() with { Mcp = new EntityMcpOptions(customToolEnabled: false, dmlToolsEnabled: true) }, + ["Custom"] = GenerateEmptyEntity() with { Mcp = new EntityMcpOptions(customToolEnabled: true, dmlToolsEnabled: false) }, + }; + + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(entities: entities), Source(DatabaseType.MSSQL)); + (_, _, string e) = Sections(telemetry); + + Assert.AreEqual('1', e[7], "any mcp dml-tools"); + Assert.AreEqual('1', e[8], "any mcp custom-tool"); + } + + // ----- Opt-out + DAB_APP_NAME_ENV ----------------------------------------------------- + + [TestMethod] + public void BuildApplicationNameSegment_OptedIn_ContainsPayload() + { + string segment = ApplicationNameTelemetry.BuildApplicationNameSegment(BuildConfig(), Source(DatabaseType.MSSQL)); + Assert.IsTrue(segment.StartsWith(ProductInfo.DAB_USER_AGENT + "+", StringComparison.Ordinal), segment); + Assert.IsTrue(segment.EndsWith("+", StringComparison.Ordinal), segment); + } + + [TestMethod] + public void BuildApplicationNameSegment_OptedOut_OmitsPayload() + { + Environment.SetEnvironmentVariable(OPT_OUT_VAR, "1"); + string segment = ApplicationNameTelemetry.BuildApplicationNameSegment(BuildConfig(), Source(DatabaseType.MSSQL)); + Assert.AreEqual(ProductInfo.DAB_USER_AGENT, segment); + } + + [DataTestMethod] + [DataRow("0")] + [DataRow("")] + [DataRow("true")] + [DataRow("yes")] + public void BuildApplicationNameSegment_InvalidOptOutValue_KeepsTelemetry(string optOutValue) + { + Environment.SetEnvironmentVariable(OPT_OUT_VAR, optOutValue); + string segment = ApplicationNameTelemetry.BuildApplicationNameSegment(BuildConfig(), Source(DatabaseType.MSSQL)); + Assert.IsTrue(segment.EndsWith("+", StringComparison.Ordinal), $"telemetry should remain on for '{optOutValue}': {segment}"); + } + + [TestMethod] + public void BuildApplicationNameSegment_AppNameEnv_RidesAsPrefixWithoutSuppressingTelemetry() + { + Environment.SetEnvironmentVariable(APP_NAME_VAR, "dab_hosted"); + string segment = ApplicationNameTelemetry.BuildApplicationNameSegment(BuildConfig(), Source(DatabaseType.MSSQL)); + + Assert.IsTrue(segment.StartsWith("dab_hosted," + ProductInfo.DAB_USER_AGENT + "+", StringComparison.Ordinal), segment); + Assert.IsTrue(segment.EndsWith("+", StringComparison.Ordinal), segment); + } + + [TestMethod] + public void BuildApplicationNameSegment_AppNameEnvWithOptOut_PrefixWithoutPayload() + { + Environment.SetEnvironmentVariable(APP_NAME_VAR, "dab_hosted"); + Environment.SetEnvironmentVariable(OPT_OUT_VAR, "1"); + string segment = ApplicationNameTelemetry.BuildApplicationNameSegment(BuildConfig(), Source(DatabaseType.MSSQL)); + Assert.AreEqual("dab_hosted," + ProductInfo.DAB_USER_AGENT, segment); + } + + // ----- Decode ------------------------------------------------------------------------- + + [TestMethod] + public void Decode_RoundTrips_ProducesReadableLines() + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), Source(DatabaseType.MSSQL)); + IReadOnlyList lines = ApplicationNameTelemetry.Decode(telemetry); + + Assert.IsTrue(lines.Any(l => l.StartsWith("Version: " + ProductInfo.DAB_USER_AGENT, StringComparison.Ordinal)), "version line"); + Assert.IsTrue(lines.Any(l => l.Contains("Source: S (SQL)")), "decoded source"); + Assert.IsTrue(lines.Any(l => l.Contains("runtime.rest.enabled")), "decoded runtime setting"); + Assert.IsTrue(lines.Any(l => l.Contains("entities.any.table")), "decoded entity setting"); + } + + [TestMethod] + public void Decode_IgnoresUserPrefixAndOboHash() + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), Source(DatabaseType.MSSQL)); + string fullAppName = $"abc123hash==|MyCustomApp,{telemetry}"; + + IReadOnlyList lines = ApplicationNameTelemetry.Decode(fullAppName); + Assert.IsTrue(lines.Any(l => l.StartsWith("Version: " + ProductInfo.DAB_USER_AGENT, StringComparison.Ordinal)), string.Join('\n', lines)); + } + + [TestMethod] + public void Decode_TruncatedPayload_DoesNotThrowAndDecodesPartial() + { + string telemetry = ApplicationNameTelemetry.EncodeTelemetryString(BuildConfig(), Source(DatabaseType.MSSQL)); + // Simulate SQL Server truncation by cutting the string mid-payload. + string truncated = telemetry[..(telemetry.Length - 10)]; + + IReadOnlyList lines = ApplicationNameTelemetry.Decode(truncated); + Assert.IsTrue(lines.Count > 1, "should decode the portion that survived truncation"); + Assert.IsTrue(lines.Any(l => l.StartsWith("Version:", StringComparison.Ordinal))); + } + + [TestMethod] + public void Decode_NoMarker_ReturnsFriendlyMessage() + { + IReadOnlyList lines = ApplicationNameTelemetry.Decode("SomeUnrelatedApplicationName"); + Assert.AreEqual(1, lines.Count); + StringAssert.Contains(lines[0], "No DAB telemetry found"); + } + + [TestMethod] + public void Decode_VersionOnly_ReportsNoPayload() + { + IReadOnlyList lines = ApplicationNameTelemetry.Decode("dab_oss_1.2.3"); + Assert.IsTrue(lines.Any(l => l.Contains("Version: dab_oss_1.2.3"))); + Assert.IsTrue(lines.Any(l => l.Contains("none")), "should report no payload"); + } + + [TestMethod] + public void Decode_NullOrEmpty_ReturnsFriendlyMessage() + { + Assert.AreEqual(1, ApplicationNameTelemetry.Decode(null).Count); + Assert.AreEqual(1, ApplicationNameTelemetry.Decode(" ").Count); + } + + /// + /// Per-pool consistency: the OBO flag must reflect the LIVE data source being encoded, not the + /// config's default data source. This matters in multi-database setups where data sources differ. + /// + [TestMethod] + public void EncodeTelemetryString_Obo_ReflectsLiveDataSourceNotDefault() + { + // Default data source has OBO OFF. + DataSource defaultSource = new(DatabaseType.MSSQL, "Server=localhost;Database=default;"); + // A different (live) data source has OBO ON. + DataSource liveOboSource = new(DatabaseType.PostgreSQL, "Host=localhost;Database=live;Username=u;") + { + UserDelegatedAuth = new UserDelegatedAuthOptions(Enabled: true) + }; + RuntimeConfig config = new(Schema: "t", DataSource: defaultSource, Entities: new(new Dictionary())); + + // Encoding for the live OBO-enabled pool reports obo=1 even though the default is off. + (_, string liveRuntime, _) = Sections(ApplicationNameTelemetry.EncodeTelemetryString(config, liveOboSource)); + Assert.AreEqual('1', liveRuntime[9], "obo must reflect the live data source"); + + // Encoding with no live source falls back to the default data source (obo off). + (_, string defaultRuntime, _) = Sections(ApplicationNameTelemetry.EncodeTelemetryString(config, liveDataSource: null)); + Assert.AreEqual('0', defaultRuntime[9], "obo falls back to the default data source when no live source"); + } + + // ----- Helpers ------------------------------------------------------------------------ + + /// Builds a live data source of the given type for the per-pool encoder inputs. + private static DataSource Source(DatabaseType type) => + new(type, "Server=localhost;Database=test;", Options: null); + + private static RuntimeConfig BuildConfig( + RuntimeOptions runtime = null, + Dictionary entities = null) + { + return new RuntimeConfig( + Schema: "test-schema", + DataSource: new DataSource(DatabaseType.MSSQL, "Server=localhost;Database=test;", Options: null), + Entities: new RuntimeEntities(entities ?? new Dictionary()), + Runtime: runtime); + } + + private static (string context, string runtime, string entity) Sections(string telemetry) + { + int firstPlus = telemetry.IndexOf('+'); + string payload = telemetry[(firstPlus + 1)..].TrimEnd('+'); + string[] parts = payload.Split('|'); + return (parts[0], parts[1], parts[2]); + } + } +} diff --git a/src/Service.Tests/UnitTests/LogBufferTests.cs b/src/Service.Tests/UnitTests/LogBufferTests.cs new file mode 100644 index 0000000000..994e1ee088 --- /dev/null +++ b/src/Service.Tests/UnitTests/LogBufferTests.cs @@ -0,0 +1,90 @@ +// Copyright (c) Microsoft Corporation. +// Licensed under the MIT License. + +using System; +using System.Collections.Generic; +using Azure.DataApiBuilder.Config; +using Microsoft.Extensions.Logging; +using Microsoft.VisualStudio.TestTools.UnitTesting; + +namespace Azure.DataApiBuilder.Service.Tests.UnitTests +{ + [TestClass] + public class LogBufferTests + { + /// + /// The buffer is bounded so it cannot grow without limit if it is never drained (e.g. a loader + /// with no logger in a hot-reload loop). Beyond the cap, the oldest entries are dropped while the + /// most recent are retained. + /// + [TestMethod] + public void BufferLog_IsBounded_DropsOldestBeyondCap() + { + LogBuffer buffer = new(); + int overflow = LogBuffer.MAX_BUFFERED_ENTRIES + 50; + + for (int i = 0; i < overflow; i++) + { + buffer.BufferLog(LogLevel.Debug, $"entry-{i}"); + } + + CapturingLogger logger = new(); + buffer.FlushToLogger(logger); + + Assert.AreEqual( + LogBuffer.MAX_BUFFERED_ENTRIES, + logger.Messages.Count, + "Buffer should be capped at MAX_BUFFERED_ENTRIES regardless of how many entries were buffered."); + Assert.IsTrue( + logger.Messages.Contains($"entry-{overflow - 1}"), + "The most recent entry should be retained."); + Assert.IsFalse( + logger.Messages.Contains("entry-0"), + "The oldest entries should be dropped once the cap is exceeded."); + } + + /// + /// Within the cap, all buffered entries are flushed in order and the buffer is drained (a second + /// flush emits nothing). + /// + [TestMethod] + public void FlushToLogger_EmitsAllBufferedEntries_AndDrains() + { + LogBuffer buffer = new(); + buffer.BufferLog(LogLevel.Information, "first"); + buffer.BufferLog(LogLevel.Warning, "second"); + + CapturingLogger logger = new(); + buffer.FlushToLogger(logger); + + CollectionAssert.AreEqual(new[] { "first", "second" }, logger.Messages.ToArray()); + + // The queue is drained on flush, so a second flush emits nothing more. + CapturingLogger secondLogger = new(); + buffer.FlushToLogger(secondLogger); + Assert.AreEqual(0, secondLogger.Messages.Count, "A drained buffer should emit nothing on a subsequent flush."); + } + + /// Minimal in-memory that records formatted messages for assertions. + private sealed class CapturingLogger : ILogger + { + public List Messages { get; } = new(); + + public IDisposable BeginScope(TState state) => NullScope.Instance; + + public bool IsEnabled(LogLevel logLevel) => true; + + public void Log(LogLevel logLevel, EventId eventId, TState state, Exception exception, Func formatter) + => Messages.Add(formatter(state, exception)); + + private sealed class NullScope : IDisposable + { + public static readonly NullScope Instance = new(); + + public void Dispose() + { + } + } + } + } +} diff --git a/src/Service.Tests/UnitTests/RuntimeConfigLoaderJsonDeserializerTests.cs b/src/Service.Tests/UnitTests/RuntimeConfigLoaderJsonDeserializerTests.cs index 3dfaf71b2e..15962c7dfb 100644 --- a/src/Service.Tests/UnitTests/RuntimeConfigLoaderJsonDeserializerTests.cs +++ b/src/Service.Tests/UnitTests/RuntimeConfigLoaderJsonDeserializerTests.cs @@ -958,7 +958,7 @@ public void TestEnvVariableResolvingToAkvPatternIsExpandedInSecondPass() Assert.IsTrue(parsed, "Config should parse successfully."); Assert.IsNotNull(config); - string expected = RuntimeConfigLoader.GetConnectionStringWithApplicationName(finalSecretValue); + string expected = RuntimeConfigLoader.GetMsSqlConnectionStringWithApplicationName(finalSecretValue); var builderExpected = new SqlConnectionStringBuilder(expected); var builderActual = new SqlConnectionStringBuilder(config.DataSource.ConnectionString); Assert.AreEqual(builderExpected["Data Source"], builderActual["Data Source"], "Data Source should match.");