A layered architecture for cross-resource joins and Dublin Core discovery in CSAPI #40
Sam-Bolling
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Discovery, Enrichment, and Composition in OGC API – Connected Systems: A Layered Architecture for Cross-Resource Data Access
Introduction and Scope
OGC API – Connected Systems (CSAPI) organizes data into discrete, independently addressable resource types rather than a single unified data source. This design is efficient for operating on any one resource type but creates friction whenever a client requirement spans more than one, and that friction takes at least two distinct forms. The first is narrow and concrete: enriching a fact-type resource with descriptive detail from a related resource one or more hops away — a table of Observations enriched with metadata about the System that produced each one is a common example. The second is broad and structural: building a single, browsable discovery view across every resource type a deployment exposes — Systems, Procedures, Deployments, Datastreams, and beyond — modeled on Dublin Core–style catalog metadata. This discussion examines the use case and technical cause of each form of friction, the reason the two are related but architecturally distinct, and a layered architecture that resolves both without requiring any deviation from standards compliance.
This kind of friction — a client assembling one coherent view from more than one independently addressable resource — is a well-documented, widely expected characteristic of REST-style APIs generally. Industry guidance on REST API design names this trade-off directly, observing that APIs built around many small, focused resources expose a large number of small resources are known as chatty, with the standard prescription being for clients to compose views by following the hyperlinks and identifiers each resource provides rather than expecting the server to pre-assemble them (Microsoft, n.d.). GitHub's REST API follows this same convention at production scale, instructing integrators that you should not try to parse these URLs or predict resource structure, and to instead follow the fields and link relations a response provides (GitHub, n.d.). Building an architecture to compose views across CSAPI's resource types is therefore not a custom, bespoke workaround, but the same ordinary integration work expected of any client consuming a well-designed, resource-oriented API.
Part I: Cross-Resource Joins Within CSAPI
The Use Case
CSAPI is not a single flat data source; it is organized around discrete resource types — Systems, Deployments, Procedures, Datastreams, Observations, and Commands — each with its own collection endpoint (Robin 2025a). Because of this organization, any client view that draws descriptive detail from a second resource type to enrich or contextualize records from a first constitutes a distinct integration requirement, and a CSAPI deployment of realistic scope will accumulate a number of such requirements over time: an Observation enriched with the System that produced it, a Command enriched with its target System, a Datastream enriched with the Procedure its System implements, or a System Event enriched with the System it describes. Each pairing is an instance of the same general use case type — a resource of one type displayed alongside descriptive attributes that live on a different, related resource type — rather than an isolated request.
The diagram above shows these pairings as edges of one underlying resource graph rather than as unrelated cases: Observation, Command, and System Event each reference a resource one or two hops away, and every such reference is a candidate for the same kind of enrichment requirement. Deployment is included as well, but with a dashed edge to mark a different shape of relationship: it lists the Systems deployed during a given period, a one-to-many, time-scoped lookup rather than the single-reference pattern the other edges share.
The Observation-plus-System pairing is used throughout this section as a concrete illustration of that general type, because it is common and because its structure is representative of the others. A discovery table of Observations in which each row also carries metadata about the producing System — name, type, or location, for instance — is a client-side view that spans two of these resource types at once: Observations, defined in Part 2, and the Systems that generated them, defined in Part 1. The technical cause of the friction it produces, examined below, generalizes directly to the other pairings named above.
The Technical Dilemma
The obstacle is that CSAPI does not model an Observation as a self-contained record. Observation resources are always associated with a parent Datastream, and identifying details that would otherwise repeat on every observation — including the link back to the observing System — are deliberately left off the Observation and only provided at the datastream level (Robin 2025b). Datastreams, in turn, are homogeneous collections tied to exactly one System. Resolving which System produced a given Observation therefore requires a two-hop traversal — Observation → Datastream → System — rather than a single lookup, and the standard does not yet define a built-in way to request that join in one call: certain types of join queries may be defined in a future extension (Open Geospatial Consortium Connected Systems Working Group, n.d.). Resolved naively, this becomes a classic N-plus-one problem for every row in a table, in which each row triggers its own round trip to resolve descriptive metadata — a scaling and latency issue common to any resource-oriented or microservice API, not specific to CSAPI (Richardson, n.d.).
The gap between what CSAPI hands back and what a discovery table needs is easiest to see directly. The following is an illustrative, representative shape — not reproduced from the specification — of what a server might return for a single Observation:
{ "id": "obs-48213", "datastream@id": "ds-1042", "phenomenonTime": "2026-07-04T14:30:00Z", "result": 21.4 }Nothing in this record names the System that produced the reading; a client has that only after a second lookup on
ds-1042. The row a discovery table actually wants looks more like this composed, illustrative shape:{ "observationId": "obs-48213", "phenomenonTime": "2026-07-04T14:30:00Z", "result": 21.4, "systemId": "sys-207", "systemName": "Buoy 12 — Temperature Sensor" }Producing the second shape from the first is exactly the two-hop traversal described above, performed once per row unless the architecture in Part IV intervenes.
The same structural cause applies to each of the other pairings named above, since it follows from how CSAPI separates resource types rather than from any property specific to Observations or Systems. A Command resource reaches its target System through a parent ControlStream, the same two-hop shape as Observation reaching System through a parent Datastream. A Datastream reaches the Procedure implemented by its System through the same kind of reference, one hop further out by way of the System resource. Any client view that spans two CSAPI resource types will therefore encounter the same multi-hop traversal and the same absence of a native join query, regardless of which two resource types are involved.
Part II: Discovery Across All Resource Types
From One Join to Many: The Dublin Core Use Case
The single-pair join described above is one instance of a broader class of need: a unified table for browsing all discoverable resources in a deployment — Systems, Procedures, Deployments, Datastreams — with a consistent set of descriptive columns (name, type, last updated, spatial or temporal coverage) regardless of which underlying resource type a given row represents. This is a different shape of problem from the Observation-plus-System case, and it maps to a different part of the OGC API family: OGC API – Records.
Records addresses the fact that CSAPI-style resource APIs are built to operate on data, not to browse across it. Its atomic building block, the Record, is defined as an atomic unit of information in a catalog that provides a description (i.e. metadata) about a resource that the provider of the resource wishes to make discoverable, built from a small set of properties common across all resource types and extensible per resource type as needed (Vretanos 2025). This common-core-plus-extension design reflects established lineage: Records is explicitly designed to be compatible but not conformant with the OGC Catalogue Service for the Web (CSW) Standard, and CSW's own metadata model is based on an extension of Dublin Core (Vretanos 2025). Records carries that same common-denominator-across-resource-types design forward into a modern, OpenAPI-based interface.
Deployment Patterns
The finalized Records standard — OGC 20-004r1, approved April 2025 and published May 2025 — defines three top-level, independently implementable catalog patterns (Vretanos 2025):
The Local Resources Catalog pattern is the most directly relevant of the three, since the mechanism it describes maps closely onto the single-endpoint, multi-resource-type discovery requirement. The pattern originated as a working-group proposal to allow cataloging of a deployment's own collections and processes directly, and its design rationale was documented in the standard's public review process. In that discussion, the standard's editor described building a single searchable catalog by crawling a deployment's entire resource tree and harvesting metadata from every resource type into one collection, then narrowing results with a type predicate — for example, retrieving only feature collections that satisfy a bounding-box and free-text search by combining a
type=collectionparameter withbboxandqparameters against one endpoint (Vretanos 2024):The request above is illustrative rather than reproduced from the source, but it shows concretely what "one endpoint, many resource types" means in practice. The stated rationale was that this is an easier way to locate local resources since you only need to hit a single endpoint (Vretanos 2024), rather than requiring a client to understand and separately query every resource type a deployment exposes. The Local Resources Catalog requirements class was carried through public comment into the final, approved standard alongside Searchable and Crawlable Catalog (Vretanos 2025), where each of the top-level requirements classes... represent an implementable catalog composed of aggregations of the common components defined by the specification.
Records as a Metadata Layer Over Other OGC APIs
The Local Resources Catalog pattern is documented as extending resource collections defined by other OGC API standards, not only resources native to Records itself. OGC's developer documentation for the pattern notes that the
/collectionsendpoint construct is shared across the OGC API family, including Features and Coverages, and describes the Local Resources Catalog pattern as one that extends other OGC APIs endpoints to behave like catalogs, naming processes and coverage scenes explicitly as additional endpoint types that can be repurposed this way (Open Geospatial Consortium, n.d.-a). Records here functions as a metadata and search layer laid over collections that are natively defined and served by separate OGC API standards — the same relationship a CSAPI deployment's Systems, Procedures, Deployments, and Datastreams would have to a Records-based catalog.The diagram above generalizes the point beyond this essay's own use case: the same Local Resources Catalog pattern applied to a CSAPI deployment's Systems and Datastreams applies equally to a Features deployment's feature collections or a Coverages deployment's coverage scenes, all cataloged into one searchable index without duplicating the underlying data.
This design intent is corroborated by the standard's own relationship to OGC API – Features. Records is described in OGC's implementation guidance as leveraging OGC API - Features as a baseline, sharing its endpoint and request/response structure for the Searchable and Local Resources Catalog patterns, with the specific value it adds being a common record schema and queryable set that allows for interoperability and integration across catalogs so that resources originating from different, independently governed APIs can be described and searched in a single consistent shape (Open Geospatial Consortium, n.d.-b). Together, these sources indicate that cataloging heterogeneous resource types drawn from more than one OGC API standard under a single searchable schema is the specific problem the Local Resources Catalog pattern was designed to solve, rather than an incidental capability. Independent of this design evidence, the standard also carries a substantial adoption record — including use as part of the World Meteorological Organization's WIS2 baseline, where a working-group participant noted that 193 countries voted and approved to use OGC API - Records (Kralidis 2024) — which indicates the pattern is deployed at scale rather than purely theoretical.
Part III: Recognizing the Underlying Pattern
Two Problems, Not One
A single mechanism is not well suited to both the single-pair join and the Dublin Core discovery table, because the two problems operate on different classes of data. This distinction has a long-established name in data architecture: the separation between facts and dimensions. In dimensional modeling terms, facts are very specific, well-defined numeric attributes recorded repeatedly and at high volume, while the surrounding descriptive context needed to interpret them — dimensions — is comparatively low in volume and changes slowly (Kimball 2003).
Mapped onto CSAPI, as shown above: Observations, Commands, and System Events are fact-like — high cardinality, high write frequency, appended continuously. Systems, Procedures, Deployments, and Datastreams are dimension-like — comparatively few in number, descriptive, and slow-changing. The Observation-plus-System use case is a fact row being enriched with a dimension attribute, which is a lookup problem. The Dublin Core discovery table is dimension-to-dimension browsing, in which Systems, Procedures, Deployments, and Datastreams are described uniformly so they can be found and navigated.
A discovery catalog should not index Observations or Commands individually; doing so would mean generating a catalog record per data point, which defeats the purpose of a catalog as a coarse, human-browsable index and would grow without bound. The boundary between what belongs in the catalog and what does not is therefore not an implementation shortcut but the architecturally correct line, and it falls exactly where the fact/dimension distinction predicts it should.
Why the Distinction Matters for Architecture
Because both problems draw on the same underlying dimension data — Systems, Procedures, Deployments, Datastreams — a single architecture can serve both from one synchronized source, even though the access patterns differ: the enrichment case requires fast point-lookups keyed by ID, while the discovery case requires full-text search, filtering, and sorting across many records at once. This shared dependency is the basis for the layered architecture below.
Part IV: A Layered Architecture
Client applications consume this architecture through three parallel modes, shown above: a discovery UI that queries the catalog directly, enrichment views that compose fact rows with dimension attributes through the composition layer, and an analytics UI that queries the escalation index for combined filtering and sorting. Each is described in turn below.
Layer 1 — CSAPI as the System of Record
Systems, Deployments, Procedures, Datastreams, Observations, and Commands remain exactly where CSAPI Parts 1 and 2 define them, exposed exactly as specified. Standards compliance is unaffected; every subsequent layer is additive.
Layer 2 — A Shared Dimension Registry
A single indexing process — run on a schedule or driven by change events from CSAPI writes — reads the dimension-type resources (Systems, Procedures, Deployments, Datastreams) and produces two outputs from that one pass:
Layer 3 — A Composition Layer for Enrichment
Between the UI and the two data sources (CSAPI's dynamic endpoints and the Layer 2 registry) sits a thin composition layer responsible for the actual join. Architecturally, this is the API Composition pattern: implement a query by invoking the services that own the data and performs an in-memory join of the results (Richardson, n.d.). Where this layer is maintained specifically for a given client experience, it also functions as a Backend for Frontend, in that it is tightly coupled to a specific user experience and owned by the team building that UI rather than a general-purpose backend team (Newman 2015). Whether implemented as a small set of REST endpoints, a GraphQL schema with resolver fields backed by the Layer 2 lookup interface, or an in-process join within the client, the underlying principle is constant: new UI requirements become new queries against existing infrastructure rather than new integration projects.
This closes the loop back to the discovery table described at the start of this essay. The illustrative raw Observation from Part I is the input; the row that table actually displays is the output, produced by a single lookup against the Layer 2 cache rather than a direct call to CSAPI's System endpoint:
id: "obs-48213"observationId: "obs-48213"phenomenonTime: "2026-07-04T14:30:00Z"phenomenonTime: "2026-07-04T14:30:00Z"result: 21.4result: 21.4datastream@id: "ds-1042"systemId: "sys-207",systemName: "Buoy 12 — Temperature Sensor"The left column is exactly what CSAPI returns; the right column is exactly what Layer 3 produces after resolving
ds-1042against the Layer 2 lookup cache. No field in the right column required a direct call to the System resource for this row.Escalation Path
Lookup-and-enrich does not address every case. When a dimension attribute must act as a filter or sort key over high-volume fact data — for example, retrieving Observations from Systems located within a region, sorted by System name — enrichment after the fact cannot satisfy the query, since the constraint must be applied before the result set is assembled rather than after. This scenario requires denormalizing the relevant dimension attributes directly into whatever store answers the fact-level query — a search index or materialized view kept current from the same event stream feeding Layer 2 — which represents a heavier, CQRS-style investment (Richardson, n.d.). This investment is best triggered by a concrete requirement reaching this limit rather than built speculatively in advance.
Conclusion
CSAPI's resource-oriented design deliberately keeps Observations lean and pushes descriptive detail up to Datastreams and Systems (Robin 2025b). OGC API – Records provides a uniform, standards-based discovery surface over those same descriptive, dimension-like resources (Vretanos 2025), with a specific, production-tested pattern — the Local Resources Catalog — for cataloging a deployment's own resources without introducing a parallel metadata system (Vretanos 2024; Kralidis 2024). Distinguishing between fact and dimension data (Kimball 2003), rather than treating each cross-resource requirement as an independent problem, allows a single architecture — a shared dimension registry exposed through both a standards-based catalog and a fast lookup interface, joined via a thin composition layer — to accommodate this class of requirements as it grows, while preserving a defined escalation path for the cases it cannot resolve through lookup alone.
References
Byron, Lee, and Nicholas Schrock. n.d. "DataLoader." GitHub repository, graphql/dataloader. Accessed July 3, 2026. https://github.com/graphql/dataloader.
GitHub. n.d. "Best Practices for Using the REST API." GitHub Docs. Accessed July 3, 2026. https://docs.github.com/en/rest/using-the-rest-api/best-practices-for-using-the-rest-api.
Kimball, Ralph. 2003. "Fact Tables and Dimension Tables." Kimball Group Design Tip, January 2003. https://www.kimballgroup.com/2003/01/fact-tables-and-dimension-tables/.
Kralidis, Tom. 2024. Comment on "Local Resource Catalogs and req. class organization." GitHub Issue #300, opengeospatial/ogcapi-records repository. Accessed July 3, 2026. opengeospatial/ogcapi-records#300.
Microsoft. n.d. "Web API Design Best Practices." Azure Architecture Center, Microsoft Learn. Accessed July 3, 2026. https://learn.microsoft.com/en-us/azure/architecture/best-practices/api-design.
Newman, Sam. 2015. "Pattern: Backends for Frontends." samnewman.io, November 18, 2015. https://samnewman.io/patterns/architectural/bff/.
Open Geospatial Consortium. n.d.-a. "Records Deployment Patterns." OGC API – Records Developer Guide. Accessed July 3, 2026. https://records.developer.ogc.org/patterns.html.
Open Geospatial Consortium. n.d.-b. "OGC API - Records — API Deep Dive." OGC API Workshop. Accessed July 3, 2026. https://ogcapi-workshop.ogc.org/api-deep-dive/records/.
Open Geospatial Consortium Connected Systems Standards Working Group. n.d. "ogcapi-connected-systems." GitHub repository. Accessed July 3, 2026. https://github.com/opengeospatial/ogcapi-connected-systems.
Richardson, Chris. n.d. "Pattern: API Composition." Microservices.io. Accessed July 3, 2026. https://microservices.io/patterns/data/api-composition.html.
Robin, Alexandre. 2025a. OGC API — Connected Systems — Part 1: Feature Resources. OGC 23-001, version 1.0. Open Geospatial Consortium. Published July 16, 2025. https://docs.ogc.org/is/23-001/23-001.html.
Robin, Alexandre. 2025b. OGC API — Connected Systems — Part 2: Dynamic Data. OGC 23-002, version 1.0. Open Geospatial Consortium. Published July 16, 2025. https://docs.ogc.org/is/23-002/23-002.html.
Vretanos, Panagiotis (Peter) A. 2024. Comment on "Local Resource Catalogs and req. class organization." GitHub Issue #300, opengeospatial/ogcapi-records repository. Accessed July 3, 2026. opengeospatial/ogcapi-records#300.
Vretanos, Panagiotis (Peter) A. 2025. OGC API — Records — Part 1: Core. OGC 20-004r1, version 1.0. Open Geospatial Consortium. Published May 2, 2025. https://docs.ogc.org/is/20-004r1/20-004r1.html.
Note on Citation Mechanics
Beta Was this translation helpful? Give feedback.
All reactions