I've been watching a quiet collision happen in our industry.
Everyone's talking about semantic layers, but not everyone means the same thing.
The analytics meaning: trust in structure
On the analytics side, semantic layer usually refers to something we've seen before. It's the business-friendly model that sits between raw data and BI tools. A governed way to define joins, metrics, and consistent logic.
Snowflake recently announced the Open Semantic Interchange.
"…introducing a common, vendor-neutral semantic model specification that standardizes how semantic metadata is defined and shared — ensuring consistent business logic across AI and business intelligence (BI) applications"
This version of semantics is about trust.
It makes sure when someone (or something) asks, "What's our total revenue last month?" the answer is accurate, governed, and consistent.
The first generation of (analytic) semantic layers thrived in the early 2000s. This is when enterprise BI tools like BusinessObjects and Cognos dominated. They provided a governed model of joins and metrics that business users could trust. Over time, those systems became rigid and costly to maintain. So the industry moved toward self-service tools and direct SQL access.
The idea has returned, driven by the need for consistent, governed definitions across every touchpoint. Whether in AI models or Excel spreadsheets, stable definitions are essential. The modern versions are lighter, code-based, and far easier to maintain.
The dbt semantic layer is one of several signs of this resurgence.
Here is an example from the dbt semantic model and metrics:
semantic_models:
- name: transaction # A semantic model with the name Transactions
model: ref('fact_transactions') # References the dbt model named `fact_transactions`
description: "Transaction fact table at the transaction level. This table contains one row per transaction and includes the transaction timestamp."
defaults:
agg_time_dimension: transaction_date
entities: # Entities included in the table are defined here. MetricFlow will use these columns as join keys.
- name: transaction
type: primary
expr: transaction_id
- name: customer
type: foreign
expr: customer_id
dimensions: # dimensions are qualitative values such as names, dates, or geographical data. They provide context to metrics and allow "metric by group" data slicing.
- name: transaction_date
type: time
type_params:
time_granularity: day
- name: transaction_location
type: categorical
expr: order_country
measures: # Measures are columns we perform an aggregation over. Measures are inputs to metrics.
- name: transaction_total
description: "The total value of the transaction."
agg: sum
metrics:
- name: order_gross_profit
description: Gross profit from each order.
type: derived
label: Order gross profit
type_params:
expr: revenue - cost
metrics:
- name: order_total
alias: revenue
- name: order_cost
alias: cost
The move to an open standard in this space is welcome as each vendor currently has its own syntax.
A clear documented set of metrics and joins makes structured data safe to query, and enables AI-driven analytics without hallucinations or surprises.
The ontology meaning: understanding through context
There's another camp using the same phrase and they've been doing it far longer.
As Jessica Talisman, author of The Ontology Pipeline, reminded me recently in a LinkedIn comment:
"Semantic layer is a concept that dates back to Semantic Web days"
To them, a semantic layer isn't a metrics model, it's a knowledge model. It's about representing meaning, relationships, and context across systems and content.
This is where schema.org, controlled vocabularies, metadata standards, and ontologies come into play. The structures that let AI understand that customer, client, and member all point to the same concept.
This version of semantics is about understanding. It connects definitions and relationships, providing the context AI uses to make sense of information.
I'm learning more about this space myself at the moment, and if you want to go deeper, I highly recommend Jessica Talisman's Substack, The Ontology Pipeline.
During my recent dives to learn this, some people have pointed out that semantics alone isn't the full story. There's also pragmatics.
According to Wikipedia, pragmatics is the study of how context contributes to meaning.
For AI (and people too), that distinction matters. In the analytics sense, a semantic layer gives which columns to sum and group by. In the ontology sense semantic layer defines what "revenue" or "customer" means. Pragmatics is what helps understand when someone's really asking, "Who are our most valuable customers right now?" It's about the intent, not just definition.
Both views are useful
These meanings of semantic layer aren't competing, they're complementary. To get the most benefit combine them:
- Structured semantics → Governed queries over structured data → Trust in numbers
- Ontological semantics → Contextual understanding of concepts → Understanding of meaning
Together, they create a foundation where AI can use your structured and unstructured data with knowledge of what it means and how they relate.
The next step for data teams
If your organisation already has a Knowledge or Content function, work with them! They've been managing meaning and taxonomies for years.
If not, this is where the Data team can step forward. We already know how to manage definitions, lineage, and quality. Managing meaning is the natural extension of that work.
So the next time someone mentions a "semantic layer" in a meeting, it's worth pausing to ask: which one do they mean? Because increasingly, the answer might be "both", and that's exactly where things get interesting.