Derive It Once, Use It Everywhere: Introducing the Derived Meaning Store

A practical approach for storing LLM-derived meaning so it becomes stable, governed and reusable across your analytics pipeline.

Imagine a world where you derive meaning from unstructured data once, and then use it reliably everywhere.

Think of customer tone, sentiment and urgency in support tickets.

Or images flagged for potential health and safety risks.

Or emails where an LLM detects frustration, confusion, intent to cancel or a possible escalation.

In this world, sentiment, tone, topics, intent and risk signals become governed data assets. You can join them, trend them and reuse them across analytics, BI, ML and operational workflows.

The LLM does the heavy lifting a single time. After that, the outputs live in the warehouse like any other data: stable, explainable and versioned.

But we don't live in that world yet.

We can extract meaning from documents at a scale that wasn't possible before now. Tools like Snowflake Document AI and Cortex functions make it straightforward to pull out sentiment, themes, tone, intent and risk indicators.

The challenge is not the extraction. The challenge is where that meaning should live.

Meaning behaves differently from facts. Meaning is not a score. Meaning changes when definitions and allowed values change. Meaning changes when models improve. Meaning can even shift when nothing else changes at all.

Our existing data structures are not designed for this.

If we store meaning on a core dimension, we mix truth with interpretation. If we store it in a fact, we pretend an interpretation is an event. If we treat it like a feature, analysts will never find it. And if we call an LLM at query time, we invite drift, cost and inconsistency.

We are extracting meaning more than ever. We have not modernised how we store it.

A simple idea: give meaning its own governed home

Over the past week I've been exploring a broader pattern. The idea is simple:

Derive the meaning once. Store it somewhere stable. Use it everywhere.

This requires a structure that is:

  • separate from core truth
  • versioned
  • explainable
  • easy to join
  • safe for reuse across analytics
  • flexible across architectures

Meaning needs its own place. Not forced into dimensions, not hidden inside facts, not left floating in notebooks, and not recomputed endlessly inside dashboards.

A new pattern: the Derived Meaning Store

To describe this, I've started calling the broader concept the Derived Meaning Store.

The Derived Meaning Store is the part of your data architecture that stores derived interpretations, sentiment, tone, topics, intent, risk signals, as governed, versioned and reusable data.

Anytime you use a LLM or function that 'generates' a result and you want to use it again, that should be put in a Derived Meaning Store.

Diagram showing the Derived Meaning Store process: unstructured data flows through LLM extraction to create derived meaning, which is stored in a governed store and then used across analytics, BI, and ML workflows

It's not a single model. It's a pattern that can appear in different places depending on your modelling preferences and pipeline:

  • as an enrichment table
  • as a Kimball mini-dimension
  • as a Data Vault satellite
  • as a dbt model feeding downstream marts
  • as a governed output from an LLM pipeline

Each of these is a valid expression of the same idea.

The Derived Meaning Store gives meaning a stable, joinable home across your analytics ecosystem.

One example: the Derived Mini-Dimension

One implementation of the Derived Meaning Store, within a dimensional model, is what I have been calling a Derived Mini-Dimension.

This is a small, SCD2-style structure that stores meaning extracted from text separately from core attributes. For example, for a support ticket:

dim_ticket_meaning
  ticket_meaning_sk
  ticket_id
  sentiment
  dominant_topic
  tone
  urgency
  risk_signal
  model_version
  ontology_version
  extracted_at
  valid_from
  valid_to
  is_current

This lets analysts:

  • join meaning to facts safely
  • trend interpretations over time
  • compare meaning across different model versions
  • track changes as definitions evolve
  • avoid calling LLMs inside dashboards or downstream joins

But the Derived Mini-Dimension is just one expression. The broader pattern, the Derived Meaning Store, is bigger than dimensional modelling. It applies in warehouses, lakehouses, semantic layers, dbt projects, and event pipelines.

Where I'm up to

This is still a working theory. I'm building examples, testing the pattern, and running real unstructured data through Snowflake's services.

Early signs are promising, and the feedback so far tells me others are feeling this gap too.

If you work with unstructured data, or you're exploring how LLM outputs should flow into analytics, I'd love to hear your thoughts.

Where would this help in your organisation? What would you change? What challenges do you see?

I'll keep expanding this as the idea develops.