Dec. 30, 2025
Microsoft Fabric Governance Explained: Why Your Data Model Will Drift
Episode OverviewThis episode explores how organizations approach data governance, why many initiatives stall, and what practical, human-centered governance can look like in reality. Rather than framing governance as a purely technical or compliance-driven exercise, the conversation emphasizes trust, clarity, accountability, and organizational design. The discussion draws from real-world experience helping organizations move from ad-hoc data practices toward sustainable, value-driven governance models.Key Themes & Takeaways1. Why Most Organizations Struggle with Data Governance
- Many organizations begin their data governance journey reactively—often due to regulatory pressure, data incidents, or leadership mandates.
- Governance is frequently introduced as a top-down control mechanism, which leads to resistance, workarounds, and superficial compliance.
- A common failure mode is over-indexing on tools, frameworks, or committees before clarifying purpose and ownership.
- Without clear incentives, governance becomes "extra work" rather than part of how people already operate.
- Tools can support governance, but they cannot create accountability or shared understanding.
- Successful governance starts with clearly defined decision rights: who owns data, who can change it, and who is accountable for outcomes.
- Organizations often confuse data governance with data management, metadata, or documentation—these are enablers, not governance itself.
- Governance must align with how the organization already makes decisions, not fight against it.
- Governance works best in high-trust environments where people feel safe raising issues and asking questions about data quality and usage.
- Low-trust cultures tend to produce heavy-handed rules that slow teams down without improving outcomes.
- Psychological safety is critical: people must feel comfortable admitting uncertainty or mistakes in data.
- Transparency about how data is used builds confidence and reduces fear-driven behavior.
- Effective governance begins by identifying high-value data products and critical business decisions.
- Policies should emerge from real use cases, not abstract ideals.
- Focusing on a small number of high-impact datasets creates momentum and credibility.
- Governance tied to outcomes (revenue, risk reduction, customer experience) gains executive support faster.
- Clear data ownership is non-negotiable, but ownership does not mean sole control.
- Data owners are responsible for quality, definitions, and access decisions—not for doing all the work themselves.
- Stewardship roles help distribute responsibility while keeping accountability clear.
- Governance fails when ownership is assigned in name only, without time, authority, or support.
- Purely centralized governance does not scale in complex organizations.
- Purely decentralized models often result in inconsistency and duplication.
- Federated models balance local autonomy with shared standards and principles.
- Central teams should act as enablers and coaches, not gatekeepers.
- Measuring governance success by the number of policies or meetings is misleading.
- Better metrics include:
- Time to find and understand data
- Data quality issues detected earlier
- Reduced rework and duplication
- Confidence in decision-making
- Qualitative feedback from data users is often as important as quantitative metrics.
- Governance is not a one-time project—it evolves as the organization and its data mature.
- Policies and standards should be revisited regularly based on real usage.
- Lightweight governance that adapts over time outperforms rigid, comprehensive frameworks.
- Iteration and learning are signs of healthy governance, not failure.
- Start small: pick one domain, one dataset, or one decision and govern that well.
- Use existing forums and workflows instead of creating new committees whenever possible.
- Write policies in plain language that people can actually understand and follow.
- Treat governance conversations as design sessions, not enforcement actions.
- Invest in education so teams understand not just the rules, but the reasons behind them.
- Treating governance as a documentation exercise
- Rolling out enterprise-wide rules before testing them locally
- Assigning ownership without authority or incentives
- Confusing compliance with effectiveness
- Expecting tools to solve human and organizational problems
- Data leaders struggling to gain traction with governance initiatives
- Executives looking for practical, non-bureaucratic approaches to data accountability
- Data practitioners frustrated by unclear ownership and inconsistent standards
- Organizations transitioning from ad-hoc analytics to data-driven decision-making
Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-show-modern-work-security-and-productivity-with-microsoft-365--6704921/support.
Transcript
1
00:00:00,000 --> 00:00:03,180
Most organizations come to fabric with the same comforting assumption.
2
00:00:03,180 --> 00:00:05,760
If we get the permissions right, the numbers will be right.
3
00:00:05,760 --> 00:00:08,760
They are wrong, your real problem is not who can open a report.
4
00:00:08,760 --> 00:00:11,760
It's what that report thinks revenue means this week.
5
00:00:11,760 --> 00:00:14,680
Fabric feels dangerous because it industrializes meaning.
6
00:00:14,680 --> 00:00:18,160
Once a metric definition leaks into the platform, it doesn't stay local.
7
00:00:18,160 --> 00:00:23,440
It propagates, mutates, and quietly competes with every other definition you never retired.
8
00:00:23,440 --> 00:00:26,320
So the useful question is not how do I lock fabric down?
9
00:00:26,320 --> 00:00:27,480
The useful question is,
10
00:00:27,480 --> 00:00:31,240
how do I keep my data model from drifting faster than my governance can follow?
11
00:00:31,240 --> 00:00:34,440
If you're responsible for enterprise data trust, this is for you.
12
00:00:34,440 --> 00:00:37,840
Because fabric is secure by design, your data model will still drift.
13
00:00:37,840 --> 00:00:41,080
The only real choice you have is whether that drift is observed and governed,
14
00:00:41,080 --> 00:00:43,800
or silent and catastrophic at AI speed.
15
00:00:43,800 --> 00:00:45,960
Why fabric feels like too much power?
16
00:00:45,960 --> 00:00:51,280
Most people meet fabric through the marketing diagram, one lake, many workloads, everything unified.
17
00:00:51,280 --> 00:00:53,080
They translate that as a tool story.
18
00:00:53,080 --> 00:00:54,800
Architecturally, it is something else.
19
00:00:54,800 --> 00:00:59,800
You've collapsed engineering, warehousing, BI and AI into a single plane all wired into
20
00:00:59,800 --> 00:01:04,800
an intra as the decision engine, one tenant, one identity fabric, one logical lake,
21
00:01:04,800 --> 00:01:10,040
every role assignment, every workspace, every shortcut participates in a shared authorization
22
00:01:10,040 --> 00:01:12,280
graph that is a lot of power in one place.
23
00:01:12,280 --> 00:01:15,280
So the first questions you hear internally are completely predictable.
24
00:01:15,280 --> 00:01:16,560
Who can see what?
25
00:01:16,560 --> 00:01:17,680
Who owns this model?
26
00:01:17,680 --> 00:01:20,200
Why do we already have two answers for the same KPI?
27
00:01:20,200 --> 00:01:21,440
Those questions aren't new.
28
00:01:21,440 --> 00:01:23,920
What's new is the speed at which fabric lets you get to them.
29
00:01:23,920 --> 00:01:27,400
Historically, your architecture protected you from yourself through friction.
30
00:01:27,400 --> 00:01:32,080
ERP lived over here, the data warehouse team lived over there, Power BI sat on top, usually
31
00:01:32,080 --> 00:01:34,160
a release cycle behind.
32
00:01:34,160 --> 00:01:38,800
To change a core metric, someone had to fight their way through ETL, schema changes, queue
33
00:01:38,800 --> 00:01:40,800
processing and a deployment window.
34
00:01:40,800 --> 00:01:44,120
That slowness was annoying, but it also throttled semantic drift.
35
00:01:44,120 --> 00:01:47,920
Every change heard just enough that people argued before they shipped a new definition.
36
00:01:47,920 --> 00:01:49,880
Fabric removes most of that friction.
37
00:01:49,880 --> 00:01:53,920
Direct Lake lets the Power BI semantic model sit almost directly on top of delta tables in
38
00:01:53,920 --> 00:01:57,760
one lake, no scheduled imports, no fragile refresh window.
39
00:01:57,760 --> 00:02:01,580
Self-service workspaces let domain teams spin up their own lake houses, warehouses and
40
00:02:01,580 --> 00:02:04,560
models without ever filing a central ticket.
41
00:02:04,560 --> 00:02:08,280
Cloning a workspace or semantic model is a couple of clicks, not a quarter's work.
42
00:02:08,280 --> 00:02:09,840
You didn't just modernize performance.
43
00:02:09,840 --> 00:02:11,960
You modernize the propagation of meaning.
44
00:02:11,960 --> 00:02:14,560
Here's the pattern that shows up in tenant after tenant.
45
00:02:14,560 --> 00:02:16,680
Adoptions start small and controlled.
46
00:02:16,680 --> 00:02:21,920
One or two central workspaces, a lake house or warehouse, a curated semantic model, a handful
47
00:02:21,920 --> 00:02:23,560
of official reports.
48
00:02:23,560 --> 00:02:24,800
Everything feels orderly.
49
00:02:24,800 --> 00:02:27,800
Then acceleration kicks in, more teams want access.
50
00:02:27,800 --> 00:02:32,720
They ask for just a copy of the semantic model to tweak a filter or just our own workspace
51
00:02:32,720 --> 00:02:34,200
to move faster.
52
00:02:34,200 --> 00:02:37,960
Direct Lake makes that copy basically free, so does one lake's shared storage.
53
00:02:37,960 --> 00:02:43,640
Semantic drift follows, those copied models diverge, one region excludes certain customers,
54
00:02:43,640 --> 00:02:46,120
one business unit adds a manual adjustment.
55
00:02:46,120 --> 00:02:49,440
Another team redefines active customer to meet their local target.
56
00:02:49,440 --> 00:02:52,800
The names on the measures don't change, the DAX does.
57
00:02:52,800 --> 00:02:54,800
Finally trust collapses.
58
00:02:54,800 --> 00:02:58,120
Executives walk into a steering committee and discover three different truths about
59
00:02:58,120 --> 00:03:03,800
revenue, churn, risk, all sourced from the same ERP all flowing through the same fabric
60
00:03:03,800 --> 00:03:07,200
tenant all protected by the same RBIAC model.
61
00:03:07,200 --> 00:03:08,200
Security worked.
62
00:03:08,200 --> 00:03:09,200
Nobody broke in.
63
00:03:09,200 --> 00:03:12,920
The platform behaved exactly as designed, but the meaning moved.
64
00:03:12,920 --> 00:03:15,360
This is why fabric feels like too much power.
65
00:03:15,360 --> 00:03:19,320
It isn't that the platform is out of control, it's that your previous architecture hit the
66
00:03:19,320 --> 00:03:22,720
fact that you never had a robust way to govern semantics in the first place.
67
00:03:22,720 --> 00:03:24,600
The friction made the gaps survivable.
68
00:03:24,600 --> 00:03:27,240
Now that friction is gone, the gaps are visible.
69
00:03:27,240 --> 00:03:29,560
Direct Lake is a good example of this acceleration.
70
00:03:29,560 --> 00:03:34,320
In a traditional import model, changing a schema or measure definition had a natural governor,
71
00:03:34,320 --> 00:03:35,840
the refresh process.
72
00:03:35,840 --> 00:03:38,160
Break the model and the refresh fails.
73
00:03:38,160 --> 00:03:40,480
Someone notices there's a clear failure point.
74
00:03:40,480 --> 00:03:44,360
Direct Lake connects the semantic model almost directly to the storage engine.
75
00:03:44,360 --> 00:03:49,360
The data is the delta table and the semantic model might keep working just enough to be dangerous.
76
00:03:49,360 --> 00:03:51,880
Column still exists, relationships still resolve.
77
00:03:51,880 --> 00:03:54,680
Only a handful of reports show subtle shifts in filters and totals.
78
00:03:54,680 --> 00:03:58,840
You remove the mechanical canary, self-service workspaces compound this.
79
00:03:58,840 --> 00:04:03,040
When every domain can spin up its own engineering, modeling and reporting stack, you've effectively
80
00:04:03,040 --> 00:04:07,440
created dozens of parallel semantic layers over the same physical data.
81
00:04:07,440 --> 00:04:11,560
Some are carefully modeled, some are copied from wherever someone had access, some are experiments
82
00:04:11,560 --> 00:04:16,160
that accidentally become production because an executive bookmarked the report, easy cloning
83
00:04:16,160 --> 00:04:17,480
is the final accelerant.
84
00:04:17,480 --> 00:04:21,520
Every time someone says, "We'll just clone it and tweak a bit for our needs."
85
00:04:21,520 --> 00:04:24,200
You've created another fork of meaning with no life cycle plan.
86
00:04:24,200 --> 00:04:28,560
There's no compilation step where a central architect has to approve the new definition.
87
00:04:28,560 --> 00:04:31,760
Most organizations respond to this unease by reaching for the wrong lever.
88
00:04:31,760 --> 00:04:33,480
They double down on access control.
89
00:04:33,480 --> 00:04:37,760
More groups, more conditional access, more workspace restrictions, more reviews of who can
90
00:04:37,760 --> 00:04:40,280
see which report, all of that is fine.
91
00:04:40,280 --> 00:04:41,280
Necessary even.
92
00:04:41,280 --> 00:04:42,960
But here's the uncomfortable truth.
93
00:04:42,960 --> 00:04:46,480
You can have perfect RBIAC and completely untrustworthy numbers.
94
00:04:46,480 --> 00:04:50,480
You can pass every security audit and still have no idea which revenue metric your CEO should
95
00:04:50,480 --> 00:04:51,680
quote to the market.
96
00:04:51,680 --> 00:04:55,680
The perceived risk is external, hackers, leaks, regulatory fines.
97
00:04:55,680 --> 00:04:59,080
The lived daily risk is internal, nobody believes the dashboards.
98
00:04:59,080 --> 00:05:03,040
Fabric didn't invent that risk, it just stopped hiding it behind batch windows and silo
99
00:05:03,040 --> 00:05:04,040
tools.
100
00:05:04,040 --> 00:05:06,720
So before we go any further, hold on to this distinction.
101
00:05:06,720 --> 00:05:10,000
Platform security answers, who is allowed to touch this object?
102
00:05:10,000 --> 00:05:13,640
Governance of meaning answers, what does this object actually say?
103
00:05:13,640 --> 00:05:15,560
Fabric gives you an excellent answer to the first question.
104
00:05:15,560 --> 00:05:17,600
It gives you almost no answer to the second.
105
00:05:17,600 --> 00:05:19,240
That's why it feels like too much power.
106
00:05:19,240 --> 00:05:22,760
Once you see that clearly the next move is obvious, you have to separate the layers you've
107
00:05:22,760 --> 00:05:26,880
been blurring together, security, data governance and semantic governance.
108
00:05:26,880 --> 00:05:30,840
Only then can you decide what fabric actually secures for you and where your data model is
109
00:05:30,840 --> 00:05:32,680
guaranteed to drift.
110
00:05:32,680 --> 00:05:35,720
Security versus data governance versus semantic governance.
111
00:05:35,720 --> 00:05:38,920
Once you stop blaming fabric for being too open, you can ask the only question that
112
00:05:38,920 --> 00:05:39,920
matters.
113
00:05:39,920 --> 00:05:46,040
What exactly is Microsoft securing for you and what isn't even in scope?
114
00:05:46,040 --> 00:05:52,240
Most organizations blur three completely different layers into one vague word, governance.
115
00:05:52,240 --> 00:05:55,480
Architecturally, those layers are separate systems with separate responsibilities.
116
00:05:55,480 --> 00:05:57,320
The first layer is platform security.
117
00:05:57,320 --> 00:06:00,760
This is Microsoft's job-entra handles authentication.
118
00:06:00,760 --> 00:06:06,120
Conditional access decides under which device, network and risk conditions are token is issued.
119
00:06:06,120 --> 00:06:11,640
The workspaces and item permissions express who can administer, contribute or view.
120
00:06:11,640 --> 00:06:16,080
One-leg security and role-based access decide which tables, folders or rows a given identity
121
00:06:16,080 --> 00:06:17,080
can touch.
122
00:06:17,080 --> 00:06:21,800
Underneath that you get encryption, address, TLS and transit, multi-geoboundaries, audit logs,
123
00:06:21,800 --> 00:06:26,400
DLP and sensitivity labels through per view and a compliance envelope that already satisfies
124
00:06:26,400 --> 00:06:30,120
regulators who are far more aggressive than your internal audit team.
125
00:06:30,120 --> 00:06:33,440
In other words, the control plane and data plane are well defended.
126
00:06:33,440 --> 00:06:36,840
If fabric were fundamentally insecure, Microsoft wouldn't be running its own financials
127
00:06:36,840 --> 00:06:38,200
on the same architecture.
128
00:06:38,200 --> 00:06:42,240
That distinction matters because it means your core risk is almost never is the platform
129
00:06:42,240 --> 00:06:43,240
safe.
130
00:06:43,240 --> 00:06:45,280
The platform is as safe as it is going to get.
131
00:06:45,280 --> 00:06:47,840
Your real problem is the second layer, data governance.
132
00:06:47,840 --> 00:06:49,320
Data governance is your job.
133
00:06:49,320 --> 00:06:53,720
This is where you decide who owns which data sets, which domains exist, which workspaces
134
00:06:53,720 --> 00:06:58,320
are allowed to touch regulated data, how long data is retained and how classification
135
00:06:58,320 --> 00:06:59,320
flows.
136
00:06:59,320 --> 00:07:04,000
You define read and write boundaries on boarding and off boarding, life cycle for lake houses
137
00:07:04,000 --> 00:07:08,640
and warehouses and what happens when a project ends but its data assets live on.
138
00:07:08,640 --> 00:07:11,560
You decide whether PHI sits only in a healthcare domain.
139
00:07:11,560 --> 00:07:15,800
You decide whether payroll data can ever be shortcut into a general analytics lake house.
140
00:07:15,800 --> 00:07:20,080
You decide whether every production data set has a named owner or just belongs to BI.
141
00:07:20,080 --> 00:07:23,600
When people say we need better governance, this is usually the layer they think they're
142
00:07:23,600 --> 00:07:27,120
talking about even if they only ever touch it via tickets to the central team.
143
00:07:27,120 --> 00:07:30,560
But there is a third layer and this is where fabric quietly hurts you.
144
00:07:30,560 --> 00:07:33,680
Semantic governance, semantic governance answers questions that platform security and
145
00:07:33,680 --> 00:07:35,880
data governance don't even try to address.
146
00:07:35,880 --> 00:07:37,600
What does customer mean in this model?
147
00:07:37,600 --> 00:07:39,360
At what grain is revenue defined?
148
00:07:39,360 --> 00:07:42,520
Which filters are always applied when we report active anything?
149
00:07:42,520 --> 00:07:44,840
Which team is allowed to redefine that logic?
150
00:07:44,840 --> 00:07:48,480
This is the ignored layer, the layer of metrics, business logic, naming and grain.
151
00:07:48,480 --> 00:07:52,880
The layer where DAX lives where SQL views define golden tables where notebooks materialize
152
00:07:52,880 --> 00:07:56,640
KPI logic in code because there was no certified model available.
153
00:07:56,640 --> 00:08:00,480
Most catastrophic fabric failures happen here not in the platform security layer.
154
00:08:00,480 --> 00:08:03,800
Nothing in interest stops you from having five different total revenue measures or with
155
00:08:03,800 --> 00:08:07,960
the same display name, each applying slightly different filters.
156
00:08:07,960 --> 00:08:11,920
Nothing in one lake security complains when three domains create their own customer dimension
157
00:08:11,920 --> 00:08:13,720
with conflicting surrogate keys.
158
00:08:13,720 --> 00:08:18,280
Nothing in purview lights up when someone silently changes the definition of churned customer
159
00:08:18,280 --> 00:08:20,560
from 90 days of inactivity to 30.
160
00:08:20,560 --> 00:08:22,160
Fabric doesn't just host semantics.
161
00:08:22,160 --> 00:08:24,360
Public industrializes meaning.
162
00:08:24,360 --> 00:08:28,320
Once a DAX measure a SQL view or a curated table is shared and referenced its definition
163
00:08:28,320 --> 00:08:31,040
stops being local it becomes a dependency.
164
00:08:31,040 --> 00:08:35,600
Every downstream report excel workbook data flow and co-pilot answer inherits that meaning
165
00:08:35,600 --> 00:08:39,160
until someone forks it, renames nothing and quietly diverges.
166
00:08:39,160 --> 00:08:41,680
This is where semantic drift becomes dangerous.
167
00:08:41,680 --> 00:08:44,840
Schema drift, columns added, types changed is noisy.
168
00:08:44,840 --> 00:08:50,600
Queries break, refreshes fail, engineers get paged, someone notices, semantic drift is silent,
169
00:08:50,600 --> 00:08:54,560
the columns are still there, the data types still match, security is still enforced.
170
00:08:54,560 --> 00:08:58,400
But the calculation behind net sales has gained an exclusion or lost an adjustment or change
171
00:08:58,400 --> 00:08:59,400
time windows.
172
00:08:59,400 --> 00:09:02,920
Only the people who lived through the change even remember it happened.
173
00:09:02,920 --> 00:09:04,520
Security tooling is blind to this.
174
00:09:04,520 --> 00:09:08,040
Data governance catalogs it at best as another data set.
175
00:09:08,040 --> 00:09:12,080
Semantic governance asks a different set of questions, which semantic models are authoritative
176
00:09:12,080 --> 00:09:16,720
for core entities like customer product revenue, who is allowed to publish or change those
177
00:09:16,720 --> 00:09:21,320
definitions, how do we signal to the rest of the organization, which meanings are reusable
178
00:09:21,320 --> 00:09:24,880
at scale, how do we detect when those meanings drift.
179
00:09:24,880 --> 00:09:29,200
Without explicit answers fabric will happily let every domain define its own truth, secure
180
00:09:29,200 --> 00:09:32,600
it perfectly, classify it correctly and serve it at speed.
181
00:09:32,600 --> 00:09:36,040
So when you look at your fabric tenant and feel that mix of power and unease, remember
182
00:09:36,040 --> 00:09:40,720
the layering, platform security is Microsoft's problem and they've largely solved it.
183
00:09:40,720 --> 00:09:45,000
Data governance is your problem and most of you have at least a partial handle on it.
184
00:09:45,000 --> 00:09:49,000
Security governance is nobody's problem by default, which means it's where reality quietly
185
00:09:49,000 --> 00:09:50,360
diverges from intent.
186
00:09:50,360 --> 00:09:53,880
Once you accept that, the next step is to be precise about what fabric actually secures
187
00:09:53,880 --> 00:09:58,440
for you and then catalog the forms of drift that walk straight past that perfectly functional
188
00:09:58,440 --> 00:10:01,000
security model every single day.
189
00:10:01,000 --> 00:10:03,080
What Microsoft fabric actually secures?
190
00:10:03,080 --> 00:10:07,640
Now that the layers are separated, we can finally answer the boring but essential question,
191
00:10:07,640 --> 00:10:11,320
what does fabric actually secure for you in deterministic terms?
192
00:10:11,320 --> 00:10:15,640
Work with identity and access every interaction with fabric flows through Entra, a user,
193
00:10:15,640 --> 00:10:19,520
a service principle, a managed identity, they all authenticate there.
194
00:10:19,520 --> 00:10:23,080
Conditional access decides whether that token is allowed under your rules.
195
00:10:23,080 --> 00:10:26,720
Compliant device, trusted network, MFA, risk level acceptable.
196
00:10:26,720 --> 00:10:30,000
Only once that token exists does fabric even enter the picture.
197
00:10:30,000 --> 00:10:34,400
Inside fabric that identity hits workspace roles and item level permissions.
198
00:10:34,400 --> 00:10:39,960
Workspaces, define, who administers, who can edit, who can contribute data, who only
199
00:10:39,960 --> 00:10:41,280
views.
200
00:10:41,280 --> 00:10:47,400
Items, lake houses, warehouses, semantic models, notebooks, add their own ACLs on top.
201
00:10:47,400 --> 00:10:50,720
One-leg security sits underneath as the data plane gatekeeper.
202
00:10:50,720 --> 00:10:54,520
At that layer, you can decide that a particular table is visible to one group and invisible
203
00:10:54,520 --> 00:10:55,520
to another.
204
00:10:55,520 --> 00:10:59,800
You can constrain access to specific folders in files or enforce row and column level rules
205
00:10:59,800 --> 00:11:02,040
at the storage engine, not just in a report.
206
00:11:02,040 --> 00:11:06,840
So if someone asks, can user X query table Y through Spark, SQL or direct lake?
207
00:11:06,840 --> 00:11:09,360
The system can give you a clear rule-driven answer.
208
00:11:09,360 --> 00:11:10,560
That part is solid.
209
00:11:10,560 --> 00:11:13,200
Move up a level and you have the data plane itself.
210
00:11:13,200 --> 00:11:17,800
One-lake is the logical lake, shortcuts, mirrored sources, native delta and park a files,
211
00:11:17,800 --> 00:11:21,640
but from a security perspective, it's still just objects and ACLs.
212
00:11:21,640 --> 00:11:25,980
A warehouse table, a lake house table, a KQL database, they're all governed by roles and
213
00:11:25,980 --> 00:11:29,320
permissions that fabric and enter evaluate on every request.
214
00:11:29,320 --> 00:11:31,680
Workspaces provide a kind of blast radius boundary here.
215
00:11:31,680 --> 00:11:36,080
A badly written notebook can still be dangerous, but only inside the capacity and the access
216
00:11:36,080 --> 00:11:38,320
scope you've given that workspace.
217
00:11:38,320 --> 00:11:42,880
Once add a logical overlay, they don't enforce security by themselves, but they group workspaces
218
00:11:42,880 --> 00:11:46,920
so you can reason about where regulated data shouldn't live.
219
00:11:46,920 --> 00:11:51,120
On the consumption side, the same model repeats a power BI semantic model, whether it's
220
00:11:51,120 --> 00:11:56,200
import, direct query or direct lake, still respects the underlying identity, workspace
221
00:11:56,200 --> 00:12:00,320
role and any row or object level security you've defined.
222
00:12:00,320 --> 00:12:04,480
If a user isn't allowed to see a customer segment at the table, they won't see it in
223
00:12:04,480 --> 00:12:05,480
the visual.
224
00:12:05,480 --> 00:12:08,960
The column is masked or hidden at the one lake layer, co-pilot doesn't magically resurrect
225
00:12:08,960 --> 00:12:10,040
it in a chat.
226
00:12:10,040 --> 00:12:15,040
From Fabrics point of view, every engine, spark, SQL, DAX, co-pilot is just another client
227
00:12:15,040 --> 00:12:17,240
of the same authorization fabric.
228
00:12:17,240 --> 00:12:20,880
Compliance instrumentation wraps around all of this, per view can scan fabric, register
229
00:12:20,880 --> 00:12:26,320
it as a data source, classify assets and apply sensitivity labels that then flow downstream.
230
00:12:26,320 --> 00:12:30,320
Activity logs tell you who touched which artifact, when and from where.
231
00:12:30,320 --> 00:12:35,440
CLP policies can flag or block data moving out of safe zones, exports to Excel, downloads
232
00:12:35,440 --> 00:12:37,600
of PBX, sharing outside the tenant.
233
00:12:37,600 --> 00:12:42,200
If you want to answer who has accessed this PHI table in the last 30 days, or where does
234
00:12:42,200 --> 00:12:44,640
this highly confidential data set flow?
235
00:12:44,640 --> 00:12:49,360
The combination of fabric logs and per view lineage can give you a satisfactory audit trail,
236
00:12:49,360 --> 00:12:52,560
so viewed purely as a platform, the picture is reassuring.
237
00:12:52,560 --> 00:12:56,520
Authentication is centralized, authorization is consistent, data access is controllable
238
00:12:56,520 --> 00:12:58,320
down to rows and columns.
239
00:12:58,320 --> 00:13:01,120
This is observable, compliance envelopes are present.
240
00:13:01,120 --> 00:13:04,800
If you stay inside that frame, the natural instinct is to keep tightening it, more labels,
241
00:13:04,800 --> 00:13:07,080
more policies, more reviews, more conditions.
242
00:13:07,080 --> 00:13:10,600
But this is where the earlier distinction becomes lethal if you ignore it.
243
00:13:10,600 --> 00:13:14,240
Fabrics is very good at securing objects that is completely agnostic about what those objects
244
00:13:14,240 --> 00:13:15,240
mean.
245
00:13:15,240 --> 00:13:19,920
You can have two semantic models, both carefully labeled, both living in the right domain,
246
00:13:19,920 --> 00:13:23,720
both with restricted access, both with full audit trails, and still have them implement
247
00:13:23,720 --> 00:13:25,840
revenue in mutually exclusive ways.
248
00:13:25,840 --> 00:13:27,920
From a security standpoint, nothing is wrong.
249
00:13:27,920 --> 00:13:30,360
From a decision standpoint, everything is broken.
250
00:13:30,360 --> 00:13:35,480
This is why the right question is not, is fabric safe, but what exactly are we securing?
251
00:13:35,480 --> 00:13:38,640
The platform can guarantee only the right identities connect.
252
00:13:38,640 --> 00:13:42,520
Only allowed tables, folders and rows are returned, only approved paths are used to move
253
00:13:42,520 --> 00:13:45,320
data, only labeled exports leave the boundary.
254
00:13:45,320 --> 00:13:49,840
It cannot guarantee that the metric you're staring at means what you think it means, that
255
00:13:49,840 --> 00:13:51,680
it meant the same thing last quarter.
256
00:13:51,680 --> 00:13:56,600
But the AI agent you just enabled is grounded on the correct version of that meaning.
257
00:13:56,600 --> 00:14:00,640
So when you hear we need better fabric governance translated, you almost never mean we don't
258
00:14:00,640 --> 00:14:01,640
trust entry.
259
00:14:01,640 --> 00:14:05,640
You mean we don't know which definitions we've actually put into production, the security
260
00:14:05,640 --> 00:14:07,640
model is done, you inherit it.
261
00:14:07,640 --> 00:14:09,120
The governance of meaning is not.
262
00:14:09,120 --> 00:14:15,320
You design it, or you operate a perfectly secure platform for uncontrolled semantic drift.
263
00:14:15,320 --> 00:14:17,600
Where governance breaks, the four drift patterns.
264
00:14:17,600 --> 00:14:21,280
Once you accept that fabric secures objects, not meaning you can start naming the ways
265
00:14:21,280 --> 00:14:24,040
that meaning quietly walks away from intent.
266
00:14:24,040 --> 00:14:28,640
There are four drift patterns that show up in almost every series tenant, access drift,
267
00:14:28,640 --> 00:14:31,800
model drift, metric drift, ownership drift.
268
00:14:31,800 --> 00:14:35,840
Each one is predictable, each one is cumulative, and each one walks straight past your perfectly
269
00:14:35,840 --> 00:14:39,720
functioning security model, start with access drift.
270
00:14:39,720 --> 00:14:41,640
On day one, access is simple.
271
00:14:41,640 --> 00:14:46,040
A few core groups, a couple of workspaces, clear roles, over time, reality intervenes.
272
00:14:46,040 --> 00:14:49,880
Someone important needs temporary access to a workspace just for this quarter.
273
00:14:49,880 --> 00:14:53,480
A project team needs broader read rights until we stabilize.
274
00:14:53,480 --> 00:14:56,560
Contractors arrive, external partners get guest accounts.
275
00:14:56,560 --> 00:14:59,760
Nested groups come in from Entra that nobody fully understands.
276
00:14:59,760 --> 00:15:03,080
Because fabric is a collaboration platform, the path of least resistance is always the
277
00:15:03,080 --> 00:15:04,080
same.
278
00:15:04,080 --> 00:15:05,080
Just add them as a viewer.
279
00:15:05,080 --> 00:15:06,080
We'll clean it up later.
280
00:15:06,080 --> 00:15:07,080
Later never comes.
281
00:15:07,080 --> 00:15:08,920
Those exceptions accumulate.
282
00:15:08,920 --> 00:15:13,280
Groups contain other groups, people change roles, but keep access in case they need to
283
00:15:13,280 --> 00:15:15,120
help with something.
284
00:15:15,120 --> 00:15:18,520
Service principles get granted broader rights because nobody wants to debug a failing
285
00:15:18,520 --> 00:15:20,400
pipeline in the middle of the night.
286
00:15:20,400 --> 00:15:22,960
Your security posture on paper is least privileged.
287
00:15:22,960 --> 00:15:25,400
Your effective access graph in Entra is anything but.
288
00:15:25,400 --> 00:15:28,440
The drift here isn't just that more people can see more things.
289
00:15:28,440 --> 00:15:31,960
It's that your mental model of who can touch which semantic layer stops matching the
290
00:15:31,960 --> 00:15:32,960
actual configuration.
291
00:15:32,960 --> 00:15:34,320
You're no longer governing.
292
00:15:34,320 --> 00:15:35,320
You're guessing.
293
00:15:35,320 --> 00:15:37,520
Now model drift.
294
00:15:37,520 --> 00:15:39,320
This is the physical shape of the data.
295
00:15:39,320 --> 00:15:41,920
Schemas, tables, relationships and views.
296
00:15:41,920 --> 00:15:43,520
A new source system is onboarded.
297
00:15:43,520 --> 00:15:47,760
A team adds a temporary staging table that turns into a de facto goal table because
298
00:15:47,760 --> 00:15:49,200
somebody built a report on it.
299
00:15:49,200 --> 00:15:51,640
A column's meaning changes, but the name doesn't.
300
00:15:51,640 --> 00:15:55,320
Someone optimizes a table for performance and silently drops attributes that downstream
301
00:15:55,320 --> 00:15:56,320
models depend on.
302
00:15:56,320 --> 00:15:57,680
None of this is malicious.
303
00:15:57,680 --> 00:15:59,080
It's normal engineering churn.
304
00:15:59,080 --> 00:16:03,360
In a traditional warehouse that churn was gated by ETL processes, integration tests, release
305
00:16:03,360 --> 00:16:08,880
cycles, in fabric, engineers, analysts and even power users can all participate in changing
306
00:16:08,880 --> 00:16:11,200
the shape of the data with fewer barriers.
307
00:16:11,200 --> 00:16:15,000
So the tables, your semantic models point at, are not static objects.
308
00:16:15,000 --> 00:16:16,720
They are moving targets.
309
00:16:16,720 --> 00:16:19,640
Not explicit model contracts and impact analysis.
310
00:16:19,640 --> 00:16:22,880
Every schema tweak risks creating a forked reality.
311
00:16:22,880 --> 00:16:27,840
One part of the organization referencing the new shape, another still living in the old.
312
00:16:27,840 --> 00:16:29,200
Then there is metric drift.
313
00:16:29,200 --> 00:16:30,880
This is the semantic layer itself.
314
00:16:30,880 --> 00:16:36,000
DAX measures, SQL defined KPIs, calculated columns, business rules embedded in notebooks.
315
00:16:36,000 --> 00:16:39,400
Metric drift is what happens when multiple teams use the same words.
316
00:16:39,400 --> 00:16:43,360
Revenue, customer, churn, risk.
317
00:16:43,360 --> 00:16:47,680
And implement them with different filters, grains or business assumptions.
318
00:16:47,680 --> 00:16:50,880
One team excludes internal transfers from revenue.
319
00:16:50,880 --> 00:16:51,880
Another doesn't.
320
00:16:51,880 --> 00:16:54,680
One region reports churn on a 30 day in activity window.
321
00:16:54,680 --> 00:16:56,280
Another uses 90 days.
322
00:16:56,280 --> 00:17:00,040
Finance defines active customer as billable in the last period.
323
00:17:00,040 --> 00:17:03,000
Sales defines it as any customer with an open opportunity.
324
00:17:03,000 --> 00:17:06,000
In isolation, each definition is locally rational.
325
00:17:06,000 --> 00:17:08,680
At enterprise scale, they are mutually incompatible.
326
00:17:08,680 --> 00:17:12,680
Fabric accelerates this drift because it makes it trivial to clone semantic models, tweak
327
00:17:12,680 --> 00:17:16,880
a measure and publish a new version without any central arbitration.
328
00:17:16,880 --> 00:17:18,400
Every fork looks trustworthy.
329
00:17:18,400 --> 00:17:20,920
Every fork can be certified inside its own workspace.
330
00:17:20,920 --> 00:17:24,200
Every fork shows up in co-pilot as a viable source of truth.
331
00:17:24,200 --> 00:17:25,960
Finally, ownership drift.
332
00:17:25,960 --> 00:17:28,480
This one is less visible but architecturally fatal.
333
00:17:28,480 --> 00:17:31,800
On the day a data set or semantic model is created, somebody owns it.
334
00:17:31,800 --> 00:17:34,360
There's an engineer and analyst, a product owner.
335
00:17:34,360 --> 00:17:35,880
Over time, people change roles.
336
00:17:35,880 --> 00:17:41,120
Teams reog, projects end, contractors leave, the fabric item remains.
337
00:17:41,120 --> 00:17:45,680
This is accumulate often lay houses, abandoned semantic models, half maintained pipelines.
338
00:17:45,680 --> 00:17:49,280
Reports nobody admits to owning continue to refresh because they might be used by someone
339
00:17:49,280 --> 00:17:50,280
important.
340
00:17:50,280 --> 00:17:51,800
When something breaks, nobody is accountable.
341
00:17:51,800 --> 00:17:54,560
When a definition needs to change, nobody has the authority.
342
00:17:54,560 --> 00:17:59,600
When AI starts consuming those models, nobody feels responsible for what the agent is saying.
343
00:17:59,600 --> 00:18:03,440
Ownership drift turns every other form of drift into unpayable security debt because here
344
00:18:03,440 --> 00:18:06,360
is the uncomfortable law of large fabric environments.
345
00:18:06,360 --> 00:18:07,960
Drift is not a failure.
346
00:18:07,960 --> 00:18:10,000
Unobserved drift is.
347
00:18:10,000 --> 00:18:15,960
Diffests will widen, models will evolve, metrics will fork, people will move on, self-service,
348
00:18:15,960 --> 00:18:18,640
domain teams and agile delivery guaranteed.
349
00:18:18,640 --> 00:18:23,280
If you design as though drift is avoidable, you will always be surprised, always be reactive
350
00:18:23,280 --> 00:18:25,680
and always be tempted to blame the platform.
351
00:18:25,680 --> 00:18:29,960
If you design on the assumption that drift is constant, then the question changes.
352
00:18:29,960 --> 00:18:33,960
Not how do we stop this, but how do we see it, measure it and decide which meanings
353
00:18:33,960 --> 00:18:35,800
we are willing to let scale.
354
00:18:35,800 --> 00:18:39,960
These four drift patterns are not theoretical, they show up in recognizable, painful ways.
355
00:18:39,960 --> 00:18:43,720
So to make this real, we are going to walk through five enterprise scenarios where everything
356
00:18:43,720 --> 00:18:46,480
we've just described plays out in public.
357
00:18:46,480 --> 00:18:51,520
Finance arguing over revenue, healthcare exposing PHI in the wrong lake house, retail turning
358
00:18:51,520 --> 00:18:57,080
self-service into shadow analytics, manufacturing turning one lake into a junk drawer, and AI
359
00:18:57,080 --> 00:19:01,200
confidently weaponizing every ungoverned definition you ever shipped.
360
00:19:01,200 --> 00:19:05,680
Concrete anonymized but architecturally identical to what's already brewing inside your tenant.
361
00:19:05,680 --> 00:19:09,200
Scenario one, finance, same revenue, three answers.
362
00:19:09,200 --> 00:19:11,480
Let's start where drift hurts fastest.
363
00:19:11,480 --> 00:19:12,480
Finance.
364
00:19:12,480 --> 00:19:14,080
Assume for a moment that the plumbing is perfect.
365
00:19:14,080 --> 00:19:18,280
You have one ERP, the chart of accounts is clean enough to survive audit, there's a well-understood
366
00:19:18,280 --> 00:19:20,520
data export or integration pattern.
367
00:19:20,520 --> 00:19:24,280
Fabric is ingesting that data into one lake through a controlled pipeline, a warehouse,
368
00:19:24,280 --> 00:19:25,800
a lake house or both.
369
00:19:25,800 --> 00:19:28,360
The numbers landing in storage match the source system.
370
00:19:28,360 --> 00:19:30,520
No data quality story to hide behind.
371
00:19:30,520 --> 00:19:33,160
No exotic multi-ERP nightmare.
372
00:19:33,160 --> 00:19:36,000
On top of that shared foundation, three teams go to work.
373
00:19:36,000 --> 00:19:37,400
Finance, sales and operations.
374
00:19:37,400 --> 00:19:39,480
Each of them gets their own fabric workspace.
375
00:19:39,480 --> 00:19:40,480
That sounds healthy.
376
00:19:40,480 --> 00:19:45,120
It lines up with org structure, they all connect to the same curated tables in one lake.
377
00:19:45,120 --> 00:19:48,800
Maybe they even start from a shared base semantic model, the central team provided, and
378
00:19:48,800 --> 00:19:51,080
then the entropy generator switch on.
379
00:19:51,080 --> 00:19:54,040
Finance takes the base model and adds the adjustments they care about.
380
00:19:54,040 --> 00:19:58,040
They exclude certain internal orders, they apply their standard FX logic, and they define
381
00:19:58,040 --> 00:20:01,040
recognized revenue with the cut off rules the auditors expect.
382
00:20:01,040 --> 00:20:03,920
They create a revenue measure that reflects that view of the world.
383
00:20:03,920 --> 00:20:07,360
Sales clones the same model, because someone told them correctly that we should all
384
00:20:07,360 --> 00:20:08,840
be using the same data.
385
00:20:08,840 --> 00:20:11,320
But their reality is pipeline and performance.
386
00:20:11,320 --> 00:20:14,040
They tweak the date logic to align with their commission periods.
387
00:20:14,040 --> 00:20:18,320
They exclude a small set of accounts that are house customers nobody carries a quota on.
388
00:20:18,320 --> 00:20:22,200
They build revenue in a way that matches how they manage the field.
389
00:20:22,200 --> 00:20:24,600
Operations does the same for fulfillment and capacity.
390
00:20:24,600 --> 00:20:26,920
They care about shipped units, backlog and throughput.
391
00:20:26,920 --> 00:20:28,240
They tweak the filters again.
392
00:20:28,240 --> 00:20:32,240
They might even introduce a simple lag so that revenue better reflects operational load
393
00:20:32,240 --> 00:20:33,680
instead of pure booking.
394
00:20:33,680 --> 00:20:35,720
None of these adaptations are stupid.
395
00:20:35,720 --> 00:20:36,920
None of them are malicious.
396
00:20:36,920 --> 00:20:38,480
All of them are locally rational.
397
00:20:38,480 --> 00:20:40,400
And in fabric, all of them are fast.
398
00:20:40,400 --> 00:20:45,320
Copy the semantic model, adjust a bit of DAX, publish, build reports, share with leadership,
399
00:20:45,320 --> 00:20:46,320
bookmark in teams.
400
00:20:46,320 --> 00:20:50,920
Before long, you have three parallel semantic layers over the same ERP facts, all describing
401
00:20:50,920 --> 00:20:53,120
revenue, all with good intent.
402
00:20:53,120 --> 00:20:54,360
Then comes board deck day.
403
00:20:54,360 --> 00:20:59,000
Finance walks in with a slide that shows revenue for the quarter, 1.02 billion.
404
00:20:59,000 --> 00:21:02,200
Sales shows 987 million with a breakdown by region.
405
00:21:02,200 --> 00:21:05,800
Operations shows 1.05 billion with an explanation of capacity strain.
406
00:21:05,800 --> 00:21:07,800
The chair asks the only question that matters.
407
00:21:07,800 --> 00:21:09,120
What is our revenue?
408
00:21:09,120 --> 00:21:12,480
Every number is defensible from the perspective of the team that produced it.
409
00:21:12,480 --> 00:21:13,680
None of them are aligned.
410
00:21:13,680 --> 00:21:15,640
You can watch the psychology in the room shift.
411
00:21:15,640 --> 00:21:17,560
First, they question the tools.
412
00:21:17,560 --> 00:21:18,880
Is this a fabric issue?
413
00:21:18,880 --> 00:21:20,120
Then they question the teams.
414
00:21:20,120 --> 00:21:21,600
Why are you all using different numbers?
415
00:21:21,600 --> 00:21:24,280
And finally, they question the entire analytics estate.
416
00:21:24,280 --> 00:21:27,160
If we can't agree on revenue, what else is wrong?
417
00:21:27,160 --> 00:21:28,880
You have just experienced trust collapse.
418
00:21:28,880 --> 00:21:30,080
Notice what did not fail here.
419
00:21:30,080 --> 00:21:31,080
ERP was fine.
420
00:21:31,080 --> 00:21:32,440
Pipelines were fine.
421
00:21:32,440 --> 00:21:33,440
One lake was fine.
422
00:21:33,440 --> 00:21:35,680
Entra and workspace security were fine.
423
00:21:35,680 --> 00:21:37,880
Incedivity labels and audit logs were fine.
424
00:21:37,880 --> 00:21:42,000
The platform delivered consistent data to every team, secured access correctly and enforced
425
00:21:42,000 --> 00:21:43,520
your compliance envelope.
426
00:21:43,520 --> 00:21:45,120
What failed was semantic governance.
427
00:21:45,120 --> 00:21:48,920
There was no authoritative, certified semantic model for revenue that all three teams were
428
00:21:48,920 --> 00:21:50,240
obligated to reuse.
429
00:21:50,240 --> 00:21:53,800
There was no metric owner empowered to say, "This is the enterprise definition.
430
00:21:53,800 --> 00:21:56,160
If you need a variant, it gets a different name."
431
00:21:56,160 --> 00:22:00,320
There was no distinction between local operational revenue for sales and legal revenue for external
432
00:22:00,320 --> 00:22:01,320
reporting.
433
00:22:01,320 --> 00:22:04,120
Fabrics simply made the absence of that discipline visible.
434
00:22:04,120 --> 00:22:05,120
Fast.
435
00:22:05,120 --> 00:22:07,600
In the legacy world, the friction would have slowed this down.
436
00:22:07,600 --> 00:22:10,960
It would have taken months for each team to get their own cubes, their own extracts, their
437
00:22:10,960 --> 00:22:12,880
own bespoke logic deployed.
438
00:22:12,880 --> 00:22:14,960
The inconsistency would still exist.
439
00:22:14,960 --> 00:22:18,080
You just might never see all three numbers on the table at the same time.
440
00:22:18,080 --> 00:22:21,840
In fabric, the path from idea to propagated meaning is short and smooth.
441
00:22:21,840 --> 00:22:26,240
So the lesson here is not "stop sales from building models" or "lock everything down
442
00:22:26,240 --> 00:22:27,920
in a central finance workspace."
443
00:22:27,920 --> 00:22:30,040
The lesson is simpler and more uncomfortable.
444
00:22:30,040 --> 00:22:33,880
If you don't govern semantics, fabric will happily let every domain industrialize
445
00:22:33,880 --> 00:22:35,080
its own truth.
446
00:22:35,080 --> 00:22:39,360
You'll end up with three revenue numbers, all secured, all audited, all cataloged, and
447
00:22:39,360 --> 00:22:42,360
none of them reliably reusable at enterprise scale.
448
00:22:42,360 --> 00:22:47,320
This is why in mature tenants, you start seeing a hard distinction between certified semantic
449
00:22:47,320 --> 00:22:52,920
models for core entities and KPIs owned and curated by a platform or domain steward, and
450
00:22:52,920 --> 00:22:57,880
everything else, local, promoted, experimental, explicitly not authoritative.
451
00:22:57,880 --> 00:23:00,600
Without that distinction, you're not doing self-service.
452
00:23:00,600 --> 00:23:03,360
You're manufacturing semantic drift at scale.
453
00:23:03,360 --> 00:23:07,360
And no amount of tightening access rights will fix a board deck with three answers to the
454
00:23:07,360 --> 00:23:08,880
same question.
455
00:23:08,880 --> 00:23:09,880
Scenario 2.
456
00:23:09,880 --> 00:23:10,880
Healthcare.
457
00:23:10,880 --> 00:23:12,720
PHI in the wrong lake house.
458
00:23:12,720 --> 00:23:14,600
Finance drift costs you trust.
459
00:23:14,600 --> 00:23:16,360
Healthcare drift costs you your license.
460
00:23:16,360 --> 00:23:20,880
So take the same architectural pattern and move it into a regulated clinical environment.
461
00:23:20,880 --> 00:23:23,000
You roll out fabric in a healthcare organization.
462
00:23:23,000 --> 00:23:24,720
On paper you do the right things.
463
00:23:24,720 --> 00:23:26,080
There is a clinical domain.
464
00:23:26,080 --> 00:23:27,640
There is a research domain.
465
00:23:27,640 --> 00:23:29,000
There is an operations domain.
466
00:23:29,000 --> 00:23:32,920
PHI is supposed to live only in tightly controlled clinical lake houses and warehouses
467
00:23:32,920 --> 00:23:36,920
with hardened workspaces, restricted groups and very nervous compliance officers watching
468
00:23:36,920 --> 00:23:38,000
purview.
469
00:23:38,000 --> 00:23:40,680
But project reality does not care about your diagram.
470
00:23:40,680 --> 00:23:45,680
Across functional analytics initiatives spins up, reducing emergency department wait times,
471
00:23:45,680 --> 00:23:46,680
for example.
472
00:23:46,680 --> 00:23:51,400
It needs scheduling data, triage codes, lap turnaround times, maybe some patient journey information
473
00:23:51,400 --> 00:23:53,040
to see where delays occur.
474
00:23:53,040 --> 00:23:55,120
The project team does the natural thing in fabric.
475
00:23:55,120 --> 00:23:56,520
They create a new workspace.
476
00:23:56,520 --> 00:23:59,960
It sits in the operations domain because that's who's sponsoring the work.
477
00:23:59,960 --> 00:24:00,880
They add a lake house.
478
00:24:00,880 --> 00:24:05,960
They call it something like ED analytics temp because it's just for this project.
479
00:24:05,960 --> 00:24:09,160
Nobody intends this to be a long-lived clinical data product.
480
00:24:09,160 --> 00:24:10,920
It's a sandbox with a deadline.
481
00:24:10,920 --> 00:24:13,160
Data engineers start short-cutting in what they need.
482
00:24:13,160 --> 00:24:15,440
Some comes from operational systems.
483
00:24:15,440 --> 00:24:18,080
Bed management, staffing, equipment tracking.
484
00:24:18,080 --> 00:24:21,760
Some comes from mirrored clinical data but deidentified upstream.
485
00:24:21,760 --> 00:24:26,680
Some however comes directly from a clinical source because the upstream masking isn't ready
486
00:24:26,680 --> 00:24:28,320
and the project sponsor is impatient.
487
00:24:28,320 --> 00:24:32,200
A few tables in this temporary lake house now contain direct identifiers.
488
00:24:32,200 --> 00:24:34,240
MRNs, dates of birth, visit IDs.
489
00:24:34,240 --> 00:24:36,920
The intent is to strip them out later in the pipeline.
490
00:24:36,920 --> 00:24:40,880
The reality is that the lake house is now PHI bearing, regardless of what your architecture
491
00:24:40,880 --> 00:24:41,880
diagram says.
492
00:24:41,880 --> 00:24:43,880
At the same time, workspace access is generous.
493
00:24:43,880 --> 00:24:46,080
It has to be because this is cross-functional.
494
00:24:46,080 --> 00:24:50,400
You have operations analysts, clinical leads, vendor consultants, tuning, triage models.
495
00:24:50,400 --> 00:24:53,320
A few data scientists doing patient flow simulations.
496
00:24:53,320 --> 00:24:56,760
And some integration engineers wiring up real-time feeds.
497
00:24:56,760 --> 00:24:59,640
The fastest way to get everyone unblocked is familiar.
498
00:24:59,640 --> 00:25:01,960
Add them as members or contributors.
499
00:25:01,960 --> 00:25:03,240
We can tighten it later.
500
00:25:03,240 --> 00:25:07,440
So you now have PHI in a workspace that was never designed or classified as a clinical
501
00:25:07,440 --> 00:25:12,320
environment, granted to a wider and less controlled audience than any of your regulated
502
00:25:12,320 --> 00:25:13,320
domains.
503
00:25:13,320 --> 00:25:14,880
Fabric does exactly what you asked.
504
00:25:14,880 --> 00:25:16,320
Entraauthenticates every identity.
505
00:25:16,320 --> 00:25:18,040
Workspace roles are respected.
506
00:25:18,040 --> 00:25:21,000
One lake security enforces who can query which tables.
507
00:25:21,000 --> 00:25:23,120
All the logs capture every access.
508
00:25:23,120 --> 00:25:24,520
Nothing escapes the tenant.
509
00:25:24,520 --> 00:25:27,760
From a pure platform security perspective, nothing is on fire.
510
00:25:27,760 --> 00:25:30,240
Per view scanning eventually runs over this lake house.
511
00:25:30,240 --> 00:25:32,360
It detects patterns that look like identifiers.
512
00:25:32,360 --> 00:25:35,640
Maybe some columns get labeled as confidential or highly confidential.
513
00:25:35,640 --> 00:25:39,920
A few automated rules apply, but the workspace is still tagged in your head as operational
514
00:25:39,920 --> 00:25:42,640
analytics, not clinical PHI.
515
00:25:42,640 --> 00:25:46,320
The problem only becomes visible when someone asks the wrong question at the right time.
516
00:25:46,320 --> 00:25:51,200
An auditor traces PHI lineage and lands unexpectedly in the operations domain.
517
00:25:51,200 --> 00:25:55,680
A consultant exports a subset to work on it in their own environment, believing it to be
518
00:25:55,680 --> 00:25:58,080
de-identified operations data.
519
00:25:58,080 --> 00:26:02,760
A routine access review shows dozens of non-clinical identities with red rights on tables
520
00:26:02,760 --> 00:26:05,120
that now clearly contain PHI.
521
00:26:05,120 --> 00:26:06,640
Environment and intent were misaligned.
522
00:26:06,640 --> 00:26:09,120
You thought clinical domain meant clinical data.
523
00:26:09,120 --> 00:26:10,120
The system did not.
524
00:26:10,120 --> 00:26:13,440
It only understands where data actually is, not where you wish it would be.
525
00:26:13,440 --> 00:26:16,920
This is data governance drift, but the mechanics are the same as in finance.
526
00:26:16,920 --> 00:26:19,680
A temporary lake house became a de facto data product.
527
00:26:19,680 --> 00:26:23,360
A workspace created for speed became a long-lived environment.
528
00:26:23,360 --> 00:26:25,160
Access expanded faster than classification.
529
00:26:25,160 --> 00:26:29,960
The meaning of that workspace shifted from operations metrics to actual patient data,
530
00:26:29,960 --> 00:26:33,200
and nobody updated the mental or technical model to match.
531
00:26:33,200 --> 00:26:34,960
Semantic governance shows up here as well.
532
00:26:34,960 --> 00:26:40,020
Downstream, someone builds a notebook that computes re-admission risk on this mixed data
533
00:26:40,020 --> 00:26:41,020
set.
534
00:26:41,020 --> 00:26:43,080
Another person wraps it in a semantic model.
535
00:26:43,080 --> 00:26:48,400
A dashboard appears in yet another workspace, surfacing risk scores by facility.
536
00:26:48,400 --> 00:26:52,680
No pilot arrives and happily answers, which hospitals have the highest re-admission risk
537
00:26:52,680 --> 00:26:53,680
this month.
538
00:26:53,680 --> 00:26:57,920
Using whatever model is easiest to reach, from a tool perspective, this is a success story.
539
00:26:57,920 --> 00:27:01,560
From a regulatory perspective, it is a slow motion incident, because the question you
540
00:27:01,560 --> 00:27:04,840
will be asked after any investigation is not, did you have R-back?
541
00:27:04,840 --> 00:27:09,360
It is how did PHI end up accessible in that environment, to those identities, with that
542
00:27:09,360 --> 00:27:10,440
lineage?
543
00:27:10,440 --> 00:27:14,680
And the honest architectural answer is, you treated where as a proxy for what?
544
00:27:14,680 --> 00:27:17,240
You assumed environment's implied classification.
545
00:27:17,240 --> 00:27:18,240
Fabric did not.
546
00:27:18,240 --> 00:27:20,160
The lesson here is brutal but simple.
547
00:27:20,160 --> 00:27:23,880
Lakehouse sprawl without hard domain and classification boundaries is a compliance trap.
548
00:27:23,880 --> 00:27:28,000
You will end up with PHI in the wrong lakehouse, shared with the wrong people, powering semantics
549
00:27:28,000 --> 00:27:29,480
nobody ever approved.
550
00:27:29,480 --> 00:27:31,360
Security will say, access was authenticated.
551
00:27:31,360 --> 00:27:33,400
Audit will say, logs exist.
552
00:27:33,400 --> 00:27:36,200
Regulators will say, you lost control of meaning and location.
553
00:27:36,200 --> 00:27:39,720
Governance for fabric and healthcare is not just about locking PHI down.
554
00:27:39,720 --> 00:27:44,080
It is about engineering domains, workspaces and semantic layers, so that regulated meaning
555
00:27:44,080 --> 00:27:48,600
cannot quietly drift into unregulated places, no matter how many temporary lakehouses your
556
00:27:48,600 --> 00:27:51,680
project teams create.
557
00:27:51,680 --> 00:27:52,680
Scenario 3.
558
00:27:52,680 --> 00:27:53,680
Retail.
559
00:27:53,680 --> 00:27:55,280
Self-service becomes shadow analytics.
560
00:27:55,280 --> 00:27:59,040
If healthcare shows you the cost of getting domains wrong, retail shows you the cost of
561
00:27:59,040 --> 00:28:03,480
getting self-service wrong, different stakes, same mechanics.
562
00:28:03,480 --> 00:28:08,000
Picture a large retail organization that prides itself on being data-driven.
563
00:28:08,000 --> 00:28:11,400
They've rolled out fabric with a very explicit mandate from the top.
564
00:28:11,400 --> 00:28:15,720
Social analysts remove bottlenecks, no more six month waits for a new report.
565
00:28:15,720 --> 00:28:20,280
On paper this is healthy, there is a central lakehouse or warehouse with clean sales, store,
566
00:28:20,280 --> 00:28:21,960
product and promotion tables.
567
00:28:21,960 --> 00:28:24,120
The data engineering team has done the right thing.
568
00:28:24,120 --> 00:28:28,760
Standardized schemers, built-conformed dimensions, set up incremental loads, a baseline semantic
569
00:28:28,760 --> 00:28:31,200
model exists with the obvious KPIs.
570
00:28:31,200 --> 00:28:33,960
Sales margin units, same-store sales, promotion uplift.
571
00:28:33,960 --> 00:28:35,920
Then the self-service story starts.
572
00:28:35,920 --> 00:28:40,120
Analysts in merchandising, pricing and marketing are all told correctly to reuse that central
573
00:28:40,120 --> 00:28:41,120
semantic model.
574
00:28:41,120 --> 00:28:46,040
They connect to it from their own workspaces, build some reports and start asking for tweaks.
575
00:28:46,040 --> 00:28:48,000
The first round of requests is reasonable.
576
00:28:48,000 --> 00:28:51,120
Can we get a version of sales that excludes staff purchases?
577
00:28:51,120 --> 00:28:54,960
We need same-store sales defined at a market cluster level, not just store.
578
00:28:54,960 --> 00:28:58,360
Our region wants to see promotion uplift excluding clearance items.
579
00:28:58,360 --> 00:28:59,800
The platform team tries to keep up.
580
00:28:59,800 --> 00:29:01,040
They add a few more measures.
581
00:29:01,040 --> 00:29:03,160
They expose some calculation groups.
582
00:29:03,160 --> 00:29:06,840
But the backlog grows and the pressure to move fast doesn't go away.
583
00:29:06,840 --> 00:29:10,360
At some point a senior analyst discovers how cheap cloning is.
584
00:29:10,360 --> 00:29:14,600
They take the certified semantic model, hit "save a's" into their own workspace and start
585
00:29:14,600 --> 00:29:15,880
adjusting DAX.
586
00:29:15,880 --> 00:29:18,640
Maybe they rename nothing to keep reports working.
587
00:29:18,640 --> 00:29:21,280
Maybe they append "marsh region" to a few measures.
588
00:29:21,280 --> 00:29:23,600
Either way, they now have a forked model.
589
00:29:23,600 --> 00:29:24,600
Direct Lake makes this frictionless.
590
00:29:24,600 --> 00:29:26,720
There's no extra storage cost.
591
00:29:26,720 --> 00:29:28,440
No duplicated refresh schedules.
592
00:29:28,440 --> 00:29:32,600
The cloned model reads the same delta tables underneath with the same performance profile,
593
00:29:32,600 --> 00:29:34,240
Word Spreads.
594
00:29:34,240 --> 00:29:39,400
In a quarter, you have dozens of near-identical semantic models orbiting the same sales tables.
595
00:29:39,400 --> 00:29:41,680
Each workspace has its own flavor.
596
00:29:41,680 --> 00:29:45,480
Merchandising has net sales that excludes returns after a certain window.
597
00:29:45,480 --> 00:29:48,560
Pricing has margin adjusted for vendor rebates.
598
00:29:48,560 --> 00:29:52,360
Marketing has promotion uplift that ignores campaigns below a spend threshold.
599
00:29:52,360 --> 00:29:56,880
E-commerce has same-store sales defined in terms of digital traffic cohorts.
600
00:29:56,880 --> 00:29:58,840
Locally every one of these models is useful.
601
00:29:58,840 --> 00:30:02,200
They answer specific questions faster than the central team ever could.
602
00:30:02,200 --> 00:30:05,400
The self-service mandate looks on the surface like a success.
603
00:30:05,400 --> 00:30:07,040
But something subtle has changed.
604
00:30:07,040 --> 00:30:09,320
You no longer have one semantic layer for sales.
605
00:30:09,320 --> 00:30:10,720
You have a semantic swarm.
606
00:30:10,720 --> 00:30:13,760
And because fabric is doing its job, they all look equally legitimate.
607
00:30:13,760 --> 00:30:15,080
They live in proper workspaces.
608
00:30:15,080 --> 00:30:18,800
They respect RLS, some are even endorsed or promoted because a local manager liked the
609
00:30:18,800 --> 00:30:20,320
dashboards.
610
00:30:20,320 --> 00:30:23,480
In the one-lake catalog and in co-pilot, they all show up when somebody searches for
611
00:30:23,480 --> 00:30:24,960
sales or margin.
612
00:30:24,960 --> 00:30:26,440
Then the business pressure ramps up.
613
00:30:26,440 --> 00:30:27,720
A bad quarter hits.
614
00:30:27,720 --> 00:30:30,160
Execs start asking hard questions.
615
00:30:30,160 --> 00:30:32,720
Which promotions actually drove incremental revenue?
616
00:30:32,720 --> 00:30:35,080
Are we discounting two deeply in certain regions?
617
00:30:35,080 --> 00:30:38,680
Is our same-store sales trend hiding underlying volume decline?
618
00:30:38,680 --> 00:30:40,760
Different teams run to their preferred models.
619
00:30:40,760 --> 00:30:42,640
Marketing pulls numbers from their workspace.
620
00:30:42,640 --> 00:30:43,640
Pricing pulls theirs.
621
00:30:43,640 --> 00:30:47,480
Finance tries to stick to the original certified model, but they've quite added a few measures
622
00:30:47,480 --> 00:30:50,520
of their own to keep pace with ad hoc asks.
623
00:30:50,520 --> 00:30:52,440
Everyone is technically using fabric.
624
00:30:52,440 --> 00:30:53,840
Nobody is using the same semantics.
625
00:30:53,840 --> 00:30:55,920
The first sign of trouble is not a big debate.
626
00:30:55,920 --> 00:30:56,920
It's a subtle mismatch.
627
00:30:56,920 --> 00:31:02,040
A VPC is two different same-store sales percentages in two separate decks and asks, why is your
628
00:31:02,040 --> 00:31:04,080
chart showing minus 1.8?
629
00:31:04,080 --> 00:31:06,480
And theirs is minus 0.5.
630
00:31:06,480 --> 00:31:08,320
Nobody can answer cleanly without opening decks.
631
00:31:08,320 --> 00:31:10,440
The conversation that follows is never about decks.
632
00:31:10,440 --> 00:31:12,600
It's about trust.
633
00:31:12,600 --> 00:31:14,200
Which model is correct?
634
00:31:14,200 --> 00:31:16,720
Who owns the definition of same-store sales?
635
00:31:16,720 --> 00:31:19,200
Why do we have five different versions of promotion uplift?
636
00:31:19,200 --> 00:31:21,000
The honest answer is architectural.
637
00:31:21,000 --> 00:31:25,120
In the rush to empower self-service, nobody drew a hard boundary between reusable governs
638
00:31:25,120 --> 00:31:27,720
semantics that represent enterprise truth.
639
00:31:27,720 --> 00:31:30,880
And local contextual semantics that are allowed to diverge.
640
00:31:30,880 --> 00:31:32,600
Fabric didn't create the shadow analytics.
641
00:31:32,600 --> 00:31:33,600
It industrialized it.
642
00:31:33,600 --> 00:31:39,280
In the Excel era, analysts did all of this anyway, just on their laptops.
643
00:31:39,280 --> 00:31:43,120
You knew it was a risk, but it was mostly trapped in files and email threads.
644
00:31:43,120 --> 00:31:45,480
In fabric, every fork is a first-class object.
645
00:31:45,480 --> 00:31:49,080
Every fork is shareable, discoverable, and consumable by AI.
646
00:31:49,080 --> 00:31:51,040
Shadow analytics graduates into shadow truth.
647
00:31:51,040 --> 00:31:52,600
Notice again what did not fail.
648
00:31:52,600 --> 00:31:54,240
Workspace security was fine.
649
00:31:54,240 --> 00:31:55,920
The access was fine.
650
00:31:55,920 --> 00:31:57,320
Performance was often excellent.
651
00:31:57,320 --> 00:31:59,080
Per view could see all the assets.
652
00:31:59,080 --> 00:32:02,240
What failed was semantic governance and platform boundaries.
653
00:32:02,240 --> 00:32:04,840
Self-service without semantic boundaries is not empowerment.
654
00:32:04,840 --> 00:32:06,720
It is drift with a friendly UI.
655
00:32:06,720 --> 00:32:09,200
The mature pattern in retail tenants looks very different.
656
00:32:09,200 --> 00:32:14,520
There is a small set of certified domain-owned semantic models for core commercial concepts.
657
00:32:14,520 --> 00:32:17,640
Sales, margin, same-store sales, promotion uplift.
658
00:32:17,640 --> 00:32:21,080
These are treated as APIs, not convenience layers, and if you want to use those meanings,
659
00:32:21,080 --> 00:32:22,840
you reference those models.
660
00:32:22,840 --> 00:32:26,200
Other teams are explicitly allowed to build their own metrics, but they must give them
661
00:32:26,200 --> 00:32:30,000
different names and those models are never certified as authoritative for the core
662
00:32:30,000 --> 00:32:31,000
APIs.
663
00:32:31,000 --> 00:32:33,560
In other words, you don't stop analysts from moving fast.
664
00:32:33,560 --> 00:32:35,840
You stop them from silently redefining shared words.
665
00:32:35,840 --> 00:32:40,120
If you skip that step, fabric will faithfully implement your self-service mandate.
666
00:32:40,120 --> 00:32:44,120
And you will wake up one quarter with excellent tooling, fast reports, and an organization that
667
00:32:44,120 --> 00:32:46,600
can no longer answer a basic question.
668
00:32:46,600 --> 00:32:50,800
When we say same-store sales in this meeting, who's meaning did we just import?
669
00:32:50,800 --> 00:32:53,840
Mario 4, manufacturing one-lake as a shared junk drawer.
670
00:32:53,840 --> 00:32:57,480
If retail shows you semantic drift in KPIs, manufacturing shows you physical drift in the
671
00:32:57,480 --> 00:33:02,160
lake itself, and this one usually starts with a sentence every architect has heard.
672
00:33:02,160 --> 00:33:05,360
Let's just put it all in one lake, so it's easy to find later.
673
00:33:05,360 --> 00:33:09,880
You roll fabric into a manufacturing organization that has been starved of integrated data for
674
00:33:09,880 --> 00:33:10,880
a decade.
675
00:33:10,880 --> 00:33:16,200
You've got IoT telemetry from machines, MES data about work orders and line states, ERP data
676
00:33:16,200 --> 00:33:20,520
for materials, production orders, and cost, quality systems for defects and inspections.
677
00:33:20,520 --> 00:33:23,480
A long tail of spreadsheets living in shared drives.
678
00:33:23,480 --> 00:33:25,400
On the whiteboard, the story is elegant.
679
00:33:25,400 --> 00:33:27,720
One lake will be the single enterprise lake.
680
00:33:27,720 --> 00:33:29,320
Domains will own their workspaces.
681
00:33:29,320 --> 00:33:33,120
Data products will emerge, engineering will have a proper foundation for predictive maintenance
682
00:33:33,120 --> 00:33:35,440
OEE and supply chain visibility.
683
00:33:35,440 --> 00:33:36,440
Then projects start.
684
00:33:36,440 --> 00:33:40,520
A plant-level team spins up a workspace to look at downtime patterns on line 7.
685
00:33:40,520 --> 00:33:41,520
They add a lake house.
686
00:33:41,520 --> 00:33:42,920
They shortcut in some IoT data.
687
00:33:42,920 --> 00:33:46,440
They drop some CSVs from legacy systems into files because the connector work isn't done
688
00:33:46,440 --> 00:33:47,440
yet.
689
00:33:47,440 --> 00:33:49,240
A global quality initiative kicks off.
690
00:33:49,240 --> 00:33:50,360
A network space.
691
00:33:50,360 --> 00:33:51,360
Another lake house.
692
00:33:51,360 --> 00:33:53,400
They ingest defect records from a separate system.
693
00:33:53,400 --> 00:33:55,840
Plus a subset of machine telemetry.
694
00:33:55,840 --> 00:33:57,600
Someone exported for them last year.
695
00:33:57,600 --> 00:34:00,000
And uploaded just so we have it nearby.
696
00:34:00,000 --> 00:34:04,000
The supply chain analytics group wants to correlate supplier performance with scrap.
697
00:34:04,000 --> 00:34:07,440
They create their own workspace, their own lake house, and start pulling from whatever looks
698
00:34:07,440 --> 00:34:11,240
vaguely relevant in one lake plus some mirrored ERP tables.
699
00:34:11,240 --> 00:34:12,440
Everyone is moving fast.
700
00:34:12,440 --> 00:34:15,320
Everyone is doing the right thing from their local point of view.
701
00:34:15,320 --> 00:34:17,440
Nobody is curating one lake as a whole.
702
00:34:17,440 --> 00:34:19,600
It emerges over 12 months is not a lake.
703
00:34:19,600 --> 00:34:20,600
It is a junk draw.
704
00:34:20,600 --> 00:34:24,600
You see the symptoms immediately if you open the one lake catalogue without rose tinted
705
00:34:24,600 --> 00:34:25,600
glasses.
706
00:34:25,600 --> 00:34:28,600
Dozens of lake houses named after projects, not domains.
707
00:34:28,600 --> 00:34:34,480
Plant X-2024, OEE pilot, supplier scrap study, temp machine data, tables copied three
708
00:34:34,480 --> 00:34:38,960
or four times with slight naming differences, folders full of one off CSVs that were just
709
00:34:38,960 --> 00:34:42,320
for exploration but are now feeding production reports.
710
00:34:42,320 --> 00:34:45,280
Domains exist on paper, but in practice they're not enforced.
711
00:34:45,280 --> 00:34:49,240
Lake space is drifted into whatever domain someone happened to picturing creation.
712
00:34:49,240 --> 00:34:50,720
Some aren't assigned at all.
713
00:34:50,720 --> 00:34:54,160
Downstream, semantic models and notebooks start hard coding paths.
714
00:34:54,160 --> 00:34:58,200
An analyst building a downtime dashboard doesn't think in terms of data products.
715
00:34:58,200 --> 00:35:01,160
They think in terms of the table that worked last time.
716
00:35:01,160 --> 00:35:06,760
So their DAX or SQL points at plant X-2024, lake house, dbo.downtime events with a fixed
717
00:35:06,760 --> 00:35:11,720
path, not a decorated domain owned table that's guaranteed to exist in five years.
718
00:35:11,720 --> 00:35:16,480
A data scientist training a predictive maintenance model grabs CSVs from a files folder in OEE
719
00:35:16,480 --> 00:35:18,880
pilot because those files had the right columns.
720
00:35:18,880 --> 00:35:22,240
They point notebooks directly at those blobs with absolute paths.
721
00:35:22,240 --> 00:35:24,840
At first it feels productive and then you try to clean up.
722
00:35:24,840 --> 00:35:28,960
A central platform team finally looks at capacity metrics and says we need to archive some
723
00:35:28,960 --> 00:35:29,960
of this clutter.
724
00:35:29,960 --> 00:35:35,240
They propose consolidating lake houses, renaming a few for consistency, maybe reorganizing
725
00:35:35,240 --> 00:35:38,520
folders so that gold data lives somewhere predictable.
726
00:35:38,520 --> 00:35:41,680
The minute they do, invisible dependencies start to snap.
727
00:35:41,680 --> 00:35:44,840
A downtime dashboard failed silently because the table moved.
728
00:35:44,840 --> 00:35:46,440
It doesn't crash spectacularly.
729
00:35:46,440 --> 00:35:50,680
It just starts returning fewer rows because the analyst's hard coded filter no longer matches
730
00:35:50,680 --> 00:35:52,280
the reorganized schema.
731
00:35:52,280 --> 00:35:56,360
The predictive model still runs, but it's now pointing at an old CSV that nobody updates
732
00:35:56,360 --> 00:36:01,280
because the data scientist's notebook refers to a path in OEE pilot that the cleanup script
733
00:36:01,280 --> 00:36:03,600
copied but nobody maintains.
734
00:36:03,600 --> 00:36:07,840
A monthly KPI report for executive manufacturing reviews flips from green to red because
735
00:36:07,840 --> 00:36:12,040
someone optimized a lake house by dropping a column they thought nobody used breaking a
736
00:36:12,040 --> 00:36:15,000
join in a semantic model they'd never heard of.
737
00:36:15,000 --> 00:36:17,440
From the platform's perspective nothing special happened.
738
00:36:17,440 --> 00:36:22,680
Folders were renamed, tables were consolidated, shortcuts moved, or legal operations.
739
00:36:22,680 --> 00:36:26,880
From the business perspective, core production KPI's just started lying.
740
00:36:26,880 --> 00:36:29,960
Operations managers see availability numbers jump around without corresponding changes
741
00:36:29,960 --> 00:36:31,200
on the floor.
742
00:36:31,200 --> 00:36:34,720
Quality leaders see defect rates mysteriously flat-line for one product family because the
743
00:36:34,720 --> 00:36:37,200
source table got filtered when it moved.
744
00:36:37,200 --> 00:36:41,320
And starts questioning whether the cost per unit matrix they've been using to justify
745
00:36:41,320 --> 00:36:44,040
capital spend are even based on current data.
746
00:36:44,040 --> 00:36:47,160
You now have the worst possible combination one lake is full.
747
00:36:47,160 --> 00:36:48,160
Nobody trusts it.
748
00:36:48,160 --> 00:36:51,360
Everyone keeps their own side spreadsheets just in case.
749
00:36:51,360 --> 00:36:54,240
Again notice where the platform did its job.
750
00:36:54,240 --> 00:36:55,240
Security was enforced.
751
00:36:55,240 --> 00:36:58,040
Telemetry came in, ERP mirrors updated.
752
00:36:58,040 --> 00:37:00,040
Audit logs captured who changed what?
753
00:37:00,040 --> 00:37:03,600
What you never did was declare for manufacturing what one lake is allowed to be.
754
00:37:03,600 --> 00:37:07,480
Is it a governed data product surface where only curated domain owned lake houses are
755
00:37:07,480 --> 00:37:12,120
allowed to serve as sources of record or a convenient dumping ground where any project
756
00:37:12,120 --> 00:37:16,280
can create a lake house, throw data in and hope somebody else makes sense of it later,
757
00:37:16,280 --> 00:37:18,280
you can't have both.
758
00:37:18,280 --> 00:37:21,480
In mature manufacturing tenants the pattern looks different.
759
00:37:21,480 --> 00:37:26,720
There is a small set of domain lake houses, manufacturing operations, quality, supply chain.
760
00:37:26,720 --> 00:37:31,000
They own the gold tables, they own the contracts, they publish certified semantic models on
761
00:37:31,000 --> 00:37:35,560
top, projects do not create their own permanent lake houses, they get a femoral workspaces
762
00:37:35,560 --> 00:37:37,080
with clear expiry.
763
00:37:37,080 --> 00:37:42,160
Anything that graduates to used in production must be promoted into a domain lake house under
764
00:37:42,160 --> 00:37:44,240
domain ownership with a life cycle.
765
00:37:44,240 --> 00:37:46,560
One lake stops being a shared junk drawer.
766
00:37:46,560 --> 00:37:51,200
It becomes what it was sold as a shared storage fabric behind governed data products.
767
00:37:51,200 --> 00:37:53,800
If you skip that discipline the outcome is guaranteed.
768
00:37:53,800 --> 00:37:55,840
You will still have a single logical lake.
769
00:37:55,840 --> 00:37:59,520
You will simply have recreated every bad pattern from your old file shares.
770
00:37:59,520 --> 00:38:03,440
This time on top of a platform that makes it easier than ever for those bad patterns
771
00:38:03,440 --> 00:38:07,440
to power critical decisions and AI systems you can't easily unwind.
772
00:38:07,440 --> 00:38:10,960
Scenario 5, AI and co-pilot, garbage meaning accelerated.
773
00:38:10,960 --> 00:38:14,520
By the time AI shows up in your fabric tenant all of the drift we've talked about is already
774
00:38:14,520 --> 00:38:15,520
there.
775
00:38:15,520 --> 00:38:19,400
You have finance with three defensible revenue numbers, healthcare with PHI straying
776
00:38:19,400 --> 00:38:24,120
into the wrong lake houses, retail with a swarm of new identical sales models.
777
00:38:24,120 --> 00:38:27,480
Manufacturing with one lake treated as shared storage, not a product surface.
778
00:38:27,480 --> 00:38:31,920
In that environment someone enables co-pilot and fabric data agents on paper the story is
779
00:38:31,920 --> 00:38:32,920
compelling.
780
00:38:32,920 --> 00:38:34,680
You point co-pilot at your fabric tenant.
781
00:38:34,680 --> 00:38:39,600
It discovers semantic models, reports, lake houses, it reads descriptions, it inspects relationships,
782
00:38:39,600 --> 00:38:44,520
it learns that revenue exists in multiple models that churn is a measure with dependencies
783
00:38:44,520 --> 00:38:48,120
that risk score is computed in a particular warehouse.
784
00:38:48,120 --> 00:38:51,560
Executives are told now you can just ask questions in natural language and get answers
785
00:38:51,560 --> 00:38:53,040
grounded in your own data.
786
00:38:53,040 --> 00:38:54,360
And technically that's true.
787
00:38:54,360 --> 00:38:57,240
But here's the part the marketing diagrams don't emphasize.
788
00:38:57,240 --> 00:39:01,040
AI agents do not consume raw data, they consume semantics.
789
00:39:01,040 --> 00:39:04,640
When co-pilot answers, what was our revenue last quarter?
790
00:39:04,640 --> 00:39:09,320
It is not reading parquet files, it is selecting a semantic model, a measure, a filter context.
791
00:39:09,320 --> 00:39:14,440
It is choosing one implementation of revenue over every other one that exists in your tenant.
792
00:39:14,440 --> 00:39:16,680
How does it choose?
793
00:39:16,680 --> 00:39:21,520
By design it optimizes for ease of use, relevance and connectivity.
794
00:39:21,520 --> 00:39:25,400
Models that are easier to query, better documented, more frequently used or closer to the
795
00:39:25,400 --> 00:39:27,160
question context tend to win.
796
00:39:27,160 --> 00:39:31,000
Certified models matter, but only if they exist and are discoverable.
797
00:39:31,000 --> 00:39:36,440
Local popular models in active workspaces often look more relevant than a pristine but obscure
798
00:39:36,440 --> 00:39:38,520
enterprise model nobody tagged correctly.
799
00:39:38,520 --> 00:39:42,720
So if your semantic layer is already fractured, co-pilot doesn't fix that, it roots through
800
00:39:42,720 --> 00:39:43,720
it.
801
00:39:43,720 --> 00:39:47,880
Imagine your finance department did the right thing and built an authoritative certified semantic
802
00:39:47,880 --> 00:39:49,960
model in a controlled workspace.
803
00:39:49,960 --> 00:39:54,440
It has legal revenue, adjusted revenue and a dozen carefully curated measures.
804
00:39:54,440 --> 00:39:56,840
Ownership is clear, documentation is solid.
805
00:39:56,840 --> 00:40:01,240
At the same time sales has its own workspace with a forked model where revenue is really
806
00:40:01,240 --> 00:40:04,120
commissionable revenue, but nobody renamed the measure.
807
00:40:04,120 --> 00:40:05,600
It is heavily used.
808
00:40:05,600 --> 00:40:09,000
Reports referencing it are open daily, it lives in a workspace with sales in the name and
809
00:40:09,000 --> 00:40:10,200
a lot of executive traffic.
810
00:40:10,200 --> 00:40:14,440
You ask co-pilot in teams, what was our revenue last quarter in Emia?
811
00:40:14,440 --> 00:40:17,040
From co-pilot's perspective both models are viable.
812
00:40:17,040 --> 00:40:22,040
One is certified but lives in a workspace with a finance centric name used mostly by finance.
813
00:40:22,040 --> 00:40:26,520
The other is promoted, heavily used and sits right next to the chat context of the person
814
00:40:26,520 --> 00:40:27,520
asking.
815
00:40:27,520 --> 00:40:29,240
Both answer the question syntactically.
816
00:40:29,240 --> 00:40:31,760
If your governance is weak, the sales model often wins.
817
00:40:31,760 --> 00:40:35,760
You get a perfectly fluent, confident answer based on commissionable revenue.
818
00:40:35,760 --> 00:40:38,160
The number doesn't match last quarter's board deck.
819
00:40:38,160 --> 00:40:39,160
Someone notices.
820
00:40:39,160 --> 00:40:41,400
The story becomes co-pilot is hallucinating.
821
00:40:41,400 --> 00:40:43,480
But architecturally that's not what happened.
822
00:40:43,480 --> 00:40:46,320
The AI did exactly what your semantics told it to do.
823
00:40:46,320 --> 00:40:50,520
It faithfully reflected the ambiguity you allowed to exist between revenue and commissionable
824
00:40:50,520 --> 00:40:51,520
revenue.
825
00:40:51,520 --> 00:40:57,040
It selected the path of least resistance through your authorization graph and your usage patterns.
826
00:40:57,040 --> 00:40:58,960
The hallucination wasn't in the model.
827
00:40:58,960 --> 00:41:02,120
It was in your belief that revenue meant one thing tenet wide.
828
00:41:02,120 --> 00:41:04,240
The same pattern shows up in risk and churn.
829
00:41:04,240 --> 00:41:08,520
A data science team builds a churn score in a notebook, exposes it through a warehouse and
830
00:41:08,520 --> 00:41:11,960
wraps it in a semantic model for operational dashboards.
831
00:41:11,960 --> 00:41:16,560
Another team, months earlier, built a simpler churned customer flag in a different workspace
832
00:41:16,560 --> 00:41:18,000
with different thresholds.
833
00:41:18,000 --> 00:41:20,720
Both end up in fabric, both are labeled churn.
834
00:41:20,720 --> 00:41:25,200
And a COO asks co-pilot which customer segments have the highest churn risk right now?
835
00:41:25,200 --> 00:41:26,640
The agent must pick one.
836
00:41:26,640 --> 00:41:30,800
If the simpler flag happens to be in the more active workspace with more dashboards and
837
00:41:30,800 --> 00:41:33,960
more daily usage, it will often be treated as the default.
838
00:41:33,960 --> 00:41:38,720
Your most sophisticated model with regulatory justification and careful calibration is bypassed
839
00:41:38,720 --> 00:41:41,240
because it lives in a quieter corner of the tenet.
840
00:41:41,240 --> 00:41:44,720
Again, from the outside this looks like AI is unreliable.
841
00:41:44,720 --> 00:41:48,560
From the inside it is semantic governance debt being called in with interest.
842
00:41:48,560 --> 00:41:50,440
Data agents amplify this further.
843
00:41:50,440 --> 00:41:54,600
When you build a fabric data agent you select up to a handful of sources, semantic models,
844
00:41:54,600 --> 00:41:58,560
lake houses, warehouses, maybe an ontology when I queue mature.
845
00:41:58,560 --> 00:42:03,680
You wire tools around them, answer this class of questions, trigger this workflow, summarize
846
00:42:03,680 --> 00:42:04,760
these metrics.
847
00:42:04,760 --> 00:42:08,520
If you feed an agent a mix of certified and non-certified models because that's what
848
00:42:08,520 --> 00:42:11,920
we had, you've just encoded your drift into an API.
849
00:42:11,920 --> 00:42:17,240
Every bot built on top of that agent in teams, in custom apps, in operation centers will
850
00:42:17,240 --> 00:42:18,960
inherit those ambiguities.
851
00:42:18,960 --> 00:42:23,080
Every automation that takes an AI answer and turns it into action will be grounded on whatever
852
00:42:23,080 --> 00:42:26,400
meaning was easiest for the agent to reach at configuration time.
853
00:42:26,400 --> 00:42:29,040
So here is the high level pattern.
854
00:42:29,040 --> 00:42:33,480
Before AI, semantic drift costs you trust in reports and meetings.
855
00:42:33,480 --> 00:42:38,880
After AI, semantic drift drives decisions and automation at speed without you in the loop.
856
00:42:38,880 --> 00:42:40,560
That is the actual risk curve.
857
00:42:40,560 --> 00:42:44,520
And this is why MVPs who live in this space keep saying a version of the same thing.
858
00:42:44,520 --> 00:42:46,160
AI doesn't break your data governance.
859
00:42:46,160 --> 00:42:49,280
It removes your ability to hide from how bad it already is.
860
00:42:49,280 --> 00:42:51,200
Co-pilot is not a hallucination engine.
861
00:42:51,200 --> 00:42:52,400
It is a semantic mirror.
862
00:42:52,400 --> 00:42:56,320
If you're uncomfortable with what you see in that mirror, the work is not to tune prompts.
863
00:42:56,320 --> 00:43:00,840
It is to decide finally which meanings in your fabric tenant are allowed to exist at scale.
864
00:43:00,840 --> 00:43:05,800
And to constrain AI to those layers until you've paid down the rest of your semantic debt.
865
00:43:05,800 --> 00:43:09,520
Governance is not permissions, redefining the fabric operating model.
866
00:43:09,520 --> 00:43:11,160
By now one thing should be obvious.
867
00:43:11,160 --> 00:43:12,840
You do not have a permissions problem.
868
00:43:12,840 --> 00:43:14,000
You have a meaning problem.
869
00:43:14,000 --> 00:43:17,440
So if you respond to everything, we've just walked through by asking your fabric admin
870
00:43:17,440 --> 00:43:19,760
to tighten access, you've missed the point.
871
00:43:19,760 --> 00:43:22,600
Governance is not the same thing as permissions administration.
872
00:43:22,600 --> 00:43:23,600
Permissions answer?
873
00:43:23,600 --> 00:43:25,760
Can this identity touch this object?
874
00:43:25,760 --> 00:43:26,840
Governance answers.
875
00:43:26,840 --> 00:43:28,720
What is this thing who owns it?
876
00:43:28,720 --> 00:43:31,160
And when is it allowed to be reused?
877
00:43:31,160 --> 00:43:32,640
Those are different questions.
878
00:43:32,640 --> 00:43:34,240
They need different teams.
879
00:43:34,240 --> 00:43:37,920
Most organizations today run fabric with one of two operating models.
880
00:43:37,920 --> 00:43:39,960
On one side you have the central BI police.
881
00:43:39,960 --> 00:43:41,240
Everything goes through one team.
882
00:43:41,240 --> 00:43:42,320
They own every dataset.
883
00:43:42,320 --> 00:43:43,280
They own every model.
884
00:43:43,280 --> 00:43:44,560
They gate every change.
885
00:43:44,560 --> 00:43:47,400
Self-service is tolerated, but only on the margins.
886
00:43:47,400 --> 00:43:48,800
The result is predictable.
887
00:43:48,800 --> 00:43:53,720
Long queues, frustrated domains, a flourishing black market of extracts and side systems.
888
00:43:53,720 --> 00:43:56,920
On the other side you have pure self-service anarchy.
889
00:43:56,920 --> 00:44:01,120
Every domain owns its own workspaces, builds its own models and answers its own questions.
890
00:44:01,120 --> 00:44:05,440
The platform team manages capacity and maybe sets some basic guardrails, but semantics
891
00:44:05,440 --> 00:44:06,880
are not their problem.
892
00:44:06,880 --> 00:44:08,800
The result is also predictable.
893
00:44:08,800 --> 00:44:13,640
Most local wins, slow enterprise failures and AI agents grounded on whatever model somebody
894
00:44:13,640 --> 00:44:14,960
happened to click first.
895
00:44:14,960 --> 00:44:15,960
Fabric needs a third thing.
896
00:44:15,960 --> 00:44:17,760
A fabric platform team.
897
00:44:17,760 --> 00:44:20,400
Not a ticket queue, not a reporting factory.
898
00:44:20,400 --> 00:44:21,680
An architectural function.
899
00:44:21,680 --> 00:44:26,160
This team's job is to design and maintain the semantic and structural rules of the environment,
900
00:44:26,160 --> 00:44:27,640
not to build every report.
901
00:44:27,640 --> 00:44:31,040
They define domain boundaries, which domains exist.
902
00:44:31,040 --> 00:44:35,840
Finance, sales, HR operations, clinical retail, whatever matches your organization and which
903
00:44:35,840 --> 00:44:38,520
workspaces are allowed to belong to each?
904
00:44:38,520 --> 00:44:44,040
They decide with business partners where regulated data is permitted to live and where it is not.
905
00:44:44,040 --> 00:44:49,120
They make temporary workspaces, a conscious time-boxed construct instead of an accident, they
906
00:44:49,120 --> 00:44:51,680
own workspace strategy.
907
00:44:51,680 --> 00:44:55,400
Not in the sense of approving every creation but of defining patterns.
908
00:44:55,400 --> 00:44:58,560
Domain aligned workspaces, not endless project aligned ones.
909
00:44:58,560 --> 00:45:02,120
Clear dev, test on prod tiers for anything that matters.
910
00:45:02,120 --> 00:45:06,880
Explicit isolation for highly regulated domains so that PHI cannot quietly drift into someone's
911
00:45:06,880 --> 00:45:09,320
analytic sandbox because it was convenient.
912
00:45:09,320 --> 00:45:11,520
They define certification rules.
913
00:45:11,520 --> 00:45:13,800
What qualifies a semantic model as certified?
914
00:45:13,800 --> 00:45:16,280
What qualifies a table as master data?
915
00:45:16,280 --> 00:45:18,960
Who can apply those endorsements and under what process?
916
00:45:18,960 --> 00:45:23,320
How do you ensure that there is exactly one certified definition of customer and revenue
917
00:45:23,320 --> 00:45:27,240
per domain and that everything else is explicitly second-class?
918
00:45:27,240 --> 00:45:32,000
They maintain semantic standards, naming conventions, default grains, time intelligence policies,
919
00:45:32,000 --> 00:45:34,400
handling of adjustments and exclusions.
920
00:45:34,400 --> 00:45:39,760
The boring work that prevents churn, churned and churn rate from all meaning different things
921
00:45:39,760 --> 00:45:44,240
in different workspaces while using identical display names and critically they do all of
922
00:45:44,240 --> 00:45:45,240
this as a platform.
923
00:45:45,240 --> 00:45:48,800
They do not sit between domains and their work, they sit underneath it.
924
00:45:48,800 --> 00:45:51,960
Their mandate is not, you must file a ticket to add a measure.
925
00:45:51,960 --> 00:45:56,080
It is we will make it easier to reuse the right meanings than to reinvent them and we will
926
00:45:56,080 --> 00:45:57,840
make it visible when you drift.
927
00:45:57,840 --> 00:46:02,760
In practice that means designing fabric so that core entities and KPIs live in shared,
928
00:46:02,760 --> 00:46:08,120
domain-owned semantic models, discoverable and clearly certified, local teams consume those
929
00:46:08,120 --> 00:46:12,720
models by reference, not by cloning and forking when they just need one more slice.
930
00:46:12,720 --> 00:46:16,760
If they truly need a different meaning, they declare a new metric with a new name and
931
00:46:16,760 --> 00:46:18,440
they own it explicitly.
932
00:46:18,440 --> 00:46:22,640
It also means that this team arbitrates common entities across domains.
933
00:46:22,640 --> 00:46:26,840
Customers shows up in sales, marketing, finance, support, do you want four incompatible
934
00:46:26,840 --> 00:46:31,840
customer dimensions or one shared backbone with domain specific extensions?
935
00:46:31,840 --> 00:46:36,960
The fabric platform team forces that conversation before every domain builds its own version.
936
00:46:36,960 --> 00:46:38,720
Architecturally this is the difference.
937
00:46:38,720 --> 00:46:42,840
You are not governing who is allowed to open Power BI, you are governing which meanings
938
00:46:42,840 --> 00:46:45,040
are allowed to be reused at scale.
939
00:46:45,040 --> 00:46:49,440
Access control is necessary, it stops the wrong people from touching the right things.
940
00:46:49,440 --> 00:46:53,000
Semantic governance is what stops the right people from trusting the wrong things.
941
00:46:53,000 --> 00:46:57,320
Once you adopt that posture, everything about your fabric operating model changes.
942
00:46:57,320 --> 00:47:01,520
You stop seeing workspaces as folders where people put stuff and start seeing them as
943
00:47:01,520 --> 00:47:04,480
surfaces where particular meanings are allowed to exist.
944
00:47:04,480 --> 00:47:09,200
You stop measuring success by number of dashboards and start measuring it by percentage of consumption
945
00:47:09,200 --> 00:47:10,880
that hit certified semantics.
946
00:47:10,880 --> 00:47:15,120
You stop asking your admins to be gatekeepers and start asking your platform team to be designers
947
00:47:15,120 --> 00:47:20,200
of an authorization and meaning fabric that reflects how your organization actually makes
948
00:47:20,200 --> 00:47:21,600
decisions.
949
00:47:21,600 --> 00:47:24,080
And once you have that, you can talk about something practical.
950
00:47:24,080 --> 00:47:26,920
How do you stand this up without freezing the tenant for a year?
951
00:47:26,920 --> 00:47:28,200
You don't need a five year program.
952
00:47:28,200 --> 00:47:32,440
You need a clear charter, some domains and a Dave and 30 pumps 90 plan that turns governance
953
00:47:32,440 --> 00:47:37,120
from a slide into a set of irreversible decisions about how fabric is allowed to behave.
954
00:47:37,120 --> 00:47:42,040
The fabric governance model, charter, domains, day, 30, 90.
955
00:47:42,040 --> 00:47:43,760
At this point you know what goes wrong.
956
00:47:43,760 --> 00:47:48,440
Now you need something uncomfortably specific, a way to run fabric that doesn't depend on
957
00:47:48,440 --> 00:47:49,440
heroics or hope.
958
00:47:49,440 --> 00:47:54,560
And that starts with a charter, not a slide that says enable self-service, an architectural
959
00:47:54,560 --> 00:47:58,920
sentence that answers a harder question, why does fabric exist in this organization?
960
00:47:58,920 --> 00:48:02,680
If your honest answer is because it was in the E5 bundle, stop there.
961
00:48:02,680 --> 00:48:06,440
You're not governing, you're decorating the only charter that survives contact with fabric
962
00:48:06,440 --> 00:48:07,800
looks more like this.
963
00:48:07,800 --> 00:48:13,240
We run fabric so the organization can make trusted decisions at speed using shared semantics
964
00:48:13,240 --> 00:48:14,960
over governed data.
965
00:48:14,960 --> 00:48:19,080
Trusted at speed out shared semantics, governed data, everything you do next has to either
966
00:48:19,080 --> 00:48:22,720
support that sentence or admit that you're optimizing for something else.
967
00:48:22,720 --> 00:48:27,640
On that charter you define domains, not just for security, for meaning finance, sales, HR,
968
00:48:27,640 --> 00:48:31,440
operations, clinical, retail, whatever maps cleanly to how your organization actually makes
969
00:48:31,440 --> 00:48:32,440
decisions.
970
00:48:32,440 --> 00:48:35,000
Each domain gets two explicit owners.
971
00:48:35,000 --> 00:48:38,840
A data owner accountable for what data is allowed to exist, how it's classified, how it
972
00:48:38,840 --> 00:48:44,160
flows, a semantic owner accountable for what core entities and KPIs mean in that domain,
973
00:48:44,160 --> 00:48:46,760
which models are authoritative and what gets certified.
974
00:48:46,760 --> 00:48:50,680
Those are not abstract titles, they are names you can put on a page and more importantly
975
00:48:50,680 --> 00:48:52,480
in fabric and purview.
976
00:48:52,480 --> 00:48:55,680
Workspaces then align to those domains, not to projects.
977
00:48:55,680 --> 00:49:00,680
Instead of project X and pilot Y scattered everywhere you get finance prod, finance dev,
978
00:49:00,680 --> 00:49:06,960
sales prod, HR prod and so on, regulated domains, clinical, HR, anything with PHI or payroll
979
00:49:06,960 --> 00:49:12,400
are isolated by design, dedicated capacities if necessary, hardened workspace settings,
980
00:49:12,400 --> 00:49:14,640
no casual shortcuts in or out.
981
00:49:14,640 --> 00:49:17,640
Dev test prod become tears you enforce, not vibes.
982
00:49:17,640 --> 00:49:21,760
Dev workspaces are where experiments happen, they are noisy, short lived and cheap.
983
00:49:21,760 --> 00:49:24,920
Workspaces exist only for things on a path to production.
984
00:49:24,920 --> 00:49:28,840
Broad workspace contain only asset someone is willing to sign for.
985
00:49:28,840 --> 00:49:32,960
Inside those prod workspaces you draw the sharpest line you have.
986
00:49:32,960 --> 00:49:35,280
Certified versus promoted semantic models.
987
00:49:35,280 --> 00:49:37,840
Certified models are governed, owned and authoritative.
988
00:49:37,840 --> 00:49:43,080
They represent the meanings you are willing to let AI executives and downstream systems reuse
989
00:49:43,080 --> 00:49:47,920
at scale, they have an explicit owner, documented logic and a change process.
990
00:49:47,920 --> 00:49:49,720
Promoted models are useful but not trusted.
991
00:49:49,720 --> 00:49:52,840
They are allowed to exist, they are allowed to help people think, they are not allowed
992
00:49:52,840 --> 00:49:58,400
to silently redefine revenue or customer for the organization, everything else is experimental
993
00:49:58,400 --> 00:49:59,960
and you treat it that way.
994
00:49:59,960 --> 00:50:03,460
Of course none of this appears magically, you have a tenant full of drift already, so you
995
00:50:03,460 --> 00:50:05,560
need a day on day 30, day 90 plan.
996
00:50:05,560 --> 00:50:09,600
Day is the freeze, you don't turn fabric off, you stop net new chaos, you freeze uncontrolled
997
00:50:09,600 --> 00:50:13,360
workspace creation by tightening who can create them and under which domains.
998
00:50:13,360 --> 00:50:17,520
You stop people from publishing brand new semantic models into production workspaces without
999
00:50:17,520 --> 00:50:22,600
review, you identify the top 10 or 20 data sets and models by consumption, the ones most
1000
00:50:22,600 --> 00:50:27,480
of the organization already depends on and you put a temporary do not clone casually sticker
1001
00:50:27,480 --> 00:50:28,760
on them.
1002
00:50:28,760 --> 00:50:33,200
The goal of day is simple, stop digging, day 30 is inventory and ownership, by then your
1003
00:50:33,200 --> 00:50:37,640
platform team has run through usage metrics, per view scans and workspace lists.
1004
00:50:37,640 --> 00:50:41,520
For anything in a prod like workspace that is clearly in active use, reports opened
1005
00:50:41,520 --> 00:50:46,640
regularly, models queried constantly, tables feeding multiple downstream artifacts, you
1006
00:50:46,640 --> 00:50:50,040
assign an owner, not a team, a person.
1007
00:50:50,040 --> 00:50:54,200
Every production data set, every widely used semantic model, every lake house that feeds
1008
00:50:54,200 --> 00:51:00,320
critical dashboards gets a named data owner and where applicable a semantic owner.
1009
00:51:00,320 --> 00:51:04,640
You log that somewhere boring and durable, a catalog a table, whatever you will actually
1010
00:51:04,640 --> 00:51:09,520
maintain, in parallel you define semantic standards for the top handful of KPIs, revenue
1011
00:51:09,520 --> 00:51:12,280
customer churn risk, whatever your board obsesses over.
1012
00:51:12,280 --> 00:51:16,520
For each you decide which model is authoritative, what the definition is and how it will be
1013
00:51:16,520 --> 00:51:17,520
surfaced.
1014
00:51:17,520 --> 00:51:21,200
You stand up the smallest possible certification process that makes changes visible and
1015
00:51:21,200 --> 00:51:24,360
reviewable without recreating the bi police.
1016
00:51:24,360 --> 00:51:28,240
By day 30 nothing is everybody's problem anymore, drift can still happen but at least you
1017
00:51:28,240 --> 00:51:30,080
know whose job it is to notice.
1018
00:51:30,080 --> 00:51:32,000
Day 90 is enforcement.
1019
00:51:32,000 --> 00:51:36,160
This is where domains stop being labels and start being boundaries, you align every workspace
1020
00:51:36,160 --> 00:51:40,640
to a domain, you shut down or archive the ones nobody will claim, you enforce that regulated
1021
00:51:40,640 --> 00:51:44,960
data, only lives in regulated domains and you back that up with per view classification
1022
00:51:44,960 --> 00:51:47,560
and DLP, not just naming conventions.
1023
00:51:47,560 --> 00:51:52,160
In fabric you turn on the features you were pretending to use, enforce domain isolation,
1024
00:51:52,160 --> 00:51:56,520
constraint cross workspace semantic reuse to certified models where it matters and start
1025
00:51:56,520 --> 00:52:02,000
wiring AI access, co-pilot data agents so that by default they can only see certified
1026
00:52:02,000 --> 00:52:03,080
semantic layers.
1027
00:52:03,080 --> 00:52:07,160
You also start measuring what percentage of consumption is hitting certified models versus
1028
00:52:07,160 --> 00:52:10,760
everything else, how many production data sets lack an owner, how many definitions of
1029
00:52:10,760 --> 00:52:13,800
revenue still exist and is that number going down.
1030
00:52:13,800 --> 00:52:17,920
By day 90 the platform will look roughly the same from the outside, the critical difference
1031
00:52:17,920 --> 00:52:19,880
is this.
1032
00:52:19,880 --> 00:52:24,360
Before fabric reflected your drift and hit it behind good intentions, after fabric reflects
1033
00:52:24,360 --> 00:52:28,040
your intent and makes drift visible the moment it matters.
1034
00:52:28,040 --> 00:52:32,200
Matrix that matter in fabric governance, at this point intent is clear, the operating model
1035
00:52:32,200 --> 00:52:35,560
is defined, now you need something much less glamorous.
1036
00:52:35,560 --> 00:52:36,560
Proof.
1037
00:52:36,560 --> 00:52:40,280
If you can't measure whether semantic drift is shrinking or growing, you are not governing
1038
00:52:40,280 --> 00:52:42,080
fabric, you're decorating it.
1039
00:52:42,080 --> 00:52:45,680
So what does proof look like in a fabric tenant, start with the only metric that really tells
1040
00:52:45,680 --> 00:52:49,760
you whether semantics are landing, what percentage of enterprise consumption hits certified
1041
00:52:49,760 --> 00:52:54,880
semantic models, not how many models are certified, that's vanity, consumption is the
1042
00:52:54,880 --> 00:52:56,680
reality.
1043
00:52:56,680 --> 00:53:01,600
You want to know across reports, dashboards, Excel connections and AI agents, what fraction
1044
00:53:01,600 --> 00:53:05,920
of queries resolve against a small known set of certified models versus everything else.
1045
00:53:05,920 --> 00:53:10,760
If that number is low, you've built a beautiful semantic layer nobody is actually using.
1046
00:53:10,760 --> 00:53:14,400
If that number grows over time, you're bending drift back toward intent.
1047
00:53:14,400 --> 00:53:16,520
Second, data owner coverage.
1048
00:53:16,520 --> 00:53:21,040
For every data set, lake house, warehouse and semantic model that behaves like production,
1049
00:53:21,040 --> 00:53:25,320
meaning people depend on it to make decisions, not just to experiment, you want a named
1050
00:53:25,320 --> 00:53:26,600
accountable owner.
1051
00:53:26,600 --> 00:53:30,800
This is binary, there is no partial credit, either there is a person whose name you can
1052
00:53:30,800 --> 00:53:33,400
put next to that asset or there isn't.
1053
00:53:33,400 --> 00:53:37,800
Your metric is simple, the proportion of production artifacts with a named data owner and
1054
00:53:37,800 --> 00:53:39,680
where relevant a semantic owner.
1055
00:53:39,680 --> 00:53:44,720
If a model powers revenue, risk, churn or anything else executives argue about, it belongs
1056
00:53:44,720 --> 00:53:46,800
to the team is not an answer.
1057
00:53:46,800 --> 00:53:49,320
Teams don't attend change approval calls, people do.
1058
00:53:49,320 --> 00:53:53,640
Third, duplicate semantic models per core KPI, pick the half dozen metrics that define your
1059
00:53:53,640 --> 00:53:54,640
business.
1060
00:53:54,640 --> 00:53:58,960
Revenue, customer, churn, risk, same store sales, whatever fits your reality.
1061
00:53:58,960 --> 00:54:03,040
For each one count how many distinct implementations exist in fabric, not how many times the word
1062
00:54:03,040 --> 00:54:06,920
appears, how many different DAX or SQL definitions you would find if you traced them.
1063
00:54:06,920 --> 00:54:11,440
If you have 10 versions of revenue and one certified model, your goal is not to drag everything
1064
00:54:11,440 --> 00:54:15,240
into central finance, it is to create a visible trajectory.
1065
00:54:15,240 --> 00:54:20,880
Those 10 becoming 8, then 5, then 3 as teams converge on shared semantics or rename their
1066
00:54:20,880 --> 00:54:21,880
local variance honestly.
1067
00:54:21,880 --> 00:54:25,360
You're not chasing perfection, you're chasing monotonic improvement.
1068
00:54:25,360 --> 00:54:30,480
Fourth, sensitivity label compliance in regulated domains, in clinical HR finance, anywhere
1069
00:54:30,480 --> 00:54:34,280
regulators like to live, you want a hard number for how much of the data state is actually
1070
00:54:34,280 --> 00:54:36,080
classified and enforced.
1071
00:54:36,080 --> 00:54:41,160
Per view can tell you how many assets are labeled, which labels they carry and where labels propagate.
1072
00:54:41,160 --> 00:54:45,320
Your governance metric is the percentage of tables, files and models that contain regulated
1073
00:54:45,320 --> 00:54:50,600
data and also carry the correct sensitivity labels with enforcement policies active.
1074
00:54:50,600 --> 00:54:54,840
If that number is low, you are not running regulated domains, you are running wishful thinking
1075
00:54:54,840 --> 00:54:56,520
with good slideware.
1076
00:54:56,520 --> 00:54:59,440
Fifth, AI consumption from uncertified models.
1077
00:54:59,440 --> 00:55:02,040
This one is your early warning system for semantic risk.
1078
00:55:02,040 --> 00:55:06,840
For every co-pilot interaction, every data agent query, every AI-driven workflow, you want
1079
00:55:06,840 --> 00:55:10,560
to know which models were used to answer the question or drive the action.
1080
00:55:10,560 --> 00:55:15,120
And whether those models were certified, merely promoted or completely ad hoc.
1081
00:55:15,120 --> 00:55:19,080
If a large share of AI usage is grounded on uncertified semantics, then your most powerful
1082
00:55:19,080 --> 00:55:22,360
amplification engine is wired directly into your drift.
1083
00:55:22,360 --> 00:55:25,560
You shouldn't be surprised when it embarrasses you in front of executives.
1084
00:55:25,560 --> 00:55:27,880
You should be surprised it hasn't done so more often.
1085
00:55:27,880 --> 00:55:32,080
There are other metrics you can track, workspace, sprawl, orfant artifacts time to certify,
1086
00:55:32,080 --> 00:55:36,520
but these five form a minimum viable truth dashboard for fabric governance, percentage
1087
00:55:36,520 --> 00:55:41,160
of consumption on certified semantics, owner coverage for production assets, duplicate
1088
00:55:41,160 --> 00:55:43,520
implementations per core KPI.
1089
00:55:43,520 --> 00:55:48,080
Label compliance in regulated domains, AI usage grounded on uncertified models.
1090
00:55:48,080 --> 00:55:51,840
If any of those are unmeasured, you are flying a very expensive platform with the instruments
1091
00:55:51,840 --> 00:55:52,840
turned off.
1092
00:55:52,840 --> 00:55:56,920
And if you find yourself tempted to invent softer, more flattering metrics, number of
1093
00:55:56,920 --> 00:56:01,800
workspaces created, reports published, co-pilot questions asked.
1094
00:56:01,800 --> 00:56:04,040
Stop and ask a harder question.
1095
00:56:04,040 --> 00:56:07,800
Do these numbers tell us anything about whether we are enforcing meaning or just about
1096
00:56:07,800 --> 00:56:10,840
how busy we are at generating more of it?
1097
00:56:10,840 --> 00:56:15,280
Because once you start treating semantics as a governed surface, not an accidental byproduct,
1098
00:56:15,280 --> 00:56:18,840
something else becomes possible, you can apply the same posture to meaning that you already
1099
00:56:18,840 --> 00:56:23,960
pretend to apply to networks and identities, zero trust, not just for who connects and
1100
00:56:23,960 --> 00:56:27,840
from where, but for which definitions you are willing to trust by default.
1101
00:56:27,840 --> 00:56:29,440
Zero trust for data and semantics.
1102
00:56:29,440 --> 00:56:31,920
Zero trust has been marketed to you for years.
1103
00:56:31,920 --> 00:56:34,640
Never trust, always verify.
1104
00:56:34,640 --> 00:56:35,640
Assume breach.
1105
00:56:35,640 --> 00:56:37,240
Enforced least privilege.
1106
00:56:37,240 --> 00:56:41,520
Most organizations dutifully apply that to networks, devices and sign-ins.
1107
00:56:41,520 --> 00:56:42,880
They turned on conditional access.
1108
00:56:42,880 --> 00:56:43,880
They tightened VPNs.
1109
00:56:43,880 --> 00:56:45,840
They added MFA to anything that moved.
1110
00:56:45,840 --> 00:56:49,240
Almost nobody applied it to data and nobody applied it to meaning.
1111
00:56:49,240 --> 00:56:51,760
In a fabric world, that gap is no longer theoretical.
1112
00:56:51,760 --> 00:56:56,480
So take the same three zero trust principles and translate them ruthlessly into how you
1113
00:56:56,480 --> 00:56:58,880
treat both data and semantics.
1114
00:56:58,880 --> 00:57:00,840
Start with verify explicitly.
1115
00:57:00,840 --> 00:57:04,680
In identity terms, that means you don't trust a token just because it exists.
1116
00:57:04,680 --> 00:57:07,520
You evaluate device state, location, risk.
1117
00:57:07,520 --> 00:57:10,440
In fabric terms, you extend that posture to three layers.
1118
00:57:10,440 --> 00:57:12,520
Who is this identity in Entra?
1119
00:57:12,520 --> 00:57:14,720
Which domain and workspace are they operating in?
1120
00:57:14,720 --> 00:57:16,680
Which semantic models are they allowed to reuse?
1121
00:57:16,680 --> 00:57:20,520
You stop assuming that because someone works in finance, they should see every finance model.
1122
00:57:20,520 --> 00:57:25,080
You stop assuming that because a semantic model lives in a prod workspace, it must be
1123
00:57:25,080 --> 00:57:26,240
authoritative.
1124
00:57:26,240 --> 00:57:28,840
Every access to a certified semantic model is a decision.
1125
00:57:28,840 --> 00:57:32,560
Does this person in this role, in this domain, need to reuse this meaning?
1126
00:57:32,560 --> 00:57:36,120
If the answer is no, they can still explore data, but they do it against non-certified layers
1127
00:57:36,120 --> 00:57:39,240
where their experiments can't redefine enterprise truth.
1128
00:57:39,240 --> 00:57:41,040
Then I'm least privilege.
1129
00:57:41,040 --> 00:57:45,400
Most people implement least privilege as give analysts viewer, not member.
1130
00:57:45,400 --> 00:57:46,880
Architecturally that's shallow.
1131
00:57:46,880 --> 00:57:50,360
Architect privilege for data is not just about rows and columns, it's about which metrics can
1132
00:57:50,360 --> 00:57:51,360
be referenced where.
1133
00:57:51,360 --> 00:57:56,120
A sales analyst might need row level access to detailed transactions in their region.
1134
00:57:56,120 --> 00:57:59,800
That doesn't mean they should be allowed to build new AI agents grounded on the global
1135
00:57:59,800 --> 00:58:01,080
certified revenue model.
1136
00:58:01,080 --> 00:58:05,240
A data scientist working on churn experiments might need full column access to customer features
1137
00:58:05,240 --> 00:58:06,240
in a sandbox.
1138
00:58:06,240 --> 00:58:09,920
That doesn't mean their experimental churn score 7 should ever show up in co-pilot suggestions
1139
00:58:09,920 --> 00:58:10,920
for executives.
1140
00:58:10,920 --> 00:58:15,160
These privilege for semantics means you constrain which models can be used as sources for other
1141
00:58:15,160 --> 00:58:16,160
models.
1142
00:58:16,160 --> 00:58:20,480
In the brain which models AI can see by default, you constrain which models can be referenced
1143
00:58:20,480 --> 00:58:22,280
across domains without review.
1144
00:58:22,280 --> 00:58:26,320
You are not just asking can you query this table that you are asking can you compose with
1145
00:58:26,320 --> 00:58:29,800
this meaning finally assume breach.
1146
00:58:29,800 --> 00:58:34,720
In the network world that means you operate as though an attacker is already inside in fabric
1147
00:58:34,720 --> 00:58:39,600
you operate as though drift is already inside because it is assume access has already widened
1148
00:58:39,600 --> 00:58:43,960
beyond what your diagram says assume there are already five versions of revenue and
1149
00:58:43,960 --> 00:58:46,120
three of churn in production.
1150
00:58:46,120 --> 00:58:50,480
Assume there are often models that AI will happily root through if you don't stop it.
1151
00:58:50,480 --> 00:58:54,160
Assume drift means you build continuous observation into the platform.
1152
00:58:54,160 --> 00:58:58,680
You monitor for semantic changes a certified measures definition changes a models filters
1153
00:58:58,680 --> 00:59:01,840
are edited a key relationship is dropped.
1154
00:59:01,840 --> 00:59:06,320
Those events aren't just version history they are governance events someone should be alerted
1155
00:59:06,320 --> 00:59:09,960
in some cases downstream consumers should be forced to re acknowledge that the meaning
1156
00:59:09,960 --> 00:59:15,920
changed you monitor for ownership gaps production artifacts without owners certified models
1157
00:59:15,920 --> 00:59:20,440
where the listed owner hasn't locked in for six months workspaces with high consumption
1158
00:59:20,440 --> 00:59:22,600
and zero named stewards.
1159
00:59:22,600 --> 00:59:27,120
You monitor for usage anomalies AI agents pulling heavily from uncertified models sudden spikes
1160
00:59:27,120 --> 00:59:31,800
in consumption of a sandbox lake house reports in executive apps grounded on non certified
1161
00:59:31,800 --> 00:59:36,520
semantics in other words you stop trusting your own configuration you treat every semantic
1162
00:59:36,520 --> 00:59:42,320
object as guilty until it has a clear owner a clear definition a clear endorsement state
1163
00:59:42,320 --> 00:59:46,760
and a clear usage pattern that matches its intent zero trust for data is not about locking
1164
00:59:46,760 --> 00:59:51,400
everything down it is about trusting fewer things by default you deliberately shrink the
1165
00:59:51,400 --> 00:59:56,280
set of meanings that are allowed to flow freely into AI into board decks into automated workflows
1166
00:59:56,280 --> 01:00:00,720
everything else has to earn its way in through ownership certification and observation
1167
01:00:00,720 --> 01:00:05,760
once you take that posture seriously design decisions change you create fewer one-off semantic
1168
01:00:05,760 --> 01:00:09,480
models because you know they will become attack surfaces for drift you push harder on
1169
01:00:09,480 --> 01:00:13,800
reuse of certified entities because that's the only way to keep your AI surface area small
1170
01:00:13,800 --> 01:00:18,520
enough to reason about you stop pretending that more models is progress and start measuring
1171
01:00:18,520 --> 01:00:22,960
more consumption on fewer better models as the real signal and you stop blaming fabric
1172
01:00:22,960 --> 01:00:28,440
for doing exactly what you told it to do secure the objects accelerate the drift expose
1173
01:00:28,440 --> 01:00:32,400
whether you are willing to govern meaning with the same paranoia you already claim to apply
1174
01:00:32,400 --> 01:00:38,880
to identity the future fabric AI agents and meaning at scale most organizations still talk
1175
01:00:38,880 --> 01:00:43,760
about AI as if it were smarter reporting tool architecturally it is something else the future
1176
01:00:43,760 --> 01:00:48,240
of fabric is not people opening power be I and clicking through semantic models it is agents
1177
01:00:48,240 --> 01:00:53,960
co-pilot data agents operational bots consuming your semantics directly without you in the loop
1178
01:00:53,960 --> 01:00:58,520
your semantic model stops being the thing behind the report it becomes an API to AI fabric
1179
01:00:58,520 --> 01:01:03,760
IQ makes that explicit underneath the branding it is building an ontology an entity graph
1180
01:01:03,760 --> 01:01:11,880
your existing artifacts customer order shipment sensor contract churn risk the relationships
1181
01:01:11,880 --> 01:01:16,120
the rules the constraints it is taking the semantics you already encoded in models tables
1182
01:01:16,120 --> 01:01:20,720
and logs and compiling them into something agents can reason over that sounds powerful it
1183
01:01:20,720 --> 01:01:25,680
is also unforgiving because wrong meaning scales faster than wrong data if you mistype
1184
01:01:25,680 --> 01:01:30,800
a value in a source system the blast rate is is local a few reports are wrong someone notices
1185
01:01:30,800 --> 01:01:35,920
you fix it if you misdefine at risk customer in the ontology that every AI agent uses to
1186
01:01:35,920 --> 01:01:40,720
triage support tickets root renewals and trigger discounts that error propagates everywhere
1187
01:01:40,720 --> 01:01:45,640
instantly every bot that calls that definition makes the same wrong decision every automation
1188
01:01:45,640 --> 01:01:49,560
built on top of those bots inherits the same flow you don't just have a bad report you
1189
01:01:49,560 --> 01:01:54,320
have institutionalized a bad rule fabric direction is clear more of your operational logic
1190
01:01:54,320 --> 01:01:59,000
will live in that semantic layer operational agents watch real-time streams from one lake
1191
01:01:59,000 --> 01:02:03,320
KQL or IOT they use the ontology to understand that truck temperature threshold for more
1192
01:02:03,320 --> 01:02:07,880
than four hours and shipment contains vaccine means cold chain breach which means alert compliance
1193
01:02:07,880 --> 01:02:12,760
hold inventory notify customer that's not a dashboard that is a controlled process grounded
1194
01:02:12,760 --> 01:02:17,560
in semantics if the ontology is wrong if vaccine doesn't include a new product line if the
1195
01:02:17,560 --> 01:02:22,360
threshold logic was copied from a pilot and never updated then the agent will do exactly
1196
01:02:22,360 --> 01:02:27,720
what you told it at machine speed with no hesitation this is the uncomfortable consistency
1197
01:02:27,720 --> 01:02:33,480
of AI agents are not creative about your meanings they are obedient they will apply whatever
1198
01:02:33,480 --> 01:02:37,400
definitions they can reach they will not stop mid workflow and ask are you sure this is the right
1199
01:02:37,400 --> 01:02:42,680
revenues and they are or did legal approve this risk score for use in Europe that's your job upstream
1200
01:02:42,680 --> 01:02:47,880
so the question how do I govern fabric data access quietly becomes which semantics am I willing
1201
01:02:47,880 --> 01:02:53,320
to expose as api's to automation and under what conditions that is what fabric IQ and future
1202
01:02:53,320 --> 01:02:58,040
agents formalize they turn your loosely managed semantic layer into a first class dependency graph
1203
01:02:58,040 --> 01:03:02,520
for decision making compliance and control they don't create new risk categories they compress the
1204
01:03:02,520 --> 01:03:07,880
time it takes for your existing semantic risk to turn into real world consequences organizations
1205
01:03:07,880 --> 01:03:12,520
that understand this treat semantics as critical infrastructure they build change control for measures
1206
01:03:12,520 --> 01:03:17,480
the way they build change control for firewall rules they test new ontology relationships with
1207
01:03:17,480 --> 01:03:22,040
synthetic scenarios before letting agents act on them they restrict AI access to a narrow band
1208
01:03:22,040 --> 01:03:26,600
of certified concepts until they've earned the right to widen it everyone else keeps asking why
1209
01:03:26,600 --> 01:03:31,880
co-pilot hallucinates when in reality it is just following their ontology the future of fabric is
1210
01:03:31,880 --> 01:03:36,920
not optional semantics it is meaning at scale you either govern that meaning on purpose or you watch
1211
01:03:36,920 --> 01:03:42,840
AI industrialize whatever you left lying around the real answer to how do I govern fabric data access
1212
01:03:42,840 --> 01:03:47,480
so here's the honest answer to the question people type into search you govern fabric data access
1213
01:03:47,480 --> 01:03:53,240
by treating access as the easy part and meaning as the hard part enter one lake workspace roles
1214
01:03:53,240 --> 01:03:57,960
purview the platform already knows how to lock doors your real work is deciding which semantics
1215
01:03:57,960 --> 01:04:03,240
are allowed to leave the room who owns them and when AI is allowed to reuse them without asking
1216
01:04:03,240 --> 01:04:09,080
fabric is secure by designing your data model will drift unless governance is engineered into creation
1217
01:04:09,080 --> 01:04:14,520
sharing and consumption if you're responsible for enterprise data trust your next steps are simple
1218
01:04:14,520 --> 01:04:19,960
and non optional define domains stand up a real platform team certify semantics measure drift
1219
01:04:19,960 --> 01:04:24,840
and keep AI constrained to what you actually trust because Microsoft fabric doesn't break data
1220
01:04:24,840 --> 01:04:29,560
governance it exposes whether your organization can agree on truth fast enough to scale AI
00:00:00,000 --> 00:00:03,180
Most organizations come to fabric with the same comforting assumption.
2
00:00:03,180 --> 00:00:05,760
If we get the permissions right, the numbers will be right.
3
00:00:05,760 --> 00:00:08,760
They are wrong, your real problem is not who can open a report.
4
00:00:08,760 --> 00:00:11,760
It's what that report thinks revenue means this week.
5
00:00:11,760 --> 00:00:14,680
Fabric feels dangerous because it industrializes meaning.
6
00:00:14,680 --> 00:00:18,160
Once a metric definition leaks into the platform, it doesn't stay local.
7
00:00:18,160 --> 00:00:23,440
It propagates, mutates, and quietly competes with every other definition you never retired.
8
00:00:23,440 --> 00:00:26,320
So the useful question is not how do I lock fabric down?
9
00:00:26,320 --> 00:00:27,480
The useful question is,
10
00:00:27,480 --> 00:00:31,240
how do I keep my data model from drifting faster than my governance can follow?
11
00:00:31,240 --> 00:00:34,440
If you're responsible for enterprise data trust, this is for you.
12
00:00:34,440 --> 00:00:37,840
Because fabric is secure by design, your data model will still drift.
13
00:00:37,840 --> 00:00:41,080
The only real choice you have is whether that drift is observed and governed,
14
00:00:41,080 --> 00:00:43,800
or silent and catastrophic at AI speed.
15
00:00:43,800 --> 00:00:45,960
Why fabric feels like too much power?
16
00:00:45,960 --> 00:00:51,280
Most people meet fabric through the marketing diagram, one lake, many workloads, everything unified.
17
00:00:51,280 --> 00:00:53,080
They translate that as a tool story.
18
00:00:53,080 --> 00:00:54,800
Architecturally, it is something else.
19
00:00:54,800 --> 00:00:59,800
You've collapsed engineering, warehousing, BI and AI into a single plane all wired into
20
00:00:59,800 --> 00:01:04,800
an intra as the decision engine, one tenant, one identity fabric, one logical lake,
21
00:01:04,800 --> 00:01:10,040
every role assignment, every workspace, every shortcut participates in a shared authorization
22
00:01:10,040 --> 00:01:12,280
graph that is a lot of power in one place.
23
00:01:12,280 --> 00:01:15,280
So the first questions you hear internally are completely predictable.
24
00:01:15,280 --> 00:01:16,560
Who can see what?
25
00:01:16,560 --> 00:01:17,680
Who owns this model?
26
00:01:17,680 --> 00:01:20,200
Why do we already have two answers for the same KPI?
27
00:01:20,200 --> 00:01:21,440
Those questions aren't new.
28
00:01:21,440 --> 00:01:23,920
What's new is the speed at which fabric lets you get to them.
29
00:01:23,920 --> 00:01:27,400
Historically, your architecture protected you from yourself through friction.
30
00:01:27,400 --> 00:01:32,080
ERP lived over here, the data warehouse team lived over there, Power BI sat on top, usually
31
00:01:32,080 --> 00:01:34,160
a release cycle behind.
32
00:01:34,160 --> 00:01:38,800
To change a core metric, someone had to fight their way through ETL, schema changes, queue
33
00:01:38,800 --> 00:01:40,800
processing and a deployment window.
34
00:01:40,800 --> 00:01:44,120
That slowness was annoying, but it also throttled semantic drift.
35
00:01:44,120 --> 00:01:47,920
Every change heard just enough that people argued before they shipped a new definition.
36
00:01:47,920 --> 00:01:49,880
Fabric removes most of that friction.
37
00:01:49,880 --> 00:01:53,920
Direct Lake lets the Power BI semantic model sit almost directly on top of delta tables in
38
00:01:53,920 --> 00:01:57,760
one lake, no scheduled imports, no fragile refresh window.
39
00:01:57,760 --> 00:02:01,580
Self-service workspaces let domain teams spin up their own lake houses, warehouses and
40
00:02:01,580 --> 00:02:04,560
models without ever filing a central ticket.
41
00:02:04,560 --> 00:02:08,280
Cloning a workspace or semantic model is a couple of clicks, not a quarter's work.
42
00:02:08,280 --> 00:02:09,840
You didn't just modernize performance.
43
00:02:09,840 --> 00:02:11,960
You modernize the propagation of meaning.
44
00:02:11,960 --> 00:02:14,560
Here's the pattern that shows up in tenant after tenant.
45
00:02:14,560 --> 00:02:16,680
Adoptions start small and controlled.
46
00:02:16,680 --> 00:02:21,920
One or two central workspaces, a lake house or warehouse, a curated semantic model, a handful
47
00:02:21,920 --> 00:02:23,560
of official reports.
48
00:02:23,560 --> 00:02:24,800
Everything feels orderly.
49
00:02:24,800 --> 00:02:27,800
Then acceleration kicks in, more teams want access.
50
00:02:27,800 --> 00:02:32,720
They ask for just a copy of the semantic model to tweak a filter or just our own workspace
51
00:02:32,720 --> 00:02:34,200
to move faster.
52
00:02:34,200 --> 00:02:37,960
Direct Lake makes that copy basically free, so does one lake's shared storage.
53
00:02:37,960 --> 00:02:43,640
Semantic drift follows, those copied models diverge, one region excludes certain customers,
54
00:02:43,640 --> 00:02:46,120
one business unit adds a manual adjustment.
55
00:02:46,120 --> 00:02:49,440
Another team redefines active customer to meet their local target.
56
00:02:49,440 --> 00:02:52,800
The names on the measures don't change, the DAX does.
57
00:02:52,800 --> 00:02:54,800
Finally trust collapses.
58
00:02:54,800 --> 00:02:58,120
Executives walk into a steering committee and discover three different truths about
59
00:02:58,120 --> 00:03:03,800
revenue, churn, risk, all sourced from the same ERP all flowing through the same fabric
60
00:03:03,800 --> 00:03:07,200
tenant all protected by the same RBIAC model.
61
00:03:07,200 --> 00:03:08,200
Security worked.
62
00:03:08,200 --> 00:03:09,200
Nobody broke in.
63
00:03:09,200 --> 00:03:12,920
The platform behaved exactly as designed, but the meaning moved.
64
00:03:12,920 --> 00:03:15,360
This is why fabric feels like too much power.
65
00:03:15,360 --> 00:03:19,320
It isn't that the platform is out of control, it's that your previous architecture hit the
66
00:03:19,320 --> 00:03:22,720
fact that you never had a robust way to govern semantics in the first place.
67
00:03:22,720 --> 00:03:24,600
The friction made the gaps survivable.
68
00:03:24,600 --> 00:03:27,240
Now that friction is gone, the gaps are visible.
69
00:03:27,240 --> 00:03:29,560
Direct Lake is a good example of this acceleration.
70
00:03:29,560 --> 00:03:34,320
In a traditional import model, changing a schema or measure definition had a natural governor,
71
00:03:34,320 --> 00:03:35,840
the refresh process.
72
00:03:35,840 --> 00:03:38,160
Break the model and the refresh fails.
73
00:03:38,160 --> 00:03:40,480
Someone notices there's a clear failure point.
74
00:03:40,480 --> 00:03:44,360
Direct Lake connects the semantic model almost directly to the storage engine.
75
00:03:44,360 --> 00:03:49,360
The data is the delta table and the semantic model might keep working just enough to be dangerous.
76
00:03:49,360 --> 00:03:51,880
Column still exists, relationships still resolve.
77
00:03:51,880 --> 00:03:54,680
Only a handful of reports show subtle shifts in filters and totals.
78
00:03:54,680 --> 00:03:58,840
You remove the mechanical canary, self-service workspaces compound this.
79
00:03:58,840 --> 00:04:03,040
When every domain can spin up its own engineering, modeling and reporting stack, you've effectively
80
00:04:03,040 --> 00:04:07,440
created dozens of parallel semantic layers over the same physical data.
81
00:04:07,440 --> 00:04:11,560
Some are carefully modeled, some are copied from wherever someone had access, some are experiments
82
00:04:11,560 --> 00:04:16,160
that accidentally become production because an executive bookmarked the report, easy cloning
83
00:04:16,160 --> 00:04:17,480
is the final accelerant.
84
00:04:17,480 --> 00:04:21,520
Every time someone says, "We'll just clone it and tweak a bit for our needs."
85
00:04:21,520 --> 00:04:24,200
You've created another fork of meaning with no life cycle plan.
86
00:04:24,200 --> 00:04:28,560
There's no compilation step where a central architect has to approve the new definition.
87
00:04:28,560 --> 00:04:31,760
Most organizations respond to this unease by reaching for the wrong lever.
88
00:04:31,760 --> 00:04:33,480
They double down on access control.
89
00:04:33,480 --> 00:04:37,760
More groups, more conditional access, more workspace restrictions, more reviews of who can
90
00:04:37,760 --> 00:04:40,280
see which report, all of that is fine.
91
00:04:40,280 --> 00:04:41,280
Necessary even.
92
00:04:41,280 --> 00:04:42,960
But here's the uncomfortable truth.
93
00:04:42,960 --> 00:04:46,480
You can have perfect RBIAC and completely untrustworthy numbers.
94
00:04:46,480 --> 00:04:50,480
You can pass every security audit and still have no idea which revenue metric your CEO should
95
00:04:50,480 --> 00:04:51,680
quote to the market.
96
00:04:51,680 --> 00:04:55,680
The perceived risk is external, hackers, leaks, regulatory fines.
97
00:04:55,680 --> 00:04:59,080
The lived daily risk is internal, nobody believes the dashboards.
98
00:04:59,080 --> 00:05:03,040
Fabric didn't invent that risk, it just stopped hiding it behind batch windows and silo
99
00:05:03,040 --> 00:05:04,040
tools.
100
00:05:04,040 --> 00:05:06,720
So before we go any further, hold on to this distinction.
101
00:05:06,720 --> 00:05:10,000
Platform security answers, who is allowed to touch this object?
102
00:05:10,000 --> 00:05:13,640
Governance of meaning answers, what does this object actually say?
103
00:05:13,640 --> 00:05:15,560
Fabric gives you an excellent answer to the first question.
104
00:05:15,560 --> 00:05:17,600
It gives you almost no answer to the second.
105
00:05:17,600 --> 00:05:19,240
That's why it feels like too much power.
106
00:05:19,240 --> 00:05:22,760
Once you see that clearly the next move is obvious, you have to separate the layers you've
107
00:05:22,760 --> 00:05:26,880
been blurring together, security, data governance and semantic governance.
108
00:05:26,880 --> 00:05:30,840
Only then can you decide what fabric actually secures for you and where your data model is
109
00:05:30,840 --> 00:05:32,680
guaranteed to drift.
110
00:05:32,680 --> 00:05:35,720
Security versus data governance versus semantic governance.
111
00:05:35,720 --> 00:05:38,920
Once you stop blaming fabric for being too open, you can ask the only question that
112
00:05:38,920 --> 00:05:39,920
matters.
113
00:05:39,920 --> 00:05:46,040
What exactly is Microsoft securing for you and what isn't even in scope?
114
00:05:46,040 --> 00:05:52,240
Most organizations blur three completely different layers into one vague word, governance.
115
00:05:52,240 --> 00:05:55,480
Architecturally, those layers are separate systems with separate responsibilities.
116
00:05:55,480 --> 00:05:57,320
The first layer is platform security.
117
00:05:57,320 --> 00:06:00,760
This is Microsoft's job-entra handles authentication.
118
00:06:00,760 --> 00:06:06,120
Conditional access decides under which device, network and risk conditions are token is issued.
119
00:06:06,120 --> 00:06:11,640
The workspaces and item permissions express who can administer, contribute or view.
120
00:06:11,640 --> 00:06:16,080
One-leg security and role-based access decide which tables, folders or rows a given identity
121
00:06:16,080 --> 00:06:17,080
can touch.
122
00:06:17,080 --> 00:06:21,800
Underneath that you get encryption, address, TLS and transit, multi-geoboundaries, audit logs,
123
00:06:21,800 --> 00:06:26,400
DLP and sensitivity labels through per view and a compliance envelope that already satisfies
124
00:06:26,400 --> 00:06:30,120
regulators who are far more aggressive than your internal audit team.
125
00:06:30,120 --> 00:06:33,440
In other words, the control plane and data plane are well defended.
126
00:06:33,440 --> 00:06:36,840
If fabric were fundamentally insecure, Microsoft wouldn't be running its own financials
127
00:06:36,840 --> 00:06:38,200
on the same architecture.
128
00:06:38,200 --> 00:06:42,240
That distinction matters because it means your core risk is almost never is the platform
129
00:06:42,240 --> 00:06:43,240
safe.
130
00:06:43,240 --> 00:06:45,280
The platform is as safe as it is going to get.
131
00:06:45,280 --> 00:06:47,840
Your real problem is the second layer, data governance.
132
00:06:47,840 --> 00:06:49,320
Data governance is your job.
133
00:06:49,320 --> 00:06:53,720
This is where you decide who owns which data sets, which domains exist, which workspaces
134
00:06:53,720 --> 00:06:58,320
are allowed to touch regulated data, how long data is retained and how classification
135
00:06:58,320 --> 00:06:59,320
flows.
136
00:06:59,320 --> 00:07:04,000
You define read and write boundaries on boarding and off boarding, life cycle for lake houses
137
00:07:04,000 --> 00:07:08,640
and warehouses and what happens when a project ends but its data assets live on.
138
00:07:08,640 --> 00:07:11,560
You decide whether PHI sits only in a healthcare domain.
139
00:07:11,560 --> 00:07:15,800
You decide whether payroll data can ever be shortcut into a general analytics lake house.
140
00:07:15,800 --> 00:07:20,080
You decide whether every production data set has a named owner or just belongs to BI.
141
00:07:20,080 --> 00:07:23,600
When people say we need better governance, this is usually the layer they think they're
142
00:07:23,600 --> 00:07:27,120
talking about even if they only ever touch it via tickets to the central team.
143
00:07:27,120 --> 00:07:30,560
But there is a third layer and this is where fabric quietly hurts you.
144
00:07:30,560 --> 00:07:33,680
Semantic governance, semantic governance answers questions that platform security and
145
00:07:33,680 --> 00:07:35,880
data governance don't even try to address.
146
00:07:35,880 --> 00:07:37,600
What does customer mean in this model?
147
00:07:37,600 --> 00:07:39,360
At what grain is revenue defined?
148
00:07:39,360 --> 00:07:42,520
Which filters are always applied when we report active anything?
149
00:07:42,520 --> 00:07:44,840
Which team is allowed to redefine that logic?
150
00:07:44,840 --> 00:07:48,480
This is the ignored layer, the layer of metrics, business logic, naming and grain.
151
00:07:48,480 --> 00:07:52,880
The layer where DAX lives where SQL views define golden tables where notebooks materialize
152
00:07:52,880 --> 00:07:56,640
KPI logic in code because there was no certified model available.
153
00:07:56,640 --> 00:08:00,480
Most catastrophic fabric failures happen here not in the platform security layer.
154
00:08:00,480 --> 00:08:03,800
Nothing in interest stops you from having five different total revenue measures or with
155
00:08:03,800 --> 00:08:07,960
the same display name, each applying slightly different filters.
156
00:08:07,960 --> 00:08:11,920
Nothing in one lake security complains when three domains create their own customer dimension
157
00:08:11,920 --> 00:08:13,720
with conflicting surrogate keys.
158
00:08:13,720 --> 00:08:18,280
Nothing in purview lights up when someone silently changes the definition of churned customer
159
00:08:18,280 --> 00:08:20,560
from 90 days of inactivity to 30.
160
00:08:20,560 --> 00:08:22,160
Fabric doesn't just host semantics.
161
00:08:22,160 --> 00:08:24,360
Public industrializes meaning.
162
00:08:24,360 --> 00:08:28,320
Once a DAX measure a SQL view or a curated table is shared and referenced its definition
163
00:08:28,320 --> 00:08:31,040
stops being local it becomes a dependency.
164
00:08:31,040 --> 00:08:35,600
Every downstream report excel workbook data flow and co-pilot answer inherits that meaning
165
00:08:35,600 --> 00:08:39,160
until someone forks it, renames nothing and quietly diverges.
166
00:08:39,160 --> 00:08:41,680
This is where semantic drift becomes dangerous.
167
00:08:41,680 --> 00:08:44,840
Schema drift, columns added, types changed is noisy.
168
00:08:44,840 --> 00:08:50,600
Queries break, refreshes fail, engineers get paged, someone notices, semantic drift is silent,
169
00:08:50,600 --> 00:08:54,560
the columns are still there, the data types still match, security is still enforced.
170
00:08:54,560 --> 00:08:58,400
But the calculation behind net sales has gained an exclusion or lost an adjustment or change
171
00:08:58,400 --> 00:08:59,400
time windows.
172
00:08:59,400 --> 00:09:02,920
Only the people who lived through the change even remember it happened.
173
00:09:02,920 --> 00:09:04,520
Security tooling is blind to this.
174
00:09:04,520 --> 00:09:08,040
Data governance catalogs it at best as another data set.
175
00:09:08,040 --> 00:09:12,080
Semantic governance asks a different set of questions, which semantic models are authoritative
176
00:09:12,080 --> 00:09:16,720
for core entities like customer product revenue, who is allowed to publish or change those
177
00:09:16,720 --> 00:09:21,320
definitions, how do we signal to the rest of the organization, which meanings are reusable
178
00:09:21,320 --> 00:09:24,880
at scale, how do we detect when those meanings drift.
179
00:09:24,880 --> 00:09:29,200
Without explicit answers fabric will happily let every domain define its own truth, secure
180
00:09:29,200 --> 00:09:32,600
it perfectly, classify it correctly and serve it at speed.
181
00:09:32,600 --> 00:09:36,040
So when you look at your fabric tenant and feel that mix of power and unease, remember
182
00:09:36,040 --> 00:09:40,720
the layering, platform security is Microsoft's problem and they've largely solved it.
183
00:09:40,720 --> 00:09:45,000
Data governance is your problem and most of you have at least a partial handle on it.
184
00:09:45,000 --> 00:09:49,000
Security governance is nobody's problem by default, which means it's where reality quietly
185
00:09:49,000 --> 00:09:50,360
diverges from intent.
186
00:09:50,360 --> 00:09:53,880
Once you accept that, the next step is to be precise about what fabric actually secures
187
00:09:53,880 --> 00:09:58,440
for you and then catalog the forms of drift that walk straight past that perfectly functional
188
00:09:58,440 --> 00:10:01,000
security model every single day.
189
00:10:01,000 --> 00:10:03,080
What Microsoft fabric actually secures?
190
00:10:03,080 --> 00:10:07,640
Now that the layers are separated, we can finally answer the boring but essential question,
191
00:10:07,640 --> 00:10:11,320
what does fabric actually secure for you in deterministic terms?
192
00:10:11,320 --> 00:10:15,640
Work with identity and access every interaction with fabric flows through Entra, a user,
193
00:10:15,640 --> 00:10:19,520
a service principle, a managed identity, they all authenticate there.
194
00:10:19,520 --> 00:10:23,080
Conditional access decides whether that token is allowed under your rules.
195
00:10:23,080 --> 00:10:26,720
Compliant device, trusted network, MFA, risk level acceptable.
196
00:10:26,720 --> 00:10:30,000
Only once that token exists does fabric even enter the picture.
197
00:10:30,000 --> 00:10:34,400
Inside fabric that identity hits workspace roles and item level permissions.
198
00:10:34,400 --> 00:10:39,960
Workspaces, define, who administers, who can edit, who can contribute data, who only
199
00:10:39,960 --> 00:10:41,280
views.
200
00:10:41,280 --> 00:10:47,400
Items, lake houses, warehouses, semantic models, notebooks, add their own ACLs on top.
201
00:10:47,400 --> 00:10:50,720
One-leg security sits underneath as the data plane gatekeeper.
202
00:10:50,720 --> 00:10:54,520
At that layer, you can decide that a particular table is visible to one group and invisible
203
00:10:54,520 --> 00:10:55,520
to another.
204
00:10:55,520 --> 00:10:59,800
You can constrain access to specific folders in files or enforce row and column level rules
205
00:10:59,800 --> 00:11:02,040
at the storage engine, not just in a report.
206
00:11:02,040 --> 00:11:06,840
So if someone asks, can user X query table Y through Spark, SQL or direct lake?
207
00:11:06,840 --> 00:11:09,360
The system can give you a clear rule-driven answer.
208
00:11:09,360 --> 00:11:10,560
That part is solid.
209
00:11:10,560 --> 00:11:13,200
Move up a level and you have the data plane itself.
210
00:11:13,200 --> 00:11:17,800
One-lake is the logical lake, shortcuts, mirrored sources, native delta and park a files,
211
00:11:17,800 --> 00:11:21,640
but from a security perspective, it's still just objects and ACLs.
212
00:11:21,640 --> 00:11:25,980
A warehouse table, a lake house table, a KQL database, they're all governed by roles and
213
00:11:25,980 --> 00:11:29,320
permissions that fabric and enter evaluate on every request.
214
00:11:29,320 --> 00:11:31,680
Workspaces provide a kind of blast radius boundary here.
215
00:11:31,680 --> 00:11:36,080
A badly written notebook can still be dangerous, but only inside the capacity and the access
216
00:11:36,080 --> 00:11:38,320
scope you've given that workspace.
217
00:11:38,320 --> 00:11:42,880
Once add a logical overlay, they don't enforce security by themselves, but they group workspaces
218
00:11:42,880 --> 00:11:46,920
so you can reason about where regulated data shouldn't live.
219
00:11:46,920 --> 00:11:51,120
On the consumption side, the same model repeats a power BI semantic model, whether it's
220
00:11:51,120 --> 00:11:56,200
import, direct query or direct lake, still respects the underlying identity, workspace
221
00:11:56,200 --> 00:12:00,320
role and any row or object level security you've defined.
222
00:12:00,320 --> 00:12:04,480
If a user isn't allowed to see a customer segment at the table, they won't see it in
223
00:12:04,480 --> 00:12:05,480
the visual.
224
00:12:05,480 --> 00:12:08,960
The column is masked or hidden at the one lake layer, co-pilot doesn't magically resurrect
225
00:12:08,960 --> 00:12:10,040
it in a chat.
226
00:12:10,040 --> 00:12:15,040
From Fabrics point of view, every engine, spark, SQL, DAX, co-pilot is just another client
227
00:12:15,040 --> 00:12:17,240
of the same authorization fabric.
228
00:12:17,240 --> 00:12:20,880
Compliance instrumentation wraps around all of this, per view can scan fabric, register
229
00:12:20,880 --> 00:12:26,320
it as a data source, classify assets and apply sensitivity labels that then flow downstream.
230
00:12:26,320 --> 00:12:30,320
Activity logs tell you who touched which artifact, when and from where.
231
00:12:30,320 --> 00:12:35,440
CLP policies can flag or block data moving out of safe zones, exports to Excel, downloads
232
00:12:35,440 --> 00:12:37,600
of PBX, sharing outside the tenant.
233
00:12:37,600 --> 00:12:42,200
If you want to answer who has accessed this PHI table in the last 30 days, or where does
234
00:12:42,200 --> 00:12:44,640
this highly confidential data set flow?
235
00:12:44,640 --> 00:12:49,360
The combination of fabric logs and per view lineage can give you a satisfactory audit trail,
236
00:12:49,360 --> 00:12:52,560
so viewed purely as a platform, the picture is reassuring.
237
00:12:52,560 --> 00:12:56,520
Authentication is centralized, authorization is consistent, data access is controllable
238
00:12:56,520 --> 00:12:58,320
down to rows and columns.
239
00:12:58,320 --> 00:13:01,120
This is observable, compliance envelopes are present.
240
00:13:01,120 --> 00:13:04,800
If you stay inside that frame, the natural instinct is to keep tightening it, more labels,
241
00:13:04,800 --> 00:13:07,080
more policies, more reviews, more conditions.
242
00:13:07,080 --> 00:13:10,600
But this is where the earlier distinction becomes lethal if you ignore it.
243
00:13:10,600 --> 00:13:14,240
Fabrics is very good at securing objects that is completely agnostic about what those objects
244
00:13:14,240 --> 00:13:15,240
mean.
245
00:13:15,240 --> 00:13:19,920
You can have two semantic models, both carefully labeled, both living in the right domain,
246
00:13:19,920 --> 00:13:23,720
both with restricted access, both with full audit trails, and still have them implement
247
00:13:23,720 --> 00:13:25,840
revenue in mutually exclusive ways.
248
00:13:25,840 --> 00:13:27,920
From a security standpoint, nothing is wrong.
249
00:13:27,920 --> 00:13:30,360
From a decision standpoint, everything is broken.
250
00:13:30,360 --> 00:13:35,480
This is why the right question is not, is fabric safe, but what exactly are we securing?
251
00:13:35,480 --> 00:13:38,640
The platform can guarantee only the right identities connect.
252
00:13:38,640 --> 00:13:42,520
Only allowed tables, folders and rows are returned, only approved paths are used to move
253
00:13:42,520 --> 00:13:45,320
data, only labeled exports leave the boundary.
254
00:13:45,320 --> 00:13:49,840
It cannot guarantee that the metric you're staring at means what you think it means, that
255
00:13:49,840 --> 00:13:51,680
it meant the same thing last quarter.
256
00:13:51,680 --> 00:13:56,600
But the AI agent you just enabled is grounded on the correct version of that meaning.
257
00:13:56,600 --> 00:14:00,640
So when you hear we need better fabric governance translated, you almost never mean we don't
258
00:14:00,640 --> 00:14:01,640
trust entry.
259
00:14:01,640 --> 00:14:05,640
You mean we don't know which definitions we've actually put into production, the security
260
00:14:05,640 --> 00:14:07,640
model is done, you inherit it.
261
00:14:07,640 --> 00:14:09,120
The governance of meaning is not.
262
00:14:09,120 --> 00:14:15,320
You design it, or you operate a perfectly secure platform for uncontrolled semantic drift.
263
00:14:15,320 --> 00:14:17,600
Where governance breaks, the four drift patterns.
264
00:14:17,600 --> 00:14:21,280
Once you accept that fabric secures objects, not meaning you can start naming the ways
265
00:14:21,280 --> 00:14:24,040
that meaning quietly walks away from intent.
266
00:14:24,040 --> 00:14:28,640
There are four drift patterns that show up in almost every series tenant, access drift,
267
00:14:28,640 --> 00:14:31,800
model drift, metric drift, ownership drift.
268
00:14:31,800 --> 00:14:35,840
Each one is predictable, each one is cumulative, and each one walks straight past your perfectly
269
00:14:35,840 --> 00:14:39,720
functioning security model, start with access drift.
270
00:14:39,720 --> 00:14:41,640
On day one, access is simple.
271
00:14:41,640 --> 00:14:46,040
A few core groups, a couple of workspaces, clear roles, over time, reality intervenes.
272
00:14:46,040 --> 00:14:49,880
Someone important needs temporary access to a workspace just for this quarter.
273
00:14:49,880 --> 00:14:53,480
A project team needs broader read rights until we stabilize.
274
00:14:53,480 --> 00:14:56,560
Contractors arrive, external partners get guest accounts.
275
00:14:56,560 --> 00:14:59,760
Nested groups come in from Entra that nobody fully understands.
276
00:14:59,760 --> 00:15:03,080
Because fabric is a collaboration platform, the path of least resistance is always the
277
00:15:03,080 --> 00:15:04,080
same.
278
00:15:04,080 --> 00:15:05,080
Just add them as a viewer.
279
00:15:05,080 --> 00:15:06,080
We'll clean it up later.
280
00:15:06,080 --> 00:15:07,080
Later never comes.
281
00:15:07,080 --> 00:15:08,920
Those exceptions accumulate.
282
00:15:08,920 --> 00:15:13,280
Groups contain other groups, people change roles, but keep access in case they need to
283
00:15:13,280 --> 00:15:15,120
help with something.
284
00:15:15,120 --> 00:15:18,520
Service principles get granted broader rights because nobody wants to debug a failing
285
00:15:18,520 --> 00:15:20,400
pipeline in the middle of the night.
286
00:15:20,400 --> 00:15:22,960
Your security posture on paper is least privileged.
287
00:15:22,960 --> 00:15:25,400
Your effective access graph in Entra is anything but.
288
00:15:25,400 --> 00:15:28,440
The drift here isn't just that more people can see more things.
289
00:15:28,440 --> 00:15:31,960
It's that your mental model of who can touch which semantic layer stops matching the
290
00:15:31,960 --> 00:15:32,960
actual configuration.
291
00:15:32,960 --> 00:15:34,320
You're no longer governing.
292
00:15:34,320 --> 00:15:35,320
You're guessing.
293
00:15:35,320 --> 00:15:37,520
Now model drift.
294
00:15:37,520 --> 00:15:39,320
This is the physical shape of the data.
295
00:15:39,320 --> 00:15:41,920
Schemas, tables, relationships and views.
296
00:15:41,920 --> 00:15:43,520
A new source system is onboarded.
297
00:15:43,520 --> 00:15:47,760
A team adds a temporary staging table that turns into a de facto goal table because
298
00:15:47,760 --> 00:15:49,200
somebody built a report on it.
299
00:15:49,200 --> 00:15:51,640
A column's meaning changes, but the name doesn't.
300
00:15:51,640 --> 00:15:55,320
Someone optimizes a table for performance and silently drops attributes that downstream
301
00:15:55,320 --> 00:15:56,320
models depend on.
302
00:15:56,320 --> 00:15:57,680
None of this is malicious.
303
00:15:57,680 --> 00:15:59,080
It's normal engineering churn.
304
00:15:59,080 --> 00:16:03,360
In a traditional warehouse that churn was gated by ETL processes, integration tests, release
305
00:16:03,360 --> 00:16:08,880
cycles, in fabric, engineers, analysts and even power users can all participate in changing
306
00:16:08,880 --> 00:16:11,200
the shape of the data with fewer barriers.
307
00:16:11,200 --> 00:16:15,000
So the tables, your semantic models point at, are not static objects.
308
00:16:15,000 --> 00:16:16,720
They are moving targets.
309
00:16:16,720 --> 00:16:19,640
Not explicit model contracts and impact analysis.
310
00:16:19,640 --> 00:16:22,880
Every schema tweak risks creating a forked reality.
311
00:16:22,880 --> 00:16:27,840
One part of the organization referencing the new shape, another still living in the old.
312
00:16:27,840 --> 00:16:29,200
Then there is metric drift.
313
00:16:29,200 --> 00:16:30,880
This is the semantic layer itself.
314
00:16:30,880 --> 00:16:36,000
DAX measures, SQL defined KPIs, calculated columns, business rules embedded in notebooks.
315
00:16:36,000 --> 00:16:39,400
Metric drift is what happens when multiple teams use the same words.
316
00:16:39,400 --> 00:16:43,360
Revenue, customer, churn, risk.
317
00:16:43,360 --> 00:16:47,680
And implement them with different filters, grains or business assumptions.
318
00:16:47,680 --> 00:16:50,880
One team excludes internal transfers from revenue.
319
00:16:50,880 --> 00:16:51,880
Another doesn't.
320
00:16:51,880 --> 00:16:54,680
One region reports churn on a 30 day in activity window.
321
00:16:54,680 --> 00:16:56,280
Another uses 90 days.
322
00:16:56,280 --> 00:17:00,040
Finance defines active customer as billable in the last period.
323
00:17:00,040 --> 00:17:03,000
Sales defines it as any customer with an open opportunity.
324
00:17:03,000 --> 00:17:06,000
In isolation, each definition is locally rational.
325
00:17:06,000 --> 00:17:08,680
At enterprise scale, they are mutually incompatible.
326
00:17:08,680 --> 00:17:12,680
Fabric accelerates this drift because it makes it trivial to clone semantic models, tweak
327
00:17:12,680 --> 00:17:16,880
a measure and publish a new version without any central arbitration.
328
00:17:16,880 --> 00:17:18,400
Every fork looks trustworthy.
329
00:17:18,400 --> 00:17:20,920
Every fork can be certified inside its own workspace.
330
00:17:20,920 --> 00:17:24,200
Every fork shows up in co-pilot as a viable source of truth.
331
00:17:24,200 --> 00:17:25,960
Finally, ownership drift.
332
00:17:25,960 --> 00:17:28,480
This one is less visible but architecturally fatal.
333
00:17:28,480 --> 00:17:31,800
On the day a data set or semantic model is created, somebody owns it.
334
00:17:31,800 --> 00:17:34,360
There's an engineer and analyst, a product owner.
335
00:17:34,360 --> 00:17:35,880
Over time, people change roles.
336
00:17:35,880 --> 00:17:41,120
Teams reog, projects end, contractors leave, the fabric item remains.
337
00:17:41,120 --> 00:17:45,680
This is accumulate often lay houses, abandoned semantic models, half maintained pipelines.
338
00:17:45,680 --> 00:17:49,280
Reports nobody admits to owning continue to refresh because they might be used by someone
339
00:17:49,280 --> 00:17:50,280
important.
340
00:17:50,280 --> 00:17:51,800
When something breaks, nobody is accountable.
341
00:17:51,800 --> 00:17:54,560
When a definition needs to change, nobody has the authority.
342
00:17:54,560 --> 00:17:59,600
When AI starts consuming those models, nobody feels responsible for what the agent is saying.
343
00:17:59,600 --> 00:18:03,440
Ownership drift turns every other form of drift into unpayable security debt because here
344
00:18:03,440 --> 00:18:06,360
is the uncomfortable law of large fabric environments.
345
00:18:06,360 --> 00:18:07,960
Drift is not a failure.
346
00:18:07,960 --> 00:18:10,000
Unobserved drift is.
347
00:18:10,000 --> 00:18:15,960
Diffests will widen, models will evolve, metrics will fork, people will move on, self-service,
348
00:18:15,960 --> 00:18:18,640
domain teams and agile delivery guaranteed.
349
00:18:18,640 --> 00:18:23,280
If you design as though drift is avoidable, you will always be surprised, always be reactive
350
00:18:23,280 --> 00:18:25,680
and always be tempted to blame the platform.
351
00:18:25,680 --> 00:18:29,960
If you design on the assumption that drift is constant, then the question changes.
352
00:18:29,960 --> 00:18:33,960
Not how do we stop this, but how do we see it, measure it and decide which meanings
353
00:18:33,960 --> 00:18:35,800
we are willing to let scale.
354
00:18:35,800 --> 00:18:39,960
These four drift patterns are not theoretical, they show up in recognizable, painful ways.
355
00:18:39,960 --> 00:18:43,720
So to make this real, we are going to walk through five enterprise scenarios where everything
356
00:18:43,720 --> 00:18:46,480
we've just described plays out in public.
357
00:18:46,480 --> 00:18:51,520
Finance arguing over revenue, healthcare exposing PHI in the wrong lake house, retail turning
358
00:18:51,520 --> 00:18:57,080
self-service into shadow analytics, manufacturing turning one lake into a junk drawer, and AI
359
00:18:57,080 --> 00:19:01,200
confidently weaponizing every ungoverned definition you ever shipped.
360
00:19:01,200 --> 00:19:05,680
Concrete anonymized but architecturally identical to what's already brewing inside your tenant.
361
00:19:05,680 --> 00:19:09,200
Scenario one, finance, same revenue, three answers.
362
00:19:09,200 --> 00:19:11,480
Let's start where drift hurts fastest.
363
00:19:11,480 --> 00:19:12,480
Finance.
364
00:19:12,480 --> 00:19:14,080
Assume for a moment that the plumbing is perfect.
365
00:19:14,080 --> 00:19:18,280
You have one ERP, the chart of accounts is clean enough to survive audit, there's a well-understood
366
00:19:18,280 --> 00:19:20,520
data export or integration pattern.
367
00:19:20,520 --> 00:19:24,280
Fabric is ingesting that data into one lake through a controlled pipeline, a warehouse,
368
00:19:24,280 --> 00:19:25,800
a lake house or both.
369
00:19:25,800 --> 00:19:28,360
The numbers landing in storage match the source system.
370
00:19:28,360 --> 00:19:30,520
No data quality story to hide behind.
371
00:19:30,520 --> 00:19:33,160
No exotic multi-ERP nightmare.
372
00:19:33,160 --> 00:19:36,000
On top of that shared foundation, three teams go to work.
373
00:19:36,000 --> 00:19:37,400
Finance, sales and operations.
374
00:19:37,400 --> 00:19:39,480
Each of them gets their own fabric workspace.
375
00:19:39,480 --> 00:19:40,480
That sounds healthy.
376
00:19:40,480 --> 00:19:45,120
It lines up with org structure, they all connect to the same curated tables in one lake.
377
00:19:45,120 --> 00:19:48,800
Maybe they even start from a shared base semantic model, the central team provided, and
378
00:19:48,800 --> 00:19:51,080
then the entropy generator switch on.
379
00:19:51,080 --> 00:19:54,040
Finance takes the base model and adds the adjustments they care about.
380
00:19:54,040 --> 00:19:58,040
They exclude certain internal orders, they apply their standard FX logic, and they define
381
00:19:58,040 --> 00:20:01,040
recognized revenue with the cut off rules the auditors expect.
382
00:20:01,040 --> 00:20:03,920
They create a revenue measure that reflects that view of the world.
383
00:20:03,920 --> 00:20:07,360
Sales clones the same model, because someone told them correctly that we should all
384
00:20:07,360 --> 00:20:08,840
be using the same data.
385
00:20:08,840 --> 00:20:11,320
But their reality is pipeline and performance.
386
00:20:11,320 --> 00:20:14,040
They tweak the date logic to align with their commission periods.
387
00:20:14,040 --> 00:20:18,320
They exclude a small set of accounts that are house customers nobody carries a quota on.
388
00:20:18,320 --> 00:20:22,200
They build revenue in a way that matches how they manage the field.
389
00:20:22,200 --> 00:20:24,600
Operations does the same for fulfillment and capacity.
390
00:20:24,600 --> 00:20:26,920
They care about shipped units, backlog and throughput.
391
00:20:26,920 --> 00:20:28,240
They tweak the filters again.
392
00:20:28,240 --> 00:20:32,240
They might even introduce a simple lag so that revenue better reflects operational load
393
00:20:32,240 --> 00:20:33,680
instead of pure booking.
394
00:20:33,680 --> 00:20:35,720
None of these adaptations are stupid.
395
00:20:35,720 --> 00:20:36,920
None of them are malicious.
396
00:20:36,920 --> 00:20:38,480
All of them are locally rational.
397
00:20:38,480 --> 00:20:40,400
And in fabric, all of them are fast.
398
00:20:40,400 --> 00:20:45,320
Copy the semantic model, adjust a bit of DAX, publish, build reports, share with leadership,
399
00:20:45,320 --> 00:20:46,320
bookmark in teams.
400
00:20:46,320 --> 00:20:50,920
Before long, you have three parallel semantic layers over the same ERP facts, all describing
401
00:20:50,920 --> 00:20:53,120
revenue, all with good intent.
402
00:20:53,120 --> 00:20:54,360
Then comes board deck day.
403
00:20:54,360 --> 00:20:59,000
Finance walks in with a slide that shows revenue for the quarter, 1.02 billion.
404
00:20:59,000 --> 00:21:02,200
Sales shows 987 million with a breakdown by region.
405
00:21:02,200 --> 00:21:05,800
Operations shows 1.05 billion with an explanation of capacity strain.
406
00:21:05,800 --> 00:21:07,800
The chair asks the only question that matters.
407
00:21:07,800 --> 00:21:09,120
What is our revenue?
408
00:21:09,120 --> 00:21:12,480
Every number is defensible from the perspective of the team that produced it.
409
00:21:12,480 --> 00:21:13,680
None of them are aligned.
410
00:21:13,680 --> 00:21:15,640
You can watch the psychology in the room shift.
411
00:21:15,640 --> 00:21:17,560
First, they question the tools.
412
00:21:17,560 --> 00:21:18,880
Is this a fabric issue?
413
00:21:18,880 --> 00:21:20,120
Then they question the teams.
414
00:21:20,120 --> 00:21:21,600
Why are you all using different numbers?
415
00:21:21,600 --> 00:21:24,280
And finally, they question the entire analytics estate.
416
00:21:24,280 --> 00:21:27,160
If we can't agree on revenue, what else is wrong?
417
00:21:27,160 --> 00:21:28,880
You have just experienced trust collapse.
418
00:21:28,880 --> 00:21:30,080
Notice what did not fail here.
419
00:21:30,080 --> 00:21:31,080
ERP was fine.
420
00:21:31,080 --> 00:21:32,440
Pipelines were fine.
421
00:21:32,440 --> 00:21:33,440
One lake was fine.
422
00:21:33,440 --> 00:21:35,680
Entra and workspace security were fine.
423
00:21:35,680 --> 00:21:37,880
Incedivity labels and audit logs were fine.
424
00:21:37,880 --> 00:21:42,000
The platform delivered consistent data to every team, secured access correctly and enforced
425
00:21:42,000 --> 00:21:43,520
your compliance envelope.
426
00:21:43,520 --> 00:21:45,120
What failed was semantic governance.
427
00:21:45,120 --> 00:21:48,920
There was no authoritative, certified semantic model for revenue that all three teams were
428
00:21:48,920 --> 00:21:50,240
obligated to reuse.
429
00:21:50,240 --> 00:21:53,800
There was no metric owner empowered to say, "This is the enterprise definition.
430
00:21:53,800 --> 00:21:56,160
If you need a variant, it gets a different name."
431
00:21:56,160 --> 00:22:00,320
There was no distinction between local operational revenue for sales and legal revenue for external
432
00:22:00,320 --> 00:22:01,320
reporting.
433
00:22:01,320 --> 00:22:04,120
Fabrics simply made the absence of that discipline visible.
434
00:22:04,120 --> 00:22:05,120
Fast.
435
00:22:05,120 --> 00:22:07,600
In the legacy world, the friction would have slowed this down.
436
00:22:07,600 --> 00:22:10,960
It would have taken months for each team to get their own cubes, their own extracts, their
437
00:22:10,960 --> 00:22:12,880
own bespoke logic deployed.
438
00:22:12,880 --> 00:22:14,960
The inconsistency would still exist.
439
00:22:14,960 --> 00:22:18,080
You just might never see all three numbers on the table at the same time.
440
00:22:18,080 --> 00:22:21,840
In fabric, the path from idea to propagated meaning is short and smooth.
441
00:22:21,840 --> 00:22:26,240
So the lesson here is not "stop sales from building models" or "lock everything down
442
00:22:26,240 --> 00:22:27,920
in a central finance workspace."
443
00:22:27,920 --> 00:22:30,040
The lesson is simpler and more uncomfortable.
444
00:22:30,040 --> 00:22:33,880
If you don't govern semantics, fabric will happily let every domain industrialize
445
00:22:33,880 --> 00:22:35,080
its own truth.
446
00:22:35,080 --> 00:22:39,360
You'll end up with three revenue numbers, all secured, all audited, all cataloged, and
447
00:22:39,360 --> 00:22:42,360
none of them reliably reusable at enterprise scale.
448
00:22:42,360 --> 00:22:47,320
This is why in mature tenants, you start seeing a hard distinction between certified semantic
449
00:22:47,320 --> 00:22:52,920
models for core entities and KPIs owned and curated by a platform or domain steward, and
450
00:22:52,920 --> 00:22:57,880
everything else, local, promoted, experimental, explicitly not authoritative.
451
00:22:57,880 --> 00:23:00,600
Without that distinction, you're not doing self-service.
452
00:23:00,600 --> 00:23:03,360
You're manufacturing semantic drift at scale.
453
00:23:03,360 --> 00:23:07,360
And no amount of tightening access rights will fix a board deck with three answers to the
454
00:23:07,360 --> 00:23:08,880
same question.
455
00:23:08,880 --> 00:23:09,880
Scenario 2.
456
00:23:09,880 --> 00:23:10,880
Healthcare.
457
00:23:10,880 --> 00:23:12,720
PHI in the wrong lake house.
458
00:23:12,720 --> 00:23:14,600
Finance drift costs you trust.
459
00:23:14,600 --> 00:23:16,360
Healthcare drift costs you your license.
460
00:23:16,360 --> 00:23:20,880
So take the same architectural pattern and move it into a regulated clinical environment.
461
00:23:20,880 --> 00:23:23,000
You roll out fabric in a healthcare organization.
462
00:23:23,000 --> 00:23:24,720
On paper you do the right things.
463
00:23:24,720 --> 00:23:26,080
There is a clinical domain.
464
00:23:26,080 --> 00:23:27,640
There is a research domain.
465
00:23:27,640 --> 00:23:29,000
There is an operations domain.
466
00:23:29,000 --> 00:23:32,920
PHI is supposed to live only in tightly controlled clinical lake houses and warehouses
467
00:23:32,920 --> 00:23:36,920
with hardened workspaces, restricted groups and very nervous compliance officers watching
468
00:23:36,920 --> 00:23:38,000
purview.
469
00:23:38,000 --> 00:23:40,680
But project reality does not care about your diagram.
470
00:23:40,680 --> 00:23:45,680
Across functional analytics initiatives spins up, reducing emergency department wait times,
471
00:23:45,680 --> 00:23:46,680
for example.
472
00:23:46,680 --> 00:23:51,400
It needs scheduling data, triage codes, lap turnaround times, maybe some patient journey information
473
00:23:51,400 --> 00:23:53,040
to see where delays occur.
474
00:23:53,040 --> 00:23:55,120
The project team does the natural thing in fabric.
475
00:23:55,120 --> 00:23:56,520
They create a new workspace.
476
00:23:56,520 --> 00:23:59,960
It sits in the operations domain because that's who's sponsoring the work.
477
00:23:59,960 --> 00:24:00,880
They add a lake house.
478
00:24:00,880 --> 00:24:05,960
They call it something like ED analytics temp because it's just for this project.
479
00:24:05,960 --> 00:24:09,160
Nobody intends this to be a long-lived clinical data product.
480
00:24:09,160 --> 00:24:10,920
It's a sandbox with a deadline.
481
00:24:10,920 --> 00:24:13,160
Data engineers start short-cutting in what they need.
482
00:24:13,160 --> 00:24:15,440
Some comes from operational systems.
483
00:24:15,440 --> 00:24:18,080
Bed management, staffing, equipment tracking.
484
00:24:18,080 --> 00:24:21,760
Some comes from mirrored clinical data but deidentified upstream.
485
00:24:21,760 --> 00:24:26,680
Some however comes directly from a clinical source because the upstream masking isn't ready
486
00:24:26,680 --> 00:24:28,320
and the project sponsor is impatient.
487
00:24:28,320 --> 00:24:32,200
A few tables in this temporary lake house now contain direct identifiers.
488
00:24:32,200 --> 00:24:34,240
MRNs, dates of birth, visit IDs.
489
00:24:34,240 --> 00:24:36,920
The intent is to strip them out later in the pipeline.
490
00:24:36,920 --> 00:24:40,880
The reality is that the lake house is now PHI bearing, regardless of what your architecture
491
00:24:40,880 --> 00:24:41,880
diagram says.
492
00:24:41,880 --> 00:24:43,880
At the same time, workspace access is generous.
493
00:24:43,880 --> 00:24:46,080
It has to be because this is cross-functional.
494
00:24:46,080 --> 00:24:50,400
You have operations analysts, clinical leads, vendor consultants, tuning, triage models.
495
00:24:50,400 --> 00:24:53,320
A few data scientists doing patient flow simulations.
496
00:24:53,320 --> 00:24:56,760
And some integration engineers wiring up real-time feeds.
497
00:24:56,760 --> 00:24:59,640
The fastest way to get everyone unblocked is familiar.
498
00:24:59,640 --> 00:25:01,960
Add them as members or contributors.
499
00:25:01,960 --> 00:25:03,240
We can tighten it later.
500
00:25:03,240 --> 00:25:07,440
So you now have PHI in a workspace that was never designed or classified as a clinical
501
00:25:07,440 --> 00:25:12,320
environment, granted to a wider and less controlled audience than any of your regulated
502
00:25:12,320 --> 00:25:13,320
domains.
503
00:25:13,320 --> 00:25:14,880
Fabric does exactly what you asked.
504
00:25:14,880 --> 00:25:16,320
Entraauthenticates every identity.
505
00:25:16,320 --> 00:25:18,040
Workspace roles are respected.
506
00:25:18,040 --> 00:25:21,000
One lake security enforces who can query which tables.
507
00:25:21,000 --> 00:25:23,120
All the logs capture every access.
508
00:25:23,120 --> 00:25:24,520
Nothing escapes the tenant.
509
00:25:24,520 --> 00:25:27,760
From a pure platform security perspective, nothing is on fire.
510
00:25:27,760 --> 00:25:30,240
Per view scanning eventually runs over this lake house.
511
00:25:30,240 --> 00:25:32,360
It detects patterns that look like identifiers.
512
00:25:32,360 --> 00:25:35,640
Maybe some columns get labeled as confidential or highly confidential.
513
00:25:35,640 --> 00:25:39,920
A few automated rules apply, but the workspace is still tagged in your head as operational
514
00:25:39,920 --> 00:25:42,640
analytics, not clinical PHI.
515
00:25:42,640 --> 00:25:46,320
The problem only becomes visible when someone asks the wrong question at the right time.
516
00:25:46,320 --> 00:25:51,200
An auditor traces PHI lineage and lands unexpectedly in the operations domain.
517
00:25:51,200 --> 00:25:55,680
A consultant exports a subset to work on it in their own environment, believing it to be
518
00:25:55,680 --> 00:25:58,080
de-identified operations data.
519
00:25:58,080 --> 00:26:02,760
A routine access review shows dozens of non-clinical identities with red rights on tables
520
00:26:02,760 --> 00:26:05,120
that now clearly contain PHI.
521
00:26:05,120 --> 00:26:06,640
Environment and intent were misaligned.
522
00:26:06,640 --> 00:26:09,120
You thought clinical domain meant clinical data.
523
00:26:09,120 --> 00:26:10,120
The system did not.
524
00:26:10,120 --> 00:26:13,440
It only understands where data actually is, not where you wish it would be.
525
00:26:13,440 --> 00:26:16,920
This is data governance drift, but the mechanics are the same as in finance.
526
00:26:16,920 --> 00:26:19,680
A temporary lake house became a de facto data product.
527
00:26:19,680 --> 00:26:23,360
A workspace created for speed became a long-lived environment.
528
00:26:23,360 --> 00:26:25,160
Access expanded faster than classification.
529
00:26:25,160 --> 00:26:29,960
The meaning of that workspace shifted from operations metrics to actual patient data,
530
00:26:29,960 --> 00:26:33,200
and nobody updated the mental or technical model to match.
531
00:26:33,200 --> 00:26:34,960
Semantic governance shows up here as well.
532
00:26:34,960 --> 00:26:40,020
Downstream, someone builds a notebook that computes re-admission risk on this mixed data
533
00:26:40,020 --> 00:26:41,020
set.
534
00:26:41,020 --> 00:26:43,080
Another person wraps it in a semantic model.
535
00:26:43,080 --> 00:26:48,400
A dashboard appears in yet another workspace, surfacing risk scores by facility.
536
00:26:48,400 --> 00:26:52,680
No pilot arrives and happily answers, which hospitals have the highest re-admission risk
537
00:26:52,680 --> 00:26:53,680
this month.
538
00:26:53,680 --> 00:26:57,920
Using whatever model is easiest to reach, from a tool perspective, this is a success story.
539
00:26:57,920 --> 00:27:01,560
From a regulatory perspective, it is a slow motion incident, because the question you
540
00:27:01,560 --> 00:27:04,840
will be asked after any investigation is not, did you have R-back?
541
00:27:04,840 --> 00:27:09,360
It is how did PHI end up accessible in that environment, to those identities, with that
542
00:27:09,360 --> 00:27:10,440
lineage?
543
00:27:10,440 --> 00:27:14,680
And the honest architectural answer is, you treated where as a proxy for what?
544
00:27:14,680 --> 00:27:17,240
You assumed environment's implied classification.
545
00:27:17,240 --> 00:27:18,240
Fabric did not.
546
00:27:18,240 --> 00:27:20,160
The lesson here is brutal but simple.
547
00:27:20,160 --> 00:27:23,880
Lakehouse sprawl without hard domain and classification boundaries is a compliance trap.
548
00:27:23,880 --> 00:27:28,000
You will end up with PHI in the wrong lakehouse, shared with the wrong people, powering semantics
549
00:27:28,000 --> 00:27:29,480
nobody ever approved.
550
00:27:29,480 --> 00:27:31,360
Security will say, access was authenticated.
551
00:27:31,360 --> 00:27:33,400
Audit will say, logs exist.
552
00:27:33,400 --> 00:27:36,200
Regulators will say, you lost control of meaning and location.
553
00:27:36,200 --> 00:27:39,720
Governance for fabric and healthcare is not just about locking PHI down.
554
00:27:39,720 --> 00:27:44,080
It is about engineering domains, workspaces and semantic layers, so that regulated meaning
555
00:27:44,080 --> 00:27:48,600
cannot quietly drift into unregulated places, no matter how many temporary lakehouses your
556
00:27:48,600 --> 00:27:51,680
project teams create.
557
00:27:51,680 --> 00:27:52,680
Scenario 3.
558
00:27:52,680 --> 00:27:53,680
Retail.
559
00:27:53,680 --> 00:27:55,280
Self-service becomes shadow analytics.
560
00:27:55,280 --> 00:27:59,040
If healthcare shows you the cost of getting domains wrong, retail shows you the cost of
561
00:27:59,040 --> 00:28:03,480
getting self-service wrong, different stakes, same mechanics.
562
00:28:03,480 --> 00:28:08,000
Picture a large retail organization that prides itself on being data-driven.
563
00:28:08,000 --> 00:28:11,400
They've rolled out fabric with a very explicit mandate from the top.
564
00:28:11,400 --> 00:28:15,720
Social analysts remove bottlenecks, no more six month waits for a new report.
565
00:28:15,720 --> 00:28:20,280
On paper this is healthy, there is a central lakehouse or warehouse with clean sales, store,
566
00:28:20,280 --> 00:28:21,960
product and promotion tables.
567
00:28:21,960 --> 00:28:24,120
The data engineering team has done the right thing.
568
00:28:24,120 --> 00:28:28,760
Standardized schemers, built-conformed dimensions, set up incremental loads, a baseline semantic
569
00:28:28,760 --> 00:28:31,200
model exists with the obvious KPIs.
570
00:28:31,200 --> 00:28:33,960
Sales margin units, same-store sales, promotion uplift.
571
00:28:33,960 --> 00:28:35,920
Then the self-service story starts.
572
00:28:35,920 --> 00:28:40,120
Analysts in merchandising, pricing and marketing are all told correctly to reuse that central
573
00:28:40,120 --> 00:28:41,120
semantic model.
574
00:28:41,120 --> 00:28:46,040
They connect to it from their own workspaces, build some reports and start asking for tweaks.
575
00:28:46,040 --> 00:28:48,000
The first round of requests is reasonable.
576
00:28:48,000 --> 00:28:51,120
Can we get a version of sales that excludes staff purchases?
577
00:28:51,120 --> 00:28:54,960
We need same-store sales defined at a market cluster level, not just store.
578
00:28:54,960 --> 00:28:58,360
Our region wants to see promotion uplift excluding clearance items.
579
00:28:58,360 --> 00:28:59,800
The platform team tries to keep up.
580
00:28:59,800 --> 00:29:01,040
They add a few more measures.
581
00:29:01,040 --> 00:29:03,160
They expose some calculation groups.
582
00:29:03,160 --> 00:29:06,840
But the backlog grows and the pressure to move fast doesn't go away.
583
00:29:06,840 --> 00:29:10,360
At some point a senior analyst discovers how cheap cloning is.
584
00:29:10,360 --> 00:29:14,600
They take the certified semantic model, hit "save a's" into their own workspace and start
585
00:29:14,600 --> 00:29:15,880
adjusting DAX.
586
00:29:15,880 --> 00:29:18,640
Maybe they rename nothing to keep reports working.
587
00:29:18,640 --> 00:29:21,280
Maybe they append "marsh region" to a few measures.
588
00:29:21,280 --> 00:29:23,600
Either way, they now have a forked model.
589
00:29:23,600 --> 00:29:24,600
Direct Lake makes this frictionless.
590
00:29:24,600 --> 00:29:26,720
There's no extra storage cost.
591
00:29:26,720 --> 00:29:28,440
No duplicated refresh schedules.
592
00:29:28,440 --> 00:29:32,600
The cloned model reads the same delta tables underneath with the same performance profile,
593
00:29:32,600 --> 00:29:34,240
Word Spreads.
594
00:29:34,240 --> 00:29:39,400
In a quarter, you have dozens of near-identical semantic models orbiting the same sales tables.
595
00:29:39,400 --> 00:29:41,680
Each workspace has its own flavor.
596
00:29:41,680 --> 00:29:45,480
Merchandising has net sales that excludes returns after a certain window.
597
00:29:45,480 --> 00:29:48,560
Pricing has margin adjusted for vendor rebates.
598
00:29:48,560 --> 00:29:52,360
Marketing has promotion uplift that ignores campaigns below a spend threshold.
599
00:29:52,360 --> 00:29:56,880
E-commerce has same-store sales defined in terms of digital traffic cohorts.
600
00:29:56,880 --> 00:29:58,840
Locally every one of these models is useful.
601
00:29:58,840 --> 00:30:02,200
They answer specific questions faster than the central team ever could.
602
00:30:02,200 --> 00:30:05,400
The self-service mandate looks on the surface like a success.
603
00:30:05,400 --> 00:30:07,040
But something subtle has changed.
604
00:30:07,040 --> 00:30:09,320
You no longer have one semantic layer for sales.
605
00:30:09,320 --> 00:30:10,720
You have a semantic swarm.
606
00:30:10,720 --> 00:30:13,760
And because fabric is doing its job, they all look equally legitimate.
607
00:30:13,760 --> 00:30:15,080
They live in proper workspaces.
608
00:30:15,080 --> 00:30:18,800
They respect RLS, some are even endorsed or promoted because a local manager liked the
609
00:30:18,800 --> 00:30:20,320
dashboards.
610
00:30:20,320 --> 00:30:23,480
In the one-lake catalog and in co-pilot, they all show up when somebody searches for
611
00:30:23,480 --> 00:30:24,960
sales or margin.
612
00:30:24,960 --> 00:30:26,440
Then the business pressure ramps up.
613
00:30:26,440 --> 00:30:27,720
A bad quarter hits.
614
00:30:27,720 --> 00:30:30,160
Execs start asking hard questions.
615
00:30:30,160 --> 00:30:32,720
Which promotions actually drove incremental revenue?
616
00:30:32,720 --> 00:30:35,080
Are we discounting two deeply in certain regions?
617
00:30:35,080 --> 00:30:38,680
Is our same-store sales trend hiding underlying volume decline?
618
00:30:38,680 --> 00:30:40,760
Different teams run to their preferred models.
619
00:30:40,760 --> 00:30:42,640
Marketing pulls numbers from their workspace.
620
00:30:42,640 --> 00:30:43,640
Pricing pulls theirs.
621
00:30:43,640 --> 00:30:47,480
Finance tries to stick to the original certified model, but they've quite added a few measures
622
00:30:47,480 --> 00:30:50,520
of their own to keep pace with ad hoc asks.
623
00:30:50,520 --> 00:30:52,440
Everyone is technically using fabric.
624
00:30:52,440 --> 00:30:53,840
Nobody is using the same semantics.
625
00:30:53,840 --> 00:30:55,920
The first sign of trouble is not a big debate.
626
00:30:55,920 --> 00:30:56,920
It's a subtle mismatch.
627
00:30:56,920 --> 00:31:02,040
A VPC is two different same-store sales percentages in two separate decks and asks, why is your
628
00:31:02,040 --> 00:31:04,080
chart showing minus 1.8?
629
00:31:04,080 --> 00:31:06,480
And theirs is minus 0.5.
630
00:31:06,480 --> 00:31:08,320
Nobody can answer cleanly without opening decks.
631
00:31:08,320 --> 00:31:10,440
The conversation that follows is never about decks.
632
00:31:10,440 --> 00:31:12,600
It's about trust.
633
00:31:12,600 --> 00:31:14,200
Which model is correct?
634
00:31:14,200 --> 00:31:16,720
Who owns the definition of same-store sales?
635
00:31:16,720 --> 00:31:19,200
Why do we have five different versions of promotion uplift?
636
00:31:19,200 --> 00:31:21,000
The honest answer is architectural.
637
00:31:21,000 --> 00:31:25,120
In the rush to empower self-service, nobody drew a hard boundary between reusable governs
638
00:31:25,120 --> 00:31:27,720
semantics that represent enterprise truth.
639
00:31:27,720 --> 00:31:30,880
And local contextual semantics that are allowed to diverge.
640
00:31:30,880 --> 00:31:32,600
Fabric didn't create the shadow analytics.
641
00:31:32,600 --> 00:31:33,600
It industrialized it.
642
00:31:33,600 --> 00:31:39,280
In the Excel era, analysts did all of this anyway, just on their laptops.
643
00:31:39,280 --> 00:31:43,120
You knew it was a risk, but it was mostly trapped in files and email threads.
644
00:31:43,120 --> 00:31:45,480
In fabric, every fork is a first-class object.
645
00:31:45,480 --> 00:31:49,080
Every fork is shareable, discoverable, and consumable by AI.
646
00:31:49,080 --> 00:31:51,040
Shadow analytics graduates into shadow truth.
647
00:31:51,040 --> 00:31:52,600
Notice again what did not fail.
648
00:31:52,600 --> 00:31:54,240
Workspace security was fine.
649
00:31:54,240 --> 00:31:55,920
The access was fine.
650
00:31:55,920 --> 00:31:57,320
Performance was often excellent.
651
00:31:57,320 --> 00:31:59,080
Per view could see all the assets.
652
00:31:59,080 --> 00:32:02,240
What failed was semantic governance and platform boundaries.
653
00:32:02,240 --> 00:32:04,840
Self-service without semantic boundaries is not empowerment.
654
00:32:04,840 --> 00:32:06,720
It is drift with a friendly UI.
655
00:32:06,720 --> 00:32:09,200
The mature pattern in retail tenants looks very different.
656
00:32:09,200 --> 00:32:14,520
There is a small set of certified domain-owned semantic models for core commercial concepts.
657
00:32:14,520 --> 00:32:17,640
Sales, margin, same-store sales, promotion uplift.
658
00:32:17,640 --> 00:32:21,080
These are treated as APIs, not convenience layers, and if you want to use those meanings,
659
00:32:21,080 --> 00:32:22,840
you reference those models.
660
00:32:22,840 --> 00:32:26,200
Other teams are explicitly allowed to build their own metrics, but they must give them
661
00:32:26,200 --> 00:32:30,000
different names and those models are never certified as authoritative for the core
662
00:32:30,000 --> 00:32:31,000
APIs.
663
00:32:31,000 --> 00:32:33,560
In other words, you don't stop analysts from moving fast.
664
00:32:33,560 --> 00:32:35,840
You stop them from silently redefining shared words.
665
00:32:35,840 --> 00:32:40,120
If you skip that step, fabric will faithfully implement your self-service mandate.
666
00:32:40,120 --> 00:32:44,120
And you will wake up one quarter with excellent tooling, fast reports, and an organization that
667
00:32:44,120 --> 00:32:46,600
can no longer answer a basic question.
668
00:32:46,600 --> 00:32:50,800
When we say same-store sales in this meeting, who's meaning did we just import?
669
00:32:50,800 --> 00:32:53,840
Mario 4, manufacturing one-lake as a shared junk drawer.
670
00:32:53,840 --> 00:32:57,480
If retail shows you semantic drift in KPIs, manufacturing shows you physical drift in the
671
00:32:57,480 --> 00:33:02,160
lake itself, and this one usually starts with a sentence every architect has heard.
672
00:33:02,160 --> 00:33:05,360
Let's just put it all in one lake, so it's easy to find later.
673
00:33:05,360 --> 00:33:09,880
You roll fabric into a manufacturing organization that has been starved of integrated data for
674
00:33:09,880 --> 00:33:10,880
a decade.
675
00:33:10,880 --> 00:33:16,200
You've got IoT telemetry from machines, MES data about work orders and line states, ERP data
676
00:33:16,200 --> 00:33:20,520
for materials, production orders, and cost, quality systems for defects and inspections.
677
00:33:20,520 --> 00:33:23,480
A long tail of spreadsheets living in shared drives.
678
00:33:23,480 --> 00:33:25,400
On the whiteboard, the story is elegant.
679
00:33:25,400 --> 00:33:27,720
One lake will be the single enterprise lake.
680
00:33:27,720 --> 00:33:29,320
Domains will own their workspaces.
681
00:33:29,320 --> 00:33:33,120
Data products will emerge, engineering will have a proper foundation for predictive maintenance
682
00:33:33,120 --> 00:33:35,440
OEE and supply chain visibility.
683
00:33:35,440 --> 00:33:36,440
Then projects start.
684
00:33:36,440 --> 00:33:40,520
A plant-level team spins up a workspace to look at downtime patterns on line 7.
685
00:33:40,520 --> 00:33:41,520
They add a lake house.
686
00:33:41,520 --> 00:33:42,920
They shortcut in some IoT data.
687
00:33:42,920 --> 00:33:46,440
They drop some CSVs from legacy systems into files because the connector work isn't done
688
00:33:46,440 --> 00:33:47,440
yet.
689
00:33:47,440 --> 00:33:49,240
A global quality initiative kicks off.
690
00:33:49,240 --> 00:33:50,360
A network space.
691
00:33:50,360 --> 00:33:51,360
Another lake house.
692
00:33:51,360 --> 00:33:53,400
They ingest defect records from a separate system.
693
00:33:53,400 --> 00:33:55,840
Plus a subset of machine telemetry.
694
00:33:55,840 --> 00:33:57,600
Someone exported for them last year.
695
00:33:57,600 --> 00:34:00,000
And uploaded just so we have it nearby.
696
00:34:00,000 --> 00:34:04,000
The supply chain analytics group wants to correlate supplier performance with scrap.
697
00:34:04,000 --> 00:34:07,440
They create their own workspace, their own lake house, and start pulling from whatever looks
698
00:34:07,440 --> 00:34:11,240
vaguely relevant in one lake plus some mirrored ERP tables.
699
00:34:11,240 --> 00:34:12,440
Everyone is moving fast.
700
00:34:12,440 --> 00:34:15,320
Everyone is doing the right thing from their local point of view.
701
00:34:15,320 --> 00:34:17,440
Nobody is curating one lake as a whole.
702
00:34:17,440 --> 00:34:19,600
It emerges over 12 months is not a lake.
703
00:34:19,600 --> 00:34:20,600
It is a junk draw.
704
00:34:20,600 --> 00:34:24,600
You see the symptoms immediately if you open the one lake catalogue without rose tinted
705
00:34:24,600 --> 00:34:25,600
glasses.
706
00:34:25,600 --> 00:34:28,600
Dozens of lake houses named after projects, not domains.
707
00:34:28,600 --> 00:34:34,480
Plant X-2024, OEE pilot, supplier scrap study, temp machine data, tables copied three
708
00:34:34,480 --> 00:34:38,960
or four times with slight naming differences, folders full of one off CSVs that were just
709
00:34:38,960 --> 00:34:42,320
for exploration but are now feeding production reports.
710
00:34:42,320 --> 00:34:45,280
Domains exist on paper, but in practice they're not enforced.
711
00:34:45,280 --> 00:34:49,240
Lake space is drifted into whatever domain someone happened to picturing creation.
712
00:34:49,240 --> 00:34:50,720
Some aren't assigned at all.
713
00:34:50,720 --> 00:34:54,160
Downstream, semantic models and notebooks start hard coding paths.
714
00:34:54,160 --> 00:34:58,200
An analyst building a downtime dashboard doesn't think in terms of data products.
715
00:34:58,200 --> 00:35:01,160
They think in terms of the table that worked last time.
716
00:35:01,160 --> 00:35:06,760
So their DAX or SQL points at plant X-2024, lake house, dbo.downtime events with a fixed
717
00:35:06,760 --> 00:35:11,720
path, not a decorated domain owned table that's guaranteed to exist in five years.
718
00:35:11,720 --> 00:35:16,480
A data scientist training a predictive maintenance model grabs CSVs from a files folder in OEE
719
00:35:16,480 --> 00:35:18,880
pilot because those files had the right columns.
720
00:35:18,880 --> 00:35:22,240
They point notebooks directly at those blobs with absolute paths.
721
00:35:22,240 --> 00:35:24,840
At first it feels productive and then you try to clean up.
722
00:35:24,840 --> 00:35:28,960
A central platform team finally looks at capacity metrics and says we need to archive some
723
00:35:28,960 --> 00:35:29,960
of this clutter.
724
00:35:29,960 --> 00:35:35,240
They propose consolidating lake houses, renaming a few for consistency, maybe reorganizing
725
00:35:35,240 --> 00:35:38,520
folders so that gold data lives somewhere predictable.
726
00:35:38,520 --> 00:35:41,680
The minute they do, invisible dependencies start to snap.
727
00:35:41,680 --> 00:35:44,840
A downtime dashboard failed silently because the table moved.
728
00:35:44,840 --> 00:35:46,440
It doesn't crash spectacularly.
729
00:35:46,440 --> 00:35:50,680
It just starts returning fewer rows because the analyst's hard coded filter no longer matches
730
00:35:50,680 --> 00:35:52,280
the reorganized schema.
731
00:35:52,280 --> 00:35:56,360
The predictive model still runs, but it's now pointing at an old CSV that nobody updates
732
00:35:56,360 --> 00:36:01,280
because the data scientist's notebook refers to a path in OEE pilot that the cleanup script
733
00:36:01,280 --> 00:36:03,600
copied but nobody maintains.
734
00:36:03,600 --> 00:36:07,840
A monthly KPI report for executive manufacturing reviews flips from green to red because
735
00:36:07,840 --> 00:36:12,040
someone optimized a lake house by dropping a column they thought nobody used breaking a
736
00:36:12,040 --> 00:36:15,000
join in a semantic model they'd never heard of.
737
00:36:15,000 --> 00:36:17,440
From the platform's perspective nothing special happened.
738
00:36:17,440 --> 00:36:22,680
Folders were renamed, tables were consolidated, shortcuts moved, or legal operations.
739
00:36:22,680 --> 00:36:26,880
From the business perspective, core production KPI's just started lying.
740
00:36:26,880 --> 00:36:29,960
Operations managers see availability numbers jump around without corresponding changes
741
00:36:29,960 --> 00:36:31,200
on the floor.
742
00:36:31,200 --> 00:36:34,720
Quality leaders see defect rates mysteriously flat-line for one product family because the
743
00:36:34,720 --> 00:36:37,200
source table got filtered when it moved.
744
00:36:37,200 --> 00:36:41,320
And starts questioning whether the cost per unit matrix they've been using to justify
745
00:36:41,320 --> 00:36:44,040
capital spend are even based on current data.
746
00:36:44,040 --> 00:36:47,160
You now have the worst possible combination one lake is full.
747
00:36:47,160 --> 00:36:48,160
Nobody trusts it.
748
00:36:48,160 --> 00:36:51,360
Everyone keeps their own side spreadsheets just in case.
749
00:36:51,360 --> 00:36:54,240
Again notice where the platform did its job.
750
00:36:54,240 --> 00:36:55,240
Security was enforced.
751
00:36:55,240 --> 00:36:58,040
Telemetry came in, ERP mirrors updated.
752
00:36:58,040 --> 00:37:00,040
Audit logs captured who changed what?
753
00:37:00,040 --> 00:37:03,600
What you never did was declare for manufacturing what one lake is allowed to be.
754
00:37:03,600 --> 00:37:07,480
Is it a governed data product surface where only curated domain owned lake houses are
755
00:37:07,480 --> 00:37:12,120
allowed to serve as sources of record or a convenient dumping ground where any project
756
00:37:12,120 --> 00:37:16,280
can create a lake house, throw data in and hope somebody else makes sense of it later,
757
00:37:16,280 --> 00:37:18,280
you can't have both.
758
00:37:18,280 --> 00:37:21,480
In mature manufacturing tenants the pattern looks different.
759
00:37:21,480 --> 00:37:26,720
There is a small set of domain lake houses, manufacturing operations, quality, supply chain.
760
00:37:26,720 --> 00:37:31,000
They own the gold tables, they own the contracts, they publish certified semantic models on
761
00:37:31,000 --> 00:37:35,560
top, projects do not create their own permanent lake houses, they get a femoral workspaces
762
00:37:35,560 --> 00:37:37,080
with clear expiry.
763
00:37:37,080 --> 00:37:42,160
Anything that graduates to used in production must be promoted into a domain lake house under
764
00:37:42,160 --> 00:37:44,240
domain ownership with a life cycle.
765
00:37:44,240 --> 00:37:46,560
One lake stops being a shared junk drawer.
766
00:37:46,560 --> 00:37:51,200
It becomes what it was sold as a shared storage fabric behind governed data products.
767
00:37:51,200 --> 00:37:53,800
If you skip that discipline the outcome is guaranteed.
768
00:37:53,800 --> 00:37:55,840
You will still have a single logical lake.
769
00:37:55,840 --> 00:37:59,520
You will simply have recreated every bad pattern from your old file shares.
770
00:37:59,520 --> 00:38:03,440
This time on top of a platform that makes it easier than ever for those bad patterns
771
00:38:03,440 --> 00:38:07,440
to power critical decisions and AI systems you can't easily unwind.
772
00:38:07,440 --> 00:38:10,960
Scenario 5, AI and co-pilot, garbage meaning accelerated.
773
00:38:10,960 --> 00:38:14,520
By the time AI shows up in your fabric tenant all of the drift we've talked about is already
774
00:38:14,520 --> 00:38:15,520
there.
775
00:38:15,520 --> 00:38:19,400
You have finance with three defensible revenue numbers, healthcare with PHI straying
776
00:38:19,400 --> 00:38:24,120
into the wrong lake houses, retail with a swarm of new identical sales models.
777
00:38:24,120 --> 00:38:27,480
Manufacturing with one lake treated as shared storage, not a product surface.
778
00:38:27,480 --> 00:38:31,920
In that environment someone enables co-pilot and fabric data agents on paper the story is
779
00:38:31,920 --> 00:38:32,920
compelling.
780
00:38:32,920 --> 00:38:34,680
You point co-pilot at your fabric tenant.
781
00:38:34,680 --> 00:38:39,600
It discovers semantic models, reports, lake houses, it reads descriptions, it inspects relationships,
782
00:38:39,600 --> 00:38:44,520
it learns that revenue exists in multiple models that churn is a measure with dependencies
783
00:38:44,520 --> 00:38:48,120
that risk score is computed in a particular warehouse.
784
00:38:48,120 --> 00:38:51,560
Executives are told now you can just ask questions in natural language and get answers
785
00:38:51,560 --> 00:38:53,040
grounded in your own data.
786
00:38:53,040 --> 00:38:54,360
And technically that's true.
787
00:38:54,360 --> 00:38:57,240
But here's the part the marketing diagrams don't emphasize.
788
00:38:57,240 --> 00:39:01,040
AI agents do not consume raw data, they consume semantics.
789
00:39:01,040 --> 00:39:04,640
When co-pilot answers, what was our revenue last quarter?
790
00:39:04,640 --> 00:39:09,320
It is not reading parquet files, it is selecting a semantic model, a measure, a filter context.
791
00:39:09,320 --> 00:39:14,440
It is choosing one implementation of revenue over every other one that exists in your tenant.
792
00:39:14,440 --> 00:39:16,680
How does it choose?
793
00:39:16,680 --> 00:39:21,520
By design it optimizes for ease of use, relevance and connectivity.
794
00:39:21,520 --> 00:39:25,400
Models that are easier to query, better documented, more frequently used or closer to the
795
00:39:25,400 --> 00:39:27,160
question context tend to win.
796
00:39:27,160 --> 00:39:31,000
Certified models matter, but only if they exist and are discoverable.
797
00:39:31,000 --> 00:39:36,440
Local popular models in active workspaces often look more relevant than a pristine but obscure
798
00:39:36,440 --> 00:39:38,520
enterprise model nobody tagged correctly.
799
00:39:38,520 --> 00:39:42,720
So if your semantic layer is already fractured, co-pilot doesn't fix that, it roots through
800
00:39:42,720 --> 00:39:43,720
it.
801
00:39:43,720 --> 00:39:47,880
Imagine your finance department did the right thing and built an authoritative certified semantic
802
00:39:47,880 --> 00:39:49,960
model in a controlled workspace.
803
00:39:49,960 --> 00:39:54,440
It has legal revenue, adjusted revenue and a dozen carefully curated measures.
804
00:39:54,440 --> 00:39:56,840
Ownership is clear, documentation is solid.
805
00:39:56,840 --> 00:40:01,240
At the same time sales has its own workspace with a forked model where revenue is really
806
00:40:01,240 --> 00:40:04,120
commissionable revenue, but nobody renamed the measure.
807
00:40:04,120 --> 00:40:05,600
It is heavily used.
808
00:40:05,600 --> 00:40:09,000
Reports referencing it are open daily, it lives in a workspace with sales in the name and
809
00:40:09,000 --> 00:40:10,200
a lot of executive traffic.
810
00:40:10,200 --> 00:40:14,440
You ask co-pilot in teams, what was our revenue last quarter in Emia?
811
00:40:14,440 --> 00:40:17,040
From co-pilot's perspective both models are viable.
812
00:40:17,040 --> 00:40:22,040
One is certified but lives in a workspace with a finance centric name used mostly by finance.
813
00:40:22,040 --> 00:40:26,520
The other is promoted, heavily used and sits right next to the chat context of the person
814
00:40:26,520 --> 00:40:27,520
asking.
815
00:40:27,520 --> 00:40:29,240
Both answer the question syntactically.
816
00:40:29,240 --> 00:40:31,760
If your governance is weak, the sales model often wins.
817
00:40:31,760 --> 00:40:35,760
You get a perfectly fluent, confident answer based on commissionable revenue.
818
00:40:35,760 --> 00:40:38,160
The number doesn't match last quarter's board deck.
819
00:40:38,160 --> 00:40:39,160
Someone notices.
820
00:40:39,160 --> 00:40:41,400
The story becomes co-pilot is hallucinating.
821
00:40:41,400 --> 00:40:43,480
But architecturally that's not what happened.
822
00:40:43,480 --> 00:40:46,320
The AI did exactly what your semantics told it to do.
823
00:40:46,320 --> 00:40:50,520
It faithfully reflected the ambiguity you allowed to exist between revenue and commissionable
824
00:40:50,520 --> 00:40:51,520
revenue.
825
00:40:51,520 --> 00:40:57,040
It selected the path of least resistance through your authorization graph and your usage patterns.
826
00:40:57,040 --> 00:40:58,960
The hallucination wasn't in the model.
827
00:40:58,960 --> 00:41:02,120
It was in your belief that revenue meant one thing tenet wide.
828
00:41:02,120 --> 00:41:04,240
The same pattern shows up in risk and churn.
829
00:41:04,240 --> 00:41:08,520
A data science team builds a churn score in a notebook, exposes it through a warehouse and
830
00:41:08,520 --> 00:41:11,960
wraps it in a semantic model for operational dashboards.
831
00:41:11,960 --> 00:41:16,560
Another team, months earlier, built a simpler churned customer flag in a different workspace
832
00:41:16,560 --> 00:41:18,000
with different thresholds.
833
00:41:18,000 --> 00:41:20,720
Both end up in fabric, both are labeled churn.
834
00:41:20,720 --> 00:41:25,200
And a COO asks co-pilot which customer segments have the highest churn risk right now?
835
00:41:25,200 --> 00:41:26,640
The agent must pick one.
836
00:41:26,640 --> 00:41:30,800
If the simpler flag happens to be in the more active workspace with more dashboards and
837
00:41:30,800 --> 00:41:33,960
more daily usage, it will often be treated as the default.
838
00:41:33,960 --> 00:41:38,720
Your most sophisticated model with regulatory justification and careful calibration is bypassed
839
00:41:38,720 --> 00:41:41,240
because it lives in a quieter corner of the tenet.
840
00:41:41,240 --> 00:41:44,720
Again, from the outside this looks like AI is unreliable.
841
00:41:44,720 --> 00:41:48,560
From the inside it is semantic governance debt being called in with interest.
842
00:41:48,560 --> 00:41:50,440
Data agents amplify this further.
843
00:41:50,440 --> 00:41:54,600
When you build a fabric data agent you select up to a handful of sources, semantic models,
844
00:41:54,600 --> 00:41:58,560
lake houses, warehouses, maybe an ontology when I queue mature.
845
00:41:58,560 --> 00:42:03,680
You wire tools around them, answer this class of questions, trigger this workflow, summarize
846
00:42:03,680 --> 00:42:04,760
these metrics.
847
00:42:04,760 --> 00:42:08,520
If you feed an agent a mix of certified and non-certified models because that's what
848
00:42:08,520 --> 00:42:11,920
we had, you've just encoded your drift into an API.
849
00:42:11,920 --> 00:42:17,240
Every bot built on top of that agent in teams, in custom apps, in operation centers will
850
00:42:17,240 --> 00:42:18,960
inherit those ambiguities.
851
00:42:18,960 --> 00:42:23,080
Every automation that takes an AI answer and turns it into action will be grounded on whatever
852
00:42:23,080 --> 00:42:26,400
meaning was easiest for the agent to reach at configuration time.
853
00:42:26,400 --> 00:42:29,040
So here is the high level pattern.
854
00:42:29,040 --> 00:42:33,480
Before AI, semantic drift costs you trust in reports and meetings.
855
00:42:33,480 --> 00:42:38,880
After AI, semantic drift drives decisions and automation at speed without you in the loop.
856
00:42:38,880 --> 00:42:40,560
That is the actual risk curve.
857
00:42:40,560 --> 00:42:44,520
And this is why MVPs who live in this space keep saying a version of the same thing.
858
00:42:44,520 --> 00:42:46,160
AI doesn't break your data governance.
859
00:42:46,160 --> 00:42:49,280
It removes your ability to hide from how bad it already is.
860
00:42:49,280 --> 00:42:51,200
Co-pilot is not a hallucination engine.
861
00:42:51,200 --> 00:42:52,400
It is a semantic mirror.
862
00:42:52,400 --> 00:42:56,320
If you're uncomfortable with what you see in that mirror, the work is not to tune prompts.
863
00:42:56,320 --> 00:43:00,840
It is to decide finally which meanings in your fabric tenant are allowed to exist at scale.
864
00:43:00,840 --> 00:43:05,800
And to constrain AI to those layers until you've paid down the rest of your semantic debt.
865
00:43:05,800 --> 00:43:09,520
Governance is not permissions, redefining the fabric operating model.
866
00:43:09,520 --> 00:43:11,160
By now one thing should be obvious.
867
00:43:11,160 --> 00:43:12,840
You do not have a permissions problem.
868
00:43:12,840 --> 00:43:14,000
You have a meaning problem.
869
00:43:14,000 --> 00:43:17,440
So if you respond to everything, we've just walked through by asking your fabric admin
870
00:43:17,440 --> 00:43:19,760
to tighten access, you've missed the point.
871
00:43:19,760 --> 00:43:22,600
Governance is not the same thing as permissions administration.
872
00:43:22,600 --> 00:43:23,600
Permissions answer?
873
00:43:23,600 --> 00:43:25,760
Can this identity touch this object?
874
00:43:25,760 --> 00:43:26,840
Governance answers.
875
00:43:26,840 --> 00:43:28,720
What is this thing who owns it?
876
00:43:28,720 --> 00:43:31,160
And when is it allowed to be reused?
877
00:43:31,160 --> 00:43:32,640
Those are different questions.
878
00:43:32,640 --> 00:43:34,240
They need different teams.
879
00:43:34,240 --> 00:43:37,920
Most organizations today run fabric with one of two operating models.
880
00:43:37,920 --> 00:43:39,960
On one side you have the central BI police.
881
00:43:39,960 --> 00:43:41,240
Everything goes through one team.
882
00:43:41,240 --> 00:43:42,320
They own every dataset.
883
00:43:42,320 --> 00:43:43,280
They own every model.
884
00:43:43,280 --> 00:43:44,560
They gate every change.
885
00:43:44,560 --> 00:43:47,400
Self-service is tolerated, but only on the margins.
886
00:43:47,400 --> 00:43:48,800
The result is predictable.
887
00:43:48,800 --> 00:43:53,720
Long queues, frustrated domains, a flourishing black market of extracts and side systems.
888
00:43:53,720 --> 00:43:56,920
On the other side you have pure self-service anarchy.
889
00:43:56,920 --> 00:44:01,120
Every domain owns its own workspaces, builds its own models and answers its own questions.
890
00:44:01,120 --> 00:44:05,440
The platform team manages capacity and maybe sets some basic guardrails, but semantics
891
00:44:05,440 --> 00:44:06,880
are not their problem.
892
00:44:06,880 --> 00:44:08,800
The result is also predictable.
893
00:44:08,800 --> 00:44:13,640
Most local wins, slow enterprise failures and AI agents grounded on whatever model somebody
894
00:44:13,640 --> 00:44:14,960
happened to click first.
895
00:44:14,960 --> 00:44:15,960
Fabric needs a third thing.
896
00:44:15,960 --> 00:44:17,760
A fabric platform team.
897
00:44:17,760 --> 00:44:20,400
Not a ticket queue, not a reporting factory.
898
00:44:20,400 --> 00:44:21,680
An architectural function.
899
00:44:21,680 --> 00:44:26,160
This team's job is to design and maintain the semantic and structural rules of the environment,
900
00:44:26,160 --> 00:44:27,640
not to build every report.
901
00:44:27,640 --> 00:44:31,040
They define domain boundaries, which domains exist.
902
00:44:31,040 --> 00:44:35,840
Finance, sales, HR operations, clinical retail, whatever matches your organization and which
903
00:44:35,840 --> 00:44:38,520
workspaces are allowed to belong to each?
904
00:44:38,520 --> 00:44:44,040
They decide with business partners where regulated data is permitted to live and where it is not.
905
00:44:44,040 --> 00:44:49,120
They make temporary workspaces, a conscious time-boxed construct instead of an accident, they
906
00:44:49,120 --> 00:44:51,680
own workspace strategy.
907
00:44:51,680 --> 00:44:55,400
Not in the sense of approving every creation but of defining patterns.
908
00:44:55,400 --> 00:44:58,560
Domain aligned workspaces, not endless project aligned ones.
909
00:44:58,560 --> 00:45:02,120
Clear dev, test on prod tiers for anything that matters.
910
00:45:02,120 --> 00:45:06,880
Explicit isolation for highly regulated domains so that PHI cannot quietly drift into someone's
911
00:45:06,880 --> 00:45:09,320
analytic sandbox because it was convenient.
912
00:45:09,320 --> 00:45:11,520
They define certification rules.
913
00:45:11,520 --> 00:45:13,800
What qualifies a semantic model as certified?
914
00:45:13,800 --> 00:45:16,280
What qualifies a table as master data?
915
00:45:16,280 --> 00:45:18,960
Who can apply those endorsements and under what process?
916
00:45:18,960 --> 00:45:23,320
How do you ensure that there is exactly one certified definition of customer and revenue
917
00:45:23,320 --> 00:45:27,240
per domain and that everything else is explicitly second-class?
918
00:45:27,240 --> 00:45:32,000
They maintain semantic standards, naming conventions, default grains, time intelligence policies,
919
00:45:32,000 --> 00:45:34,400
handling of adjustments and exclusions.
920
00:45:34,400 --> 00:45:39,760
The boring work that prevents churn, churned and churn rate from all meaning different things
921
00:45:39,760 --> 00:45:44,240
in different workspaces while using identical display names and critically they do all of
922
00:45:44,240 --> 00:45:45,240
this as a platform.
923
00:45:45,240 --> 00:45:48,800
They do not sit between domains and their work, they sit underneath it.
924
00:45:48,800 --> 00:45:51,960
Their mandate is not, you must file a ticket to add a measure.
925
00:45:51,960 --> 00:45:56,080
It is we will make it easier to reuse the right meanings than to reinvent them and we will
926
00:45:56,080 --> 00:45:57,840
make it visible when you drift.
927
00:45:57,840 --> 00:46:02,760
In practice that means designing fabric so that core entities and KPIs live in shared,
928
00:46:02,760 --> 00:46:08,120
domain-owned semantic models, discoverable and clearly certified, local teams consume those
929
00:46:08,120 --> 00:46:12,720
models by reference, not by cloning and forking when they just need one more slice.
930
00:46:12,720 --> 00:46:16,760
If they truly need a different meaning, they declare a new metric with a new name and
931
00:46:16,760 --> 00:46:18,440
they own it explicitly.
932
00:46:18,440 --> 00:46:22,640
It also means that this team arbitrates common entities across domains.
933
00:46:22,640 --> 00:46:26,840
Customers shows up in sales, marketing, finance, support, do you want four incompatible
934
00:46:26,840 --> 00:46:31,840
customer dimensions or one shared backbone with domain specific extensions?
935
00:46:31,840 --> 00:46:36,960
The fabric platform team forces that conversation before every domain builds its own version.
936
00:46:36,960 --> 00:46:38,720
Architecturally this is the difference.
937
00:46:38,720 --> 00:46:42,840
You are not governing who is allowed to open Power BI, you are governing which meanings
938
00:46:42,840 --> 00:46:45,040
are allowed to be reused at scale.
939
00:46:45,040 --> 00:46:49,440
Access control is necessary, it stops the wrong people from touching the right things.
940
00:46:49,440 --> 00:46:53,000
Semantic governance is what stops the right people from trusting the wrong things.
941
00:46:53,000 --> 00:46:57,320
Once you adopt that posture, everything about your fabric operating model changes.
942
00:46:57,320 --> 00:47:01,520
You stop seeing workspaces as folders where people put stuff and start seeing them as
943
00:47:01,520 --> 00:47:04,480
surfaces where particular meanings are allowed to exist.
944
00:47:04,480 --> 00:47:09,200
You stop measuring success by number of dashboards and start measuring it by percentage of consumption
945
00:47:09,200 --> 00:47:10,880
that hit certified semantics.
946
00:47:10,880 --> 00:47:15,120
You stop asking your admins to be gatekeepers and start asking your platform team to be designers
947
00:47:15,120 --> 00:47:20,200
of an authorization and meaning fabric that reflects how your organization actually makes
948
00:47:20,200 --> 00:47:21,600
decisions.
949
00:47:21,600 --> 00:47:24,080
And once you have that, you can talk about something practical.
950
00:47:24,080 --> 00:47:26,920
How do you stand this up without freezing the tenant for a year?
951
00:47:26,920 --> 00:47:28,200
You don't need a five year program.
952
00:47:28,200 --> 00:47:32,440
You need a clear charter, some domains and a Dave and 30 pumps 90 plan that turns governance
953
00:47:32,440 --> 00:47:37,120
from a slide into a set of irreversible decisions about how fabric is allowed to behave.
954
00:47:37,120 --> 00:47:42,040
The fabric governance model, charter, domains, day, 30, 90.
955
00:47:42,040 --> 00:47:43,760
At this point you know what goes wrong.
956
00:47:43,760 --> 00:47:48,440
Now you need something uncomfortably specific, a way to run fabric that doesn't depend on
957
00:47:48,440 --> 00:47:49,440
heroics or hope.
958
00:47:49,440 --> 00:47:54,560
And that starts with a charter, not a slide that says enable self-service, an architectural
959
00:47:54,560 --> 00:47:58,920
sentence that answers a harder question, why does fabric exist in this organization?
960
00:47:58,920 --> 00:48:02,680
If your honest answer is because it was in the E5 bundle, stop there.
961
00:48:02,680 --> 00:48:06,440
You're not governing, you're decorating the only charter that survives contact with fabric
962
00:48:06,440 --> 00:48:07,800
looks more like this.
963
00:48:07,800 --> 00:48:13,240
We run fabric so the organization can make trusted decisions at speed using shared semantics
964
00:48:13,240 --> 00:48:14,960
over governed data.
965
00:48:14,960 --> 00:48:19,080
Trusted at speed out shared semantics, governed data, everything you do next has to either
966
00:48:19,080 --> 00:48:22,720
support that sentence or admit that you're optimizing for something else.
967
00:48:22,720 --> 00:48:27,640
On that charter you define domains, not just for security, for meaning finance, sales, HR,
968
00:48:27,640 --> 00:48:31,440
operations, clinical, retail, whatever maps cleanly to how your organization actually makes
969
00:48:31,440 --> 00:48:32,440
decisions.
970
00:48:32,440 --> 00:48:35,000
Each domain gets two explicit owners.
971
00:48:35,000 --> 00:48:38,840
A data owner accountable for what data is allowed to exist, how it's classified, how it
972
00:48:38,840 --> 00:48:44,160
flows, a semantic owner accountable for what core entities and KPIs mean in that domain,
973
00:48:44,160 --> 00:48:46,760
which models are authoritative and what gets certified.
974
00:48:46,760 --> 00:48:50,680
Those are not abstract titles, they are names you can put on a page and more importantly
975
00:48:50,680 --> 00:48:52,480
in fabric and purview.
976
00:48:52,480 --> 00:48:55,680
Workspaces then align to those domains, not to projects.
977
00:48:55,680 --> 00:49:00,680
Instead of project X and pilot Y scattered everywhere you get finance prod, finance dev,
978
00:49:00,680 --> 00:49:06,960
sales prod, HR prod and so on, regulated domains, clinical, HR, anything with PHI or payroll
979
00:49:06,960 --> 00:49:12,400
are isolated by design, dedicated capacities if necessary, hardened workspace settings,
980
00:49:12,400 --> 00:49:14,640
no casual shortcuts in or out.
981
00:49:14,640 --> 00:49:17,640
Dev test prod become tears you enforce, not vibes.
982
00:49:17,640 --> 00:49:21,760
Dev workspaces are where experiments happen, they are noisy, short lived and cheap.
983
00:49:21,760 --> 00:49:24,920
Workspaces exist only for things on a path to production.
984
00:49:24,920 --> 00:49:28,840
Broad workspace contain only asset someone is willing to sign for.
985
00:49:28,840 --> 00:49:32,960
Inside those prod workspaces you draw the sharpest line you have.
986
00:49:32,960 --> 00:49:35,280
Certified versus promoted semantic models.
987
00:49:35,280 --> 00:49:37,840
Certified models are governed, owned and authoritative.
988
00:49:37,840 --> 00:49:43,080
They represent the meanings you are willing to let AI executives and downstream systems reuse
989
00:49:43,080 --> 00:49:47,920
at scale, they have an explicit owner, documented logic and a change process.
990
00:49:47,920 --> 00:49:49,720
Promoted models are useful but not trusted.
991
00:49:49,720 --> 00:49:52,840
They are allowed to exist, they are allowed to help people think, they are not allowed
992
00:49:52,840 --> 00:49:58,400
to silently redefine revenue or customer for the organization, everything else is experimental
993
00:49:58,400 --> 00:49:59,960
and you treat it that way.
994
00:49:59,960 --> 00:50:03,460
Of course none of this appears magically, you have a tenant full of drift already, so you
995
00:50:03,460 --> 00:50:05,560
need a day on day 30, day 90 plan.
996
00:50:05,560 --> 00:50:09,600
Day is the freeze, you don't turn fabric off, you stop net new chaos, you freeze uncontrolled
997
00:50:09,600 --> 00:50:13,360
workspace creation by tightening who can create them and under which domains.
998
00:50:13,360 --> 00:50:17,520
You stop people from publishing brand new semantic models into production workspaces without
999
00:50:17,520 --> 00:50:22,600
review, you identify the top 10 or 20 data sets and models by consumption, the ones most
1000
00:50:22,600 --> 00:50:27,480
of the organization already depends on and you put a temporary do not clone casually sticker
1001
00:50:27,480 --> 00:50:28,760
on them.
1002
00:50:28,760 --> 00:50:33,200
The goal of day is simple, stop digging, day 30 is inventory and ownership, by then your
1003
00:50:33,200 --> 00:50:37,640
platform team has run through usage metrics, per view scans and workspace lists.
1004
00:50:37,640 --> 00:50:41,520
For anything in a prod like workspace that is clearly in active use, reports opened
1005
00:50:41,520 --> 00:50:46,640
regularly, models queried constantly, tables feeding multiple downstream artifacts, you
1006
00:50:46,640 --> 00:50:50,040
assign an owner, not a team, a person.
1007
00:50:50,040 --> 00:50:54,200
Every production data set, every widely used semantic model, every lake house that feeds
1008
00:50:54,200 --> 00:51:00,320
critical dashboards gets a named data owner and where applicable a semantic owner.
1009
00:51:00,320 --> 00:51:04,640
You log that somewhere boring and durable, a catalog a table, whatever you will actually
1010
00:51:04,640 --> 00:51:09,520
maintain, in parallel you define semantic standards for the top handful of KPIs, revenue
1011
00:51:09,520 --> 00:51:12,280
customer churn risk, whatever your board obsesses over.
1012
00:51:12,280 --> 00:51:16,520
For each you decide which model is authoritative, what the definition is and how it will be
1013
00:51:16,520 --> 00:51:17,520
surfaced.
1014
00:51:17,520 --> 00:51:21,200
You stand up the smallest possible certification process that makes changes visible and
1015
00:51:21,200 --> 00:51:24,360
reviewable without recreating the bi police.
1016
00:51:24,360 --> 00:51:28,240
By day 30 nothing is everybody's problem anymore, drift can still happen but at least you
1017
00:51:28,240 --> 00:51:30,080
know whose job it is to notice.
1018
00:51:30,080 --> 00:51:32,000
Day 90 is enforcement.
1019
00:51:32,000 --> 00:51:36,160
This is where domains stop being labels and start being boundaries, you align every workspace
1020
00:51:36,160 --> 00:51:40,640
to a domain, you shut down or archive the ones nobody will claim, you enforce that regulated
1021
00:51:40,640 --> 00:51:44,960
data, only lives in regulated domains and you back that up with per view classification
1022
00:51:44,960 --> 00:51:47,560
and DLP, not just naming conventions.
1023
00:51:47,560 --> 00:51:52,160
In fabric you turn on the features you were pretending to use, enforce domain isolation,
1024
00:51:52,160 --> 00:51:56,520
constraint cross workspace semantic reuse to certified models where it matters and start
1025
00:51:56,520 --> 00:52:02,000
wiring AI access, co-pilot data agents so that by default they can only see certified
1026
00:52:02,000 --> 00:52:03,080
semantic layers.
1027
00:52:03,080 --> 00:52:07,160
You also start measuring what percentage of consumption is hitting certified models versus
1028
00:52:07,160 --> 00:52:10,760
everything else, how many production data sets lack an owner, how many definitions of
1029
00:52:10,760 --> 00:52:13,800
revenue still exist and is that number going down.
1030
00:52:13,800 --> 00:52:17,920
By day 90 the platform will look roughly the same from the outside, the critical difference
1031
00:52:17,920 --> 00:52:19,880
is this.
1032
00:52:19,880 --> 00:52:24,360
Before fabric reflected your drift and hit it behind good intentions, after fabric reflects
1033
00:52:24,360 --> 00:52:28,040
your intent and makes drift visible the moment it matters.
1034
00:52:28,040 --> 00:52:32,200
Matrix that matter in fabric governance, at this point intent is clear, the operating model
1035
00:52:32,200 --> 00:52:35,560
is defined, now you need something much less glamorous.
1036
00:52:35,560 --> 00:52:36,560
Proof.
1037
00:52:36,560 --> 00:52:40,280
If you can't measure whether semantic drift is shrinking or growing, you are not governing
1038
00:52:40,280 --> 00:52:42,080
fabric, you're decorating it.
1039
00:52:42,080 --> 00:52:45,680
So what does proof look like in a fabric tenant, start with the only metric that really tells
1040
00:52:45,680 --> 00:52:49,760
you whether semantics are landing, what percentage of enterprise consumption hits certified
1041
00:52:49,760 --> 00:52:54,880
semantic models, not how many models are certified, that's vanity, consumption is the
1042
00:52:54,880 --> 00:52:56,680
reality.
1043
00:52:56,680 --> 00:53:01,600
You want to know across reports, dashboards, Excel connections and AI agents, what fraction
1044
00:53:01,600 --> 00:53:05,920
of queries resolve against a small known set of certified models versus everything else.
1045
00:53:05,920 --> 00:53:10,760
If that number is low, you've built a beautiful semantic layer nobody is actually using.
1046
00:53:10,760 --> 00:53:14,400
If that number grows over time, you're bending drift back toward intent.
1047
00:53:14,400 --> 00:53:16,520
Second, data owner coverage.
1048
00:53:16,520 --> 00:53:21,040
For every data set, lake house, warehouse and semantic model that behaves like production,
1049
00:53:21,040 --> 00:53:25,320
meaning people depend on it to make decisions, not just to experiment, you want a named
1050
00:53:25,320 --> 00:53:26,600
accountable owner.
1051
00:53:26,600 --> 00:53:30,800
This is binary, there is no partial credit, either there is a person whose name you can
1052
00:53:30,800 --> 00:53:33,400
put next to that asset or there isn't.
1053
00:53:33,400 --> 00:53:37,800
Your metric is simple, the proportion of production artifacts with a named data owner and
1054
00:53:37,800 --> 00:53:39,680
where relevant a semantic owner.
1055
00:53:39,680 --> 00:53:44,720
If a model powers revenue, risk, churn or anything else executives argue about, it belongs
1056
00:53:44,720 --> 00:53:46,800
to the team is not an answer.
1057
00:53:46,800 --> 00:53:49,320
Teams don't attend change approval calls, people do.
1058
00:53:49,320 --> 00:53:53,640
Third, duplicate semantic models per core KPI, pick the half dozen metrics that define your
1059
00:53:53,640 --> 00:53:54,640
business.
1060
00:53:54,640 --> 00:53:58,960
Revenue, customer, churn, risk, same store sales, whatever fits your reality.
1061
00:53:58,960 --> 00:54:03,040
For each one count how many distinct implementations exist in fabric, not how many times the word
1062
00:54:03,040 --> 00:54:06,920
appears, how many different DAX or SQL definitions you would find if you traced them.
1063
00:54:06,920 --> 00:54:11,440
If you have 10 versions of revenue and one certified model, your goal is not to drag everything
1064
00:54:11,440 --> 00:54:15,240
into central finance, it is to create a visible trajectory.
1065
00:54:15,240 --> 00:54:20,880
Those 10 becoming 8, then 5, then 3 as teams converge on shared semantics or rename their
1066
00:54:20,880 --> 00:54:21,880
local variance honestly.
1067
00:54:21,880 --> 00:54:25,360
You're not chasing perfection, you're chasing monotonic improvement.
1068
00:54:25,360 --> 00:54:30,480
Fourth, sensitivity label compliance in regulated domains, in clinical HR finance, anywhere
1069
00:54:30,480 --> 00:54:34,280
regulators like to live, you want a hard number for how much of the data state is actually
1070
00:54:34,280 --> 00:54:36,080
classified and enforced.
1071
00:54:36,080 --> 00:54:41,160
Per view can tell you how many assets are labeled, which labels they carry and where labels propagate.
1072
00:54:41,160 --> 00:54:45,320
Your governance metric is the percentage of tables, files and models that contain regulated
1073
00:54:45,320 --> 00:54:50,600
data and also carry the correct sensitivity labels with enforcement policies active.
1074
00:54:50,600 --> 00:54:54,840
If that number is low, you are not running regulated domains, you are running wishful thinking
1075
00:54:54,840 --> 00:54:56,520
with good slideware.
1076
00:54:56,520 --> 00:54:59,440
Fifth, AI consumption from uncertified models.
1077
00:54:59,440 --> 00:55:02,040
This one is your early warning system for semantic risk.
1078
00:55:02,040 --> 00:55:06,840
For every co-pilot interaction, every data agent query, every AI-driven workflow, you want
1079
00:55:06,840 --> 00:55:10,560
to know which models were used to answer the question or drive the action.
1080
00:55:10,560 --> 00:55:15,120
And whether those models were certified, merely promoted or completely ad hoc.
1081
00:55:15,120 --> 00:55:19,080
If a large share of AI usage is grounded on uncertified semantics, then your most powerful
1082
00:55:19,080 --> 00:55:22,360
amplification engine is wired directly into your drift.
1083
00:55:22,360 --> 00:55:25,560
You shouldn't be surprised when it embarrasses you in front of executives.
1084
00:55:25,560 --> 00:55:27,880
You should be surprised it hasn't done so more often.
1085
00:55:27,880 --> 00:55:32,080
There are other metrics you can track, workspace, sprawl, orfant artifacts time to certify,
1086
00:55:32,080 --> 00:55:36,520
but these five form a minimum viable truth dashboard for fabric governance, percentage
1087
00:55:36,520 --> 00:55:41,160
of consumption on certified semantics, owner coverage for production assets, duplicate
1088
00:55:41,160 --> 00:55:43,520
implementations per core KPI.
1089
00:55:43,520 --> 00:55:48,080
Label compliance in regulated domains, AI usage grounded on uncertified models.
1090
00:55:48,080 --> 00:55:51,840
If any of those are unmeasured, you are flying a very expensive platform with the instruments
1091
00:55:51,840 --> 00:55:52,840
turned off.
1092
00:55:52,840 --> 00:55:56,920
And if you find yourself tempted to invent softer, more flattering metrics, number of
1093
00:55:56,920 --> 00:56:01,800
workspaces created, reports published, co-pilot questions asked.
1094
00:56:01,800 --> 00:56:04,040
Stop and ask a harder question.
1095
00:56:04,040 --> 00:56:07,800
Do these numbers tell us anything about whether we are enforcing meaning or just about
1096
00:56:07,800 --> 00:56:10,840
how busy we are at generating more of it?
1097
00:56:10,840 --> 00:56:15,280
Because once you start treating semantics as a governed surface, not an accidental byproduct,
1098
00:56:15,280 --> 00:56:18,840
something else becomes possible, you can apply the same posture to meaning that you already
1099
00:56:18,840 --> 00:56:23,960
pretend to apply to networks and identities, zero trust, not just for who connects and
1100
00:56:23,960 --> 00:56:27,840
from where, but for which definitions you are willing to trust by default.
1101
00:56:27,840 --> 00:56:29,440
Zero trust for data and semantics.
1102
00:56:29,440 --> 00:56:31,920
Zero trust has been marketed to you for years.
1103
00:56:31,920 --> 00:56:34,640
Never trust, always verify.
1104
00:56:34,640 --> 00:56:35,640
Assume breach.
1105
00:56:35,640 --> 00:56:37,240
Enforced least privilege.
1106
00:56:37,240 --> 00:56:41,520
Most organizations dutifully apply that to networks, devices and sign-ins.
1107
00:56:41,520 --> 00:56:42,880
They turned on conditional access.
1108
00:56:42,880 --> 00:56:43,880
They tightened VPNs.
1109
00:56:43,880 --> 00:56:45,840
They added MFA to anything that moved.
1110
00:56:45,840 --> 00:56:49,240
Almost nobody applied it to data and nobody applied it to meaning.
1111
00:56:49,240 --> 00:56:51,760
In a fabric world, that gap is no longer theoretical.
1112
00:56:51,760 --> 00:56:56,480
So take the same three zero trust principles and translate them ruthlessly into how you
1113
00:56:56,480 --> 00:56:58,880
treat both data and semantics.
1114
00:56:58,880 --> 00:57:00,840
Start with verify explicitly.
1115
00:57:00,840 --> 00:57:04,680
In identity terms, that means you don't trust a token just because it exists.
1116
00:57:04,680 --> 00:57:07,520
You evaluate device state, location, risk.
1117
00:57:07,520 --> 00:57:10,440
In fabric terms, you extend that posture to three layers.
1118
00:57:10,440 --> 00:57:12,520
Who is this identity in Entra?
1119
00:57:12,520 --> 00:57:14,720
Which domain and workspace are they operating in?
1120
00:57:14,720 --> 00:57:16,680
Which semantic models are they allowed to reuse?
1121
00:57:16,680 --> 00:57:20,520
You stop assuming that because someone works in finance, they should see every finance model.
1122
00:57:20,520 --> 00:57:25,080
You stop assuming that because a semantic model lives in a prod workspace, it must be
1123
00:57:25,080 --> 00:57:26,240
authoritative.
1124
00:57:26,240 --> 00:57:28,840
Every access to a certified semantic model is a decision.
1125
00:57:28,840 --> 00:57:32,560
Does this person in this role, in this domain, need to reuse this meaning?
1126
00:57:32,560 --> 00:57:36,120
If the answer is no, they can still explore data, but they do it against non-certified layers
1127
00:57:36,120 --> 00:57:39,240
where their experiments can't redefine enterprise truth.
1128
00:57:39,240 --> 00:57:41,040
Then I'm least privilege.
1129
00:57:41,040 --> 00:57:45,400
Most people implement least privilege as give analysts viewer, not member.
1130
00:57:45,400 --> 00:57:46,880
Architecturally that's shallow.
1131
00:57:46,880 --> 00:57:50,360
Architect privilege for data is not just about rows and columns, it's about which metrics can
1132
00:57:50,360 --> 00:57:51,360
be referenced where.
1133
00:57:51,360 --> 00:57:56,120
A sales analyst might need row level access to detailed transactions in their region.
1134
00:57:56,120 --> 00:57:59,800
That doesn't mean they should be allowed to build new AI agents grounded on the global
1135
00:57:59,800 --> 00:58:01,080
certified revenue model.
1136
00:58:01,080 --> 00:58:05,240
A data scientist working on churn experiments might need full column access to customer features
1137
00:58:05,240 --> 00:58:06,240
in a sandbox.
1138
00:58:06,240 --> 00:58:09,920
That doesn't mean their experimental churn score 7 should ever show up in co-pilot suggestions
1139
00:58:09,920 --> 00:58:10,920
for executives.
1140
00:58:10,920 --> 00:58:15,160
These privilege for semantics means you constrain which models can be used as sources for other
1141
00:58:15,160 --> 00:58:16,160
models.
1142
00:58:16,160 --> 00:58:20,480
In the brain which models AI can see by default, you constrain which models can be referenced
1143
00:58:20,480 --> 00:58:22,280
across domains without review.
1144
00:58:22,280 --> 00:58:26,320
You are not just asking can you query this table that you are asking can you compose with
1145
00:58:26,320 --> 00:58:29,800
this meaning finally assume breach.
1146
00:58:29,800 --> 00:58:34,720
In the network world that means you operate as though an attacker is already inside in fabric
1147
00:58:34,720 --> 00:58:39,600
you operate as though drift is already inside because it is assume access has already widened
1148
00:58:39,600 --> 00:58:43,960
beyond what your diagram says assume there are already five versions of revenue and
1149
00:58:43,960 --> 00:58:46,120
three of churn in production.
1150
00:58:46,120 --> 00:58:50,480
Assume there are often models that AI will happily root through if you don't stop it.
1151
00:58:50,480 --> 00:58:54,160
Assume drift means you build continuous observation into the platform.
1152
00:58:54,160 --> 00:58:58,680
You monitor for semantic changes a certified measures definition changes a models filters
1153
00:58:58,680 --> 00:59:01,840
are edited a key relationship is dropped.
1154
00:59:01,840 --> 00:59:06,320
Those events aren't just version history they are governance events someone should be alerted
1155
00:59:06,320 --> 00:59:09,960
in some cases downstream consumers should be forced to re acknowledge that the meaning
1156
00:59:09,960 --> 00:59:15,920
changed you monitor for ownership gaps production artifacts without owners certified models
1157
00:59:15,920 --> 00:59:20,440
where the listed owner hasn't locked in for six months workspaces with high consumption
1158
00:59:20,440 --> 00:59:22,600
and zero named stewards.
1159
00:59:22,600 --> 00:59:27,120
You monitor for usage anomalies AI agents pulling heavily from uncertified models sudden spikes
1160
00:59:27,120 --> 00:59:31,800
in consumption of a sandbox lake house reports in executive apps grounded on non certified
1161
00:59:31,800 --> 00:59:36,520
semantics in other words you stop trusting your own configuration you treat every semantic
1162
00:59:36,520 --> 00:59:42,320
object as guilty until it has a clear owner a clear definition a clear endorsement state
1163
00:59:42,320 --> 00:59:46,760
and a clear usage pattern that matches its intent zero trust for data is not about locking
1164
00:59:46,760 --> 00:59:51,400
everything down it is about trusting fewer things by default you deliberately shrink the
1165
00:59:51,400 --> 00:59:56,280
set of meanings that are allowed to flow freely into AI into board decks into automated workflows
1166
00:59:56,280 --> 01:00:00,720
everything else has to earn its way in through ownership certification and observation
1167
01:00:00,720 --> 01:00:05,760
once you take that posture seriously design decisions change you create fewer one-off semantic
1168
01:00:05,760 --> 01:00:09,480
models because you know they will become attack surfaces for drift you push harder on
1169
01:00:09,480 --> 01:00:13,800
reuse of certified entities because that's the only way to keep your AI surface area small
1170
01:00:13,800 --> 01:00:18,520
enough to reason about you stop pretending that more models is progress and start measuring
1171
01:00:18,520 --> 01:00:22,960
more consumption on fewer better models as the real signal and you stop blaming fabric
1172
01:00:22,960 --> 01:00:28,440
for doing exactly what you told it to do secure the objects accelerate the drift expose
1173
01:00:28,440 --> 01:00:32,400
whether you are willing to govern meaning with the same paranoia you already claim to apply
1174
01:00:32,400 --> 01:00:38,880
to identity the future fabric AI agents and meaning at scale most organizations still talk
1175
01:00:38,880 --> 01:00:43,760
about AI as if it were smarter reporting tool architecturally it is something else the future
1176
01:00:43,760 --> 01:00:48,240
of fabric is not people opening power be I and clicking through semantic models it is agents
1177
01:00:48,240 --> 01:00:53,960
co-pilot data agents operational bots consuming your semantics directly without you in the loop
1178
01:00:53,960 --> 01:00:58,520
your semantic model stops being the thing behind the report it becomes an API to AI fabric
1179
01:00:58,520 --> 01:01:03,760
IQ makes that explicit underneath the branding it is building an ontology an entity graph
1180
01:01:03,760 --> 01:01:11,880
your existing artifacts customer order shipment sensor contract churn risk the relationships
1181
01:01:11,880 --> 01:01:16,120
the rules the constraints it is taking the semantics you already encoded in models tables
1182
01:01:16,120 --> 01:01:20,720
and logs and compiling them into something agents can reason over that sounds powerful it
1183
01:01:20,720 --> 01:01:25,680
is also unforgiving because wrong meaning scales faster than wrong data if you mistype
1184
01:01:25,680 --> 01:01:30,800
a value in a source system the blast rate is is local a few reports are wrong someone notices
1185
01:01:30,800 --> 01:01:35,920
you fix it if you misdefine at risk customer in the ontology that every AI agent uses to
1186
01:01:35,920 --> 01:01:40,720
triage support tickets root renewals and trigger discounts that error propagates everywhere
1187
01:01:40,720 --> 01:01:45,640
instantly every bot that calls that definition makes the same wrong decision every automation
1188
01:01:45,640 --> 01:01:49,560
built on top of those bots inherits the same flow you don't just have a bad report you
1189
01:01:49,560 --> 01:01:54,320
have institutionalized a bad rule fabric direction is clear more of your operational logic
1190
01:01:54,320 --> 01:01:59,000
will live in that semantic layer operational agents watch real-time streams from one lake
1191
01:01:59,000 --> 01:02:03,320
KQL or IOT they use the ontology to understand that truck temperature threshold for more
1192
01:02:03,320 --> 01:02:07,880
than four hours and shipment contains vaccine means cold chain breach which means alert compliance
1193
01:02:07,880 --> 01:02:12,760
hold inventory notify customer that's not a dashboard that is a controlled process grounded
1194
01:02:12,760 --> 01:02:17,560
in semantics if the ontology is wrong if vaccine doesn't include a new product line if the
1195
01:02:17,560 --> 01:02:22,360
threshold logic was copied from a pilot and never updated then the agent will do exactly
1196
01:02:22,360 --> 01:02:27,720
what you told it at machine speed with no hesitation this is the uncomfortable consistency
1197
01:02:27,720 --> 01:02:33,480
of AI agents are not creative about your meanings they are obedient they will apply whatever
1198
01:02:33,480 --> 01:02:37,400
definitions they can reach they will not stop mid workflow and ask are you sure this is the right
1199
01:02:37,400 --> 01:02:42,680
revenues and they are or did legal approve this risk score for use in Europe that's your job upstream
1200
01:02:42,680 --> 01:02:47,880
so the question how do I govern fabric data access quietly becomes which semantics am I willing
1201
01:02:47,880 --> 01:02:53,320
to expose as api's to automation and under what conditions that is what fabric IQ and future
1202
01:02:53,320 --> 01:02:58,040
agents formalize they turn your loosely managed semantic layer into a first class dependency graph
1203
01:02:58,040 --> 01:03:02,520
for decision making compliance and control they don't create new risk categories they compress the
1204
01:03:02,520 --> 01:03:07,880
time it takes for your existing semantic risk to turn into real world consequences organizations
1205
01:03:07,880 --> 01:03:12,520
that understand this treat semantics as critical infrastructure they build change control for measures
1206
01:03:12,520 --> 01:03:17,480
the way they build change control for firewall rules they test new ontology relationships with
1207
01:03:17,480 --> 01:03:22,040
synthetic scenarios before letting agents act on them they restrict AI access to a narrow band
1208
01:03:22,040 --> 01:03:26,600
of certified concepts until they've earned the right to widen it everyone else keeps asking why
1209
01:03:26,600 --> 01:03:31,880
co-pilot hallucinates when in reality it is just following their ontology the future of fabric is
1210
01:03:31,880 --> 01:03:36,920
not optional semantics it is meaning at scale you either govern that meaning on purpose or you watch
1211
01:03:36,920 --> 01:03:42,840
AI industrialize whatever you left lying around the real answer to how do I govern fabric data access
1212
01:03:42,840 --> 01:03:47,480
so here's the honest answer to the question people type into search you govern fabric data access
1213
01:03:47,480 --> 01:03:53,240
by treating access as the easy part and meaning as the hard part enter one lake workspace roles
1214
01:03:53,240 --> 01:03:57,960
purview the platform already knows how to lock doors your real work is deciding which semantics
1215
01:03:57,960 --> 01:04:03,240
are allowed to leave the room who owns them and when AI is allowed to reuse them without asking
1216
01:04:03,240 --> 01:04:09,080
fabric is secure by designing your data model will drift unless governance is engineered into creation
1217
01:04:09,080 --> 01:04:14,520
sharing and consumption if you're responsible for enterprise data trust your next steps are simple
1218
01:04:14,520 --> 01:04:19,960
and non optional define domains stand up a real platform team certify semantics measure drift
1219
01:04:19,960 --> 01:04:24,840
and keep AI constrained to what you actually trust because Microsoft fabric doesn't break data
1220
01:04:24,840 --> 01:04:29,560
governance it exposes whether your organization can agree on truth fast enough to scale AI