Foundry Is the Next Shadow IT Risk (Without This Purview Rule)
This episode opens with a blunt warning: Microsoft Foundry isn’t just another AI feature you can casually approve and forget. It’s an agent factory, and if execution comes before governance, you are almost guaranteed to create the next generation of shadow IT. Most future AI incidents won’t come from models hallucinating answers. They’ll come from autonomous agents quietly accessing data no one realized they could see, combining systems that were never meant to touch, and continuing to run long after human ownership has disappeared.
In this episode, we reframe Foundry from a helpful chat surface into what it really is: a platform for manufacturing non-human workloads that act, decide, and execute at cloud scale. We unpack why traditional governance models fail the moment agents are allowed to run without enforced ownership, bounded identities, and pre-execution controls. Drawing on hard lessons from SharePoint, Power Apps, and Teams, the episode shows how familiar patterns of “innovation first, governance later” collapse faster and more silently when autonomy is involved.
If you’re responsible for identity, compliance, or platform governance, this conversation isn’t theoretical. It’s a roadmap of the failure modes you’re likely to face, why they’re structurally inevitable, and what has to change before your first agent-driven incident forces the issue for you.
Shadow IT didn’t disappear — it evolved. In this episode, we break down why Foundry is quietly becoming the next major Shadow IT risk inside organizations, especially as teams rush to build AI apps, copilots, and agents faster than security and governance can keep up. What used to be unsanctioned SaaS tools has now turned into unsanctioned AI workloads — and the implications are far more serious. 🚨 The New Face of Shadow IT: AI & Agents Foundry makes it incredibly easy for developers, data teams, and even business units to spin up powerful AI-driven applications and agents. That speed is exactly the problem. When Foundry environments are created without guardrails:
- Security teams may not even know the apps exist
- Sensitive data can be accessed or processed without oversight
- Agents may run autonomously with excessive permissions
- Compliance boundaries become blurred or completely bypassed
This episode explains why AI platforms amplify Shadow IT risk, rather than just repeating old mistakes. 🔐 Why One Missing Purview Rule Changes Everything We dig into the critical role of Microsoft Purview in governing Foundry environments — and how missing even a single policy can create a massive blind spot. Without the right Purview configuration:
- Data classification may not apply to AI prompts or outputs
- DLP controls may never trigger
- Sensitive information can be exposed through agent workflows
- Organizations lose visibility into how data is being used, transformed, or shared by AI
This isn’t about blocking innovation — it’s about ensuring AI is deployed safely, visibly, and intentionally. 🤖 AI Agents Are Not “Just Apps” One of the biggest mindset shifts discussed in this episode: AI agents must be treated as first-class IT assets. Agents don’t just read data — they act on it.
They can:
- Chain tools together
- Make decisions
- Trigger downstream systems
- Operate continuously without human review
If these agents are created in Foundry without identity controls, policy enforcement, and governance, they effectively become autonomous shadow employees with access to your data. 🧠 Where Organizations Are Getting This Wrong We explore common mistakes teams are making right now:
- Letting developers deploy Foundry solutions before governance is ready
- Assuming Purview “just works” for AI by default
- Treating AI experimentation as low-risk
- Ignoring agent identities and permissions
- Failing to inventory AI workloads across the environment
The result? Security teams are left reacting after incidents instead of preventing them. ✅ What You Should Be Doing Instead This episode outlines practical steps organizations should take immediately:
- Define ownership for every Foundry environment and agent
- Apply Purview policies before AI goes to production
- Ensure data classification follows AI inputs and outputs
- Monitor agent behavior, not just user behavior
- Bring security into the AI development lifecycle early
The goal isn’t to slow teams down — it’s to make sure speed doesn’t come at the cost of control. 🔑 Key Takeaways
- Shadow IT is no longer just apps — it’s AI platforms and agents
- Foundry dramatically lowers the barrier to creating risky workloads
- One missing Purview rule can eliminate visibility entirely
- AI agents require the same (or stronger) governance as human users
- Security must evolve alongside AI, not chase it afterward
🎯 Who This Episode Is For
- Security leaders worried about AI risk and governance
- IT teams managing rapid AI adoption
- Architects designing modern AI platforms
- Compliance professionals navigating AI-driven data usage
- Developers building in Foundry who want to do it right
Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-show-modern-work-security-and-productivity-with-microsoft-365--6704921/support.
1
00:00:00,000 --> 00:00:01,800
Microsoft Foundry is not an AI feature.
2
00:00:01,800 --> 00:00:03,000
It is an agent factory.
3
00:00:03,000 --> 00:00:05,120
And every agent factory becomes Shadow IT,
4
00:00:05,120 --> 00:00:08,000
the moment execution is allowed before governance is enforced.
5
00:00:08,000 --> 00:00:10,760
Most AI incidents won't be caused by hallucinations.
6
00:00:10,760 --> 00:00:13,680
They'll be caused by agents acting on data no one realized they could see.
7
00:00:13,680 --> 00:00:16,920
If you're responsible for identity compliance or platform governance,
8
00:00:16,920 --> 00:00:19,200
this episode is not optional listening.
9
00:00:19,200 --> 00:00:21,520
Because the question won't be whether Foundry was allowed,
10
00:00:21,520 --> 00:00:23,240
but it will be why no one stopped it.
11
00:00:23,240 --> 00:00:26,280
Let me explain why that failure is structurally inevitable
12
00:00:26,280 --> 00:00:28,560
and how to block it before your first incident.
13
00:00:29,560 --> 00:00:31,800
Reframing Foundry from Feature to Factory.
14
00:00:31,800 --> 00:00:34,040
Reframing Foundry from Feature to Factory.
15
00:00:34,040 --> 00:00:36,880
When most people hear Microsoft Foundry,
16
00:00:36,880 --> 00:00:38,400
they see the demo in their head,
17
00:00:38,400 --> 00:00:40,040
model catalog agents playground,
18
00:00:40,040 --> 00:00:42,160
a few connectors to SharePoint or Fabric,
19
00:00:42,160 --> 00:00:43,640
maybe a logic app in the background.
20
00:00:43,640 --> 00:00:46,880
In other words, a nice app portal on top of Azure AI.
21
00:00:46,880 --> 00:00:48,080
Governance hears that and thinks,
22
00:00:48,080 --> 00:00:50,480
"Okay, another tool will add it to the list."
23
00:00:50,480 --> 00:00:52,720
That mental model is wrong in a way that matters.
24
00:00:52,720 --> 00:00:54,240
Governance thinks, "Tool."
25
00:00:54,240 --> 00:00:58,000
Reality, autonomous workload.
26
00:00:58,000 --> 00:01:00,360
Foundry is not where people chat with AI.
27
00:01:00,360 --> 00:01:04,240
Foundry is where agents are manufactured and released into your environment.
28
00:01:04,240 --> 00:01:05,360
Pause here for a second.
29
00:01:05,360 --> 00:01:08,920
Ask yourself if one of those agents wakes up at night and starts touching data.
30
00:01:08,920 --> 00:01:09,800
Who do I call?
31
00:01:09,800 --> 00:01:11,000
If you don't have a precise answer,
32
00:01:11,000 --> 00:01:13,760
you already know why we're having this conversation.
33
00:01:13,760 --> 00:01:15,000
From a systems perspective,
34
00:01:15,000 --> 00:01:17,800
Foundry assembles four things into one execution surface,
35
00:01:17,800 --> 00:01:20,480
models, tools, knowledge and observability,
36
00:01:20,480 --> 00:01:22,040
models provide reasoning.
37
00:01:22,040 --> 00:01:23,720
They decide what to do next.
38
00:01:23,720 --> 00:01:25,040
Tools are the actuators.
39
00:01:25,040 --> 00:01:27,440
Logic apps, functions, API's graph.
40
00:01:27,440 --> 00:01:28,440
Knowledge is your data.
41
00:01:28,440 --> 00:01:30,800
SharePoint, fabric, vector stores, search.
42
00:01:30,800 --> 00:01:32,120
Observability is the trace.
43
00:01:32,120 --> 00:01:34,200
If you've invested in it of what happened,
44
00:01:34,200 --> 00:01:37,200
put those together and you don't get chat, you get behavior.
45
00:01:37,200 --> 00:01:39,840
Once an agent is configured, it's not a fancy prompt.
46
00:01:39,840 --> 00:01:41,600
It is an identity it runs under.
47
00:01:41,600 --> 00:01:43,440
A set of tools it's allowed to call.
48
00:01:43,440 --> 00:01:46,760
A memory surface it can read and write and triggers that start runs.
49
00:01:46,760 --> 00:01:49,280
Events, schedules, external calls.
50
00:01:49,280 --> 00:01:50,840
That combination is a workload.
51
00:01:50,840 --> 00:01:51,960
It can wake up on a timer.
52
00:01:51,960 --> 00:01:53,560
It can react to a queue.
53
00:01:53,560 --> 00:01:56,280
It can sit behind an API that another system hits.
54
00:01:56,280 --> 00:01:59,480
It can chain tools, query a search index, call an internal API,
55
00:01:59,480 --> 00:02:01,240
write a record send a notification.
56
00:02:01,240 --> 00:02:03,480
No human in the loop, no user clicking a button,
57
00:02:03,480 --> 00:02:05,560
no UI you can point at in a training session.
58
00:02:05,560 --> 00:02:08,000
This is the first mental shift I need you to take.
59
00:02:08,000 --> 00:02:10,200
Foundry agents are closer to microservices
60
00:02:10,200 --> 00:02:12,920
with reasoning than to chatbots with personalities.
61
00:02:12,920 --> 00:02:14,080
Developers love that.
62
00:02:14,080 --> 00:02:16,840
Can I get the agent to handle this workflow end to end?
63
00:02:16,840 --> 00:02:18,480
Your question has to be different.
64
00:02:18,480 --> 00:02:20,560
If this thing wakes up at 3am,
65
00:02:20,560 --> 00:02:23,640
what is the blast radius and who is on the hook when it crosses it?
66
00:02:23,640 --> 00:02:25,920
Governance doesn't fail when agents are created.
67
00:02:25,920 --> 00:02:28,840
It fails when execution is allowed before ownership exists.
68
00:02:28,840 --> 00:02:30,360
We've lived this pattern before.
69
00:02:30,360 --> 00:02:33,320
SharePoint lists quietly turned into apps without lifecycle.
70
00:02:33,320 --> 00:02:36,280
Power apps quietly wired into production without ALM.
71
00:02:36,280 --> 00:02:39,760
Teams bots quietly outlived the projects that justified them.
72
00:02:39,760 --> 00:02:43,280
In every case, we started with let people move fast
73
00:02:43,280 --> 00:02:47,080
and only introduced real control once entropy was visible.
74
00:02:47,080 --> 00:02:49,400
The difference with Foundry is that the unit of entropy
75
00:02:49,400 --> 00:02:52,040
is not a form or a flow, it's an autonomous actor.
76
00:02:52,040 --> 00:02:54,360
Power apps failed slowly.
77
00:02:54,360 --> 00:02:57,760
Foundry agents failed silently, thus, and much faster.
78
00:02:57,760 --> 00:03:00,480
If you treat Foundry as an IDE with a pretty UI,
79
00:03:00,480 --> 00:03:03,320
you'll set some quotas, maybe a DLP rule for uploads
80
00:03:03,320 --> 00:03:05,240
and you'll feel like you did something.
81
00:03:05,240 --> 00:03:07,120
Meanwhile, the real risk surface
82
00:03:07,120 --> 00:03:09,640
sits in the control plane you didn't define.
83
00:03:09,640 --> 00:03:12,640
Who can create agents under which identities,
84
00:03:12,640 --> 00:03:14,480
with which data boundaries,
85
00:03:14,480 --> 00:03:16,560
and with what level of observability
86
00:03:16,560 --> 00:03:18,520
as a precondition to execution?
87
00:03:18,520 --> 00:03:21,040
Most orgs today treat labels as metadata.
88
00:03:21,040 --> 00:03:23,680
What you need are labels as execution constraints.
89
00:03:23,680 --> 00:03:27,480
That means an unlabeled data set is not work in progress.
90
00:03:27,480 --> 00:03:29,400
It is ineligible for autonomous access.
91
00:03:29,400 --> 00:03:31,720
It does not exist as far as agents are concerned.
92
00:03:31,720 --> 00:03:35,320
It also means an unowned agent identity is not temporary.
93
00:03:35,320 --> 00:03:37,080
It is not allowed to run ever.
94
00:03:37,080 --> 00:03:39,120
We'll go deep on those mechanics later.
95
00:03:39,120 --> 00:03:41,040
For now, I want you to hold one framing.
96
00:03:41,040 --> 00:03:44,240
Foundry is a platform as a service for agente workloads.
97
00:03:44,240 --> 00:03:46,200
Each agent is a non-human identity
98
00:03:46,200 --> 00:03:47,720
with tools and data attached,
99
00:03:47,720 --> 00:03:49,680
capable of acting at cloud scale.
100
00:03:49,680 --> 00:03:51,920
Every time one of those identities executes
101
00:03:51,920 --> 00:03:54,200
without a clearly enforced owner,
102
00:03:54,200 --> 00:03:55,960
a clearly enforced data boundary,
103
00:03:55,960 --> 00:03:57,440
and a clearly enforced audit trail,
104
00:03:57,440 --> 00:04:00,520
you didn't automate a task, you automated risk.
105
00:04:00,520 --> 00:04:02,200
And that is what turns an impressive demo
106
00:04:02,200 --> 00:04:04,360
into the next generation of shadowite.
107
00:04:04,360 --> 00:04:07,680
Historical pattern, how Microsoft platforms created shadow IT,
108
00:04:07,680 --> 00:04:10,520
historical pattern, how Microsoft platforms created shadow,
109
00:04:10,520 --> 00:04:13,160
IT let me ground this in something you've already survived.
110
00:04:13,160 --> 00:04:15,120
This isn't the first time Microsoft gave the business
111
00:04:15,120 --> 00:04:17,720
a powerful surface and let governance show up later.
112
00:04:17,720 --> 00:04:18,720
The labels have changed.
113
00:04:18,720 --> 00:04:21,280
The failure pattern has not start with SharePoint.
114
00:04:21,280 --> 00:04:22,960
Officially, SharePoint was collaboration,
115
00:04:22,960 --> 00:04:24,720
document management, team sites.
116
00:04:24,720 --> 00:04:27,120
In reality, it became an application platform
117
00:04:27,120 --> 00:04:29,120
for people who'd never heard the word platform.
118
00:04:29,120 --> 00:04:31,760
Lists became databases, views became UIs.
119
00:04:31,760 --> 00:04:33,920
A couple of calculated columns and content types later,
120
00:04:33,920 --> 00:04:36,160
you had a critical business app running in a team site.
121
00:04:36,160 --> 00:04:37,360
No one in IT could find.
122
00:04:37,360 --> 00:04:38,800
What was the governance failure there?
123
00:04:38,800 --> 00:04:41,320
SharePoint's problem wasn't that people built things.
124
00:04:41,320 --> 00:04:43,120
The problem was that nothing in the product
125
00:04:43,120 --> 00:04:45,080
or the process defined a life cycle.
126
00:04:45,080 --> 00:04:47,640
No one had to declare this list is now a system of record.
127
00:04:47,640 --> 00:04:49,040
This site has an owner.
128
00:04:49,040 --> 00:04:50,600
This workflow must be retired
129
00:04:50,600 --> 00:04:51,880
when the project ends.
130
00:04:51,880 --> 00:04:53,760
So 10 years later, you had thousands of lists,
131
00:04:53,760 --> 00:04:55,760
hundreds of apps and permissions.
132
00:04:55,760 --> 00:04:57,200
No one could reconstruct.
133
00:04:57,200 --> 00:04:59,400
Every attempt to clean up felt like defusing a bomb
134
00:04:59,400 --> 00:05:00,600
with no diagram.
135
00:05:00,600 --> 00:05:02,400
Then came Power Apps and Power Automate.
136
00:05:02,400 --> 00:05:03,800
The pitch shifted from collaboration
137
00:05:03,800 --> 00:05:05,320
to citizen development.
138
00:05:05,320 --> 00:05:09,520
Build Apps, Wireflows, Automate Processes, No Code.
139
00:05:09,520 --> 00:05:11,280
And the business did exactly that.
140
00:05:11,280 --> 00:05:14,600
They connected low-code apps into finance, HR, CRM,
141
00:05:14,600 --> 00:05:16,040
line of business systems.
142
00:05:16,040 --> 00:05:19,720
Flows moved data between sales platforms, updated records,
143
00:05:19,720 --> 00:05:22,080
sent emails, created tickets.
144
00:05:22,080 --> 00:05:24,000
Again, the central governance failure was clear,
145
00:05:24,000 --> 00:05:26,680
no ownership, environments and DLP policies
146
00:05:26,680 --> 00:05:29,080
arrived after there were already hundreds or thousands
147
00:05:29,080 --> 00:05:30,800
of apps and flows in the wild.
148
00:05:30,800 --> 00:05:33,120
Many were created by people who had since moved roles
149
00:05:33,120 --> 00:05:34,640
or left the organization.
150
00:05:34,640 --> 00:05:37,640
There was no enforced concept of this app has a business owner.
151
00:05:37,640 --> 00:05:39,160
This flow has a life cycle.
152
00:05:39,160 --> 00:05:40,640
This identity is responsible.
153
00:05:40,640 --> 00:05:42,560
So security and compliance teams had to start
154
00:05:42,560 --> 00:05:44,360
by discovering what even existed.
155
00:05:44,360 --> 00:05:46,200
Then they had to negotiate with business owners
156
00:05:46,200 --> 00:05:48,160
who had accidentally built production dependencies
157
00:05:48,160 --> 00:05:50,720
on fragile, undocumented automations.
158
00:05:50,720 --> 00:05:53,400
So you'll recognize the pattern, innovation first, governance
159
00:05:53,400 --> 00:05:56,720
as archeology, then teams moved in as the new operating
160
00:05:56,720 --> 00:05:58,160
system of daily work.
161
00:05:58,160 --> 00:06:00,480
Suddenly, you had apps and bots living in the same surface
162
00:06:00,480 --> 00:06:02,040
where people chatted all day.
163
00:06:02,040 --> 00:06:03,960
Taps pointing to external systems.
164
00:06:03,960 --> 00:06:07,080
Bots with graph scopes broad enough to see and do almost anything.
165
00:06:07,080 --> 00:06:09,360
Messaging extensions calling into services,
166
00:06:09,360 --> 00:06:11,880
no one had ever security reviewed.
167
00:06:11,880 --> 00:06:14,400
On paper, governance controls existed.
168
00:06:14,400 --> 00:06:17,880
App permission policies, tenant app catalogs, admin approval
169
00:06:17,880 --> 00:06:18,520
workflows.
170
00:06:18,520 --> 00:06:20,880
In practice, those controls matured only after adoption
171
00:06:20,880 --> 00:06:22,080
was already massive.
172
00:06:22,080 --> 00:06:25,080
So teams replayed the same movie with a slightly different script.
173
00:06:25,080 --> 00:06:27,600
The governance failure here was decommissioning.
174
00:06:27,600 --> 00:06:29,840
Apps and bots were introduced for projects, pilots,
175
00:06:29,840 --> 00:06:31,240
one-off experiments.
176
00:06:31,240 --> 00:06:33,400
But almost no one had a muscle for deliberate shutdown.
177
00:06:33,400 --> 00:06:35,360
We knew how to approve something once.
178
00:06:35,360 --> 00:06:37,120
We did not know how to say this bot
179
00:06:37,120 --> 00:06:38,640
must die on this date unless someone
180
00:06:38,640 --> 00:06:40,360
renews its existence.
181
00:06:40,360 --> 00:06:42,480
SharePoint, no life cycle.
182
00:06:42,480 --> 00:06:44,320
Power apps, no ownership.
183
00:06:44,320 --> 00:06:46,080
Teams, no decommissioning.
184
00:06:46,080 --> 00:06:47,520
Three waves, same story.
185
00:06:47,520 --> 00:06:49,920
Execution embedded into collaboration surfaces
186
00:06:49,920 --> 00:06:52,320
without a control plane that defined who owns this,
187
00:06:52,320 --> 00:06:53,960
how long is it allowed to exist,
188
00:06:53,960 --> 00:06:56,000
and what has to be true for it to stop?
189
00:06:56,000 --> 00:06:58,040
Now connect that history to Foundry.
190
00:06:58,040 --> 00:07:00,440
In all three previous waves, the unit of risk
191
00:07:00,440 --> 00:07:03,040
was still in some week way human triggered.
192
00:07:03,040 --> 00:07:04,680
Someone had to click a list item.
193
00:07:04,680 --> 00:07:06,360
Someone had to open a power app.
194
00:07:06,360 --> 00:07:07,960
Someone had to talk to a bot in a channel
195
00:07:07,960 --> 00:07:09,480
that gave you at least one thin handle.
196
00:07:09,480 --> 00:07:11,320
There was a UI, a button, a chat.
197
00:07:11,320 --> 00:07:13,080
You could ask who uses this.
198
00:07:13,080 --> 00:07:15,880
You could see the surface where the behavior emerged.
199
00:07:15,880 --> 00:07:17,640
Foundry removes that last bit of friction.
200
00:07:17,640 --> 00:07:19,600
Foundry agents don't need a user session.
201
00:07:19,600 --> 00:07:21,080
They don't wait for someone to click.
202
00:07:21,080 --> 00:07:22,200
They wake up on schedules.
203
00:07:22,200 --> 00:07:23,240
They react to events.
204
00:07:23,240 --> 00:07:24,640
They sit behind APIs.
205
00:07:24,640 --> 00:07:27,360
They can be chained by other services you don't even control.
206
00:07:27,360 --> 00:07:29,360
So if you repeat the same governance pattern,
207
00:07:29,360 --> 00:07:32,800
adoption first controls later, you don't just get more shadow IT.
208
00:07:32,800 --> 00:07:34,400
You get autonomous shadow IT.
209
00:07:34,400 --> 00:07:37,200
You get agents that keep operating when the project is over,
210
00:07:37,200 --> 00:07:39,040
when the sponsor has moved on, when the person
211
00:07:39,040 --> 00:07:41,840
who wired the permissions has forgotten what they granted.
212
00:07:41,840 --> 00:07:44,200
And because there is no UI to stumble across,
213
00:07:44,200 --> 00:07:45,640
you don't discover them by accident.
214
00:07:45,640 --> 00:07:49,080
You discover them when an auditor asks why a summary contained data
215
00:07:49,080 --> 00:07:51,200
from three systems you swore were segregated.
216
00:07:51,200 --> 00:07:53,360
That's why I'm spending time on the historical curves.
217
00:07:53,360 --> 00:07:54,200
It's not nostalgia.
218
00:07:54,200 --> 00:07:55,320
It's a prediction.
219
00:07:55,320 --> 00:07:57,520
If you let Foundry follow the SharePoint Power Apps
220
00:07:57,520 --> 00:08:00,160
and Teams trajectory, you will end up in the same place just
221
00:08:00,160 --> 00:08:03,960
faster and with actors that don't wait for humans before they act.
222
00:08:03,960 --> 00:08:05,200
The platforms changed.
223
00:08:05,200 --> 00:08:06,400
The pattern didn't.
224
00:08:06,400 --> 00:08:09,000
The only new variable this time is autonomy.
225
00:08:09,000 --> 00:08:12,480
And autonomy is what turns familiar governance gaps into incidents
226
00:08:12,480 --> 00:08:15,360
you can't explain in the room where it matters.
227
00:08:15,360 --> 00:08:17,920
Failure mode one, agent identity collapse.
228
00:08:17,920 --> 00:08:20,240
Now I want to walk through the first failure mode
229
00:08:20,240 --> 00:08:23,600
because it's the one that quietly turns every other control into theatre.
230
00:08:23,600 --> 00:08:26,120
This is failure mode one, agent identity collapse.
231
00:08:26,120 --> 00:08:29,680
I call it agent identity collapse because at a high level it sound simple.
232
00:08:29,680 --> 00:08:31,560
The agent keeps acting when the human,
233
00:08:31,560 --> 00:08:35,560
it was anchored to no longer exists in the way your governance model assumes.
234
00:08:35,560 --> 00:08:38,160
But the operational reality is uglier than that.
235
00:08:38,160 --> 00:08:42,080
Identity collapse is the moment where everyone in the room knows something is wrong.
236
00:08:42,080 --> 00:08:44,480
And no one knows who is allowed to turn it off.
237
00:08:44,480 --> 00:08:47,640
This is where most governance models quietly break because they never encoded
238
00:08:47,640 --> 00:08:50,040
who owns a non-human identity when the humans move on.
239
00:08:50,040 --> 00:08:51,240
Here's the archetype.
240
00:08:51,240 --> 00:08:53,560
A project team stands up a foundry agent.
241
00:08:53,560 --> 00:08:54,440
They're under pressure.
242
00:08:54,440 --> 00:08:55,680
There's an exec sponsor.
243
00:08:55,680 --> 00:08:56,640
There's a demo date.
244
00:08:56,640 --> 00:08:58,600
So they do what every team under pressure does.
245
00:08:58,600 --> 00:09:00,800
They reuse whatever identity path is easiest.
246
00:09:00,800 --> 00:09:04,360
Maybe they wire the agent to run under a user's delegated context.
247
00:09:04,360 --> 00:09:08,680
Maybe they attach it to a generic automation account that already has broad rights.
248
00:09:08,680 --> 00:09:12,080
Maybe they grab an existing service principle because it's just a pilot
249
00:09:12,080 --> 00:09:13,640
in week one that feels harmless.
250
00:09:13,640 --> 00:09:15,840
Everyone in the room knows what the agent does.
251
00:09:15,840 --> 00:09:17,320
The sponsor is excited.
252
00:09:17,320 --> 00:09:20,520
The project lead can point to the intro object and say, yes, that's ours on it.
253
00:09:20,520 --> 00:09:22,920
From a pure authentication standpoint, it even looks clean.
254
00:09:22,920 --> 00:09:25,800
The agent can get it token, conditional access passes,
255
00:09:25,800 --> 00:09:28,960
logs show a known identity making calls, then time happens.
256
00:09:28,960 --> 00:09:30,920
Six months later, the sponsor has moved on.
257
00:09:30,920 --> 00:09:32,400
The project lead has changed teams.
258
00:09:32,400 --> 00:09:35,880
The contractor who actually wired the identity is long gone.
259
00:09:35,880 --> 00:09:39,160
The user account that originally granted consent is disabled.
260
00:09:39,160 --> 00:09:43,080
The distribution list you thought was the owner no longer maps to a real group of people.
261
00:09:43,080 --> 00:09:44,480
But the agent is still running.
262
00:09:44,480 --> 00:09:46,280
Tickets are still being triaged.
263
00:09:46,280 --> 00:09:47,840
Records are still being updated.
264
00:09:47,840 --> 00:09:49,240
Emails are still being sent.
265
00:09:49,240 --> 00:09:51,040
Nothing in entry is technically broken.
266
00:09:51,040 --> 00:09:52,000
The token is valid.
267
00:09:52,000 --> 00:09:53,720
The app registration still exists.
268
00:09:53,720 --> 00:09:56,160
The automation account is still in the right groups.
269
00:09:56,160 --> 00:09:58,840
From the system's perspective, everything is fine.
270
00:09:58,840 --> 00:10:01,240
From a governance perspective, identity has collapsed.
271
00:10:01,240 --> 00:10:02,480
The execution continues.
272
00:10:02,480 --> 00:10:04,000
The ownership has evaporated.
273
00:10:04,000 --> 00:10:06,800
This is the gap between authentication and accountability.
274
00:10:06,800 --> 00:10:09,720
Authentication tells you the caller had valid credentials.
275
00:10:09,720 --> 00:10:12,880
Authorization tells you the permissions attached to that identity.
276
00:10:12,880 --> 00:10:14,120
Allow the action.
277
00:10:14,120 --> 00:10:18,320
Neither tells you whether any living human still intends for that identity to exist.
278
00:10:18,320 --> 00:10:22,200
If an agent can execute without an owner, you didn't automate a task.
279
00:10:22,200 --> 00:10:23,680
You automated risk.
280
00:10:23,680 --> 00:10:26,600
Pause on that because it defines your entire agent control plane.
281
00:10:26,600 --> 00:10:27,600
Let me make this concrete.
282
00:10:27,600 --> 00:10:28,920
Incident A looks like this.
283
00:10:28,920 --> 00:10:31,720
A foundry agent is built to triage support tickets.
284
00:10:31,720 --> 00:10:36,560
It reads from a queue, pulls customer details from data verse, looks up documentation in a sharepoint site,
285
00:10:36,560 --> 00:10:39,200
and writes a summary back into a system of record.
286
00:10:39,200 --> 00:10:45,000
To move fast, the team has it run on behalf of a shared automation account that already has the right graph scopes and data access.
287
00:10:45,000 --> 00:10:46,960
No one wants to wait for a new access request.
288
00:10:46,960 --> 00:10:49,360
They tell themselves they'll clean it up after the pilot.
289
00:10:49,360 --> 00:10:50,400
The pilot works.
290
00:10:50,400 --> 00:10:51,120
People are happy.
291
00:10:51,120 --> 00:10:52,200
Time passes.
292
00:10:52,200 --> 00:10:58,600
18 months later, a different team needs to fix an unrelated problem and gives that same automation account access to a new data source.
293
00:10:58,600 --> 00:11:00,800
Maybe HR data, maybe finance.
294
00:11:00,800 --> 00:11:05,600
No one in that meeting remembers there is a foundry agent silently executing under that identity.
295
00:11:05,600 --> 00:11:12,280
Now your ticket triage agent designed for one narrow workflow is operating with a much larger blast radius,
296
00:11:12,280 --> 00:11:15,240
touching data it was never modeled for under an identity.
297
00:11:15,240 --> 00:11:16,720
No one feels their own.
298
00:11:16,720 --> 00:11:18,000
Nothing left the tenant.
299
00:11:18,000 --> 00:11:19,680
There was no external compromise.
300
00:11:19,680 --> 00:11:21,520
Every call in the logs is legitimate.
301
00:11:21,520 --> 00:11:23,080
Try explaining that to an auditor.
302
00:11:23,080 --> 00:11:25,480
This is what I mean by agent identity collapse.
303
00:11:25,480 --> 00:11:28,120
The system can prove the agent was allowed to act.
304
00:11:28,120 --> 00:11:31,000
Nobody can prove that anyone still intends for it to have that power.
305
00:11:31,000 --> 00:11:36,760
And this isn't limited to bad hygiene. It's structural whenever you treat agents as side effects of other identities.
306
00:11:36,760 --> 00:11:40,880
User accounts, shared service principles, catch all automation apps.
307
00:11:40,880 --> 00:11:45,560
Agents need to be first class security principles, not users, not borrowed service principles,
308
00:11:45,560 --> 00:11:52,640
distinct non-human identities with four mandatory attributes and owner a purpose, a maximum lifetime, and a decommissioned trigger.
309
00:11:52,640 --> 00:11:56,600
That's the minimum shape of a non-human identity in a serious agent control plane.
310
00:11:56,600 --> 00:12:01,560
If you can't answer those four for a given agent, you are already in identity collapse territory.
311
00:12:01,560 --> 00:12:03,200
You just haven't seen the incident yet.
312
00:12:03,200 --> 00:12:07,600
This is also where people lull themselves with the idea that we log everything.
313
00:12:07,600 --> 00:12:08,520
Yes, you should log.
314
00:12:08,520 --> 00:12:14,520
Yes, you should integrate with your sign, but logging an orphaned identity faster does not fix the fact it's orphaned.
315
00:12:14,520 --> 00:12:18,960
If your access model depends on remembering why a permission exists, it has already failed.
316
00:12:18,960 --> 00:12:22,040
The only durable pattern is identity driven agent control.
317
00:12:22,040 --> 00:12:29,960
Unique workload identities for agents created through a governed pipeline, tagged with owner and purpose, reviewed on a schedule, disabled when the trigger condition hits.
318
00:12:29,960 --> 00:12:37,280
No exceptions, no just for this demo, no will refactor later, because later is always when the person who understood the shortcut has already left,
319
00:12:37,280 --> 00:12:40,200
and the agent they wired is still acting as if nothing changed.
320
00:12:40,200 --> 00:12:42,720
If you simplify failure mode one, it comes down to this.
321
00:12:42,720 --> 00:12:48,640
Once autonomous execution survives longer than human ownership, your pre-execution governance is already gone.
322
00:12:48,640 --> 00:12:51,760
Failure mode two, permission drift in agent access.
323
00:12:51,760 --> 00:12:55,240
Identity collapse is what keeps an agent alive after ownership dies.
324
00:12:55,240 --> 00:13:00,520
Permission drift is what quietly expands what that agent can touch while nobody is looking.
325
00:13:00,520 --> 00:13:03,120
If identity collapse answers who is this?
326
00:13:03,120 --> 00:13:05,360
Permission drift answers what can it do?
327
00:13:05,360 --> 00:13:06,920
And that answer keeps changing.
328
00:13:06,920 --> 00:13:10,640
This is failure mode two, permission drift in agent access.
329
00:13:10,640 --> 00:13:14,840
The pattern is depressingly consistent, incident B starts clean.
330
00:13:14,840 --> 00:13:17,680
A team builds their first foundry agent with a tight access model.
331
00:13:17,680 --> 00:13:18,760
They do almost everything right.
332
00:13:18,760 --> 00:13:20,640
They create a dedicated workload identity.
333
00:13:20,640 --> 00:13:22,920
They granted read access to one share point side.
334
00:13:22,920 --> 00:13:24,760
Read access to one shared mailbox.
335
00:13:24,760 --> 00:13:26,400
Write access to a single logging store.
336
00:13:26,400 --> 00:13:27,480
They document it.
337
00:13:27,480 --> 00:13:28,520
Security signs off.
338
00:13:28,520 --> 00:13:31,120
Everyone feels like this is how it's supposed to work.
339
00:13:31,120 --> 00:13:32,480
Then the second agent arrives.
340
00:13:32,480 --> 00:13:34,320
It has a similar but not identical job.
341
00:13:34,320 --> 00:13:36,720
Maybe it needs to read two sides instead of one.
342
00:13:36,720 --> 00:13:39,200
Maybe it also has to update a ticketing system.
343
00:13:39,200 --> 00:13:42,200
Under pressure, the team does the most natural thing in the world.
344
00:13:42,200 --> 00:13:44,200
They reuse what already works.
345
00:13:44,200 --> 00:13:47,800
They hang the new agent off the same identity or the same role assignment.
346
00:13:47,800 --> 00:13:50,280
And they just add a couple of extra permissions.
347
00:13:50,280 --> 00:13:53,920
You've just merged two logical workloads into one permission surface.
348
00:13:53,920 --> 00:13:57,120
One agent has no legitimate reason to touch system B.
349
00:13:57,120 --> 00:13:58,960
The other has no reason to touch system A.
350
00:13:58,960 --> 00:14:01,240
But they now share an identity that can do both.
351
00:14:01,240 --> 00:14:02,600
Then the third agent shows up.
352
00:14:02,600 --> 00:14:05,000
This one belongs to a different team entirely.
353
00:14:05,000 --> 00:14:10,160
But the platform folks remember that the existing identity already has almost everything it needs.
354
00:14:10,160 --> 00:14:13,800
There is one missing permission, a right scope on a new data source.
355
00:14:13,800 --> 00:14:15,680
Someone grants a temporary exception.
356
00:14:15,680 --> 00:14:20,080
There is a verbal understanding that this will be fixed properly once the pilot proves its value.
357
00:14:20,080 --> 00:14:23,720
30 days later, the pilot is production by usage, not by design.
358
00:14:23,720 --> 00:14:25,400
No one has revisited the identity.
359
00:14:25,400 --> 00:14:30,840
Now you have a single non-human principle that can read from five different sources, right to three more.
360
00:14:30,840 --> 00:14:32,760
And nobody can clearly explain why.
361
00:14:32,760 --> 00:14:33,840
That is permission drift.
362
00:14:33,840 --> 00:14:35,560
It's not one catastrophic decision.
363
00:14:35,560 --> 00:14:40,240
It's a series of small, reasonable exceptions that aggregate into an access graph.
364
00:14:40,240 --> 00:14:41,760
No one ever intended.
365
00:14:41,760 --> 00:14:46,040
And with agents, that drift is amplified by how invisible the execution is.
366
00:14:46,040 --> 00:14:49,320
In traditional app governance, you at least have a UI to grab onto.
367
00:14:49,320 --> 00:14:52,640
There is an app tile, a URL, a mobile client.
368
00:14:52,640 --> 00:14:55,240
You can say, show me everything attached to this thing.
369
00:14:55,240 --> 00:14:57,520
With Foundry agents, there may be no UI at all.
370
00:14:57,520 --> 00:15:02,600
The execution surface is a queue, a scheduler, or an API endpoint buried inside another system.
371
00:15:02,600 --> 00:15:06,480
So by the time someone asks, which agents can touch this HR data set you?
372
00:15:06,480 --> 00:15:09,280
Your honest starting point is we know which identities can.
373
00:15:09,280 --> 00:15:11,520
We're less sure which agents are hanging off them.
374
00:15:11,520 --> 00:15:15,160
This is also where non-human identities turn drift into a multiplier.
375
00:15:15,160 --> 00:15:18,080
Service principles and managed identities are designed for reuse.
376
00:15:18,080 --> 00:15:19,920
That's their power and their danger.
377
00:15:19,920 --> 00:15:24,320
In a microservice architecture with strong discipline, you might have one identity per service boundary.
378
00:15:24,320 --> 00:15:28,240
In early stage agent adoption, the temptation is always the opposite.
379
00:15:28,240 --> 00:15:32,120
Hang more agents off the same automation identity because it already works.
380
00:15:32,120 --> 00:15:35,400
If your process doesn't fight that instinct, drift is guaranteed.
381
00:15:35,400 --> 00:15:40,320
If your access model depends on remembering why a permission exists, it has already failed.
382
00:15:40,320 --> 00:15:42,920
And the most insidious part is that nothing looks wrong in the logs.
383
00:15:42,920 --> 00:15:44,680
Every call is properly authenticated.
384
00:15:44,680 --> 00:15:46,360
Every token carries the right roles.
385
00:15:46,360 --> 00:15:48,640
There is no explicit deny being violated.
386
00:15:48,640 --> 00:15:50,320
Your CMC's normal operations.
387
00:15:50,320 --> 00:15:53,080
From the systems point of view, everything is compliant.
388
00:15:53,080 --> 00:15:55,720
From a governance point of view, the boundary has already moved.
389
00:15:55,720 --> 00:15:59,440
Now, layer in the fact that foundry agents are not deterministic flows.
390
00:15:59,440 --> 00:16:01,520
A power automate flow has a fixed sequence.
391
00:16:01,520 --> 00:16:03,840
You can read the steps in order and understand the path.
392
00:16:03,840 --> 00:16:08,120
An agent chooses tools at runtime based on instructions and intermediate results.
393
00:16:08,120 --> 00:16:10,720
It can chain calls, retry, pick alternate paths.
394
00:16:10,720 --> 00:16:14,080
So when it's underlying identity gains access to a new data source,
395
00:16:14,080 --> 00:16:16,640
that new path is not just theoretically available.
396
00:16:16,640 --> 00:16:20,120
It is available to a reasoning system that is incentivized to explore.
397
00:16:20,120 --> 00:16:23,280
That's how you end up with an agent that was originally scope to read
398
00:16:23,280 --> 00:16:26,200
from this support mailbox and this documentation site,
399
00:16:26,200 --> 00:16:28,600
suddenly able to pull internal HR notes
400
00:16:28,600 --> 00:16:31,560
because the shared identity picked up a broader graph scope,
401
00:16:31,560 --> 00:16:33,400
joined that with customer emails,
402
00:16:33,400 --> 00:16:38,480
and write a holistic view into a system no one expected to contain that combination.
403
00:16:38,480 --> 00:16:40,320
Again, nothing left the tenant.
404
00:16:40,320 --> 00:16:41,560
No firewall was breached.
405
00:16:41,560 --> 00:16:44,800
Every permission was granted by someone for some reason at some point.
406
00:16:44,800 --> 00:16:47,080
Try diagramming that story in a risk committee.
407
00:16:47,080 --> 00:16:49,160
If your access posture for agents is,
408
00:16:49,160 --> 00:16:52,040
we'll just be careful when we add permissions you've already lost.
409
00:16:52,040 --> 00:16:53,720
Monitoring is not governance.
410
00:16:53,720 --> 00:16:55,280
Monitoring is not governance.
411
00:16:55,280 --> 00:16:59,160
The only durable way to contain permission drift is to design for it upfront.
412
00:16:59,160 --> 00:17:02,640
Every agent gets its own minimum necessary access profile,
413
00:17:02,640 --> 00:17:05,080
expressed as narrowly as your platform allows,
414
00:17:05,080 --> 00:17:06,880
bound to its own non-human identity,
415
00:17:06,880 --> 00:17:08,960
no shared service principles,
416
00:17:08,960 --> 00:17:10,680
no omnibus automation roles.
417
00:17:11,680 --> 00:17:14,160
Yes, that sounds slower on day one.
418
00:17:14,160 --> 00:17:16,200
But permission models are like concrete,
419
00:17:16,200 --> 00:17:19,640
easy to pour, incredibly hard to reshape once it sets.
420
00:17:19,640 --> 00:17:23,120
Optimizing for speed over reversibility is how you end up here.
421
00:17:23,120 --> 00:17:25,840
And it's state full of agents, no one wants to touch
422
00:17:25,840 --> 00:17:29,040
because no one can predict what else each identity might be enabling.
423
00:17:29,040 --> 00:17:31,360
And remember, monitoring doesn't fix drift.
424
00:17:31,360 --> 00:17:34,120
It just tells you faster that it already happened.
425
00:17:34,120 --> 00:17:37,080
If the only thing standing between your agents and a wider blast radius
426
00:17:37,080 --> 00:17:40,120
is the hope that people will remember why a permission was added,
427
00:17:40,120 --> 00:17:41,600
you don't have governance.
428
00:17:41,600 --> 00:17:44,600
You have a story you tell yourself until the next incident.
429
00:17:44,600 --> 00:17:47,800
If you simplify failure mode 2, it comes down to one thing.
430
00:17:47,800 --> 00:17:52,440
A single overprivileged identity turns a clean design into shadow AR you can't reason about.
431
00:17:52,440 --> 00:17:53,960
Failure mode 3.
432
00:17:53,960 --> 00:17:55,560
Data boundary collapse.
433
00:17:55,560 --> 00:17:58,200
Identity collapse keeps the agent alive.
434
00:17:58,200 --> 00:18:00,520
Permission drift grows what it can touch.
435
00:18:00,520 --> 00:18:03,880
Data boundary collapse is where on paper everything looks compliant
436
00:18:03,880 --> 00:18:06,040
and you still end up with a reportable incident.
437
00:18:06,040 --> 00:18:07,480
This is failure mode 3.
438
00:18:07,480 --> 00:18:10,040
Data boundary collapse under autonomous execution.
439
00:18:10,040 --> 00:18:11,160
Here's the archetype.
440
00:18:11,160 --> 00:18:13,880
Incident C starts out as a model of good behavior.
441
00:18:13,880 --> 00:18:16,840
A team builds a foundry agent to help a support function.
442
00:18:16,840 --> 00:18:18,360
The design looks sane.
443
00:18:18,360 --> 00:18:21,320
The agent reads from a shared support mailbox in exchange
444
00:18:21,320 --> 00:18:24,520
Pulse-related documentation from a specific SharePoint site
445
00:18:24,520 --> 00:18:27,240
calls an approved external API for status,
446
00:18:27,240 --> 00:18:30,120
drafts an internal update in a ticketing system.
447
00:18:30,120 --> 00:18:32,680
Each individual access goes through the right channel.
448
00:18:32,680 --> 00:18:35,960
Exchange admins see a workload identity reading a scoped mailbox,
449
00:18:35,960 --> 00:18:39,640
SharePoint admins see a trusted app with rights on one side collection.
450
00:18:39,640 --> 00:18:44,120
Network security sees outbound traffic to an allow-listed third-party endpoint.
451
00:18:44,120 --> 00:18:47,320
The ticketing team sees an internal integration account updating records.
452
00:18:47,320 --> 00:18:49,640
Every control surface sees its own piece and says,
453
00:18:49,640 --> 00:18:51,960
"Yes, this is fine. No one sees the full path.
454
00:18:51,960 --> 00:18:55,400
What the agent is actually doing is stitching those hops into one flow,
455
00:18:55,400 --> 00:18:58,120
reading potentially sensitive customer content from email,
456
00:18:58,120 --> 00:19:02,200
correlating it with internal docs, enriching it with third-party data,
457
00:19:02,200 --> 00:19:05,240
and then writing a synthesized view somewhere else."
458
00:19:05,240 --> 00:19:08,280
If your purview policies, your DLP rules and your network controls
459
00:19:08,280 --> 00:19:10,360
are all scoped to one system at a time.
460
00:19:10,360 --> 00:19:12,440
They will happily bless each leg of that journey.
461
00:19:12,440 --> 00:19:14,200
That's data boundary collapse.
462
00:19:14,200 --> 00:19:16,200
Every hop is individually compliant.
463
00:19:16,200 --> 00:19:19,640
The combined execution crosses a line your policies assumed would hold.
464
00:19:19,640 --> 00:19:23,240
Let me restate this because it's easy to miss how dangerous that is.
465
00:19:23,240 --> 00:19:25,240
Nothing was leaked, nothing left the tenant.
466
00:19:25,240 --> 00:19:26,920
And the policy was still violated.
467
00:19:26,920 --> 00:19:28,920
You see this most clearly once you add,
468
00:19:28,920 --> 00:19:31,880
retrieval augmented generation and tool orchestration,
469
00:19:31,880 --> 00:19:35,000
a typical Foundry agent can query a vector store
470
00:19:35,000 --> 00:19:38,360
built from labeled SharePoint content, search email, via graph,
471
00:19:38,360 --> 00:19:41,880
call a line of business API that returns semi-structured records
472
00:19:41,880 --> 00:19:43,960
and write outputs into another system.
473
00:19:43,960 --> 00:19:45,800
Each tool has its own security story.
474
00:19:45,800 --> 00:19:47,880
Each data set has its own label strategy.
475
00:19:47,880 --> 00:19:50,040
Each admin team feels they've done their job.
476
00:19:50,040 --> 00:19:53,160
What you almost never have is a control that evaluates the combination.
477
00:19:53,160 --> 00:19:55,800
No one is asking, in a machine-inforcible way,
478
00:19:55,800 --> 00:19:58,680
is it acceptable for an autonomous agent to combine data
479
00:19:58,680 --> 00:20:02,280
from these three label classes and push the result into this fourth system
480
00:20:02,280 --> 00:20:04,840
without a human validating the path in the middle?
481
00:20:04,840 --> 00:20:09,480
Traditional PerView DLP is very good at obvious exfiltration patterns,
482
00:20:09,480 --> 00:20:13,480
emailing a highly confidential document to an external domain,
483
00:20:13,480 --> 00:20:17,000
mass-downloading regulated data to an unmanaged device,
484
00:20:17,000 --> 00:20:20,360
uploading sensitive files to an unsanctioned SAS app.
485
00:20:20,360 --> 00:20:23,320
It is much weaker when the data never leaves approved systems
486
00:20:23,320 --> 00:20:25,240
but crosses an implicit policy boundary.
487
00:20:25,240 --> 00:20:26,840
For example, you may have a rule of thumb
488
00:20:26,840 --> 00:20:29,080
that HR nodes stay in HR systems.
489
00:20:29,080 --> 00:20:31,240
Customer PRI stays in support systems.
490
00:20:31,240 --> 00:20:33,000
The financial details stay in finance.
491
00:20:33,000 --> 00:20:35,000
Individually, those systems are locked down.
492
00:20:35,000 --> 00:20:37,880
But your agent operating as a reasoning layer above all three
493
00:20:37,880 --> 00:20:41,480
can be instructed or can infer that to produce a 360-degree view
494
00:20:41,480 --> 00:20:45,000
it should pull HR performance nodes, support ticket history
495
00:20:45,000 --> 00:20:48,360
and billing status, and merge them into a single narrative.
496
00:20:48,360 --> 00:20:50,200
No single system knows that happened.
497
00:20:50,200 --> 00:20:51,640
Each only sees its own queries.
498
00:20:51,640 --> 00:20:53,400
Your DLP rules don't fire.
499
00:20:53,400 --> 00:20:56,280
Your export controls stay quiet, your firewalls are happy.
500
00:20:56,280 --> 00:20:58,520
From a logging perspective, nothing illegal occurred.
501
00:20:58,520 --> 00:21:00,040
From a regulatory perspective,
502
00:21:00,040 --> 00:21:03,080
you may have just combined classes of data in a way your own policies
503
00:21:03,080 --> 00:21:04,200
explicitly forbid.
504
00:21:04,200 --> 00:21:06,360
That's the heart of data boundary collapse.
505
00:21:06,360 --> 00:21:08,760
The agent becomes an invisible integration surface
506
00:21:08,760 --> 00:21:11,240
that tunnels through the spaces between your controls.
507
00:21:11,240 --> 00:21:14,280
If you simplify failure mode three, it comes down to this.
508
00:21:14,280 --> 00:21:17,080
Every system thinks it is enforcing policy
509
00:21:17,080 --> 00:21:18,680
but no one governs the combination
510
00:21:18,680 --> 00:21:20,360
and autonomous agent can assemble.
511
00:21:20,360 --> 00:21:23,080
Hold that thought because this is where your agent control plane
512
00:21:23,080 --> 00:21:24,840
either exists or it doesn't.
513
00:21:24,840 --> 00:21:28,280
And agents don't interact with labeled content the way humans do.
514
00:21:28,280 --> 00:21:30,840
Sensitivity labels were designed around human actions.
515
00:21:30,840 --> 00:21:33,560
Open, edit, share, send.
516
00:21:33,560 --> 00:21:36,280
Agents often see chunks from a vector index,
517
00:21:36,280 --> 00:21:39,080
tokenized snippets from search aggregates from APIs.
518
00:21:39,080 --> 00:21:41,080
Those fragments may never carry a visible label
519
00:21:41,080 --> 00:21:42,760
into the execution context
520
00:21:42,760 --> 00:21:45,400
even though they originated from highly labeled sources.
521
00:21:45,400 --> 00:21:47,080
So you can end up in a place where
522
00:21:47,080 --> 00:21:49,000
an every original document is correctly labeled,
523
00:21:49,000 --> 00:21:50,520
every mailbox is correctly governed,
524
00:21:50,520 --> 00:21:52,040
every API is properly scoped.
525
00:21:52,040 --> 00:21:53,640
And the agent's working memory,
526
00:21:53,640 --> 00:21:56,520
a blend of snippets, embeddings, and intermediate summaries,
527
00:21:56,520 --> 00:21:58,920
sits entirely outside your labeling model.
528
00:21:58,920 --> 00:22:00,440
If you don't anchor agent behavior
529
00:22:00,440 --> 00:22:03,480
to predefined enforceable label combinations,
530
00:22:03,480 --> 00:22:06,040
it will happily synthesize across boundaries
531
00:22:06,040 --> 00:22:07,640
you never meant to be porous.
532
00:22:07,640 --> 00:22:09,880
Saying our data is labeled is not enough.
533
00:22:09,880 --> 00:22:12,440
You need a policy that says, in effect,
534
00:22:12,440 --> 00:22:13,560
for autonomous agents,
535
00:22:13,560 --> 00:22:15,560
these label classes may be read together,
536
00:22:15,560 --> 00:22:16,840
these may be written together
537
00:22:16,840 --> 00:22:19,160
and these combinations are simply not allowed.
538
00:22:19,160 --> 00:22:20,840
And that policy has to be enforced
539
00:22:20,840 --> 00:22:21,960
before the agent exists,
540
00:22:21,960 --> 00:22:24,280
not inspected after the first bad summary goes out.
541
00:22:24,280 --> 00:22:26,440
That's pre-execution governance applied to data,
542
00:22:26,440 --> 00:22:27,400
not just identity.
543
00:22:27,400 --> 00:22:29,000
Because once the agent is live,
544
00:22:29,000 --> 00:22:30,840
it will explore the space you gave it.
545
00:22:30,840 --> 00:22:32,520
It won't ask whether your policy assumed
546
00:22:32,520 --> 00:22:33,720
those systems would never meet,
547
00:22:33,720 --> 00:22:36,920
was why foundry agents are worse than low-code apps.
548
00:22:36,920 --> 00:22:38,120
At this point you might be thinking,
549
00:22:38,120 --> 00:22:40,920
we've seen this before with power apps and power automate.
550
00:22:40,920 --> 00:22:43,720
We survived that, how much worse can foundry really be?
551
00:22:43,720 --> 00:22:46,520
The uncomfortable answer is structurally worse.
552
00:22:46,520 --> 00:22:48,200
Not because the technology is malicious,
553
00:22:48,200 --> 00:22:49,960
but because the execution model removes
554
00:22:49,960 --> 00:22:52,440
the last accidental safety rails low-code gave you.
555
00:22:52,440 --> 00:22:56,440
Power apps power automate even most teams apps share three properties
556
00:22:56,440 --> 00:22:58,440
that made governance late, but still possible.
557
00:22:58,440 --> 00:23:00,520
They are user-triggered, they are UI-bound
558
00:23:00,520 --> 00:23:03,480
and they are, at least in theory, easy to inventory.
559
00:23:03,480 --> 00:23:05,560
A power app needs someone to launch it.
560
00:23:05,560 --> 00:23:08,120
A flow usually fires off something a human did,
561
00:23:08,120 --> 00:23:10,360
submitting a form, updating a row.
562
00:23:10,360 --> 00:23:13,560
Even scheduled flows have a visible artifact in an environment you can list.
563
00:23:13,560 --> 00:23:15,880
You can go into the Power Platform Admin Center,
564
00:23:15,880 --> 00:23:17,400
pull a report of apps and flows,
565
00:23:17,400 --> 00:23:19,240
sought by owner or environment,
566
00:23:19,240 --> 00:23:20,840
and you at least have something concrete
567
00:23:20,840 --> 00:23:25,080
to start a conversation around low-code governance assumes determinism.
568
00:23:25,080 --> 00:23:26,600
You have a definition of the flow.
569
00:23:26,600 --> 00:23:28,200
You have a person who clicks,
570
00:23:28,200 --> 00:23:29,720
you have a surface you can point to and say,
571
00:23:29,720 --> 00:23:31,400
this is the thing we're talking about.
572
00:23:31,400 --> 00:23:33,720
Agents operate on probability and context.
573
00:23:33,720 --> 00:23:35,880
That's why you cannot recycle your low-code playbook
574
00:23:35,880 --> 00:23:38,680
and expect it to govern autonomous execution.
575
00:23:38,680 --> 00:23:40,360
Foundry takes away all three crutches,
576
00:23:40,360 --> 00:23:42,280
first user-triggered versus autonomous.
577
00:23:42,280 --> 00:23:44,680
Foundry agents are workloads.
578
00:23:44,680 --> 00:23:46,760
Once deployed, they wake up on events,
579
00:23:46,760 --> 00:23:49,160
timers, webhooks or calls from other systems.
580
00:23:49,160 --> 00:23:51,880
There may never be a human in the loop for a particular run.
581
00:23:51,880 --> 00:23:53,640
There may never be a UI at all.
582
00:23:53,640 --> 00:23:55,320
The risk doesn't live where someone clicks.
583
00:23:55,320 --> 00:23:56,680
It lives where events fire.
584
00:23:56,680 --> 00:23:58,840
Second, UI bound versus invisible.
585
00:23:58,840 --> 00:24:01,160
When a power app breaks, a user sees it.
586
00:24:01,160 --> 00:24:03,080
A form doesn't load, a button doesn't work.
587
00:24:03,080 --> 00:24:06,200
That pain is often how security discovers something exists.
588
00:24:06,200 --> 00:24:07,640
When a Foundry agent misbehaves,
589
00:24:07,640 --> 00:24:09,800
it can do so for weeks before anyone notices
590
00:24:09,800 --> 00:24:11,480
because there is no daily human touch point.
591
00:24:11,480 --> 00:24:13,720
Maybe its output is another system's input.
592
00:24:13,720 --> 00:24:15,160
Maybe it feeds a weekly report,
593
00:24:15,160 --> 00:24:16,920
someone glances at between meetings.
594
00:24:16,920 --> 00:24:20,040
You lose that thin lifeline of the thing people complain about.
595
00:24:20,040 --> 00:24:22,760
Third, easy inventory versus dynamic execution graphs.
596
00:24:22,760 --> 00:24:25,000
You can inventory low-code apps by environment.
597
00:24:25,000 --> 00:24:26,680
You can see which connectors they use.
598
00:24:26,680 --> 00:24:30,120
You can map with some effort this app talks to this data.
599
00:24:30,120 --> 00:24:32,680
An agent's execution path is not a static diagram.
600
00:24:32,680 --> 00:24:33,960
It is a reasoning process.
601
00:24:33,960 --> 00:24:36,760
At runtime, it chooses tools based on intermediate results
602
00:24:36,760 --> 00:24:37,960
and instructions.
603
00:24:37,960 --> 00:24:40,280
Two runs of the same agent can hit different tools,
604
00:24:40,280 --> 00:24:41,800
different data, different destinations.
605
00:24:41,800 --> 00:24:44,120
So the idea that you'll document the flow once
606
00:24:44,120 --> 00:24:46,440
and review it annually simply doesn't hold.
607
00:24:46,440 --> 00:24:48,520
And this is before we talk about sprawl.
608
00:24:48,520 --> 00:24:51,400
Low-code sprawl gave you shadow IT with forms and flows.
609
00:24:51,400 --> 00:24:54,840
Annoying, risky, often bounded by human attention,
610
00:24:54,840 --> 00:24:57,640
Foundry gives you shadow IT with autonomous actors.
611
00:24:57,640 --> 00:25:00,040
Agents that execute under non-human identities
612
00:25:00,040 --> 00:25:02,280
can call any tool that identity can reach,
613
00:25:02,280 --> 00:25:03,640
can be triggered by systems,
614
00:25:03,640 --> 00:25:05,480
no one in IT even administers,
615
00:25:05,480 --> 00:25:08,840
and can change their own behavior as new tools are wired in.
616
00:25:08,840 --> 00:25:11,160
Most AI incidents won't be caused by hallucinations.
617
00:25:11,160 --> 00:25:12,920
They'll be caused by agents acting on data
618
00:25:12,920 --> 00:25:14,680
no one realized they could see.
619
00:25:14,680 --> 00:25:17,800
This is where the old governance reflex will monitor it
620
00:25:17,800 --> 00:25:18,920
collides with reality.
621
00:25:18,920 --> 00:25:20,920
Monitoring is not governance.
622
00:25:20,920 --> 00:25:22,760
If your first response is, "We'll watch it,"
623
00:25:22,760 --> 00:25:24,040
the decision is already made.
624
00:25:24,040 --> 00:25:25,880
You've accepted that ungoverned execution
625
00:25:25,880 --> 00:25:27,320
will shape your environment
626
00:25:27,320 --> 00:25:29,880
and you're just hoping to notice before it hurts.
627
00:25:29,880 --> 00:25:32,840
With Power Apps, that posture was barely tolerable.
628
00:25:32,840 --> 00:25:34,520
With Foundry agents, it's an admission
629
00:25:34,520 --> 00:25:36,920
that you're willing to let autonomous systems operate
630
00:25:36,920 --> 00:25:38,920
in production without a control plane.
631
00:25:38,920 --> 00:25:40,360
And there's one more difference that matters,
632
00:25:40,360 --> 00:25:42,520
how easily capability expands.
633
00:25:42,520 --> 00:25:45,400
In a low-code app, adding a new connector is a deployment event.
634
00:25:45,400 --> 00:25:46,280
You change the app.
635
00:25:46,280 --> 00:25:48,120
There's a PR, a solution export,
636
00:25:48,120 --> 00:25:49,800
at least some visible step.
637
00:25:49,800 --> 00:25:52,920
In an agent, adding a new tool is often an infrastructure decision.
638
00:25:52,920 --> 00:25:54,920
You publish another internal API.
639
00:25:54,920 --> 00:25:56,440
You light up a new vector store.
640
00:25:56,440 --> 00:25:58,440
You broaden a role on a managed identity.
641
00:25:58,440 --> 00:26:00,760
Suddenly, every agent hanging off that identity
642
00:26:00,760 --> 00:26:02,040
has a new potential path
643
00:26:02,040 --> 00:26:04,280
without anyone touching the agent's definition.
644
00:26:04,280 --> 00:26:06,520
You didn't tell the agent you can now read the system.
645
00:26:06,520 --> 00:26:08,440
You told the identity?
646
00:26:08,440 --> 00:26:09,960
You can now read the system.
647
00:26:09,960 --> 00:26:11,640
The agent discovers it at runtime.
648
00:26:11,640 --> 00:26:13,320
So your effective behavior surface grows
649
00:26:13,320 --> 00:26:14,920
with every convenience change you make
650
00:26:14,920 --> 00:26:16,600
to the underlying access model.
651
00:26:16,600 --> 00:26:18,440
Power Apps failed slowly.
652
00:26:18,440 --> 00:26:21,640
Foundry agents failed silently and much faster.
653
00:26:21,640 --> 00:26:24,440
If you try to reuse your low-code governance muscle,
654
00:26:24,440 --> 00:26:26,360
inventory, quarterly reviews,
655
00:26:26,360 --> 00:26:28,840
owners in a spreadsheet against the system
656
00:26:28,840 --> 00:26:31,720
that is probabilistic, autonomous, and API-driven,
657
00:26:31,720 --> 00:26:35,000
it will look like it's working right up until the first incident.
658
00:26:35,000 --> 00:26:38,120
In other words, the old model governs app tiles and flows.
659
00:26:38,120 --> 00:26:40,520
It does not govern an agent control plane.
660
00:26:40,520 --> 00:26:43,160
And when that incident hits, the story will sound familiar.
661
00:26:43,160 --> 00:26:44,200
There was no breach.
662
00:26:44,200 --> 00:26:46,840
No external attacker, no policy explicitly broken
663
00:26:46,840 --> 00:26:48,120
at any single hop.
664
00:26:48,120 --> 00:26:50,120
And yet an agent combined data
665
00:26:50,120 --> 00:26:52,120
in a way your own rules said it never should.
666
00:26:52,120 --> 00:26:53,720
The difference is that with Power Apps,
667
00:26:53,720 --> 00:26:55,800
you could at least point to a screen and say,
668
00:26:55,800 --> 00:26:57,240
"That's the thing that did it."
669
00:26:57,240 --> 00:27:00,120
With Foundry, you'll be pointing at an execution trace
670
00:27:00,120 --> 00:27:02,760
and an often-didentity trying to explain to people
671
00:27:02,760 --> 00:27:06,440
who don't live in entra why, allowed, and intended,
672
00:27:06,440 --> 00:27:08,280
were never the same thing.
673
00:27:08,280 --> 00:27:09,960
Designing an agent control plane.
674
00:27:09,960 --> 00:27:11,640
Identity is the first boundary.
675
00:27:11,640 --> 00:27:13,640
By now, the pattern should be obvious.
676
00:27:13,640 --> 00:27:16,840
If you let agents exist before you define how they identified
677
00:27:16,840 --> 00:27:20,280
and contained, everything downstream turns into guesswork.
678
00:27:20,280 --> 00:27:23,720
So the first boundary in an agent control plane is not data.
679
00:27:23,720 --> 00:27:24,680
It's identity.
680
00:27:24,680 --> 00:27:27,320
And I don't mean we have an app registration in entra,
681
00:27:27,320 --> 00:27:28,280
so we're fine.
682
00:27:28,280 --> 00:27:31,560
I mean, every agent is treated as a first-class security principle
683
00:27:31,560 --> 00:27:34,360
with an explicit life cycle, clear ownership,
684
00:27:34,360 --> 00:27:36,760
and a constrained field of operation.
685
00:27:36,760 --> 00:27:38,200
If you're serious about this,
686
00:27:38,200 --> 00:27:40,200
there are four questions you must be able to answer
687
00:27:40,200 --> 00:27:42,200
in writing for every production agent.
688
00:27:42,200 --> 00:27:42,840
Who owns it?
689
00:27:42,840 --> 00:27:43,640
What is it for?
690
00:27:43,640 --> 00:27:45,240
How long is it allowed to exist?
691
00:27:45,240 --> 00:27:47,160
What event forces it to shut down?
692
00:27:47,160 --> 00:27:49,160
If you can't answer those four, you don't have an agent.
693
00:27:49,160 --> 00:27:50,200
You have a ghost.
694
00:27:50,200 --> 00:27:53,960
In entra terms, you really only have three patterns available for agents.
695
00:27:53,960 --> 00:27:56,440
User impersonation, generic service principles,
696
00:27:56,440 --> 00:27:58,600
and dedicated workload identities.
697
00:27:58,600 --> 00:28:01,000
Only one of those belongs anywhere near production.
698
00:28:01,000 --> 00:28:03,640
Dedicated workload identities for agents.
699
00:28:03,640 --> 00:28:06,360
User impersonation, on behalf of flows,
700
00:28:06,360 --> 00:28:09,160
sounds attractive because it keeps access personalized.
701
00:28:09,160 --> 00:28:11,320
In practice, it destroys accountability.
702
00:28:11,320 --> 00:28:14,120
When something goes wrong, your logs say the user did it.
703
00:28:14,120 --> 00:28:17,720
Good luck explaining which actions were consciously taken by a human
704
00:28:17,720 --> 00:28:20,760
and which were silently executed by an agent in their name.
705
00:28:20,760 --> 00:28:22,520
Generic service principles are worse.
706
00:28:22,520 --> 00:28:24,440
They become the organizational junk draw.
707
00:28:24,440 --> 00:28:26,040
Every time you reuse one,
708
00:28:26,040 --> 00:28:28,760
you deepen permission drift and hide more execution
709
00:28:28,760 --> 00:28:30,520
behind a single opaque identity.
710
00:28:30,520 --> 00:28:32,120
Agents need their own identities.
711
00:28:32,120 --> 00:28:34,200
A unique, entra object, per agent,
712
00:28:34,200 --> 00:28:36,760
or at most, per tightly related agent family,
713
00:28:36,760 --> 00:28:39,320
tagged with an owner team, a business purpose,
714
00:28:39,320 --> 00:28:41,720
an environment, and an expiry date.
715
00:28:41,720 --> 00:28:45,320
No owner, no execution, no purpose, no execution,
716
00:28:45,320 --> 00:28:46,920
no expiry, no execution.
717
00:28:46,920 --> 00:28:48,200
That's the baseline.
718
00:28:48,200 --> 00:28:50,520
This is where identity-driven agent control starts.
719
00:28:50,520 --> 00:28:53,080
You also have to separate how the agent proves who it is
720
00:28:53,080 --> 00:28:55,080
from what it's allowed to do.
721
00:28:55,080 --> 00:28:57,000
Authentication tells you this is the agent.
722
00:28:57,000 --> 00:29:00,760
Authorization tells you here is the slice of the world it can touch.
723
00:29:00,760 --> 00:29:02,600
Those must be independently governed.
724
00:29:02,600 --> 00:29:05,720
On the authentication side, anything that depends on a static secret
725
00:29:05,720 --> 00:29:06,840
is a liability.
726
00:29:06,840 --> 00:29:10,200
If you have an agent whose entire existence depends on a client secret
727
00:29:10,200 --> 00:29:13,240
sitting in a conflict file or a key vault that no one rotates,
728
00:29:13,240 --> 00:29:15,000
you've just created a long-lived backdoor
729
00:29:15,000 --> 00:29:17,000
with no natural kill switch.
730
00:29:17,000 --> 00:29:20,200
Agents should authenticate using modern, secretless patterns.
731
00:29:20,200 --> 00:29:24,360
Managed identities, workload identity federation, token exchange,
732
00:29:24,360 --> 00:29:26,440
credentials, the platform issues and rotates
733
00:29:26,440 --> 00:29:29,320
not strings a developer copied from a portal six months ago.
734
00:29:29,320 --> 00:29:30,440
On the authorization side,
735
00:29:30,440 --> 00:29:33,080
this is where conditional access for non-human identities
736
00:29:33,080 --> 00:29:34,280
stops being optional.
737
00:29:34,280 --> 00:29:37,880
Most orgs think of conditional access as the thing that enforces MFA
738
00:29:37,880 --> 00:29:38,600
for users or so.
739
00:29:38,600 --> 00:29:40,520
You need the agent version of that discipline.
740
00:29:40,520 --> 00:29:42,520
Policies that say explicitly,
741
00:29:42,520 --> 00:29:45,560
these identities may only sign in from our Foundry runtime.
742
00:29:45,560 --> 00:29:48,200
These identities may only access these resource types.
743
00:29:48,200 --> 00:29:51,320
These identities are blocked from specific high-risk operations
744
00:29:51,320 --> 00:29:53,000
regardless of token contents.
745
00:29:53,000 --> 00:29:55,080
You're not going to prompt an agent for MFA,
746
00:29:55,080 --> 00:29:56,840
but you can absolutely gated on context.
747
00:29:56,840 --> 00:29:58,600
Is this coming from the expected workload?
748
00:29:58,600 --> 00:30:00,280
Is it using the correct identity?
749
00:30:00,280 --> 00:30:03,400
Is it trying to reach a resource class it was never cleared for?
750
00:30:03,400 --> 00:30:05,400
If the answer doesn't line up with policy,
751
00:30:05,400 --> 00:30:06,840
the token shouldn't be honored.
752
00:30:06,840 --> 00:30:08,200
Full stop.
753
00:30:08,200 --> 00:30:09,720
Then there's impersonation.
754
00:30:09,720 --> 00:30:13,400
Agents that act on behalf of users should be the exception you escalate,
755
00:30:13,400 --> 00:30:15,720
not the default you casually allow.
756
00:30:15,720 --> 00:30:18,600
If you must let an agent impersonator use a scope it like a scalpel,
757
00:30:18,600 --> 00:30:20,280
this agent, this user population,
758
00:30:20,280 --> 00:30:22,840
these operations, this system, this time window,
759
00:30:22,840 --> 00:30:25,880
everything else stays under its own workload identity
760
00:30:25,880 --> 00:30:28,280
with its own clearly defined permissions.
761
00:30:28,280 --> 00:30:31,080
And none of this matters if you don't wire in life cycle.
762
00:30:31,080 --> 00:30:36,040
Agents identities should not be created by hand in the portal at 11pm before a demo.
763
00:30:36,040 --> 00:30:39,240
They should be provisioned through a pipeline that enforces naming, tagging,
764
00:30:39,240 --> 00:30:40,520
and baseline policies.
765
00:30:40,520 --> 00:30:41,880
Scripted at minimum.
766
00:30:41,880 --> 00:30:43,720
Policy as code if you're serious,
767
00:30:43,720 --> 00:30:45,400
the provisioning needs the same rigor.
768
00:30:45,400 --> 00:30:47,880
When a project ends, when an owner changes roles,
769
00:30:47,880 --> 00:30:49,400
when the maximum lifetime hits,
770
00:30:49,400 --> 00:30:51,480
the identity should be flagged automatically.
771
00:30:51,480 --> 00:30:54,440
Someone has to consciously renew it with justification or let it die.
772
00:30:54,440 --> 00:30:56,680
No identity should be immortal by default,
773
00:30:56,680 --> 00:30:59,640
so your weak one control plane for identity looks like this.
774
00:30:59,640 --> 00:31:03,080
Inventory every non-human identity any agent could be using,
775
00:31:03,080 --> 00:31:04,600
flagged the ones with no clear owner,
776
00:31:04,600 --> 00:31:06,040
flagged the ones with broad rights,
777
00:31:06,040 --> 00:31:08,200
prohibit new agents from binding to them,
778
00:31:08,200 --> 00:31:09,720
then define a single hard rule.
779
00:31:09,720 --> 00:31:11,480
No dedicated workload identity,
780
00:31:11,480 --> 00:31:15,240
no agent, no owner tag, no agent, no expiry, no agent.
781
00:31:15,240 --> 00:31:18,520
Because until you can look at an entrae object and say that is an agent,
782
00:31:18,520 --> 00:31:21,480
here is who owns it, here is what it can do, here is when it dies,
783
00:31:21,480 --> 00:31:22,680
you don't have a control plane.
784
00:31:22,680 --> 00:31:26,360
You have a factory full of unlabeled machines all wired to the same power bus,
785
00:31:26,360 --> 00:31:28,520
and no breakers you trust enough to flip,
786
00:31:28,520 --> 00:31:30,200
and when the first incident hits,
787
00:31:30,200 --> 00:31:31,640
that's exactly what it will feel like.
788
00:31:31,640 --> 00:31:36,440
The refrain, on purpose, identity is the first boundary.
789
00:31:36,440 --> 00:31:38,280
By now the pattern should be obvious.
790
00:31:38,280 --> 00:31:41,720
If you let agents exist before you decide who they are and how they're contained,
791
00:31:41,720 --> 00:31:43,480
everything after that is guesswork,
792
00:31:43,480 --> 00:31:46,440
say it out loud, identity first, data later.
793
00:31:46,440 --> 00:31:49,080
Everything else is detailed, this is the intentional repetition,
794
00:31:49,080 --> 00:31:51,000
not a copy, a reinforcement.
795
00:31:51,000 --> 00:31:52,600
Most orgs try to start with data,
796
00:31:52,600 --> 00:31:53,080
they shouldn't.
797
00:31:53,080 --> 00:31:55,400
The first boundary in an agent control plane is not data,
798
00:31:55,400 --> 00:31:56,600
it is identity,
799
00:31:56,600 --> 00:32:00,280
and we created an app registration in Entra is not identity.
800
00:32:00,280 --> 00:32:03,480
Identity means the agent is a first class security principle
801
00:32:03,480 --> 00:32:04,600
with a life cycle,
802
00:32:04,600 --> 00:32:05,480
a custodian,
803
00:32:05,480 --> 00:32:07,720
and a fence line you can draw on a whiteboard.
804
00:32:07,720 --> 00:32:10,280
Think of four properties you can't compromise.
805
00:32:10,280 --> 00:32:11,400
A steward,
806
00:32:11,400 --> 00:32:13,640
the accountable team on the hook when it moves,
807
00:32:13,640 --> 00:32:14,200
a charter,
808
00:32:14,200 --> 00:32:16,760
the exact problem space it is allowed to operate in,
809
00:32:16,760 --> 00:32:17,400
a clock,
810
00:32:17,400 --> 00:32:20,120
an expiration that forces renewal or retirement,
811
00:32:20,120 --> 00:32:21,080
a kill switch,
812
00:32:21,080 --> 00:32:23,480
a condition that shuts it off without debate.
813
00:32:23,480 --> 00:32:26,280
If you can't state those four in writing for a single agent,
814
00:32:26,280 --> 00:32:27,480
you don't have a workload,
815
00:32:27,480 --> 00:32:28,280
you have a ghost,
816
00:32:28,280 --> 00:32:29,080
no owner,
817
00:32:29,080 --> 00:32:29,960
no execution,
818
00:32:29,960 --> 00:32:31,640
no owner, no execution.
819
00:32:31,640 --> 00:32:33,000
This is not ceremony,
820
00:32:33,000 --> 00:32:35,160
it's how you prevent a non-human identity
821
00:32:35,160 --> 00:32:37,480
from outliving the intent that justified it.
822
00:32:37,480 --> 00:32:39,320
Let's zoom in and vary the angles,
823
00:32:39,320 --> 00:32:42,200
because this is where orgs convince themselves they're fine.
824
00:32:42,200 --> 00:32:44,200
Three ways agents show up in Entra,
825
00:32:44,200 --> 00:32:46,360
as user impersonation on behalf of.
826
00:32:46,360 --> 00:32:48,680
As a borrowed service principle,
827
00:32:48,680 --> 00:32:49,960
automation app,
828
00:32:49,960 --> 00:32:51,640
as a dedicated workload identity,
829
00:32:51,640 --> 00:32:53,800
only one is defensible at scale.
830
00:32:53,800 --> 00:32:56,120
User impersonation keeps access personalized,
831
00:32:56,120 --> 00:32:57,560
then removes accountability.
832
00:32:57,560 --> 00:32:58,920
When something goes wrong,
833
00:32:58,920 --> 00:33:00,520
the logs say the user did it.
834
00:33:00,520 --> 00:33:02,440
You will spend weeks separating human action
835
00:33:02,440 --> 00:33:05,000
from agent action that happened under that user's name.
836
00:33:05,000 --> 00:33:06,440
Shared service principles are worse,
837
00:33:06,440 --> 00:33:08,520
they become a junk drawer for exceptions.
838
00:33:08,520 --> 00:33:09,320
Every reuse,
839
00:33:09,320 --> 00:33:10,440
deepens permission drift
840
00:33:10,440 --> 00:33:13,160
and hides behavior behind a single opaque actor,
841
00:33:13,160 --> 00:33:14,040
no one owns.
842
00:33:14,040 --> 00:33:17,880
Dedicated workload identities are the only adult option,
843
00:33:17,880 --> 00:33:20,200
one agent, one principle, one perimeter.
844
00:33:20,200 --> 00:33:21,480
Tag them like you mean it,
845
00:33:21,480 --> 00:33:23,240
stew a team, not a person,
846
00:33:23,240 --> 00:33:25,400
business charter, one sentence specific.
847
00:33:25,400 --> 00:33:28,120
Environment, dev, foretests, prod,
848
00:33:28,120 --> 00:33:30,680
expiry, a real date, not never.
849
00:33:30,680 --> 00:33:32,360
No purpose, no execution,
850
00:33:32,360 --> 00:33:34,280
no expiry, no execution.
851
00:33:34,280 --> 00:33:35,640
That is the baseline.
852
00:33:35,640 --> 00:33:38,440
Now separate two ideas you must never conflate.
853
00:33:38,440 --> 00:33:40,840
Authentication proves this is the agent.
854
00:33:40,840 --> 00:33:42,040
Authorization answers,
855
00:33:42,040 --> 00:33:43,880
here is the slice of the world it can touch.
856
00:33:43,880 --> 00:33:45,080
Those are independent levers,
857
00:33:45,080 --> 00:33:45,960
govern them separately.
858
00:33:45,960 --> 00:33:48,120
On authentication,
859
00:33:48,120 --> 00:33:50,280
if a static secret can keep an agent alive,
860
00:33:50,280 --> 00:33:52,520
you've built a back door with no natural end.
861
00:33:52,520 --> 00:33:53,880
Dump client secrets,
862
00:33:53,880 --> 00:33:55,400
prefer secretless patterns,
863
00:33:55,400 --> 00:33:56,520
managed identities,
864
00:33:56,520 --> 00:33:58,360
workload identity federation,
865
00:33:58,360 --> 00:33:59,560
token exchange.
866
00:33:59,560 --> 00:34:01,560
Credentials the platform issues and rotates,
867
00:34:01,560 --> 00:34:03,320
not string somebody pasted into a conflict
868
00:34:03,320 --> 00:34:04,680
the night before a demo.
869
00:34:04,680 --> 00:34:05,800
On authorization,
870
00:34:05,800 --> 00:34:08,440
conditional access for non-human identities is not optional.
871
00:34:08,440 --> 00:34:10,680
You're not going to prompt an agent for MFA,
872
00:34:10,680 --> 00:34:12,040
but you can enforce context,
873
00:34:12,040 --> 00:34:13,640
only sign in from your foundry runtime.
874
00:34:13,640 --> 00:34:15,720
Only call the resources you were cleared for.
875
00:34:15,720 --> 00:34:17,320
Block entire classes of operations
876
00:34:17,320 --> 00:34:19,160
regardless of roles in the token.
877
00:34:19,160 --> 00:34:20,600
If the context doesn't match policy,
878
00:34:20,600 --> 00:34:21,800
the token is not honored.
879
00:34:21,800 --> 00:34:22,440
Full stop.
880
00:34:22,440 --> 00:34:24,680
About impersonation.
881
00:34:24,680 --> 00:34:26,200
Make it the exception you escalate,
882
00:34:26,200 --> 00:34:27,800
not the default you encourage.
883
00:34:27,800 --> 00:34:29,640
And when you approve it, narrow it to a scalpel,
884
00:34:29,640 --> 00:34:31,240
this agent, this user population,
885
00:34:31,240 --> 00:34:32,680
these operations, the system,
886
00:34:32,680 --> 00:34:33,880
this time window,
887
00:34:33,880 --> 00:34:35,480
everything else runs under its own
888
00:34:35,480 --> 00:34:37,800
workload identity with least privilege.
889
00:34:37,800 --> 00:34:40,200
Life cycle is where most programs quietly give up,
890
00:34:40,200 --> 00:34:41,800
don't provision identities
891
00:34:41,800 --> 00:34:43,720
through a pipeline that enforces naming,
892
00:34:43,720 --> 00:34:45,480
tagging and baseline policy.
893
00:34:45,480 --> 00:34:46,680
Scripted bare minimum,
894
00:34:46,680 --> 00:34:47,560
policy is code,
895
00:34:47,560 --> 00:34:48,680
if you're serious,
896
00:34:48,680 --> 00:34:50,360
deprovision with the same rigor.
897
00:34:50,360 --> 00:34:51,240
When the clock hits,
898
00:34:51,240 --> 00:34:53,000
the Stuart renews with justification
899
00:34:53,000 --> 00:34:54,440
or the identity dies.
900
00:34:54,440 --> 00:34:55,400
No immortal accounts,
901
00:34:55,400 --> 00:34:57,160
no sentimental exceptions.
902
00:34:57,160 --> 00:34:59,400
Inventory is not optional theater.
903
00:34:59,400 --> 00:35:01,560
It's how you find the ghosts you already have.
904
00:35:01,560 --> 00:35:02,840
Week one discipline.
905
00:35:02,840 --> 00:35:06,120
Enumerate every non-human identity agents could be using.
906
00:35:06,120 --> 00:35:08,120
Flag the offense, no Stuart tag.
907
00:35:08,120 --> 00:35:09,960
Flag the giants, broad rights.
908
00:35:09,960 --> 00:35:11,720
Bann new agents from binding to either.
909
00:35:11,720 --> 00:35:14,600
Then set the single hard rule that actually moves behavior.
910
00:35:14,600 --> 00:35:17,480
No dedicated workload identity, no agent.
911
00:35:17,480 --> 00:35:20,520
No Stuart tag, no agent, no expiry, no agent.
912
00:35:20,520 --> 00:35:23,400
Here's a fresh scenario to make this less abstract.
913
00:35:23,400 --> 00:35:26,120
The product org ships a renewals assistant,
914
00:35:26,120 --> 00:35:27,560
clean design, clean runbook,
915
00:35:27,560 --> 00:35:29,480
dedicated identity, owner tag,
916
00:35:29,480 --> 00:35:32,600
read only scopes across a labeled set of customer assets.
917
00:35:32,600 --> 00:35:33,640
Sales loves it.
918
00:35:33,640 --> 00:35:37,240
Quaterlater, ops adds a small enhancement.
919
00:35:37,240 --> 00:35:38,840
Give that same principle right access
920
00:35:38,840 --> 00:35:41,240
to a scheduling system just for convenience.
921
00:35:41,240 --> 00:35:43,320
No one circles back to the identity record.
922
00:35:43,320 --> 00:35:45,800
The Stuart team doesn't formally accept the change.
923
00:35:45,800 --> 00:35:47,080
A quarter after that,
924
00:35:47,080 --> 00:35:50,280
finance asks why renewal reminders now include meeting details
925
00:35:50,280 --> 00:35:53,560
and internal notes that were never supposed to leave the sales system.
926
00:35:53,560 --> 00:35:55,480
No breach, no exfiltration.
927
00:35:55,480 --> 00:35:58,200
Every API call was legitimate.
928
00:35:58,200 --> 00:36:00,680
The agent simply discovered, at runtime,
929
00:36:00,680 --> 00:36:03,640
the widened perimeter on its identity and behaved accordingly.
930
00:36:03,640 --> 00:36:05,240
Could that have happened with a name Stuart,
931
00:36:05,240 --> 00:36:07,400
a charter that forbids cross-system writebacks,
932
00:36:07,400 --> 00:36:11,400
a clock that forces renewal and a kill switch tied to scope changes?
933
00:36:11,400 --> 00:36:13,720
Unlikely, without them inevitable.
934
00:36:13,720 --> 00:36:15,080
Identity is not a checkbox.
935
00:36:15,080 --> 00:36:18,280
It's the breaker you trust enough to flip when the trace looks wrong.
936
00:36:18,280 --> 00:36:21,400
And the refrain matters because it anchors the next gate.
937
00:36:21,400 --> 00:36:23,320
Identity gives you a named actor.
938
00:36:23,320 --> 00:36:25,960
Permissions define what that actor can touch in theory.
939
00:36:25,960 --> 00:36:28,760
Neither answers the question that burns your time in incident review.
940
00:36:28,760 --> 00:36:32,600
What combinations of data was this thing ever allowed to assemble?
941
00:36:32,600 --> 00:36:34,360
That's the next boundary, data,
942
00:36:34,360 --> 00:36:36,200
sensitivity labels and DLP,
943
00:36:36,200 --> 00:36:38,440
and they only work if identity came first.
944
00:36:38,440 --> 00:36:40,840
But hold the line here, one agent, one principle,
945
00:36:40,840 --> 00:36:43,560
one owner, on record, one charter, with teeth.
946
00:36:43,560 --> 00:36:46,680
One clock that forces a choice, one kill switch wired to policy.
947
00:36:46,680 --> 00:36:49,160
Because if an agent can execute without those,
948
00:36:49,160 --> 00:36:50,440
you didn't automate work.
949
00:36:50,440 --> 00:36:51,480
You automated risk.
950
00:36:51,480 --> 00:36:53,960
And you will read about it later,
951
00:36:53,960 --> 00:36:57,160
written by someone else to an audience that wasn't in your design review.
952
00:36:57,160 --> 00:37:00,440
No owner, no execution, no label, no agent,
953
00:37:00,440 --> 00:37:03,480
no audit path, no production, identity first always.
954
00:37:03,480 --> 00:37:07,000
Execution visibility, observability and agent data access auditing.
955
00:37:07,000 --> 00:37:11,000
Identity tells you who the agent is, labels tell you what it's allowed to assemble.
956
00:37:11,000 --> 00:37:13,960
Execution visibility is how you prove what actually happened.
957
00:37:13,960 --> 00:37:16,280
Without that third pillar, you are still guessing.
958
00:37:16,280 --> 00:37:18,200
You're just guessing with better vocabulary.
959
00:37:18,200 --> 00:37:19,720
When people say, "We'll monitor it,"
960
00:37:19,720 --> 00:37:22,600
what they usually mean is, "Logs exist somewhere."
961
00:37:22,600 --> 00:37:23,960
For agents, that's not enough.
962
00:37:23,960 --> 00:37:25,240
You don't need a pile of events.
963
00:37:25,240 --> 00:37:26,440
You need a narrative.
964
00:37:26,440 --> 00:37:29,240
For any run that matters, you should be able to answer five questions
965
00:37:29,240 --> 00:37:31,000
without launching an incident war room,
966
00:37:31,000 --> 00:37:32,360
which agent executed,
967
00:37:32,360 --> 00:37:33,480
under which identity?
968
00:37:33,480 --> 00:37:35,320
Which tools did it call in what order?
969
00:37:35,320 --> 00:37:38,040
Which label data did it read or write along the way?
970
00:37:38,040 --> 00:37:39,880
What decision points changed its path?
971
00:37:39,880 --> 00:37:41,400
If you can't reconstruct that story,
972
00:37:41,400 --> 00:37:43,720
you can't audit behavior, you can't prove compliance,
973
00:37:43,720 --> 00:37:46,120
and you definitely can't walk into an investigation
974
00:37:46,120 --> 00:37:48,040
with anything stronger than opinion.
975
00:37:48,040 --> 00:37:49,400
So let's make this concrete.
976
00:37:49,400 --> 00:37:53,080
At minimum observability for agents has to capture four dimensions.
977
00:37:53,080 --> 00:37:54,680
First, agent context.
978
00:37:54,680 --> 00:37:58,040
Every run needs an immutable identifier for the agent definition,
979
00:37:58,040 --> 00:38:00,840
the version, the environment, and the workload identity.
980
00:38:00,840 --> 00:38:02,600
Not some call from Foundry.
981
00:38:02,600 --> 00:38:05,480
This specific agent, version N, in Environment X,
982
00:38:05,480 --> 00:38:07,240
acting as identity Y.
983
00:38:07,240 --> 00:38:10,600
Second, the tool graph, you need a trace of which tools were invoked
984
00:38:10,600 --> 00:38:13,720
in what sequence with which inputs and outputs at a metadata level.
985
00:38:13,720 --> 00:38:15,640
I don't mean storing full payloads forever.
986
00:38:15,640 --> 00:38:17,640
I mean enough structure to say,
987
00:38:17,640 --> 00:38:19,160
this run query de vector store,
988
00:38:19,160 --> 00:38:21,640
built from content labeled internal confidential,
989
00:38:21,640 --> 00:38:24,920
searched in exchange mailbox labeled customer PII,
990
00:38:24,920 --> 00:38:28,440
called an API that returns finance restricted records,
991
00:38:28,440 --> 00:38:31,800
then wrote into a system labeled operational internal.
992
00:38:31,800 --> 00:38:33,080
That's an execution graph.
993
00:38:33,080 --> 00:38:35,960
Without it, you're just staring at individual log lines.
994
00:38:35,960 --> 00:38:37,640
Third, the data access footprint.
995
00:38:37,640 --> 00:38:40,520
This is where your purview integration stops being theoretical.
996
00:38:40,520 --> 00:38:42,440
For each tool call, you want to know,
997
00:38:42,440 --> 00:38:44,680
which labels were present on the resources touched,
998
00:38:44,680 --> 00:38:47,720
were those labels allowed under this agent's declared policy?
999
00:38:47,720 --> 00:38:50,360
If an agent that was only approved for internal,
1000
00:38:50,360 --> 00:38:52,280
suddenly reads from highly confidential
1001
00:38:52,280 --> 00:38:54,680
that shouldn't hide as a generic API request,
1002
00:38:54,680 --> 00:38:57,160
it should surface as a boundary violation in your traces.
1003
00:38:57,160 --> 00:38:58,920
Fourth, decision points.
1004
00:38:58,920 --> 00:39:01,640
Agents don't just march through a static flow, they choose.
1005
00:39:01,640 --> 00:39:04,040
You don't need to log every token of model reasoning,
1006
00:39:04,040 --> 00:39:06,600
but you do need markers where the agent changed course.
1007
00:39:06,600 --> 00:39:08,920
It retried a tool, it chose an alternate tool,
1008
00:39:08,920 --> 00:39:11,000
it declined to act because a precondition failed,
1009
00:39:11,000 --> 00:39:12,600
it escalated to a fallback path.
1010
00:39:12,600 --> 00:39:15,720
Those are the moments in auditor or an internal review
1011
00:39:15,720 --> 00:39:16,840
will care about.
1012
00:39:16,840 --> 00:39:18,760
Why did it go left instead of right?
1013
00:39:18,760 --> 00:39:20,920
If your traces flatten everything into API calls,
1014
00:39:20,920 --> 00:39:21,880
you've lost the why.
1015
00:39:21,880 --> 00:39:24,600
This is where something like open telemetry becomes useful,
1016
00:39:24,600 --> 00:39:26,360
not because you care about the standard itself,
1017
00:39:26,360 --> 00:39:28,920
but because you need structured correlated traces.
1018
00:39:28,920 --> 00:39:30,360
Think of each agent run as a trace,
1019
00:39:30,360 --> 00:39:31,880
each tool call as a span.
1020
00:39:31,880 --> 00:39:35,080
Labels and policy decisions as attributes and events.
1021
00:39:35,080 --> 00:39:38,120
That gives you the raw material for two critical capabilities,
1022
00:39:38,120 --> 00:39:40,360
run time alerting and forensic reconstruction,
1023
00:39:40,360 --> 00:39:42,280
run time alerting is obvious.
1024
00:39:42,280 --> 00:39:45,160
If an agent crosses a label boundary, it was never approved for,
1025
00:39:45,160 --> 00:39:47,880
or starts calling tools outside, it's declared set,
1026
00:39:47,880 --> 00:39:50,760
you want a signal now, not in an annual review.
1027
00:39:50,760 --> 00:39:54,200
But forensic reconstruction is where most organizations quietly fail.
1028
00:39:54,200 --> 00:39:56,760
When someone says this summary contained HR details
1029
00:39:56,760 --> 00:39:58,440
that should never have left HR,
1030
00:39:58,440 --> 00:40:00,520
you need to pull that run and walk through it.
1031
00:40:00,520 --> 00:40:02,440
Here is the agent, here is the identity,
1032
00:40:02,440 --> 00:40:05,560
here are the tools it called, here are the labels it touched at each step,
1033
00:40:05,560 --> 00:40:08,680
here is the decision where it chose to include that fragment.
1034
00:40:08,680 --> 00:40:12,040
If your only answer is, while the agent has access to those systems,
1035
00:40:12,040 --> 00:40:13,400
you've already lost the argument,
1036
00:40:13,400 --> 00:40:16,680
agent data access auditing sits on top of this observability fabric.
1037
00:40:16,680 --> 00:40:20,120
It's not a separate log, it's a set of questions you know you'll need to answer,
1038
00:40:20,120 --> 00:40:22,120
baked into how you design traces.
1039
00:40:22,120 --> 00:40:24,600
Questions like, show me all runs of any agent
1040
00:40:24,600 --> 00:40:26,520
that read data labeled regulated,
1041
00:40:26,520 --> 00:40:29,000
and then wrote to destinations labeled external.
1042
00:40:29,000 --> 00:40:33,320
Show me every run where HR confidential and finance restricted appeared in the same trace.
1043
00:40:33,320 --> 00:40:36,440
Show me executions under identities that were flagged for review
1044
00:40:36,440 --> 00:40:37,960
or should have been decommissioned.
1045
00:40:37,960 --> 00:40:41,080
If those queries take a week of log wrangling, you don't have auditing,
1046
00:40:41,080 --> 00:40:43,080
you have a data lake, and here's the hard truth.
1047
00:40:43,080 --> 00:40:45,240
Observability without control is theatre.
1048
00:40:45,240 --> 00:40:47,480
If your traces can show you that an agent is misbehaving,
1049
00:40:47,480 --> 00:40:49,080
but you have no reliable kill switch,
1050
00:40:49,080 --> 00:40:53,240
no way to revoke its identity, no way to block its label access before the next run,
1051
00:40:53,240 --> 00:40:55,800
all you've built is a very expensive rearview mirror.
1052
00:40:56,520 --> 00:40:59,400
Execution visibility has to be wired back into the control plane.
1053
00:40:59,400 --> 00:41:03,400
That means a central switch to disable an agent identity instantly,
1054
00:41:03,400 --> 00:41:08,440
a mechanism to block an agent's access to specific labels or tools based on what your traces reveal.
1055
00:41:08,440 --> 00:41:10,200
And one non-negotiable policy,
1056
00:41:10,200 --> 00:41:13,400
if an agent cannot meet your observability and auditing requirements,
1057
00:41:13,400 --> 00:41:14,680
it does not run in production.
1058
00:41:14,680 --> 00:41:16,600
No exceptions, no temporary blind spots,
1059
00:41:16,600 --> 00:41:18,280
no, we'll add tracing later.
1060
00:41:18,280 --> 00:41:20,760
Because in the room that matters after the incident,
1061
00:41:20,760 --> 00:41:23,000
you will not be judged on how many logs you collected.
1062
00:41:23,000 --> 00:41:24,680
You'll be judged on whether you can say,
1063
00:41:24,680 --> 00:41:27,960
clearly, here is what the agent did, here is why it was allowed to do it,
1064
00:41:27,960 --> 00:41:29,800
here is how we made sure it can't ever do it again.
1065
00:41:29,800 --> 00:41:33,640
The non-negotiable rule, pre-execution governance only,
1066
00:41:33,640 --> 00:41:36,680
everything I've walked through so far points to one conclusion.
1067
00:41:36,680 --> 00:41:39,560
This is the non-negotiable rule of your agent control plane.
1068
00:41:39,560 --> 00:41:42,120
If you let agents execute before governance is enforced,
1069
00:41:42,120 --> 00:41:44,840
you have already lost, not you might have a problem later,
1070
00:41:44,840 --> 00:41:46,360
not you should keep an eye on it.
1071
00:41:46,360 --> 00:41:49,400
You have accepted that your environment will be shaped by autonomous behaviour,
1072
00:41:49,400 --> 00:41:50,520
you do not control.
1073
00:41:50,520 --> 00:41:52,040
Anything else is theatre.
1074
00:41:52,040 --> 00:41:54,040
Everything after this is damage control,
1075
00:41:54,040 --> 00:41:56,760
so I want to state the rule plainly, so there is no wiggle room.
1076
00:41:56,760 --> 00:42:02,040
If an agent can execute before identity, data boundary and observability are in place,
1077
00:42:02,040 --> 00:42:03,800
governance has already failed.
1078
00:42:03,800 --> 00:42:08,200
Pre-execution only, pre-execution governance is the only governance agents will ever
1079
00:42:08,200 --> 00:42:12,120
truly respect when you apply controls after agents exist, you are doing theatre,
1080
00:42:12,120 --> 00:42:15,400
you can add purview policies, you can tighten conditional access,
1081
00:42:15,400 --> 00:42:16,760
you can enhance logging.
1082
00:42:16,760 --> 00:42:18,840
But all of that is happening in an estate,
1083
00:42:18,840 --> 00:42:21,960
whose shape was already determined by ungoverned execution.
1084
00:42:21,960 --> 00:42:23,720
You're not building an agent control plane,
1085
00:42:23,720 --> 00:42:27,640
you're trying to decorate one that autonomous execution has already drawn for you.
1086
00:42:27,640 --> 00:42:31,400
You are bolting guardrails onto systems that were never designed to carry them,
1087
00:42:31,400 --> 00:42:35,080
that is exactly what happened with SharePoint, Power Apps and Teams.
1088
00:42:35,080 --> 00:42:37,640
Governance arrived after the fact, it reduced some risks,
1089
00:42:37,640 --> 00:42:39,080
it never erased the entropy.
1090
00:42:39,080 --> 00:42:42,360
With foundry agents, that same posture is worse than ineffective.
1091
00:42:42,360 --> 00:42:46,680
It actively hides the real state of your environment behind dashboards and reports that look mature.
1092
00:42:46,680 --> 00:42:49,240
A post-creation policy is opt-in by definition,
1093
00:42:49,240 --> 00:42:53,080
it assumes everything that existed before the policy is either exempt
1094
00:42:53,080 --> 00:42:55,800
or will be discovered and remediated by hand.
1095
00:42:55,800 --> 00:42:59,160
In practice, that means your oldest, riskiest, least documented agents
1096
00:42:59,160 --> 00:43:01,640
are precisely the ones least likely to be constrained.
1097
00:43:01,640 --> 00:43:05,800
If your AI strategy starts with will monitor it, you've already accepted the outcome,
1098
00:43:05,800 --> 00:43:07,240
monitoring is a feedback loop.
1099
00:43:07,240 --> 00:43:08,520
Governance is a predicate.
1100
00:43:08,520 --> 00:43:11,640
Feedback loops improve a system that already exists.
1101
00:43:11,640 --> 00:43:14,120
Predicate decides which systems are allowed to exist,
1102
00:43:14,120 --> 00:43:17,240
so what does pre-execution governance actually look like for foundry?
1103
00:43:17,240 --> 00:43:19,240
It looks like gates, hard ones.
1104
00:43:19,240 --> 00:43:21,720
Wired into the only path that leads to production.
1105
00:43:21,720 --> 00:43:23,960
For agents, there are three gates that matter.
1106
00:43:23,960 --> 00:43:25,480
Gate one is identity.
1107
00:43:25,480 --> 00:43:28,040
If there is no dedicated workload identity,
1108
00:43:28,040 --> 00:43:30,200
tagged with an owner, a purpose, an environment,
1109
00:43:30,200 --> 00:43:32,040
and an expiry deployment fails,
1110
00:43:32,040 --> 00:43:34,360
not raises a warning, fails.
1111
00:43:34,360 --> 00:43:36,520
No owner, no execution.
1112
00:43:36,520 --> 00:43:38,280
Gate two is data boundary.
1113
00:43:38,280 --> 00:43:41,320
If the agents declared tools and knowledge sources cannot be mapped,
1114
00:43:41,320 --> 00:43:45,240
via purview, to an approved combination of sensitivity labels and destinations,
1115
00:43:45,240 --> 00:43:46,360
deployment fails.
1116
00:43:46,360 --> 00:43:48,040
No label, no agent, that's the line.
1117
00:43:48,040 --> 00:43:50,600
You don't let agents run over unlabeled data.
1118
00:43:50,600 --> 00:43:54,360
You don't trust hand-way via assurances that the system is internal.
1119
00:43:54,360 --> 00:43:57,240
If purview can't classify it, foundry can't touch it.
1120
00:43:57,240 --> 00:43:59,080
Gate three is observability.
1121
00:43:59,080 --> 00:44:02,520
If the agent cannot produce traces that meet your auditing requirements,
1122
00:44:02,520 --> 00:44:06,600
identity, toolgraph, label footprint, decision markers,
1123
00:44:06,600 --> 00:44:08,120
it does not run in production.
1124
00:44:08,120 --> 00:44:09,720
No audit path, no execution.
1125
00:44:09,720 --> 00:44:12,840
Those three gates, identity, boundary visibility,
1126
00:44:12,840 --> 00:44:15,320
are the minimal pre-execution requirements.
1127
00:44:15,320 --> 00:44:17,480
Anything less, and you are not governing agents,
1128
00:44:17,480 --> 00:44:18,680
you are annotating them.
1129
00:44:18,680 --> 00:44:21,720
You might be thinking, we can't afford to be that strict in week one,
1130
00:44:21,720 --> 00:44:22,840
we'll block innovation.
1131
00:44:22,840 --> 00:44:24,360
The reality is the opposite.
1132
00:44:24,360 --> 00:44:27,720
Pre-execution controls create a predictable surface for innovation.
1133
00:44:27,720 --> 00:44:29,240
Developers know the rules.
1134
00:44:29,240 --> 00:44:32,760
If they define identities correctly, choose label data sources
1135
00:44:32,760 --> 00:44:35,560
and adopt the observability template, their agents ship,
1136
00:44:35,560 --> 00:44:37,240
if they cut corners they don't.
1137
00:44:37,240 --> 00:44:40,280
That is infinitely more empowering than the world you're in now,
1138
00:44:40,280 --> 00:44:41,880
where teams build whatever they like,
1139
00:44:41,880 --> 00:44:44,120
only to have security show up months later,
1140
00:44:44,120 --> 00:44:47,240
with a list of violations and a vague threat to turn things off.
1141
00:44:47,240 --> 00:44:49,720
It also aligns with where regulation is going.
1142
00:44:49,720 --> 00:44:52,360
Most of the upcoming AI compliance requirements,
1143
00:44:52,360 --> 00:44:55,160
EUAI Act, sector guidance, internal audit trends
1144
00:44:55,160 --> 00:44:57,160
are converging on the same question,
1145
00:44:57,160 --> 00:45:00,440
show me how you decided the system was safe before you deployed it.
1146
00:45:00,440 --> 00:45:02,200
Pre-execution gates give you that story.
1147
00:45:02,200 --> 00:45:03,800
You can point to the identity policy,
1148
00:45:03,800 --> 00:45:05,720
you can point to the purview evaluation,
1149
00:45:05,720 --> 00:45:07,640
you can point to the observability checklist.
1150
00:45:07,640 --> 00:45:11,000
You can say, nothing reaches production without passing these controls,
1151
00:45:11,000 --> 00:45:12,280
and that's the hinge.
1152
00:45:12,280 --> 00:45:14,760
Either you decide now that no agent in your estate
1153
00:45:14,760 --> 00:45:16,920
is allowed to run without passing those gates.
1154
00:45:16,920 --> 00:45:18,920
Or you wait, let the factory spin up,
1155
00:45:18,920 --> 00:45:20,600
and discover your real architecture
1156
00:45:20,600 --> 00:45:23,080
in an incident report written by somebody else.
1157
00:45:23,080 --> 00:45:26,520
Every agent you allow today defines the incident report you'll read tomorrow.
1158
00:45:26,520 --> 00:45:29,400
Week one control plane checklist for Foundry agents.
1159
00:45:29,400 --> 00:45:31,640
Let me turn this from architecture into something
1160
00:45:31,640 --> 00:45:33,000
you can actually do in the first week.
1161
00:45:33,000 --> 00:45:34,280
You don't need a program,
1162
00:45:34,280 --> 00:45:35,960
you don't need a steering committee,
1163
00:45:35,960 --> 00:45:38,360
you need a checklist that changes what is allowed to exist.
1164
00:45:38,360 --> 00:45:41,240
Think of this as the minimum viable control plane for Foundry,
1165
00:45:41,240 --> 00:45:42,360
start with ownership.
1166
00:45:42,360 --> 00:45:45,160
For every existing or proposed agent,
1167
00:45:45,160 --> 00:45:46,600
there has to be a named owner.
1168
00:45:46,600 --> 00:45:49,320
Not the AI team, not platform,
1169
00:45:49,320 --> 00:45:52,120
a real team or function you can find in the org chart.
1170
00:45:52,120 --> 00:45:54,200
Your first question is brutally simple.
1171
00:45:54,200 --> 00:45:56,840
Who is accountable for this agent's behavior?
1172
00:45:56,840 --> 00:45:59,000
If the answer is a person, you are brittle.
1173
00:45:59,000 --> 00:45:59,960
People leave.
1174
00:45:59,960 --> 00:46:01,560
If the answer is a vague group,
1175
00:46:01,560 --> 00:46:03,160
you have no accountability.
1176
00:46:03,160 --> 00:46:05,800
Pick a team and write it down somewhere security can see,
1177
00:46:05,800 --> 00:46:07,800
then attach that ownership to identity.
1178
00:46:07,800 --> 00:46:10,440
In week one, define a pattern like this.
1179
00:46:10,440 --> 00:46:13,880
All production agents use dedicated workload identities in Entra.
1180
00:46:13,880 --> 00:46:16,280
Each of those identities is tagged with an owner team,
1181
00:46:16,280 --> 00:46:18,520
a business purpose, an environment tag,
1182
00:46:18,520 --> 00:46:21,000
and an expiry date, no tag, no production.
1183
00:46:21,000 --> 00:46:23,320
You don't need perfect automation on day one.
1184
00:46:23,320 --> 00:46:25,640
You can enforce this with naming conventions,
1185
00:46:25,640 --> 00:46:28,040
with tags, with a spreadsheet if you have to.
1186
00:46:28,040 --> 00:46:30,200
What you cannot do is let the next agent ship
1187
00:46:30,200 --> 00:46:33,160
under a generic automation account or a borrowed service principle
1188
00:46:33,160 --> 00:46:34,680
because it's just a pilot.
1189
00:46:34,680 --> 00:46:37,720
If you discover agents already running under those identities,
1190
00:46:37,720 --> 00:46:39,880
put them on a remediation list immediately,
1191
00:46:39,880 --> 00:46:41,080
not someday, now.
1192
00:46:41,080 --> 00:46:44,280
Next, bring Perview into the conversation
1193
00:46:44,280 --> 00:46:45,720
before anyone touches Foundry.
1194
00:46:45,720 --> 00:46:48,920
You establish one champion rule, no label, no agent.
1195
00:46:48,920 --> 00:46:51,000
Pick a minimal, opinionated label set.
1196
00:46:51,000 --> 00:46:52,360
You actually trust.
1197
00:46:52,360 --> 00:46:54,520
Internal, confidential, highly confidential,
1198
00:46:54,520 --> 00:46:56,520
regulated, don't obsess over taxonomy purity.
1199
00:46:56,520 --> 00:46:58,840
You can refine later, declare in writing,
1200
00:46:58,840 --> 00:47:00,760
only data with at least a baseline label
1201
00:47:00,760 --> 00:47:02,760
is eligible for autonomous processing
1202
00:47:02,760 --> 00:47:04,680
and only under explicit policy.
1203
00:47:04,680 --> 00:47:06,760
Then define even roughly which labels agents
1204
00:47:06,760 --> 00:47:08,360
can read and write in week one.
1205
00:47:08,360 --> 00:47:10,760
For example, agents may read internal and confidential.
1206
00:47:10,760 --> 00:47:12,520
Agents may not read highly confidential
1207
00:47:12,520 --> 00:47:14,520
or regulated without an exception process.
1208
00:47:14,520 --> 00:47:17,560
Agents only write to destinations classified at least internal.
1209
00:47:17,560 --> 00:47:19,240
You're not solving every edge case.
1210
00:47:19,240 --> 00:47:20,680
You're drawing a bright line.
1211
00:47:20,680 --> 00:47:22,280
Below this line, we experiment.
1212
00:47:22,280 --> 00:47:23,400
Above this line, we don't.
1213
00:47:23,400 --> 00:47:24,920
Now tie this to deployment.
1214
00:47:24,920 --> 00:47:27,240
Whatever process leads to a Foundry deployment in your world,
1215
00:47:27,240 --> 00:47:30,280
CI/CD, a ticket, a manual step in a portal.
1216
00:47:30,280 --> 00:47:33,320
Add three questions that must be answered before it succeeds.
1217
00:47:33,320 --> 00:47:35,720
Does this agent use a dedicated workload identity
1218
00:47:35,720 --> 00:47:36,600
with an owner tag?
1219
00:47:36,600 --> 00:47:39,640
Do all declared data sources have Perview sensitivity labels?
1220
00:47:39,640 --> 00:47:43,000
Has someone asserted which labels this agent is allowed to read and write?
1221
00:47:43,000 --> 00:47:45,240
If any answer is no, deployment stops.
1222
00:47:45,240 --> 00:47:48,520
This can be as crude as a mandatory checklist on a change request.
1223
00:47:48,520 --> 00:47:51,480
It can be enforced by a script that queries Entra and Perview.
1224
00:47:51,480 --> 00:47:53,160
The mechanism doesn't matter in week one.
1225
00:47:53,160 --> 00:47:56,440
The non-negotiable part is that you remove plausible deniability.
1226
00:47:56,440 --> 00:47:58,200
No one gets to say, and we didn't think about that
1227
00:47:58,200 --> 00:48:00,280
and then address observability.
1228
00:48:00,280 --> 00:48:02,440
You won't build perfect tracing in seven days.
1229
00:48:02,440 --> 00:48:03,880
You can refuse to run blind.
1230
00:48:03,880 --> 00:48:06,440
For every production agent, ask three more questions.
1231
00:48:06,440 --> 00:48:07,640
Where do its traces live?
1232
00:48:07,640 --> 00:48:10,680
What is the minimum set of facts we can reconstruct for each run?
1233
00:48:10,680 --> 00:48:13,720
Who is responsible for looking at those traces when something goes wrong?
1234
00:48:13,720 --> 00:48:16,360
If the answer to the first is, we don't trace this.
1235
00:48:16,360 --> 00:48:18,680
That agent has no business in production.
1236
00:48:18,680 --> 00:48:21,080
If the answer to the third is no one,
1237
00:48:21,080 --> 00:48:22,520
assign ownership now.
1238
00:48:22,520 --> 00:48:24,360
Even if the answer is as crude as
1239
00:48:24,360 --> 00:48:26,760
the platform team owns all agent traces.
1240
00:48:26,760 --> 00:48:28,840
Next, define exit.
1241
00:48:28,840 --> 00:48:32,040
Every agent you allow into production needs a shutdown story.
1242
00:48:32,040 --> 00:48:34,040
Write a single sentence per agent.
1243
00:48:34,040 --> 00:48:36,760
This agent stops existing when X is true.
1244
00:48:36,760 --> 00:48:39,800
X might be, this project ends, this product is retired,
1245
00:48:39,800 --> 00:48:42,440
this Q is decommissioned, this owner team changes,
1246
00:48:42,440 --> 00:48:43,560
at one more line.
1247
00:48:43,560 --> 00:48:47,320
When X happens, Y disables the identity and decommissioned the agent definition.
1248
00:48:47,320 --> 00:48:49,800
E is a real team, not somebody in IT.
1249
00:48:49,800 --> 00:48:52,360
Finally, bake in a habit of hunting for the unknown.
1250
00:48:52,360 --> 00:48:55,320
Run a basic inventory, list all intra-app registrations
1251
00:48:55,320 --> 00:48:58,120
and service principles tagged for automation or AI.
1252
00:48:58,120 --> 00:49:00,760
Crosscheck with what you believe are your foundry projects.
1253
00:49:00,760 --> 00:49:04,360
Flag anything with no clear mapping to a current owner or business process.
1254
00:49:04,360 --> 00:49:05,880
You won't fix it all in week one.
1255
00:49:05,880 --> 00:49:09,000
You're building a queue of agents and identities we don't understand yet.
1256
00:49:09,000 --> 00:49:12,280
That queue is your early warning system for autonomous shadow IT.
1257
00:49:12,280 --> 00:49:14,920
So your week one control plane checklist is this.
1258
00:49:14,920 --> 00:49:18,440
No agent without a dedicated workload identity and an owner team.
1259
00:49:18,440 --> 00:49:21,320
No unlabeled data in scope for autonomous access.
1260
00:49:21,320 --> 00:49:24,040
No deployment without an explicit label policy for that agent.
1261
00:49:24,040 --> 00:49:27,880
No production agent without at least minimal traces in a place to store them.
1262
00:49:27,880 --> 00:49:31,800
No agent without a defined exit condition and a named shutdown actor.
1263
00:49:31,800 --> 00:49:35,400
A standing task to hunt for identities and agents that sit outside those rules.
1264
00:49:35,400 --> 00:49:36,840
You won't win awards for this.
1265
00:49:36,840 --> 00:49:37,960
You won't inspire anybody.
1266
00:49:37,960 --> 00:49:40,360
You'll do something more useful at this stage.
1267
00:49:40,360 --> 00:49:43,560
You'll be the reason the first foundry incident in your tenant
1268
00:49:43,560 --> 00:49:47,400
is a near miss instead of the case study someone else presents at a conference
1269
00:49:47,400 --> 00:49:50,680
with your name redacted and your architecture on every slide.
1270
00:49:50,680 --> 00:49:54,840
Anticipating enterprise drift, shadow it and regulatory pressure.
1271
00:49:54,840 --> 00:49:58,040
Everything up to now has been inside one tenant and one platform
1272
00:49:58,040 --> 00:50:00,280
but enterprises don't drift one agent at a time.
1273
00:50:00,280 --> 00:50:01,480
They drift as a system.
1274
00:50:01,480 --> 00:50:05,240
So I want you to zoom out and look at the next 12 to 24 months of your estate
1275
00:50:05,240 --> 00:50:08,920
through two lenses next generation shadow IT and regulatory pressure
1276
00:50:08,920 --> 00:50:11,160
that is going to land whether you are ready or not.
1277
00:50:11,160 --> 00:50:14,520
Start with shadow IT because that's where entropy always shows up first.
1278
00:50:14,520 --> 00:50:17,800
Shadow it used to mean somebody swiping a credit card for SAS.
1279
00:50:17,800 --> 00:50:21,800
A rogue CRM, an unsanctioned file share, things that at least had a URL
1280
00:50:21,800 --> 00:50:23,240
you could eventually discover.
1281
00:50:23,240 --> 00:50:26,840
Shadow AI is the same pattern just faster and harder to see.
1282
00:50:26,840 --> 00:50:29,240
Teams can now spin up agents in multiple places.
1283
00:50:29,240 --> 00:50:32,520
Foundry, co-pilot studio, third party orchestrators,
1284
00:50:32,520 --> 00:50:36,760
SAS products that quietly added agent features to their enterprise tier.
1285
00:50:36,760 --> 00:50:39,160
Each of those agents can be wired into your core systems
1286
00:50:39,160 --> 00:50:40,840
through perfectly legitimate connectors.
1287
00:50:40,840 --> 00:50:44,440
Graph, dataverse, exchange, service, now, custom internal APIs.
1288
00:50:44,440 --> 00:50:46,440
From a network and identity perspective,
1289
00:50:46,440 --> 00:50:48,040
none of this looks obviously hostile.
1290
00:50:48,040 --> 00:50:50,520
Tokens are issued, TLS is green.
1291
00:50:50,520 --> 00:50:54,360
Permissions were granted through the same consent flows you use for everything else.
1292
00:50:54,360 --> 00:50:57,640
What you experience over time is not one catastrophic decision.
1293
00:50:57,640 --> 00:50:59,480
It's decentralized AI risk.
1294
00:50:59,480 --> 00:51:02,120
A sales org pilots an agent to enrich opportunities.
1295
00:51:02,120 --> 00:51:05,160
Support builds their own triage agent in a different platform.
1296
00:51:05,160 --> 00:51:08,040
Finance adopts a vendor hosted reconciliation agent.
1297
00:51:08,040 --> 00:51:10,280
Security experiments with an incident response agent.
1298
00:51:10,280 --> 00:51:12,440
Every local team can justify their own move.
1299
00:51:12,440 --> 00:51:13,800
None of them share control plane.
1300
00:51:13,800 --> 00:51:15,320
So when someone asks,
1301
00:51:15,320 --> 00:51:18,920
how many agents do we have that can touch regulated customer data
1302
00:51:18,920 --> 00:51:20,280
and under what policies?
1303
00:51:20,280 --> 00:51:23,160
Your honest answer is, we know where some of them are.
1304
00:51:23,160 --> 00:51:25,400
That's next generation Shadow IT.
1305
00:51:25,400 --> 00:51:30,120
Not just unknown apps, but unknown autonomous behaviors over known systems.
1306
00:51:30,120 --> 00:51:31,960
Now layer regulation on top of that.
1307
00:51:31,960 --> 00:51:35,400
Regulators don't care whether you call something an app or bought or an agent.
1308
00:51:35,400 --> 00:51:37,640
They care what impact it has on real people.
1309
00:51:37,640 --> 00:51:40,120
Most of the incoming AI requirements,
1310
00:51:40,120 --> 00:51:42,040
whether you look at the EU AI Act,
1311
00:51:42,040 --> 00:51:44,920
sector-specific guidance or internal audit trends,
1312
00:51:44,920 --> 00:51:46,600
converge on a few simple expectations.
1313
00:51:46,600 --> 00:51:49,080
You know where AI is in your estate.
1314
00:51:49,080 --> 00:51:50,760
You know what data it touches.
1315
00:51:50,760 --> 00:51:53,480
You can explain how you decided it was allowed to do that.
1316
00:51:53,480 --> 00:51:55,640
You can produce logs when something goes wrong.
1317
00:51:55,640 --> 00:51:57,640
And for anything high risk, you can turn it off.
1318
00:51:57,640 --> 00:51:59,080
High risk isn't a marketing label.
1319
00:51:59,080 --> 00:52:01,480
It's anything that can affect someone's livelihood,
1320
00:52:01,480 --> 00:52:04,360
credit, health, freedom or access to services.
1321
00:52:04,360 --> 00:52:07,800
That includes plenty of the agents your business is already dreaming about.
1322
00:52:07,800 --> 00:52:09,560
If your Foundry adoption pattern is,
1323
00:52:09,560 --> 00:52:11,800
we'll prototype first and wrap governance later.
1324
00:52:11,800 --> 00:52:14,360
You're engineering the worst possible combination,
1325
00:52:14,360 --> 00:52:18,120
decentralized AI risk with centralized accountability.
1326
00:52:18,120 --> 00:52:20,360
Because when the letter arrives from a regulator,
1327
00:52:20,360 --> 00:52:23,080
from an internal audit committee or from a customer's legal team,
1328
00:52:23,080 --> 00:52:24,600
it will not be addressed to the team
1329
00:52:24,600 --> 00:52:26,600
that hacked together the first agent.
1330
00:52:26,600 --> 00:52:29,960
It will be addressed to whoever owns security compliance or the platform.
1331
00:52:29,960 --> 00:52:32,600
This is where AI entropy becomes a governance problem,
1332
00:52:32,600 --> 00:52:34,040
not just a technical one.
1333
00:52:34,040 --> 00:52:37,560
Without a control plane, your estate tends to auto-pay complexity.
1334
00:52:37,560 --> 00:52:38,920
Not because anyone is malicious,
1335
00:52:38,920 --> 00:52:40,600
but because every local optimization,
1336
00:52:40,600 --> 00:52:44,200
every quick agent adds another edge to a graph nobody is drawing.
1337
00:52:44,200 --> 00:52:46,360
After a while, no one can answer basic questions,
1338
00:52:46,360 --> 00:52:48,840
which agents exist, which identities do they use,
1339
00:52:48,840 --> 00:52:51,560
which labels can they touch, who owns them, how do they die?
1340
00:52:51,560 --> 00:52:54,680
And regulators are getting less interested in your intentions
1341
00:52:54,680 --> 00:52:56,120
and more interested in your answers.
1342
00:52:56,120 --> 00:52:57,400
Five years from now,
1343
00:52:57,400 --> 00:53:00,120
when people look back at Foundry vs. Power Platform,
1344
00:53:00,120 --> 00:53:03,000
they won't ask how quickly you enabled AI.
1345
00:53:03,000 --> 00:53:05,480
They'll look at the first incidents and ask one thing.
1346
00:53:05,480 --> 00:53:08,040
Did you let autonomous execution into production
1347
00:53:08,040 --> 00:53:09,320
before you had a way to see it,
1348
00:53:09,320 --> 00:53:11,080
bound it and prove you were in control?
1349
00:53:11,080 --> 00:53:12,600
If the answer is yes,
1350
00:53:12,600 --> 00:53:14,920
it won't matter that the tools were powerful
1351
00:53:14,920 --> 00:53:16,600
or that your teams were moving fast.
1352
00:53:16,600 --> 00:53:19,320
It will look like every other governance failure pattern
1353
00:53:19,320 --> 00:53:20,680
in Microsoft history,
1354
00:53:20,680 --> 00:53:21,960
SharePoint Power Apps,
1355
00:53:21,960 --> 00:53:24,600
teams just compressed into a shorter time frame
1356
00:53:24,600 --> 00:53:26,360
with agents instead of forms.
1357
00:53:26,360 --> 00:53:28,200
And the first public foundry style incidents
1358
00:53:28,200 --> 00:53:29,400
won't even be labeled correctly.
1359
00:53:29,400 --> 00:53:31,960
They'll be called unexpected data access.
1360
00:53:31,960 --> 00:53:34,440
AI misbehavior, configuration error,
1361
00:53:34,440 --> 00:53:35,880
on paper that will be true,
1362
00:53:35,880 --> 00:53:37,880
on a risk register it will be something else.
1363
00:53:37,880 --> 00:53:39,400
Governance arriving too late.
1364
00:53:39,400 --> 00:53:40,360
The prediction,
1365
00:53:40,360 --> 00:53:42,040
how this will officially fail.
1366
00:53:42,040 --> 00:53:44,120
I want to end by telling you how this is actually going to look
1367
00:53:44,120 --> 00:53:46,040
on paper when it finally breaks,
1368
00:53:46,040 --> 00:53:47,640
because it won't show up in the incident report
1369
00:53:47,640 --> 00:53:49,400
the way we've been talking about it here.
1370
00:53:49,400 --> 00:53:50,600
Nobody is going to write,
1371
00:53:50,600 --> 00:53:52,760
we allowed autonomous execution into production
1372
00:53:52,760 --> 00:53:54,280
before we had a control plane.
1373
00:53:54,280 --> 00:53:56,360
You're going to see three familiar phrases instead.
1374
00:53:56,360 --> 00:53:58,920
Unexpected data access.
1375
00:53:58,920 --> 00:54:00,920
AI produced an incorrect output,
1376
00:54:00,920 --> 00:54:03,480
configuration issue in an automation component.
1377
00:54:03,480 --> 00:54:05,720
If regulators are involved, the language will tighten.
1378
00:54:05,720 --> 00:54:06,840
The pattern won't change.
1379
00:54:06,840 --> 00:54:07,720
On day zero,
1380
00:54:07,720 --> 00:54:09,560
someone inside the business notices something
1381
00:54:09,560 --> 00:54:11,160
that simply doesn't feel right.
1382
00:54:11,160 --> 00:54:13,160
A summary email that includes HR details
1383
00:54:13,160 --> 00:54:15,640
in a place where HR data has never appeared.
1384
00:54:15,640 --> 00:54:18,440
A customer communication that references billing status,
1385
00:54:18,440 --> 00:54:20,440
no one thought the sender could see.
1386
00:54:20,440 --> 00:54:22,600
A report that combines data from two systems
1387
00:54:22,600 --> 00:54:24,680
that on paper are segregated.
1388
00:54:24,680 --> 00:54:26,680
On day one, security pulls logs.
1389
00:54:26,680 --> 00:54:28,040
They do the obvious analysis.
1390
00:54:28,040 --> 00:54:29,480
Was there an external attacker?
1391
00:54:29,480 --> 00:54:30,600
Any sign in anomalies?
1392
00:54:30,600 --> 00:54:32,680
Any tokens from untrusted locations?
1393
00:54:32,680 --> 00:54:34,280
Everything comes back clean.
1394
00:54:34,280 --> 00:54:36,440
Every call came from trusted identities
1395
00:54:36,440 --> 00:54:38,920
through approved connectors over encrypted channels.
1396
00:54:38,920 --> 00:54:41,800
No sign of exfiltration, no brute force, no malware.
1397
00:54:41,800 --> 00:54:44,280
The preliminary conclusion.
1398
00:54:44,280 --> 00:54:46,520
No evidence of external compromise.
1399
00:54:46,520 --> 00:54:49,800
On day two, someone finally asks the right question.
1400
00:54:49,800 --> 00:54:51,400
Could this have been an agent?
1401
00:54:51,400 --> 00:54:54,840
They trace the path and discover that, yes, an agent,
1402
00:54:54,840 --> 00:54:56,120
sometimes built in Foundry,
1403
00:54:56,120 --> 00:54:58,040
sometimes living in a neighboring platform,
1404
00:54:58,040 --> 00:54:59,640
was the one that assembled the output.
1405
00:54:59,640 --> 00:55:01,400
It has a reassuring name assistant,
1406
00:55:01,400 --> 00:55:03,160
copilot, triagebot.
1407
00:55:03,160 --> 00:55:05,160
On day three, they pull the intro object.
1408
00:55:05,160 --> 00:55:07,400
They realize the identity behind that agent
1409
00:55:07,400 --> 00:55:10,120
has broader permissions than anyone remembers granting,
1410
00:55:10,120 --> 00:55:12,040
or that it's a shared automation account
1411
00:55:12,040 --> 00:55:14,600
that has quietly accumulated rights over time,
1412
00:55:14,600 --> 00:55:17,480
or that it's a workload identity with no current owner.
1413
00:55:17,480 --> 00:55:18,760
Now the story shifts.
1414
00:55:18,760 --> 00:55:20,680
This wasn't AI-going rogue.
1415
00:55:20,680 --> 00:55:23,080
This was a perfectly authenticated identity
1416
00:55:23,080 --> 00:55:25,320
doing exactly what the platform allowed it to do.
1417
00:55:25,320 --> 00:55:26,920
On day four, they map the data.
1418
00:55:26,920 --> 00:55:28,520
They see that multiple sources,
1419
00:55:28,520 --> 00:55:30,280
each individually compliant,
1420
00:55:30,280 --> 00:55:32,680
were combined in a way that violates policy intent.
1421
00:55:32,680 --> 00:55:35,800
HR notes plus customer PRI, plus finance records,
1422
00:55:35,800 --> 00:55:37,640
merged into a single narrative.
1423
00:55:37,640 --> 00:55:40,120
Or internal risk ratings, plus external credit data,
1424
00:55:40,120 --> 00:55:41,480
plus operational logs,
1425
00:55:41,480 --> 00:55:45,000
surfaced in a context that was never supposed to see all three.
1426
00:55:45,000 --> 00:55:47,720
At that point, the narrative is already being sanitized.
1427
00:55:47,720 --> 00:55:49,880
The draft root cause statement will talk about
1428
00:55:49,880 --> 00:55:52,680
a misapplied permission on an automation identity,
1429
00:55:52,680 --> 00:55:54,360
an overly broad connectoscope,
1430
00:55:54,360 --> 00:55:57,160
a lack of testing of AI behavior in edge cases.
1431
00:55:57,160 --> 00:55:59,640
There might be a sentence about insufficient guardrails
1432
00:55:59,640 --> 00:56:01,720
or incomplete DLP coverage.
1433
00:56:01,720 --> 00:56:04,520
You will see a remediation section that sounds mature.
1434
00:56:04,520 --> 00:56:06,760
We will enhance governance for AI.
1435
00:56:06,760 --> 00:56:09,320
We will improve monitoring of agent behavior.
1436
00:56:09,320 --> 00:56:12,200
We will tighten access policies around automation accounts.
1437
00:56:12,200 --> 00:56:15,880
We will provide additional training on configuration best practices.
1438
00:56:15,880 --> 00:56:17,320
All of that will be true.
1439
00:56:17,320 --> 00:56:18,920
None of it will be the actual cause.
1440
00:56:18,920 --> 00:56:21,080
Because the real cause will be upstream,
1441
00:56:21,080 --> 00:56:24,040
you allowed agents to execute in production before identity,
1442
00:56:24,040 --> 00:56:26,680
data boundaries and observability were treated as gates
1443
00:56:26,680 --> 00:56:28,920
instead of after the fact annotations.
1444
00:56:28,920 --> 00:56:30,440
You let the factory ship product
1445
00:56:30,440 --> 00:56:32,360
before you finished building the breakers.
1446
00:56:32,360 --> 00:56:35,320
Most organizations will frame this as a misconfiguration story
1447
00:56:35,320 --> 00:56:36,920
because misconfiguration is comfortable.
1448
00:56:36,920 --> 00:56:38,600
It sounds like an era, not a design.
1449
00:56:38,600 --> 00:56:40,520
We misconfigured a permission, implies the model
1450
00:56:40,520 --> 00:56:42,360
was sound and the human slipped.
1451
00:56:42,360 --> 00:56:45,240
What actually happened is that governance arrived too late by design.
1452
00:56:45,240 --> 00:56:48,040
You will see that pattern replayed with minor variations.
1453
00:56:48,680 --> 00:56:52,040
Sometimes the headline will emphasize AI hallucination
1454
00:56:52,040 --> 00:56:56,280
even when the hallucination happened entirely inside the boundaries you drew.
1455
00:56:56,280 --> 00:56:58,680
Sometimes it will emphasize vendor behavior
1456
00:56:58,680 --> 00:57:01,960
even when the vendor only did what your identity and data model allowed.
1457
00:57:01,960 --> 00:57:03,400
And here's the uncomfortable part.
1458
00:57:03,400 --> 00:57:06,760
From the outside, two organizations will look very different after this wave.
1459
00:57:06,760 --> 00:57:09,400
One will have a shorter, less painful set of incidents.
1460
00:57:09,400 --> 00:57:11,960
They'll be held up in case studies as more mature,
1461
00:57:11,960 --> 00:57:14,680
more responsible ahead on AI governance.
1462
00:57:14,680 --> 00:57:16,920
The other will spend years backfilling documentation,
1463
00:57:16,920 --> 00:57:20,280
retrofitting policies, and explaining to regulators why an agent
1464
00:57:20,280 --> 00:57:24,200
no one can fully describe was able to touch data nobody remembers approving.
1465
00:57:24,200 --> 00:57:25,960
From the inside, the difference will be simple.
1466
00:57:25,960 --> 00:57:29,160
The lucky orgs treated agents as production workloads on day one.
1467
00:57:29,160 --> 00:57:33,800
They enforced no owner, no execution, no label, no agent, no audit path, no production.
1468
00:57:33,800 --> 00:57:36,360
The others told themselves a more optimistic story.
1469
00:57:36,360 --> 00:57:37,800
We'll let teams experiment.
1470
00:57:37,800 --> 00:57:39,240
We'll monitor and adjust.
1471
00:57:39,240 --> 00:57:41,880
We'll add governance when we see what people actually build.
1472
00:57:41,880 --> 00:57:46,600
They woke up to find that what people built had quietly become part of how the business operates
1473
00:57:46,600 --> 00:57:49,400
and taking it away hurt more than leaving it ungoverned.
1474
00:57:49,400 --> 00:57:51,720
That's the shape of this failure when it finally lands,
1475
00:57:51,720 --> 00:57:53,160
not a dramatic AI catastrophe.
1476
00:57:53,160 --> 00:57:56,200
A series of small, plausible incidents that all reduce
1477
00:57:56,200 --> 00:57:58,360
in hindsight to the same decision.
1478
00:57:58,360 --> 00:58:00,840
Execution was allowed before governance existed.
1479
00:58:00,840 --> 00:58:03,800
And once you see it that way, the prediction writes itself.
1480
00:58:03,800 --> 00:58:07,320
Most foundry related incidents will be officially classified as
1481
00:58:07,320 --> 00:58:10,680
AI misconfiguration, unexpected data access,
1482
00:58:10,680 --> 00:58:12,680
process gap in automation governance.
1483
00:58:12,680 --> 00:58:14,920
But inside your own post-mortem, if you're honest,
1484
00:58:14,920 --> 00:58:17,640
the line you'll write for yourself will be shorter.
1485
00:58:17,640 --> 00:58:19,960
We let the factory run without a control plane
1486
00:58:19,960 --> 00:58:21,560
and it did exactly what factories do.
1487
00:58:21,560 --> 00:58:24,680
The choice point, foundry is not an AI feature.
1488
00:58:24,680 --> 00:58:25,800
It is an agent factory.
1489
00:58:25,800 --> 00:58:28,760
And factories without control planes don't produce value.
1490
00:58:28,760 --> 00:58:30,520
They produce shadow IT.
1491
00:58:30,520 --> 00:58:32,920
If you remember nothing else, remember this.
1492
00:58:32,920 --> 00:58:36,680
If an agent can execute before identity, data boundary,
1493
00:58:36,680 --> 00:58:39,880
and observability are in place, governance has already failed.
1494
00:58:39,880 --> 00:58:42,680
Every agent you allow today defines the incident report,
1495
00:58:42,680 --> 00:58:43,640
you'll read tomorrow.
1496
00:58:43,640 --> 00:58:47,880
You can decide now that no agent in your tenant runs without an owner,
1497
00:58:47,880 --> 00:58:50,520
a labeled boundary, and an audit path.
1498
00:58:50,520 --> 00:58:53,240
Or you can wait and learn those boundaries from an external report
1499
00:58:53,240 --> 00:58:55,400
with your architecture on every slide.
1500
00:58:55,400 --> 00:58:57,480
If you want more unapologetic failure analysis
1501
00:58:57,480 --> 00:59:00,520
before it shows up in your inbox, stay with this series.