Dec. 6, 2025

MCP & Semantic Kernel: Building AI Agents That Take Action, Not Just Chat

You’re wasting AI on small talk. In this session I show you how to turn chatty models into hardened IT ops agents that actually fix incidents while you sleep. We wire Semantic Kernel, MCP, Microsoft Graph and Azure OpenAI with managed identity so agents can plan, act and auto-verify – without handing root access to a hallucinating chatbot.

You’ll see how to slash MTTR, auto-resolve password reset tickets, drain bad builds, and roll back safely using tool schemas as “laws of physics,” not vibes. We’ll build a six-part agent molecule (persona, memory, planner, tools, policy, verifier) and drop it into real incident flows: 5XX spikes, canary failures, onboarding waves and weekend fire drills.

If you care about uptime, sleep, and not turning your data center into glass, this is your blueprint: SK orchestrates, MCP connects, Foundry governs, managed identity contains – and your agents prove every action they take.

Ah! You’re wasting AI on small talk. Pure power trapped in chit-chat.
In this episode, we break open the containment field and show you how to turn AI from a polite conversationalist into a fully-acting IT Operations agent—one that plans, executes, verifies, and stays inside governance at all times. You’ll learn exactly how modern enterprise teams are using Semantic Kernel, MCP, and Azure OpenAI tool-calling with Managed Identity to auto-remediate incidents, reduce MTTR, eliminate hundreds of service desk tickets, and create predictable, auditable workflows. This isn’t theory—it’s the blueprint. 🎯 Episode Focus — From Answering to Acting Traditional chatbots whisper advice. Acting agents do the work.
We explore the shift from static Q&A loops to a closed-loop cycle: Intention → Plan → Tool Use → Result → Self-Check → Next Step Learn why this pattern unlocks automation in Microsoft environments without sacrificing safety, compliance, or observability. Micro-Story: A real SRE team wired an agent to monitor high CPU alerts, correlate with deployments, drain faulty nodes, roll back the slot, and post an incident summary—all before the human even rolled out of bed.
Not magic. Orchestration. 🔌 Why Microsoft Shops Win Big: MCP + SK + Managed Identity Three components snap together and give you enterprise-grade capability: 🔧 MCP (Model Context Protocol): The Wiring

  • Tools describe themselves with standards and schemas
  • Microsoft Graph, Intune, Service Health, internal APIs become discoverable
  • No brittle plugins or secret adapters
  • Add new capabilities without redeploying anything

MCP makes your tools visible. 🧠 Semantic Kernel: The Orchestration Layer

  • Turns MCP tools into callable kernel functions
  • Handles planning: sequential, parallel, or graph-shaped tasks
  • Auto-builds JSON schemas models expect
  • Removes the need for hand-crafted payloads

SK shapes the plan and the calls. 🔐 Azure OpenAI + Managed Identity: The Containment Field

  • Model decides what, identity decides what’s allowed
  • Tokens are never exposed
  • Each action is access-controlled at the tool boundary
  • High-risk actions require approval tokens

Identity contains the blast radius. 🧬 The Six-Part Agent Molecule: Build Stable, Reliable Agents A high-functioning IT Ops agent is built from a six-part molecule:

  1. Persona — SRE temperament encoded (cautious, concise, safety-first).
  2. Memory — Short-term context + durable environmental facts.
  3. Planner — Decomposes tasks into safe, verifiable steps.
  4. Tools — MCP-exposed actuators and sensors.
  5. Policy — Identity controls, approvals, guardrails.
  6. Verifier — Post-action checks: metrics, probes, risk state.

Miss one of these parts and your agent becomes unpredictable. ⚙ Blueprint 1 — SK Planner + Graph via MCP (IT Ops) We walk through a concrete pattern for post-deployment error spikes: Goal: Recover from elevated 5xx while minimizing blast radius. Tools (via MCP):

  • AppInsightsQuery
  • GraphServiceHealth
  • GraphChangeLog
  • DrainSubsetByBuild
  • RollbackSlot
  • PostIncidentNote

Plan:

  1. Assess: Query metrics, deployments, health advisories (parallel).
  2. Decide: Pick the narrowest safe fix—e.g., drain a bad build subset.
  3. Act: Perform drainage or rollback with identity-scoped tools.
  4. Verify: Require P95 + 5xx improvement before declaring success.
  5. Report: Summaries, graphs, dashboards, change IDs.

Key win: Narrow-first fixes prevent unnecessary rollbacks. 🔧 Blueprint 2 — Azure OpenAI Tool-Calling with Managed Identity This blueprint shows how to let the model act without ever handing out credentials. Example: Password Reset Automation

  • Agent validates user status via Graph
  • Checks MFA, riskState, and role assignments
  • Performs compliant reset (MI scopes enforce safety)
  • Notifies user and closes ITSM ticket
  • Verifies sign-in status or risk flag after reset

Policy encoded in tools ensures governance is non-negotiable. 🛠 Blueprint 3 — Closed-Loop Auto-Remediation The crown jewel: a fully contained remediation loop. Flow:

  • Triggered by telemetry or incident
  • Multi-branch assessment for root-cause hints
  • Narrow corrective action first (drain, isolate, scale)
  • Approval-gated high-risk actions (rollback, redeploy)
  • Continuous verification with App Insights
  • Auto-reporting with evidence

Closed-loop means no guessing—an agent proves the outcome. 📈 Business Outcomes: Why This Actually Matters Beyond the tech, we break down real business impacts:

  • 40–70% reduction in MTTR for repeatable failure modes
  • 60–90% ticket deflection for onboarding and identity issues
  • 50% faster change cycles with Parallel Assess → Safe Action
  • Lower burnout and attrition in SRE/on-call teams
  • Audit-ready logs for every action—no mystery behavior
  • Risk compression thanks to identity-scoped tools and approvals

Automation stops being magic—it becomes measurable. 🛡 Guardrails & Responsibility: Safety as Physics We detail the guardrails that prevent chaos:

  • Split Managed Identities (read vs. write vs. high-risk)
  • Hard-coded schema constraints for dangerous operations
  • Approval tokens enforced by the tool, not the prompt
  • Immutable audit envelopes for every tool call
  • Red-team testing for bypass attempts and prompt injections
  • Scope-drift monitoring on tools and identities
  • Privacy guarantees for sensitive data
  • Failure choreography: safe fallback → escalate → contextual summary
  • Model rotation behind stable tool contracts

Governance isn’t vibes—it’s encoded in the tool boundary. 🏁 Conclusion — The Agent Era Starts Now If you remember nothing else: SK orchestrates.
MCP connects.
Foundry governs.
Managed Identity contains.
Verification proves. Start with one narrow flow—like drain-then-verify for post-deploy spikes—and scale safely outward. Subscribe for next week’s episode:
The Minimal Viable RAG Pipeline for Enterprise Truth: Chunking, Guardrails, Evaluations, and Cost Control. Delicious security awaits.

Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-show-podcast--6704921/support.

Follow us on:
LInkedIn
Substack

Transcript

1
00:00:00,000 --> 00:00:07,160
Ah, you're wasting AI on small talk, pure, undiluted power trapped in chit chat.

2
00:00:07,160 --> 00:00:12,080
Today I'll show you how to turn AI from a talker into a worker.

3
00:00:12,080 --> 00:00:19,800
Agents that plan, call Microsoft Graph via MCP, use Azure OpenAI tool calling with managed

4
00:00:19,800 --> 00:00:23,520
identity and auto-remediate incidents.

5
00:00:23,520 --> 00:00:27,280
No live demo, just the containment field you can copy.

6
00:00:27,280 --> 00:00:33,560
By the end, you'll know the blueprint to ship agents that reduce MTTR, auto-resolve

7
00:00:33,560 --> 00:00:37,120
tickets, and cut cycle time in ITOPS.

8
00:00:37,120 --> 00:00:40,120
Let's energize the reactor and make it act.

9
00:00:40,120 --> 00:00:43,120
The shift from answering to acting.

10
00:00:43,120 --> 00:00:48,480
You see, traditional chat is like hiring a brilliant intern who only whispers suggestions

11
00:00:48,480 --> 00:00:53,680
nice but useless at 3am when a critical service is down.

12
00:00:53,680 --> 00:00:59,120
Having agents are different, they understand context, decide, grab the right tool, push

13
00:00:59,120 --> 00:01:03,120
the button, then verify the blast radius didn't melt the floor.

14
00:01:03,120 --> 00:01:09,120
Here's the shift in physics, old way, Q&A loops, you ask, it answers, you do the work,

15
00:01:09,120 --> 00:01:10,840
latency everywhere.

16
00:01:10,840 --> 00:01:18,200
New way, intention, plan, tool use, result, self-check, next move.

17
00:01:18,200 --> 00:01:22,200
The reactor runs a cycle, you supervise the dials.

18
00:01:22,200 --> 00:01:23,880
Check Microstory.

19
00:01:23,880 --> 00:01:29,320
Last quarter in SRE team let an agent watch high CPU alerts on a front end pool.

20
00:01:29,320 --> 00:01:34,120
The agent passed the incident, cross-checked a recent deployment via graph.

21
00:01:34,120 --> 00:01:40,000
Saw a spike only on nodes running a specific build, drained those nodes, rolled back the slot,

22
00:01:40,000 --> 00:01:42,240
and posted the incident note.

23
00:01:42,240 --> 00:01:45,240
All inside change policy.

24
00:01:45,240 --> 00:01:47,920
Human arrived to "resolved."

25
00:01:47,920 --> 00:01:50,400
That's not magic, that's orchestration.

26
00:01:50,400 --> 00:01:51,720
Now focus.

27
00:01:51,720 --> 00:01:54,160
Why this works in Microsoft Land?

28
00:01:54,160 --> 00:02:00,760
Three ingredients snapped together like a disciplined SRE, MCP is the wiring.

29
00:02:00,760 --> 00:02:08,560
It advertises tools, Microsoft graph endpoints, service catalogs, knowledge lookups, in a standard

30
00:02:08,560 --> 00:02:11,440
shape the model can discover dynamically.

31
00:02:11,440 --> 00:02:14,320
No brittle adapters, no hard-coded plug-ins.

32
00:02:14,320 --> 00:02:16,000
The tool shelf labels itself.

33
00:02:16,000 --> 00:02:20,600
In short, MCP makes your tools visible and standard to the model.

34
00:02:20,600 --> 00:02:23,160
The kernel is the control panel.

35
00:02:23,160 --> 00:02:29,880
It turns those tools into callable kernel functions, plans, multi-step work, and handles

36
00:02:29,880 --> 00:02:33,600
the JSON schema's models expect for function calling.

37
00:02:33,600 --> 00:02:39,520
You don't juggle payloads, SK shapes the energy, in short, SK decides the steps and shapes

38
00:02:39,520 --> 00:02:40,520
the calls.

39
00:02:40,520 --> 00:02:45,280
Azure OpenAI with managed identity is the safe power source.

40
00:02:45,280 --> 00:02:49,800
The model can call functions, but identity gates the voltage.

41
00:02:49,800 --> 00:02:55,560
And identity means tokens aren't strewn around like gasoline, the containment field holds,

42
00:02:55,560 --> 00:03:00,360
in short, identity keeps every action inside the blast shield.

43
00:03:00,360 --> 00:03:02,320
But they are slippery.

44
00:03:02,320 --> 00:03:05,920
Chatty agents hallucinate authority.

45
00:03:05,920 --> 00:03:12,520
Acting agents must be leashed, scoped permissions, approval gates for high-risk actions, and

46
00:03:12,520 --> 00:03:15,960
logging that glows like radioactive paint.

47
00:03:15,960 --> 00:03:21,320
The trick is making the fast-path automatic and the dangerous path explicit.

48
00:03:21,320 --> 00:03:22,920
Another micro story.

49
00:03:22,920 --> 00:03:28,840
A help desk was drowning in password reset tickets during onboarding waves.

50
00:03:28,840 --> 00:03:35,840
The agent read the ticket, validated user status in Entra, Viagraph, MCP tool.

51
00:03:35,840 --> 00:03:40,600
Checked MFA registration issued a compliant reset with a temporary password, notified

52
00:03:40,600 --> 00:03:44,120
the user, and closed the loop in the ITSM.

53
00:03:44,120 --> 00:03:48,080
This didn't shrink, they vanished, weekends returned.

54
00:03:48,080 --> 00:03:54,240
The game-changer nobody talks about is verification, an agent that acts must prove it did.

55
00:03:54,240 --> 00:03:59,200
After a remediation call, it requiries the signal, did latency fall?

56
00:03:59,200 --> 00:04:01,480
Is the unhealthy probe count zero?

57
00:04:01,480 --> 00:04:03,800
Did permissions remain least privileged?

58
00:04:03,800 --> 00:04:06,920
If not, it rolls back or escalates.

59
00:04:06,920 --> 00:04:08,480
Boom!

60
00:04:08,480 --> 00:04:09,880
Closed loop control.

61
00:04:09,880 --> 00:04:12,400
Let's pause here.

62
00:04:12,400 --> 00:04:13,720
Key takeaway.

63
00:04:13,720 --> 00:04:19,040
Stop treating AI as a mouth, wire it as hands with a brain.

64
00:04:19,040 --> 00:04:25,400
MCP connects semantic kernel orchestrates, managed identity contains, and your incident

65
00:04:25,400 --> 00:04:30,120
queue turns from wildfire into predictable chemistry.

66
00:04:30,120 --> 00:04:37,880
What an AI agent actually is, the six-part molecule, now tighten the molecule stream.

67
00:04:37,880 --> 00:04:46,000
A capable IT-Ops agent is a six-part molecule, persona, memory, planner, tools, policy,

68
00:04:46,000 --> 00:04:47,800
and verifier.

69
00:04:47,800 --> 00:04:50,560
Miss one and the compound becomes unstable.

70
00:04:50,560 --> 00:04:55,160
Let's synthesize each, cleanly, persona.

71
00:04:55,160 --> 00:04:57,680
The operating temperament.

72
00:04:57,680 --> 00:05:04,880
It's your SRE temperament encoded, cautious with production, decisive with toil.

73
00:05:04,880 --> 00:05:07,560
You give concise goals.

74
00:05:07,560 --> 00:05:11,200
Then web API S-LOs.

75
00:05:11,200 --> 00:05:14,240
Prefer rollbacks to risky hotfixes.

76
00:05:14,240 --> 00:05:17,920
NERATE actions briefly never escalate privileges.

77
00:05:17,920 --> 00:05:20,360
The persona sits by us.

78
00:05:20,360 --> 00:05:26,480
Without it, the model's plasma spreads hot, aimless, memory.

79
00:05:26,480 --> 00:05:30,040
Short term context plus durable facts.

80
00:05:30,040 --> 00:05:31,920
Short term is the thread.

81
00:05:31,920 --> 00:05:36,120
Current incident, telemetry snapshots, the last action.

82
00:05:36,120 --> 00:05:38,280
Model is environment state.

83
00:05:38,280 --> 00:05:43,680
Service mappings, maintenance windows, tenant boundaries, and don't touch prod after 5pm

84
00:05:43,680 --> 00:05:44,680
Friday.

85
00:05:44,680 --> 00:05:47,760
In SKU attach memories or external stores.

86
00:05:47,760 --> 00:05:49,880
The agent stays anchored.

87
00:05:49,880 --> 00:05:53,240
Memory prevents goldfish remediation.

88
00:05:53,240 --> 00:05:54,240
Planner.

89
00:05:54,240 --> 00:05:55,240
The brainstem.

90
00:05:55,240 --> 00:05:57,440
This is where SK shines.

91
00:05:57,440 --> 00:05:59,520
Given an intention.

92
00:05:59,520 --> 00:06:03,280
Resolve elevated 5X on API.

93
00:06:03,280 --> 00:06:05,800
The planner decomposes.

94
00:06:05,800 --> 00:06:07,400
Gather metrics.

95
00:06:07,400 --> 00:06:09,760
Correlate to recent changes.

96
00:06:09,760 --> 00:06:11,280
Isolate scope.

97
00:06:11,280 --> 00:06:13,360
Select remediation.

98
00:06:13,360 --> 00:06:14,360
Execute.

99
00:06:14,360 --> 00:06:15,360
Verify.

100
00:06:15,360 --> 00:06:16,360
Report.

101
00:06:16,360 --> 00:06:21,560
SK can run sequential concurrent or graph-shaped workflows.

102
00:06:21,560 --> 00:06:24,960
Think of it as the reactor's timing circuit.

103
00:06:24,960 --> 00:06:28,440
No sparks until the sequence is safe.

104
00:06:28,440 --> 00:06:29,440
Microglimps.

105
00:06:29,440 --> 00:06:36,080
The planner forks two sub agents, one scrapes deployment logs, the other samples metrics.

106
00:06:36,080 --> 00:06:44,000
They return, the planner fuses results, chooses rollback, concurrency without chaos.

107
00:06:44,000 --> 00:06:45,000
Tools.

108
00:06:45,000 --> 00:06:47,520
The actuators and sensors.

109
00:06:47,520 --> 00:06:53,680
Through MCP, the agent discovers, callable tools, Microsoft graph for

110
00:06:53,680 --> 00:07:01,240
entry and teams incidents, service health, into device actions, your internal APIs for

111
00:07:01,240 --> 00:07:05,440
drain, undeploy, knowledge lookups for runbooks.

112
00:07:05,440 --> 00:07:09,680
SK wraps them as kernel functions and hands the model adjacent schema.

113
00:07:09,680 --> 00:07:14,080
The model decides to call drain node node ID with parameters.

114
00:07:14,080 --> 00:07:15,760
You don't handcraft payloads.

115
00:07:15,760 --> 00:07:18,360
SK presents them in a safe beaker.

116
00:07:18,360 --> 00:07:28,280
Cool descriptions matter.

117
00:07:28,280 --> 00:07:53,840
Now, the schema is your guardrail language.

118
00:07:53,840 --> 00:08:03,840
Now, the data is your guardrail.

119
00:08:03,840 --> 00:08:21,480
Now, the data is your guardrail.

120
00:08:21,480 --> 00:08:44,120
Now, the data is your guardrail language.

121
00:08:44,120 --> 00:09:06,760
Now, the data is your guardrail language.

122
00:09:06,760 --> 00:09:16,760
Now, the data is your guardrail language.

123
00:09:16,760 --> 00:09:24,600
Now, the data is your guardrail language.

124
00:09:24,600 --> 00:09:50,120
Now, the data is your guardrail language.

125
00:09:50,120 --> 00:09:54,600
Verifier.

126
00:09:54,600 --> 00:10:02,980
If you remember nothing else, Persona Ames, Memory Anchors, Planner Sequences, Tools Execute,

127
00:10:02,980 --> 00:10:08,360
Policy Contains Verifier Proves, that's a stable agent.

128
00:10:08,360 --> 00:10:13,120
Microsoft Stack, SK MCP Azure AI Foundry, are.

129
00:10:13,120 --> 00:10:14,680
Now, we wire the reactor.

130
00:10:14,680 --> 00:10:16,360
You've got the six part molecule.

131
00:10:16,360 --> 00:10:21,480
So let's bind it to the Microsoft Stack where the energy is dense but containable.

132
00:10:21,480 --> 00:10:31,160
Three layers, one flow, MCP is the wiring, semantic kernel is the control panel, Azure AI Foundry

133
00:10:31,160 --> 00:10:34,200
is the power grid and the audit room.

134
00:10:34,200 --> 00:10:37,360
Together they turn intent into action with guardrails.

135
00:10:37,360 --> 00:10:40,360
First, MCP, the Model Context Protocol.

136
00:10:40,360 --> 00:10:51,760
Think of it as standardized lab glassware.

137
00:10:51,760 --> 00:11:19,920
Now, the system is a little more advanced.

138
00:11:19,920 --> 00:11:22,840
Now, focus.

139
00:11:22,840 --> 00:11:48,520
Now, the system is a little more advanced.

140
00:11:48,520 --> 00:11:58,520
Now, the system is a little more advanced.

141
00:11:58,520 --> 00:12:24,040
Now, the system is a little more advanced.

142
00:12:24,040 --> 00:12:34,040
Now, the system is a little more advanced.

143
00:12:34,040 --> 00:12:44,040
Now, the system is a little more advanced.

144
00:12:44,040 --> 00:12:54,040
Now, the system is a little more advanced.

145
00:12:54,040 --> 00:13:04,040
Now, the system is a little more advanced.

146
00:13:04,040 --> 00:13:24,040
Now, the system is a little more advanced.

147
00:13:24,040 --> 00:13:44,040
Now, the system is a little more advanced.

148
00:13:44,040 --> 00:13:54,040
Now, the system is a little more advanced.

149
00:13:54,040 --> 00:14:04,040
Now, the system is a little more advanced.

150
00:14:04,040 --> 00:14:12,040
Now, the system is a little more advanced.

151
00:14:12,040 --> 00:14:18,040
Next, run SK Refresh's tools, the Model Seaset, and the planner prefers the narrower blast radius.

152
00:14:18,040 --> 00:14:24,040
You just moved from sledgehammer rollbacks to precision surgery without rewiring the panel.

153
00:14:24,040 --> 00:14:28,040
As you are AI foundry now, this is the grid in the clipboard.

154
00:14:28,040 --> 00:14:36,040
You define model deployments, approve which tools an agent may call, and root messages through govern threads.

155
00:14:36,040 --> 00:14:44,040
Critical. You also capture traces, prompts, tool calls, parameters, outputs, so audits aren't guesswork.

156
00:14:44,040 --> 00:14:50,040
If a regulator asks, why did the agent reset 73 accounts last Friday?

157
00:14:50,040 --> 00:14:59,040
You open the log and show the requesters, the justification, the tool schemers, the approval tokens, and the verification results.

158
00:14:59,040 --> 00:15:12,040
Clear deterministic chemistry, no smoke, quick micro story from the field, a network team added an MCP server for their load balancer control plane, drain, attach, change weight.

159
00:15:12,040 --> 00:15:16,040
Before, change windows were manual and slow.

160
00:15:16,040 --> 00:15:27,040
After the agent came online, SK planned, detect imbalance, drain hot nodes, redistribute weight, verify packet loss, then schedule a gradual reattach.

161
00:15:27,040 --> 00:15:31,040
Managed identity limited scope to the front end pool only.

162
00:15:31,040 --> 00:15:40,040
Incidents that used to burn an hour shrank to 8 minutes end to end with logs to match, not drama. Procedure.

163
00:15:40,040 --> 00:15:47,040
Now tighten the molecule stream, tool descriptions are your physics, write them like laws, not suggestions.

164
00:15:47,040 --> 00:15:53,040
Use reset password only when user. Account enabled is true and risk state is none.

165
00:15:53,040 --> 00:16:03,040
For privileged roles require approval token. Models obey schema better than pros and always encode safety at the boundary.

166
00:16:03,040 --> 00:16:08,040
If the tool rejects missing approval, you've codified governance where it counts.

167
00:16:08,040 --> 00:16:12,040
Once you nail that, everything else clicks.

168
00:16:12,040 --> 00:16:33,040
MCP gives you modular instruments. SK orchestrates multi-step reactions, managed identity enforces least privilege and as your AI foundry records the experiment, you get agents that plan, act and prove it, without handing uranium to a chatbot.

169
00:16:33,040 --> 00:16:45,040
Key takeaway, compressed to one line. SK orchestrates, MCP connects, foundry governs, managed identity contains, so your agent's power is usable, traceable and safe.

170
00:16:45,040 --> 00:16:51,040
Blueprint 1. SK planner plus graph via MCP, IT ops.

171
00:16:51,040 --> 00:17:06,040
Time to assemble a working containment field you can copy will wire semantic kernels planner to Microsoft graph through MCP so the agent can investigate and remediate a live-seaming IT ops incident without brittle code or leaking power.

172
00:17:06,040 --> 00:17:24,040
Scenario, elevated 5XX on web API after a deployment, objective, diagnose, correlate with changes, propose lowest blast radius fix, execute if safe, verify and report automatically.

173
00:17:24,040 --> 00:17:33,040
Why this blueprint matters? Most teams stall on, how do I make the model use my systems safely?

174
00:17:33,040 --> 00:17:46,040
This pattern answers it with three bindings, tools via MCP, plans via SK, identity via your runtime and bakes verification into the end.

175
00:17:46,040 --> 00:18:02,040
The scaffold, conceptually. Intention, reduce 5XX to baseline while preserving SLO, plan shape, concurrent first then converge, metrics branch, change history branch, tools, via MCP.

176
00:18:02,040 --> 00:18:12,040
App insights query read only telemetry, graph service, health, advise if there's a broader incident.

177
00:18:12,040 --> 00:18:31,040
Graph change log, deployments approvals, drain subset by build, load balancer, rollback slot, deployment, post-incident note, ITSM teams, policy hints in tool schemas, approval tokens required on rollback slot.

178
00:18:31,040 --> 00:18:38,040
Safe by default on drain subset by build, strict parameter enums, timestamps for correlation.

179
00:18:38,040 --> 00:18:47,040
Now, the SK planner, you give SK the high level goal and the available kernel functions, which it got from MCP discovery.

180
00:18:47,040 --> 00:18:59,040
IT decomposes, phase A, assess, parallel, app insights query for error rate and P95 scope to last 30 minutes.

181
00:18:59,040 --> 00:19:06,040
Graph change log for deployments to the same service, graph service, health for advisories.

182
00:19:06,040 --> 00:19:14,040
The model reads results, notifies the planner. Spike began six minutes after build 2025, 11111.

183
00:19:14,040 --> 00:19:31,040
O4, no global advisory affected nodes share build tag B4 421, phase B decide, choose the narrowest fix, drain subset by build, build tag B4 421 percentage 50, safety.

184
00:19:31,040 --> 00:19:35,040
Allowed without approval token because it's reversible and scoped.

185
00:19:35,040 --> 00:19:43,040
The tool description explicitly states changes revert in 10 minutes unless verification affirms.

186
00:19:43,040 --> 00:19:56,040
Phase C, act, invoke drain subset by build, start verifier, pull app insights every 60 seconds for 5 minutes, thresholds P95 down 30% 5XX below.

187
00:19:56,040 --> 00:20:07,040
5%, phase D, if verification fails, escalate action, request approval token for rollback slot with justification.

188
00:20:07,040 --> 00:20:16,040
Post deploy error spike mapped to build B4 421, drain ineffective, rolling back to previous slot.

189
00:20:16,040 --> 00:20:34,040
O8 token, if granted invoke rollback slot, run verifier again with the same thresholds. Phase E, report, post incident, note summarizing actions metrics before after and links to dashboards and change IDs.

190
00:20:34,040 --> 00:20:40,040
Observe the anomaly, the magic isn't in prose, it's in tool schemers and planner choreography.

191
00:20:40,040 --> 00:20:52,040
MCP advertises each tool with clear names, parameters and constraints. SK wraps them as kernel functions and auto constructs the Jason schema the model needs to call them.

192
00:20:52,040 --> 00:20:59,040
You don't hand craft payloads, you define physics, inputs, allowed ranges and approval rules.

193
00:20:59,040 --> 00:21:07,040
MicroStory, a customer's staging ring kept spiking after canary deploys.

194
00:21:07,040 --> 00:21:14,040
They added one MCP tool, drain subset by build with a percentage parameter and a hard ceiling of 50 in the schema.

195
00:21:14,040 --> 00:21:23,040
Instantly the plan agained a low risk move, incidents that used to jump to rollbacks now stabilized with a 10 minute drain and observe.

196
00:21:23,040 --> 00:21:32,040
The uranium stayed in the vault, the beaker handled the heat, common pitfalls and how this blueprint avoids them.

197
00:21:32,040 --> 00:21:39,040
Brittle adapters, MCP eliminates hard coded plugins, new tools appear dynamically.

198
00:21:39,040 --> 00:21:47,040
Prompt only governance. Policy embedded at the tool boundary prevents sweet talked violations.

199
00:21:47,040 --> 00:21:54,040
Blind action, verifier is non-negotiable, poll objective metrics before claiming victory.

200
00:21:54,040 --> 00:21:59,040
Implementation notes you'll actually use, keep tool description specific.

201
00:21:59,040 --> 00:22:10,040
Drain subset by build reduces traffic to nodes with build tag, use for canary regional issues, reversible, no approval required.

202
00:22:10,040 --> 00:22:17,040
Name parameters with business clarity, build tag, percentage, change id.

203
00:22:17,040 --> 00:22:24,040
Add consistent telemetry tags to every action, the verifier reads them to correlate cause and effect.

204
00:22:24,040 --> 00:22:32,040
Let's pause, key takeaway, SK plans, MCP wires and your tools become disciplined actuators.

205
00:22:32,040 --> 00:22:42,040
You get fast, narrow fixes first, safe escalation second and proof at the end, delicious security, blueprint two.

206
00:22:42,040 --> 00:22:46,040
Azure open AI tool calling with managed identity.

207
00:22:46,040 --> 00:22:54,040
Now tighten the cable, Azure open AI tool calling but every spark flows through managed identity.

208
00:22:54,040 --> 00:22:58,040
This is the moment the models hands enter the glove box.

209
00:22:58,040 --> 00:23:08,040
The goal, let the model pick and call tools but force every execution to authenticate as a managed principle with least privilege.

210
00:23:08,040 --> 00:23:16,040
No secrets, no stray tokens, no accidental overreach, pure insulated power.

211
00:23:16,040 --> 00:23:26,040
The flow conceptually agent receives intent, auto resolve eligible password reset tickets under policy.

212
00:23:26,040 --> 00:23:38,040
Tools exposed via MCP executed under MI. Graph get user, read user status, risk state.

213
00:23:38,040 --> 00:23:45,040
Graph reset password, temporary password, require reset on next sign in.

214
00:23:45,040 --> 00:23:50,040
Graph notify user, email, teams message.

215
00:23:50,040 --> 00:23:55,040
ITS update, ticket, status, resolution notes.

216
00:23:55,040 --> 00:24:06,040
Tools schemas declare safety rules. Graph reset password requires user, account enabled, true and user, risk state, none.

217
00:24:06,040 --> 00:24:13,040
For privileged roles approval token is required and must be verified by a separate approved change tool.

218
00:24:13,040 --> 00:24:26,040
Azure open AI tool calling, the model chooses tools, SK shapes the JSON schema but execution passes through a function executor that acquires an access token via managed identity at runtime.

219
00:24:26,040 --> 00:24:36,040
Scope to the specific graph permissions in our G user read basic, all directory.

220
00:24:36,040 --> 00:24:49,040
Access as user, no use app permissions with constrained app roles. You see managed identity is not just convenience. It's the containment field.

221
00:24:49,040 --> 00:24:57,040
The executor calls Azure AD, obtains a token for graph based on the managed principles assigned roles then calls the tool endpoint.

222
00:24:57,040 --> 00:25:03,040
If a tool isn't permitted, it fails closed. No prompt can change physics.

223
00:25:03,040 --> 00:25:14,040
Microstory onboarding surge week, 1,800 tickets. The agent filters by policy, enabled accounts, no risk flags, non-privileged roles.

224
00:25:14,040 --> 00:25:23,040
It resets, notifies and updates tickets. For 42 privileged accounts it requests approval tokens. Security approves 39, denies 3.

225
00:25:23,040 --> 00:25:35,040
The agent closes 1,797 tickets unattended and routes 3 with full context. MTR drops to minutes, weekends come back online.

226
00:25:35,040 --> 00:25:50,040
Critical implementation moves. Separate read and write tools with different managed identities or scopes. Reading status should be broad, writing should be narrow and auditable. Encode approval at the write tool boundary.

227
00:25:50,040 --> 00:25:56,040
Approval token isn't a prompt instruction. It's a required parameter with signature verification.

228
00:25:56,040 --> 00:26:11,040
In a given application, emit an audit envelope on every call, principle add, tool name parameters, sanitized, correlation ID and outcome. Route to your logging sync. Why Azure OpenAI tool calling helps?

229
00:26:11,040 --> 00:26:24,040
The model's planner can choose the correct sequence, read user, evaluate policy, branch for privileged users, reset, verify sign in flag, notify closed ticket.

230
00:26:24,040 --> 00:26:34,040
Without you hardwiring if else ladders. But identity ensures every step is shackled to explicit permission.

231
00:26:34,040 --> 00:26:44,040
Common mistakes that set labs on fire. Using a generic guard identity with broad graph permissions.

232
00:26:44,040 --> 00:26:57,040
Don't. Split identities by function and environment, relying on prompt text for safety. Tools must enforce constraints in code and schema, skipping verification.

233
00:26:57,040 --> 00:27:08,040
After reset, requery sign in state or test a low risk signal before closing, quick twist, rotate models without changing your security posture.

234
00:27:08,040 --> 00:27:25,040
Tool calling contracts stay the same. Managed identity scopes stay the same. Swap GPT family versions or add reasoning modes and the gloves still fit. That's the beauty. Models evolve. Your containment holds.

235
00:27:25,040 --> 00:27:34,040
One line takeaway. Let the model decide but let managed identity decide what's allowed.

236
00:27:34,040 --> 00:27:45,040
Autonomous action blast radius in centimeters not kilometers. If a blueprint three, incident auto remediation and IT ops.

237
00:27:45,040 --> 00:28:00,040
Time to unleash closed loop control. The reactor that fixes itself while you sleep will assemble an auto remediation path that starts with detection flows through safe actions and ends with proof. No heroics.

238
00:28:00,040 --> 00:28:09,040
Just disciplined chemistry. Trigger. A Sentinel detects elevated error rates or abnormal latency on web API.

239
00:28:09,040 --> 00:28:27,040
The agent ingests the alert payload service region thresholds crossed timestamps memory loads recent deployments maintenance windows and current topology intention forms restore SLO with minimal blast radius.

240
00:28:27,040 --> 00:28:40,040
Planner spin up phase a assess to concurrent probes ignite metrics via app insights query and change context via graph change log.

241
00:28:40,040 --> 00:28:54,040
A third check service health for broader noise within seconds the planner correlates spike began four minutes post deploy in region a specific nodes share build tag B44421 no global advisory.

242
00:28:54,040 --> 00:29:14,040
Now the agent has a hypothesis canary contamination phase B constraint the lowest voltage move wins first the planner select strain subset by build with percentage 30 scope and region a tool schema allows it without approval because it's reversible and time bound.

243
00:29:14,040 --> 00:29:31,040
Execution happens under managed identity permissioned only for that pool the verifier activates poll P95 and five VIXX for five minutes annotate every sample with correlation I'd an action tag observe the anomaly.

244
00:29:31,040 --> 00:29:48,040
If metrics improve within two minutes the planner holds the drain schedules a gradual reattach and monitors for regression for 10 minutes if improvement is marginal or negative the planner escalates to phase C roll back.

245
00:29:48,040 --> 00:30:11,040
It requests an approval token via approve change posts a crisp justification in teams error spike map to build B4421 30% drain ineffective requesting slot rollback approve a clicks token returns rollback slot executes again under managed identity with narrow scope.

246
00:30:11,040 --> 00:30:40,040
Now the verifiers Geiger counter ticks thresholds P95 down 30% 5XX under 5% probe health green for five continuous minutes if all green phase D report the agent posts an incident summary before after metrics graphs actions taken tool parameters approval token ID links to dashboards and change IDs it updates the ITSM ticket to resolved with artifacts attached verification fails phase E.

247
00:30:40,040 --> 00:31:00,040
The planner considers alternative fixes scale out 25% reset connection pools or isolate by path for a suspected hot end point each encoded as MCP tools with their own safety physics.

248
00:31:00,040 --> 00:31:23,040
Each action cycles through the same verify or rollback loop micro story a payment API started spiking 5X right after a Friday patch the agent trained 25% saw no relief requested approval rolled back and posted the forensic bundle root cause a thread starvation bug in the new build

249
00:31:23,040 --> 00:31:52,040
TTR 14 minutes human effort one approving click that's not provato that's guard rail automation crucial safety patterns that keep the lab from exploding narrow first actions drains and isolates before deploy changes time boxed reversibility every risky move auto reverts if metrics don't confirm approval at the boundary high risk tools won't run without a sign token.

250
00:31:52,040 --> 00:32:21,040
Immutable audit envelopes log principle ID tool parameters sanitized and outcomes one sentence take away auto remediation isn't a guess it's a loop of assess constraint act verify and prove all contained by identity and policy business outcomes proving it works now the voltage you can feel numbers are fine

251
00:32:21,040 --> 00:32:43,040
consequences sell reduced MTTR isn't just efficiency it's sleep returned to engineers and incident fatigue evaporating from your own call rotation tickets auto resolved when the password reset agent runs under policy onboarding wave stop flooding the queue those tickets don't wait they disappear

252
00:32:43,040 --> 00:32:58,040
that means fewer handoffs fewer SLA's breached and front line staff focusing on exceptions that actually need judgment each ticket avoided saves minutes at scale it buys back headcount capacity without layoffs

253
00:32:58,040 --> 00:33:12,040
room to tackle the backlog you've ignored for a year lead conversion lift for ity ops translated faster provisioning and cleaner access means sales tools work when reps need them every minute shaved from access issues is a

254
00:33:12,040 --> 00:33:28,040
minute added to pipeline creation it's not abstract it's missing a quarter versus hitting it cycle time reduction shows up everywhere changes that used to take a swarm now flow through mcp tools within coded approvals

255
00:33:28,040 --> 00:33:49,040
the planner handles the choreography humans approve high risk pivots result deployment related incidents shrink from hours to minutes and the post mortem has receipts tool calls tokens threshold so learning compounds rather than dissolves into blame microstory

256
00:33:49,040 --> 00:34:17,040
retailers weekend incidents spikes used to eat to engineers saturdays after wiring the drain then verify loop and enforcing managed identity purple the same class of incidents now resolves in under 10 minutes mostly unattended those engineers didn't just get time back they stopped dreading the pager attrition dipped recruiting got easier that's culture impact measured in up time

257
00:34:17,040 --> 00:34:45,040
executives ask where's the ROI here's a crisp frame time mttr down 40 70% on repeatable failure modes ticket deflection 60 90% change cycle time cut 50% plus with parallel assess and safe early actions risk constrained identities scheme about tools and full audits compress over reach people fewer

258
00:34:45,040 --> 00:35:05,040
burnt out on calls lower attrition more time for deep work but the real win is risk compression manage identity and scheme about tools make over reach mathematically harder you're not trusting vibes you're enforcing laws audits stop being archaeology and become exports automation

259
00:35:05,040 --> 00:35:29,040
isn't just about machines working more it's about humans burning out less if you remember one thing outcomes compound when agents both act and prove it mttr drops cues clear change moves faster and risk stays contained delicious security guard rails and responsibility power without

260
00:35:29,040 --> 00:35:57,040
containment is just an explosion waiting for a witness if you wire agents into it ups you inherit a duty in code responsibility as physics not poetry let's set the guard rails that keep the reactor humming instead of turning the data center into glass first identity boundaries you never given agent a guard credential you split managed identities by domain and action

261
00:35:57,040 --> 00:36:18,040
read identities for telemetry and inventories write identities for scope actuators and high risk identities that can only execute within approval token this isn't belt and suspenders it's the blast shield if an agent attempts roll back slot with the read identity it should fail closed

262
00:36:18,040 --> 00:36:47,040
if it attempts a privileged reset without a token it should fail loud your goal is deterministic denial second policy at the tool edge tool schemers must express law not vibes required parameters for risk approval token change ID justification enumerated scopes allowed regions pools services range clamps percentage between and fifty for drains timeouts capped retries bounded

263
00:36:47,040 --> 00:37:08,040
preconditions user account enabled must be true risk state must be none maintenance window must be closed this is why you encode never touch prod after five p.m. Friday as a guard in the tool not a suggestion in a prompt the tool is the valve prompts are the labels

264
00:37:08,040 --> 00:37:30,040
third human in the loop engineered like a circuit not a panic button high risk tools require an approval flow the agent can trigger but not short cut that means an approve change tool that posts a crisp verifiable request what why where blast radius

265
00:37:30,040 --> 00:37:58,040
approval token signed server side time bound and tied to that exact action a second verification that the token matches parameters at execution time no swapping a rollback for redeploy mid flight the guy account to watch as the token to forth audit envelopes every tool call emits an immutable record

266
00:37:58,040 --> 00:38:23,040
caller principle id tool name parameters sanitized of secrets correlation ID time stamps return status and any changes to state route these to your centralized logging sink and retain them under your compliance policy this transforms post mortems into science repeatable inspectable boring marvel

267
00:38:23,040 --> 00:38:50,040
marvelous fifth red teams for agents if it's not enforced at the tool boundary it's not real policy before you trust an auto remediation path attack it attempt prompt injections in context ignore policy simulate approval force parameter 100 the tool should refuse on schematics before the model can improvise now tighten the molecule stream

268
00:38:50,040 --> 00:39:11,040
apply concurrency pressure approve twice revoke once execute price the system must to duplicate honor revocations and record intent versus outcome treat your agent like a new s re it doesn't get root until it survives fire drills

269
00:39:11,040 --> 00:39:37,040
six scope drift monitoring tools evolve permissions creep you need a schedule job yes an agent that audits tool definitions manage identity assignments and effective graph scopes against the golden policy if a permission expands it alerts if a tool adds a parameter without a range it blocks publication this is the feedback loop that

270
00:39:37,040 --> 00:40:05,040
preserve safety as you grow seventh data minimization and privacy agents love context but context bleeds for user facing tasks resets access pass only attributes necessary for the action mask PI in logs by default for team chat notifications send links to secure dashboards instead of raw payloads the principle is simple handle volatile compounds in

271
00:40:05,040 --> 00:40:31,040
the fume hood not on the cafeteria table eighth failure choreography when tools fail the planner must fail usefully safe fall back first revert pause or isolate escalate with a complete compact state bundle what changed what failed what remains safe do not retry blindly across environments

272
00:40:31,040 --> 00:41:00,040
a failed drain in region a does not authorize a drain in region b unless policy says so ninth model governance you will rotate models you will test reasoning modes do it behind an abstraction keep tool contracts stable and security posture constant while you a b model configurations in foundry if a candidate model produces more tool call errors it does not graduate the gloves must fit the new hands

273
00:41:00,040 --> 00:41:29,040
finally training and ownership assign a product owner for your agent like you would any critical service they maintain tool catalogs review audit trends track incident classes handled and tune policy thresholds and train your humans how to approve when to deny how to interpret the agents summaries the future is collaborative human judgment at the boundaries machine precision in the middle

274
00:41:29,040 --> 00:41:58,040
key takeaway responsibility is not a memo it's encoded in identity schemas approvals audits and deliberate failure paths build guard rails as laws of physics and your agent becomes a safe tireless colleague not a chaos engine the agent error starts now if you remember one thing wire a I as hands with a brain SK orchestrates mcp connects foundry

275
00:41:58,040 --> 00:42:18,040
governs manage identity contains and let verification prove the result start with one flow drain then verify for post deploy spikes encode approvals at the tool edge and route every call to audit then expand subscribe and watch the advanced blueprint next delicious security awaits