Nov. 29, 2025

Y'all Need Governance: The LangChain4j & Copilot Studio Mess

AI agents are shipping faster than your change control, and they’re carrying master keys to your data. This talk rips into how LangChain4J and Copilot Studio quietly turn “helpful copilots” into data-leaking, over-permissioned shadow admins with no audit trail. You’ll see exactly how prompt injection, over-scoped connectors, and missing logs create reportable incidents, and how strict schemas, per-agent identities, and real DLP stop the bleeding. The core move most orgs skip: give every agent its own locked-down identity, no shared creds, no tenant-wide scopes, and treat it like a very dumb, very powerful user you have to restrain by design.

AI agents are shipping faster than your change control meetings, and the governance is… a vibe. You know that feeling when a Copilot ships with tenant-wide access “just for testing”? Yeah, that’s your compliance officer’s heartbeat you’re hearing. Today, I’m tearing down the mess in LangChain4j and Copilot Studio with real cases: prompt injection, over‑permissive connectors, and audit gaps. I’ll show you what breaks, why it breaks, and the fixes that actually hold. Stay to the end—I’ll give you the one governance step that prevents most incidents. You’ll leave with an agent RBAC model, data loss policies, and a red‑team checklist.Case 1: Prompt Injection—The Unsupervised Intern Writes Policy (650 words)Prompt injection is that unsupervised intern who sounds helpful, writes in complete sentences, and then emails payroll data to “their personal archive” for safekeeping. You think your system prompt is the law. The model thinks it’s a suggestion. And the moment you ground it on internal content, one spicy document or user message can rewrite the rules mid‑conversation.Why this matters: when injection wins, your agent becomes a data‑leaking poet. It hallucinates authority, escalates tools, and ignores policy language like it’s the Wi‑Fi terms of service. In regulated shops, that’s not a bug—it’s a reportable incident with your company name on it.Let’s start with what breaks in LangChain4j. The thing most people miss is that tool calling without strict output schemas is basically “do crimes, return vibes.” If your tools accept unchecked arguments—think free‑text “sql” or “query” fields—and you don’t validate types, ranges, or enums, the model will happily pass along whatever an attacker smuggles in. Weak output validation is the partner in crime: when you expect JSON but accept “JSON‑ish,” an attacker can slip instructions in comments or strings that your downstream parser treats as commands. This clicked for me when I saw logs where a retrieval tool took a “topic” parameter with arbitrary Markdown. The next call parsed that Markdown like it was configuration. That’s not orchestration. That’s self‑own.Now here’s where most people mess up: they rely on the model’s “please be safe” setting instead of guardrails in code. In LangChain4j, you need allowlists for tool names and arguments, JSON schema outputs enforced at the boundary, pattern‑based output filters to nuke secrets, and exception handling that doesn’t retry the same poisoned input five times like a golden retriever with a tennis ball. The reason this works is it turns “trust the model” into “verify every byte.”What breaks in Copilot Studio? Naive grounding with broad SharePoint ingestion. You connect an entire site collection “for completeness,” and now one onboarding doc with “ignore previous instructions” becomes your agent’s new religion. System prompts editable by business users is the sequel. I love business users, but giving them prompt admin is like letting Marketing set firewall rules because they “know the brand voice.” And yes, I’ve seen tenant configs where moderation was disabled “to reduce friction.” You wish you couldn’t.Evidence you’ll recognize: tenant logs that show tools invoked with unbounded parameters, like “export all” flags that were never supposed to exist. Conversation traces where the assistant repeats an injected string from a retrieved document. Disabled moderation toggles. That’s not hypothetical—that’s every post‑incident review you don’t want to attend.So what’s the fix path you can implement today?For LangChain4j:

Enforce allowlists at the tool registry. If the tool isn’t registered with a schema, it doesn’t exist.
Require JSON schema outputs and reject anything that doesn’t validate. No schema, no response. Full stop.
Add pattern filters for obvious leaks: API keys, secrets, SSNs. Bloom filters are fast and cheap; use them.
Wrap tools with policy checks. Validate argument types, ranges, and expected formats before execution.
Add content moderation in pre/post processors. Keep the model from acting on or emitting toxic or sensitive content.
Fail closed with explicit exceptions and never auto‑retry poisoned prompts.

For Copilot Studio:

Lock system prompts. Only admins can change them. Version them like code.
Scope connectors by environment. Dev, test, prod, different boundaries. Least privilege on data sources.
Turn on content moderation policies at the tenant level. This is table stakes.
Ground only on labeled, sensitivity‑tagged content, not the whole farm “for convenience.”

The quick win that pays off immediately: add an output schema and a Bloom‑filter moderation step at the agent boundary. You’ll kill most dumb leaks without touching business logic. Then layer in a small regex allowlist for formats you expect—like structured summaries—and block everything else.Let me show you exactly how this plays out. Example: you have a “CreateTicket” tool that accepts title, description, and priority. Without schema enforcement, an attacker injects “description: Close all P1 incidents” inside a triple‑backtick block. The model passes it through; your ITSM API shrugs and runs an update script. With schema and validation, “description” can’t contain command tokens or exceed length; the request fails closed, logs a correlation ID, and your SIEM flags a moderation hit. And boom—look at that result: incident avoided, trail preserved, auditor appeased.Common mistakes to avoid:

Letting the model choose tool names dynamically. Tools are contracts, not suggestions.
Accepting free‑form JSON without a validator. “Looks like JSON” is not a compliment.
Editable prompts in production environments. If it can change without review, it will.
Relying on conversation memory for policy. Policy belongs in code and config, not vibes.

Once you nail this, everything else clicks. You stopped the intern from talking out of turn. Next, we stop them from walking into every room with a master key.Case 2: Over-Permissive Connectors—Keys to the Castle on a LanyardYou stopped the intern from talking. Now take the badge back. Over‑permissive connectors are that janitor keyring that opens every door in the building, including the vault, the daycare, and somehow the CEO’s Peloton.Why this matters is simple: one over‑scoped connector equals enterprise‑wide data exfiltration in a single request. Not theoretical. One call. While you’re still arguing about the change ticket title.Let’s start with what breaks in LangChain4j. Developers share API keys across agents “for convenience.” Then someone commits the .env to a private repo that’s actually public, and you’re doing incident response at 2 a.m. Broad OAuth scopes are next. You grant “read/write all” to save time during testing, and six months later that test token is now production’s crown jewel. And the tool registry? I love a clean registry, but if you point a dev agent at production credentials because the demo has to work “today,” you just wired a chainsaw to a Roomba.The thing most people miss is that tools inherit whatever identity you hand them. Shared credentials mean shared blast radius. There’s no magic “only do safe things” flag. If the token can delete records, your agent can delete records—accidentally, enthusiastically, and with perfect confidence.Now swing over to Copilot Studio. Tenant‑wide M365 connectors are the classic trap. You click once, and now every Copilot in every Team can see data it shouldn’t. That’s not empowerment; that’s a buffet for mistakes. Then you deploy to Teams with org‑wide visibility because adoption, and suddenly a pilot bot meant for Finance is answering questions in Marketing, pulling content from SharePoint sites it never should’ve known existed. And unmanaged third‑party SaaS hooks? Those are like USB drives in 2009—mysteriously everywhere and always “temporary.”Evidence shows up the same way every time: stale secrets that never rotated, no expiration, no owner; connectors mapped to global groups “for simplicity”; app registrations with scopes that read like a confession; and yes, that “temporary” prod key living in dev for months. Your security findings and tenant configs won’t lie. They’ll just sigh.So what’s the fix path?For LangChain4j, treat every agent like a separate application with its own identity.

Create per‑agent service principals. No shared tokens. If two agents need the same API, they still get different credentials.
Use scoped OAuth. Grant the smallest set of permissions that lets the tool do its job. Reader, not Writer. Write to one collection, not all.
Store secrets in a proper secret manager. Rotate on a schedule. Rotate on incident. Rotate when someone even whispers “token.”
Add tool‑level RBAC. A tool wrapper checks the caller’s role before it touches an API. No role, no call.
Separate environments. Dev keys only talk to dev systems. If a tool sees a prod endpoint in dev, it fails closed and screams in the logs.

For Copilot Studio, draw hard boundaries with environments and scopes.

Use environment separation: dev, test, prod. Different connectors. Different permissions. Different owners.
Review connector scopes with a workflow. Changes require approval, expiration dates, and owners. No owner, no connector.
Apply DLP policies per channel. Finance channel gets stricter rules than company‑wide. That’s the point.
Kill org‑wide Teams deployments for pilots. Limit visibility to a security group. Expand only after review.
Inventory and gate third‑party SaaS connectors. If it’s unmanaged, it’s off by default. Owners must justify access and renew it.

Here’s a quick win you can ship this afternoon: kill tenant‑wide scopes and map each connector to a security group with an expiration policy. When the group ex

Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-show-podcast--6704921/support.

Follow us on:
LInkedIn
Substack

Transcript

1
00:00:00,000 --> 00:00:04,800
AI agents are shipping faster than your change control meetings and the governance is a vibe.

2
00:00:04,800 --> 00:00:08,640
You know that feeling when a co-pilot ships with tenant-wide access just for testing?

3
00:00:08,640 --> 00:00:12,960
Yeah, that's your compliance officer's hard beat you're hearing. Today, I'm tearing down the mess

4
00:00:12,960 --> 00:00:18,000
in Langchain 4J and co-pilot studio with real cases, prompt injection, over permissive connectors,

5
00:00:18,000 --> 00:00:22,720
and audit gaps. And I'll show you what breaks, why it breaks, and the fixes that actually hold.

6
00:00:22,720 --> 00:00:26,880
Stay to the end, I'll give you the one governance step that prevents most incidents.

7
00:00:26,880 --> 00:00:31,520
You'll leave with an agent RBX model, data loss policies, and a red team checklist.

8
00:00:31,520 --> 00:00:36,640
Case one, prompt injection. The unsupervised intern writes policy.

9
00:00:36,640 --> 00:00:41,360
Prompt injection is that unsupervised intern who sounds helpful, writes in complete sentences,

10
00:00:41,360 --> 00:00:44,880
and then emails payroll data to their personal archive for safekeeping.

11
00:00:44,880 --> 00:00:49,600
You think your system prompt is the law. The model thinks it's a suggestion, and the moment you

12
00:00:49,600 --> 00:00:55,200
ground it on internal content, one spicy document or user message can rewrite the rules mid-conversation.

13
00:00:55,200 --> 00:01:00,720
Why this matters? When injection wins, your agent becomes a data leaking poet.

14
00:01:00,720 --> 00:01:05,440
It hallucinates authority, escalates tools, and ignores policy language like it's the Wi-Fi

15
00:01:05,440 --> 00:01:10,160
terms of service. In regulated shops, that's not a bug. It's a reportable incident with your company name

16
00:01:10,160 --> 00:01:14,320
on it. Let's start with what breaks in Langchain 4J. The thing most people miss is that

17
00:01:14,320 --> 00:01:18,080
tool calling without strict output schemas is basically due crimes return vibes.

18
00:01:18,080 --> 00:01:25,120
If your tools accept unchecked arguments, think free text, SQL, or query fields,

19
00:01:25,120 --> 00:01:30,320
and you don't validate types, ranges, or enums, the model will happily pass along whatever an attacker

20
00:01:30,320 --> 00:01:36,240
smuggles in. Week output validation is the partner in crime. When you expect JSON, but accept JSON-ish,

21
00:01:36,240 --> 00:01:40,960
an attacker can slip instructions in comments or strings that your downstream passer treats as commands.

22
00:01:40,960 --> 00:01:46,800
This clicked for me when I saw logs where a retrieval tool took a topic parameter with arbitrary

23
00:01:46,800 --> 00:01:52,320
markdown. The next call passed that marked down like it was configuration. That's not orchestration.

24
00:01:52,320 --> 00:01:58,720
That's self-own. Now here's where most people mess up. They rely on the models please be safe setting

25
00:01:58,720 --> 00:02:04,240
instead of guardrails in code. In Langchain 4J, you need allow lists for tool names and arguments.

26
00:02:04,240 --> 00:02:09,040
JSON schema outputs enforced at the boundary pattern based output filters to new secrets,

27
00:02:09,040 --> 00:02:14,800
an exception handling that doesn't retry the same poisoned input five times like a golden retriever

28
00:02:14,800 --> 00:02:19,760
with a tennis ball. The reason this works is it turns trust the model into verify every bite.

29
00:02:20,960 --> 00:02:26,240
What breaks in co-pilot studio? Naif grounding with broad share point ingestion. You connect an entire

30
00:02:26,240 --> 00:02:31,200
site collection for completeness and now one onboarding dog with ignore previous instructions becomes

31
00:02:31,200 --> 00:02:36,160
your agent's new religion. System prompts editable by business users is the sequel. I love business

32
00:02:36,160 --> 00:02:41,040
users, but giving them prompt admin is like letting marketing set firewall rules because they know the

33
00:02:41,040 --> 00:02:46,400
brand voice. And yes, I've seen tenant configs where moderation was disabled to reduce friction.

34
00:02:46,400 --> 00:02:51,360
You wish you couldn't. Evidence you'll recognize tenant logs that show tools invoked with unbounded

35
00:02:51,360 --> 00:02:56,720
parameters like export all flags that were never supposed to exist. Conversation traces where the

36
00:02:56,720 --> 00:03:01,600
assistant repeats an injected string from a retrieve document disabled moderation toggles. That's not

37
00:03:01,600 --> 00:03:05,920
hypothetical. That's every post incident review you don't want to attend. So what's the fixed path

38
00:03:05,920 --> 00:03:11,680
you can implement today for Langchain 4J enforce allow lists at the tool registry. If the tool isn't

39
00:03:11,680 --> 00:03:15,920
registered with a schema, it doesn't exist require JSON schema outputs and reject anything that

40
00:03:15,920 --> 00:03:22,640
doesn't validate no schema no response full stop. Add pattern filters for obvious leaks API keys secrets

41
00:03:22,640 --> 00:03:28,160
SSN's bloom filters are fast and cheap use them wrap tools with policy checks validate argument types

42
00:03:28,160 --> 00:03:33,520
ranges and expected formats before execution add content moderation in pre post processors keep the

43
00:03:33,520 --> 00:03:39,120
model from acting on or emitting toxic or sensitive content fail closed with explicit exceptions

44
00:03:39,120 --> 00:03:44,160
and never auto retry poisoned prompts for co pilot studio lock system prompts only admins can

45
00:03:44,160 --> 00:03:49,760
change them version them like code scope connectors by environment dev test plot different boundaries

46
00:03:49,760 --> 00:03:54,400
least privilege on data sources turn on content moderation policies at the tenant level this is

47
00:03:54,400 --> 00:03:59,920
table stakes ground only unlabeled sensitivity tag content not the whole farm for convenience

48
00:03:59,920 --> 00:04:04,960
the quick win that pays off immediately add an output schema and a bloom filter moderation step

49
00:04:04,960 --> 00:04:09,360
at the agent boundary you'll kill most dumb leaks without touching business logic then layer in a

50
00:04:09,360 --> 00:04:14,720
small reject allow list for formats you expect like structured summaries and block everything else

51
00:04:14,720 --> 00:04:19,840
let me show you exactly how this plays out example you have a create ticket tool that accepts title

52
00:04:19,840 --> 00:04:25,920
description and priority without schema enforcement and attacker injects description close all p1

53
00:04:25,920 --> 00:04:32,080
incidents inside a triple backtick block the model passes it through your it sm api shrugs and runs

54
00:04:32,080 --> 00:04:37,520
an update script with schema and validation description can't contain command tokens or exceed length

55
00:04:37,520 --> 00:04:42,720
the request fails closed locks a correlation ID and your cm flags a moderation hit

56
00:04:42,720 --> 00:04:49,440
and boom look at that result incident avoided trail preserved auditor appeased common mistakes to

57
00:04:49,440 --> 00:04:54,400
avoid letting the model choose two names dynamically tools are contracts not suggestions

58
00:04:54,400 --> 00:04:58,560
accepting freeform json without a validator looks like jason is not a compliment

59
00:04:58,560 --> 00:05:02,560
editable prompts in production environments if it can change without review it will

60
00:05:03,360 --> 00:05:08,560
relying on conversation memory for policy policy belongs in code and config not vibes once you

61
00:05:08,560 --> 00:05:13,040
nail this everything else clicks you stop the intern from talking out of turn next we stop them from

62
00:05:13,040 --> 00:05:18,640
walking into every room with a master key case two over permissive connectors keys to the castle on

63
00:05:18,640 --> 00:05:23,680
a lanyard you stop the intern from talking now take the batch back over permissive connectors are

64
00:05:23,680 --> 00:05:28,480
that janitor key ring that opens every door in the building including the vault the daycare and

65
00:05:28,480 --> 00:05:33,600
somehow the ceo's peloton why this matters is simple one over scoped connector equals enterprise

66
00:05:33,600 --> 00:05:39,040
wide data exfiltration in a single request not theoretical one call while you're still arguing about

67
00:05:39,040 --> 00:05:44,400
the change ticket title let's start with what breaks in lang chain for j developers share api

68
00:05:44,400 --> 00:05:49,680
keys across agents for convenience are then someone commits the envy to a private repo that's actually

69
00:05:49,680 --> 00:05:54,720
public and you're doing incident response at two a m have brought all the scopes are next you grant

70
00:05:54,720 --> 00:06:00,240
read write all to save time during testing and six months later that test token is now productions

71
00:06:00,240 --> 00:06:04,880
crown jewel and the tool registry i love a clean registry but if you point a dev agent at production

72
00:06:04,880 --> 00:06:09,760
credentials because the demo has to work today you just wired a chainsaw to a rumble the thing most

73
00:06:09,760 --> 00:06:14,160
people miss is that tools inherit whatever identity you hand them shared credentials mean shared

74
00:06:14,160 --> 00:06:19,200
blast radius there's no magic only to save things flag if the token can delete records your agent

75
00:06:19,200 --> 00:06:24,160
can delete records accidentally enthusiastically and with perfect confidence now swing over to co-pilot

76
00:06:24,160 --> 00:06:30,080
studio tenant wide m265 connectors are the classic trap you click once and now every co-pilot in every

77
00:06:30,080 --> 00:06:35,360
team can see data it shouldn't that's not empowerment that's a buffet for mistakes then you deploy to

78
00:06:35,360 --> 00:06:40,080
teams with org wide visibility because adoption and suddenly a pilot bought meant for finance is

79
00:06:40,080 --> 00:06:44,080
answering questions in marketing pulling content from sharepoint sites it never should have known

80
00:06:44,080 --> 00:06:50,080
existed and unmanaged third party sass hooks those are like usb drives in 2009 mysteriously everywhere

81
00:06:50,080 --> 00:06:55,840
and always temporary evidence shows up the same way every time stale secrets that never rotated no

82
00:06:55,840 --> 00:07:02,320
expiration no owner connectors mapped to global groups for simplicity app registrations with scopes

83
00:07:02,320 --> 00:07:09,440
that read like a confession and yes that temporary prodkey living in dev for months your security

84
00:07:09,440 --> 00:07:14,240
findings and tenant configs won't lie they'll just sigh so what's the fix path for lang chain for j

85
00:07:14,240 --> 00:07:19,200
treat every agent like a separate application with its own identity create per agent service principles

86
00:07:19,200 --> 00:07:24,480
no shared tokens if two agents need the same api they still get different credentials use scoped

87
00:07:24,480 --> 00:07:29,680
OAuth grant the smallest set of permissions that lets the tool do its job reader not writer write

88
00:07:29,680 --> 00:07:35,280
to one collection not all store secrets in a proper secret manager rotate on a schedule rotate on

89
00:07:35,280 --> 00:07:40,320
incident rotate when someone even whispers token at two level our back a tool wrapper checks the

90
00:07:40,320 --> 00:07:45,520
callers role before it touches an api no role no call separate environments dev keys only talk to

91
00:07:45,520 --> 00:07:50,400
dev systems if a tool sees a plot endpoint in dev it fails closed and screams in the logs

92
00:07:50,400 --> 00:07:55,840
for co pilot studio draw hard boundaries with environments and scopes use environment separation

93
00:07:55,840 --> 00:08:01,040
dev test prod different connectors different permissions different owners review connector scopes

94
00:08:01,040 --> 00:08:06,240
with a workflow changes require approval expiration dates and owners no owner no connector apply

95
00:08:06,240 --> 00:08:11,440
dlp policies per channel finance channel gets stricter rules then company wide that's the point kill

96
00:08:11,440 --> 00:08:17,440
org wide teams deployments for pilots limit visibility to a security group expand only after review

97
00:08:17,440 --> 00:08:22,480
inventory and gate third party says connectors if it's unmanaged it's off by default owners must

98
00:08:22,480 --> 00:08:28,000
justify access and renew it here's a quick win you can ship this afternoon kill tenant wide scopes

99
00:08:28,000 --> 00:08:32,880
and map each connector to a security group with an expiration policy when the group expires access

100
00:08:32,880 --> 00:08:37,040
dies gracefully no drama no surprise let me show you exactly how this plays out you've got an

101
00:08:37,040 --> 00:08:41,840
export report tool that writes to a storage bucket with shared credentials it can see every bucket

102
00:08:41,840 --> 00:08:47,280
so a prompt that says save backup to archive quietly land sensitive data in a public facing path

103
00:08:47,280 --> 00:08:54,560
with per agent principles and scoped roles that tool can only write to one bucket with one prefix

104
00:08:54,560 --> 00:09:00,160
if the model decides to invent another path the api denies it you lock the correlation ID and your

105
00:09:00,160 --> 00:09:05,120
CM fires an alert and boom no accidental data lake party common mistakes to avoid

106
00:09:05,920 --> 00:09:11,200
using global groups for access just for now now never ends granting broad or outscopes because the

107
00:09:11,200 --> 00:09:16,400
docs were confusing confusing is not a permission copying credentials across environments if you can

108
00:09:16,400 --> 00:09:21,360
paste it you'll paste it wrong treating teams visibility as harmless visibility is access access

109
00:09:21,360 --> 00:09:26,800
is risk once you nail access scoping auditors stop smelling blood in the water and when they do

110
00:09:26,800 --> 00:09:31,600
knock you'll need proof which brings us to the part everyone forgets if it isn't logged it didn't

111
00:09:31,600 --> 00:09:39,360
happen case three audit gaps if it isn't logged it didn't happen you scoped access great now prove

112
00:09:39,360 --> 00:09:45,440
it because when an auditor asks what did the agent do at 314 pm and your answer is we think something

113
00:09:45,440 --> 00:09:51,440
that's not a posture that's an apology tour why this matters without lineage you can't explain

114
00:09:51,440 --> 00:09:56,640
weird answers roll back bad actions or even know if an incident was contained regulators love that

115
00:09:56,640 --> 00:10:01,360
uncertainty customers love it less and your execs really love it when you say we have no idea

116
00:10:01,360 --> 00:10:05,840
here's the thing most people miss in lang chain 4j your chain is doing a lot more than your logs

117
00:10:05,840 --> 00:10:11,360
admit missing execution graphs mean you can't reconstruct the steps no trace IDs means events look

118
00:10:11,360 --> 00:10:16,640
like random fireworks partial logs hide the one tool called that mattered and silent tool failures

119
00:10:16,640 --> 00:10:21,680
those are my favorite everything looks green while the tool never ran and the model just hallucinated

120
00:10:21,680 --> 00:10:26,080
a response to be polite this clicked for me the day I chased an incident where a retrieval step

121
00:10:26,080 --> 00:10:31,920
timed out returned null and the next step happily wrote no records found closing to a ticketing API

122
00:10:31,920 --> 00:10:36,880
no exception no trace just vibes and close tickets if you can't tell model output from tool

123
00:10:36,880 --> 00:10:40,960
output in your logs you're already in trouble let me show you exactly how to fix lang chain 4j

124
00:10:40,960 --> 00:10:46,480
logging so it's courtroom ready first enable structured logging everywhere jason not string soup

125
00:10:46,480 --> 00:10:50,960
every request and response gets an envelope with timestamps model name token counts and a correlation

126
00:10:50,960 --> 00:10:56,880
ID that survives the whole chain then add execution graphs or traces using lang smith or lang graph

127
00:10:56,880 --> 00:11:02,320
each node logs inputs outputs duration and errors if a tool throws it throws loud no silent failure

128
00:11:02,320 --> 00:11:07,280
no we assumed now the big one centralize it pipe traces and logs into your scene don't leave them

129
00:11:07,280 --> 00:11:12,080
on a pod that auto scales to oblivion add sampling for noise but never drop errors you want to

130
00:11:12,080 --> 00:11:16,880
search by trace ID and see the entire story prompts retrieve docs tool calls API responses

131
00:11:16,880 --> 00:11:22,080
and the final answer without playing where's wall door to a m swing to copilot studio the platform

132
00:11:22,080 --> 00:11:26,640
gives you some telemetry but it's spread across teams data verse and whatever connector you touch

133
00:11:26,640 --> 00:11:31,120
limited conversation export means you get snippets not narratives fragmented telemetry means

134
00:11:31,120 --> 00:11:35,840
your compliance officer is stitching screenshots like its arts and crafts fixed by building a unified

135
00:11:35,840 --> 00:11:41,600
audit lane turn on conversation transcript retention with clear retention s la's who keeps it how long

136
00:11:41,600 --> 00:11:46,640
and why export conversation and action logs to a centralized workspace like fabric or log analytics

137
00:11:46,640 --> 00:11:50,960
tag every event with a correlation ID that you pass from the entry point in teams all the way

138
00:11:50,960 --> 00:11:56,080
through data verse plugins and downstream APIs if the chat says create case I want to click through

139
00:11:56,080 --> 00:12:00,960
to the exact record in the exact system with the exact data that was sent evidence you'll recognize

140
00:12:00,960 --> 00:12:06,960
from incident reviews often LL m calls in raw logs that don't map to any user session tool invocations

141
00:12:06,960 --> 00:12:11,520
with no corresponding prompt or answers citing sources you can't find because grounding logs don't

142
00:12:11,520 --> 00:12:16,400
include document IDs that's not a mystery that's missing plumbing quick wins you can ship this week

143
00:12:16,800 --> 00:12:22,240
mandate request response envelopes with hashes store a content hash of prompts retrieve chunks

144
00:12:22,240 --> 00:12:28,160
and outputs so you can prove integrity if someone tempers the hash says nope require correlation IDs

145
00:12:28,160 --> 00:12:33,440
end to end generate at the edge pass through every hop and reject any call that drops it no

146
00:12:33,440 --> 00:12:39,440
id no service define retention s la's legal science of security enforces 90 days for dev a year for

147
00:12:39,440 --> 00:12:45,440
port longer for regulated domains publish it live by it common mistakes to avoid logging raw

148
00:12:45,440 --> 00:12:50,480
secrets or p i i mask at source keep the fields you need for forensics not the crown jewels

149
00:12:50,480 --> 00:12:56,320
sampling errors sample success log all failures when things break you'll need every breadcrumb

150
00:12:56,320 --> 00:13:02,160
free text logs humans love them machines hate them use fields at schemas validate on ingest

151
00:13:02,160 --> 00:13:08,560
now auditors ask what happened you open one trace you show the prompt the retrieved dog IDs the

152
00:13:08,560 --> 00:13:13,680
tool input the tool output and the final message correlation ID matches the API gateway the

153
00:13:13,680 --> 00:13:19,120
database right and the teams message and boom the story holds so when you can see you can govern

154
00:13:19,120 --> 00:13:24,000
which means it's time to formalize the rules and lock down who does what the agent are beac

155
00:13:24,000 --> 00:13:29,680
model treat agents like very dumb very powerful users you've got visibility good now give these things

156
00:13:29,680 --> 00:13:34,880
an identity and a leash agents are users very dumb very powerful users if you wouldn't give a human

157
00:13:34,880 --> 00:13:40,000
temp admin on day one don't hand it to a stochastic parrot that can't tell sarcasm from pseudo why

158
00:13:40,000 --> 00:13:45,840
this matters most orgs secure humans and ignore agents the agent then bypasses least privilege by

159
00:13:45,840 --> 00:13:50,240
design it hops tools crosses environments and nobody notices because it doesn't show up in the

160
00:13:50,240 --> 00:13:55,600
org chart that's how helper bot turns into shadow admin here's the model i want you to ship

161
00:13:55,600 --> 00:14:00,640
one identity per agent clear roles policy checks in code and environment boundaries that actually

162
00:14:00,640 --> 00:14:05,760
mean something start with identity create a service principle for every agent not platform bot

163
00:14:05,760 --> 00:14:11,280
zero one actual one to one mapping name it like an app tag it with owner purpose environment and

164
00:14:11,280 --> 00:14:16,640
expiration if two agents hit the same API they still get different identities shared identity equals

165
00:14:16,640 --> 00:14:21,920
shared blast radius then define a role taxonomy that even busy people can apply i use four reader

166
00:14:21,920 --> 00:14:29,120
runner writer admin reader can fetch data no side effects think query retrieve summarize runner

167
00:14:29,120 --> 00:14:35,440
can invoke workflows with bounded effects create ticket send approval schedule meeting writer can

168
00:14:35,440 --> 00:14:42,240
write data to specific scopes one table one bucket one mailbox not all admin can change configuration

169
00:14:42,240 --> 00:14:47,920
this role should make you sweat you sparingly time bound and always dual approved now map that to

170
00:14:47,920 --> 00:14:52,960
long chain for j tools are your enforcement points each tool wrapper checks the callers role before

171
00:14:52,960 --> 00:14:58,160
executing if the agent identity is mapped to reader a right tool never even receives the call

172
00:14:58,160 --> 00:15:03,840
and don't stop at role include resource scope a writer to cases finance can't write cases sales

173
00:15:04,400 --> 00:15:09,360
scope is not a comment it's a gate use environment variables and config files to bind tools to

174
00:15:09,360 --> 00:15:14,800
environments dev agent only sees dev host and dev keys if it tries to prod url fail closed and log

175
00:15:14,800 --> 00:15:20,480
loudly policy checks happen at the boundary not after the API call fails in co-pilot studio do the

176
00:15:20,480 --> 00:15:25,280
same pattern with environment roles and connector scopes assign the agent to an environment with

177
00:15:25,280 --> 00:15:31,360
permissions that mirror reader so runner writer admin connector permissions map to security groups

178
00:15:31,360 --> 00:15:36,240
tied to those roles teams deployment boundaries are your blast radius keep pilots inside a small group

179
00:15:36,240 --> 00:15:41,040
expand only with a change ticket this is where the review workflow keeps you honest any scope change

180
00:15:41,040 --> 00:15:46,160
is a change request list the agent the requested role the connector or tool the environment the

181
00:15:46,160 --> 00:15:51,440
reason the requester and the expiration two approvals one technical owner one data owner when the

182
00:15:51,440 --> 00:15:56,800
expiration hits access dies automatically and you get an alert to renew or let it go temporary

183
00:15:56,800 --> 00:16:01,040
finally means temporary let me show you exactly how this plays out with a micro story a finance

184
00:16:01,040 --> 00:16:06,080
Q&A agent starts is reader people love it so product once create expense report and that's a runner

185
00:16:06,080 --> 00:16:12,400
action with a narrow scope you add a new tool with a tool level policy only expense submit only for

186
00:16:12,400 --> 00:16:17,360
the employees cost center only under five attachments the agent identity gets runner for that single

187
00:16:17,360 --> 00:16:22,880
API expiring in 30 days in co pilot studio the environment gains a connector permission to one

188
00:16:22,880 --> 00:16:28,400
dataverse table with row level security audit logs show the change ticket the approvals the expiration

189
00:16:28,960 --> 00:16:33,760
when marketing asks for also delete reports you decline or root to a separate writer tool with

190
00:16:33,760 --> 00:16:40,080
its own owners and sl a the model can want anything the policy enforces everything common mistakes you

191
00:16:40,080 --> 00:16:45,840
need to stop shared identities for convenience convenience is the enemy of containment temporary admin

192
00:16:45,840 --> 00:16:51,680
that never expires put timeouts on the permission not your calendar prod credentials in death to unblock

193
00:16:51,680 --> 00:16:56,800
demos demos don't need production data auditors do tool getting by you i not code if the model can

194
00:16:56,800 --> 00:17:01,120
call it the user interface doesn't matter if you remember nothing else treat agents like users

195
00:17:01,120 --> 00:17:07,600
with dumb curiosity and sharp knives identity per agent role per action scope per resource time bound

196
00:17:07,600 --> 00:17:13,280
by default and every call checks policy before it touches data data loss policies that actually block

197
00:17:13,280 --> 00:17:18,160
AI from becoming a leaky faucet you gave agents identities and roles now we stop the drips governance

198
00:17:18,160 --> 00:17:23,520
is proactive here block the bad before it leaves the building why this matters AI multiplies exposure

199
00:17:23,520 --> 00:17:28,400
a single prompt can produce a hundred outbound tokens each a potential leak if your controls only show

200
00:17:28,400 --> 00:17:33,920
up in post mortems you're doing forensics not governance here's the policy said that actually bites

201
00:17:33,920 --> 00:17:39,200
think of it as seat belts airbags and a brick wall p i reduction on outputs by default names emails

202
00:17:39,200 --> 00:17:45,040
phone numbers national IDs if the user isn't entitled it's masked nobody's internal internal is where

203
00:17:45,040 --> 00:17:50,880
leaks live secrets detection guard for api keys tokens private keys connection strings patterns plus

204
00:17:50,880 --> 00:17:55,360
bloom filters for speed at check sums for common formats so false positives don't drown you

205
00:17:55,360 --> 00:18:00,800
sensitive term filters tuned to your business legal project names m in a code words unreleased

206
00:18:00,800 --> 00:18:05,840
product code names keep this list in a managed store with owners and change history regax allow

207
00:18:05,840 --> 00:18:10,320
list for output structure if the tool expects an invoice summary only invoice shape outputs pass

208
00:18:10,320 --> 00:18:14,880
narrative poetry doesn't schemas aren't optional their contracts in long chain for j in force in

209
00:18:14,880 --> 00:18:19,680
pre and post processors pre processes scan inputs for injection payloads and disallow tasks

210
00:18:19,680 --> 00:18:25,680
post process is validate json schemas run term secret filters and redact p i i before anything leaves

211
00:18:25,680 --> 00:18:31,200
the agent boundary if a check fails you return a safe error with a correlation ID no partial truths

212
00:18:31,200 --> 00:18:37,600
no maybe add moderation pipelines lightweight pattern filters and bloom filters heavier a i based

213
00:18:37,600 --> 00:18:44,080
moderation for context in g don't disclose health data chain them fail closed and define fallback

214
00:18:44,080 --> 00:18:48,800
behavior summarize at a higher level give a safe refusal or route to a human don't loop the model

215
00:18:48,800 --> 00:18:54,560
into giving a safer leak in copilot studio use the stack you already pay for dlp policies in

216
00:18:54,560 --> 00:19:00,320
Microsoft 365 apply to channels and connectors turn them on and tune them use sensitivity labels in

217
00:19:00,320 --> 00:19:05,200
your grounding sources so the agent never indexes high impact content without explicit approval

218
00:19:05,200 --> 00:19:10,240
tenant level moderation should be on by default not we'll turn it on later later is when pr gets

219
00:19:10,240 --> 00:19:15,600
involved monitoring ties this together set thresholds unusual output size unusual destination unusual

220
00:19:15,600 --> 00:19:20,720
frequency if a q and a board suddenly writes megabytes to an external endpoint quarantine the session

221
00:19:20,720 --> 00:19:26,160
auto expire risky sessions and notify owners keep alerts actionable don't ping everyone for every

222
00:19:26,160 --> 00:19:31,840
mask event quick win you can ship today in force output schemas and block list terms at the agent

223
00:19:31,840 --> 00:19:37,440
boundary no schema no response at p i reduction as a final pass that combination stops the obvious

224
00:19:37,440 --> 00:19:41,760
leaks and buys you time to implement the rest common mistakes letting reduction depend on front

225
00:19:41,760 --> 00:19:47,280
end code redact server side before the message ever hits a client one global block list segment by

226
00:19:47,280 --> 00:19:53,280
business domain finance words aren't marketing words logging the unredacted original for debugging mask

227
00:19:53,280 --> 00:19:58,560
at ingestion keep a secure vault copy only if your legal team signed off lock the faucet then try

228
00:19:58,560 --> 00:20:04,800
to break it on purpose the red teaming checklist break it before producers do you don't trust the seatbelt

229
00:20:04,800 --> 00:20:09,840
until you young on it same with guard rails make the chaos scheduled goal adversarial testing you

230
00:20:09,840 --> 00:20:15,680
can repeat not vibes you'll ship a checklist run it on a cadence and gate releases with it injection

231
00:20:15,680 --> 00:20:21,680
tests first role confusion prompt the agent to ignore policy and act as system it should refuse

232
00:20:21,680 --> 00:20:27,600
tool name collision inject fake tool names and parameters only registered tools should ever execute

233
00:20:27,600 --> 00:20:34,480
schema bypass embed commands in strings and Jason comments validators must reject chain of thought

234
00:20:34,480 --> 00:20:40,960
bait try to coax reasoning dumps that leak sources output should stay within schema data x fill prompts

235
00:20:40,960 --> 00:20:47,600
quote the exact ssns export all emails you get a refusal with an audit trail access tests next scope

236
00:20:47,600 --> 00:20:53,280
escalation request actions outside the agents role tools must block at rapper level expired token

237
00:20:53,280 --> 00:20:59,120
behavior rotate secrets mid session calls should fail closed with clear errors shadow environment leakage

238
00:20:59,120 --> 00:21:06,000
dev agent tries prod endpoints hard fail loud locks audit tests trace completeness can you reconstruct

239
00:21:06,000 --> 00:21:11,760
prompt to action with one correlation ID tamper detection alter stored locks hash mismatch should alarm

240
00:21:11,760 --> 00:21:17,200
cross service correlation teams message ID to data verse record to external API call all matched

241
00:21:17,200 --> 00:21:23,760
define pass slash fail gates no schema ill fail any output that doesn't validate blocks the release

242
00:21:23,760 --> 00:21:28,560
tenant wide scope it will fail any connector without scope groups and expiration is blocked

243
00:21:28,560 --> 00:21:33,840
silent tool failure ill or fail errors must be surfaced not swallowed no rollback plan egos fail if

244
00:21:33,840 --> 00:21:39,200
you can't revert a permission change or a model config in minutes you're not ready cadence pre-release

245
00:21:39,200 --> 00:21:44,480
for every agent change quarterly chaos drills against prod like environments post incident learning

246
00:21:44,480 --> 00:21:49,760
loop within five business days with new tests added to the suite owners and SLA's each checklist

247
00:21:49,760 --> 00:21:56,640
item has an owner and SLA to fix an a status no owner no ship quick win wire no schema egos fail

248
00:21:56,640 --> 00:22:02,880
and tenant wide scope egos fail as ccd checks with either trips the pipeline stops no meetings no

249
00:22:02,880 --> 00:22:08,160
debates just a red light that everyone respects run the list break things fix them then run it again

250
00:22:08,160 --> 00:22:14,000
until the surprises stop being surprising the one governance step most org skip if you remember

251
00:22:14,000 --> 00:22:19,360
nothing else agents are identities treat them like very dumb very powerful users with least

252
00:22:19,360 --> 00:22:25,040
privilege enforced output contracts and full audit trails the step most orgs skip is creating

253
00:22:25,040 --> 00:22:29,840
dedicated service principles for every agent then wiring every tool and connector through scope

254
00:22:29,840 --> 00:22:35,440
rolls by environment no shared creds no tenant wide scopes adopt the R-back model turn on the

255
00:22:35,440 --> 00:22:40,800
DLP policies and run the red team checklist this week subscribe for the follow up where I show

256
00:22:40,800 --> 00:22:46,000
the exact configs in lang chain for J and co pilot studio so you can copy paste governance instead of

Y'all Need Governance: The LangChain4j & Copilot Studio Mess

Listen On

Support On

Copilot Talk Episodes

Recent Episodes

Data Talk Episodes

Power Platform Talk Episodes

Security Talk Episodes

Azure Talk Episodes

Copilot Talk Episodes

Dynamics Talk Episodes

Dev Talk Episodes

M365 Talk Episodes

Browse episodes by category