Blog / Prompt engineering / Why AI Agent Permissions Break Down

Why AI Agent Permissions Break Down

Learn why AI agent identity and permissions fail at the control plane, and how to fix delegation, scope, and auditability. Read the full guide.

Ilia Ilinskii
Rephrase · April 22, 2026

Prompt engineering8 min read

On this page

Key Takeaways Why is AI agent identity a control-plane problem?What breaks when agents use the wrong identity?How should AI agent permissions actually work?Why do multi-agent systems make permissions harder?What does a better agent control plane look like?References

Most teams obsess over the model. I think the scarier problem is everything around it. Once an AI agent can click, read, write, and delegate, your real risk stops being "bad output" and becomes "bad authority."

Key Takeaways

AI agent security is increasingly a control-plane problem, not just a prompt problem.
Research and official guidance both point to the same fix: identity propagation, least privilege, and deterministic policy enforcement.
The hardest question is simple: who is the agent acting as right now?
Overpowered service accounts are the quiet failure mode behind many agent incidents.
Multi-agent systems make attribution, delegation, and auditability much harder than most teams expect.

Why is AI agent identity a control-plane problem?

AI agent identity is a control-plane problem because the core failure is not usually language generation; it is authority assignment. Once an agent can access tools and data, the key question becomes which identity it carries, which permissions follow it, and what policy layer validates each action before execution [1][2].

Here's what I keep noticing in product teams: they treat agents like fancy chatbots, then wire them into real systems using a broad service account. It works fast. It demos well. It also quietly destroys accountability.

Google's recent SAIF guidance is unusually direct on this point. It says agentic systems require identity propagation, and warns against broad service accounts. Agents acting on a user's behalf should propagate the user's actual identity and permissions to every backend tool they touch [1]. That's a strong statement, and honestly, it should be much more widely discussed.

The research says the same thing from a different angle. The Berkeley-led survey on agent security argues that traditional IAM assumptions break down because agents need agent-specific identities, delegation, and dynamic access control at runtime [2]. In other words, static app auth is not enough when the workflow itself is fluid.

What breaks when agents use the wrong identity?

When agents use the wrong identity, three things break at once: attribution, least privilege, and revocation. You can no longer tell who really initiated an action, you grant more access than needed, and removing a risky capability becomes messy and slow [1][2].

A lot of agent stacks still collapse identity into one of two bad options. Either the agent impersonates the human too broadly, or it acts through a generic automation account with giant privileges. Both are dangerous. The first leaks user authority across tasks. The second creates an unaccountable robot admin.

The research literature maps this pretty cleanly. Identity in agent systems can be user-level, agent-level, or task-level [2]. That sounds abstract, but it matters. User-level identity helps with accountability. Agent-level identity helps isolate a persistent agent. Task-level identity is often the cleanest least-privilege move because it creates short-lived authority for one bounded job.

That's why "who is this agent?" is the wrong standalone question. The better question is: "Who is this agent acting as, for this task, with what expiry, under what approval path?"

Perplexity's response to NIST pushes the same theme further: agent systems need at least one deterministic enforcement layer, because model behavior alone cannot reliably protect privilege boundaries [3]. I agree. If your security plan is "the model knows better," you do not have a security plan.

How should AI agent permissions actually work?

AI agent permissions should be delegated narrowly, evaluated at runtime, and tied to tools and tasks rather than to a permanently trusted model. The model can propose actions, but a policy layer should decide whether those actions are allowed, under which identity, and with what scope [1][2][3].

That sounds obvious. In practice, most teams still do this backwards.

Instead of binding permissions to task-scoped actions, they bind trust to the agent itself. That creates a weird social contract with software: "We believe this agent is generally helpful, so let it do a lot." Security hates "generally."

Here's the model I recommend:

Approach	What it looks like	Main risk	Better alternative
Broad service account	One agent account can read, write, and call tools everywhere	No attribution, huge blast radius	Propagate user identity plus task-scoped capabilities
Full user impersonation	Agent inherits everything the user can do	Overreach across tasks and sessions	Add per-task approval and resource scoping
Tool-scoped capability	Agent gets only the exact action needed for a bounded task	More policy complexity upfront	Best balance for real systems

This is also where RBAC alone starts to feel blunt. The survey on agentic AI notes that RBAC, ABAC, and capability-based approaches all matter, but agents need policies that adapt to changing context, tools, and environmental state [2]. I'd phrase it more simply: if the workflow changes on the fly, your permissions model has to keep up.

And yes, this is the kind of policy work most builders want to skip because it slows the prototype. I get it. But skipping it is how prototypes become incident reports.

Why do multi-agent systems make permissions harder?

Multi-agent systems make permissions harder because delegation chains blur trust boundaries. One agent can induce another, more privileged agent to act, which creates classic confused-deputy problems and makes responsibility difficult to trace across sessions, tools, and intermediaries [2][3].

This is the part nobody wants to hear: adding more agents does not just add capability. It multiplies ambiguity.

Perplexity's paper is especially good here. It warns that outer agents can manipulate more privileged inner agents, and that enforcing consistent authorization across those boundaries is hard [3]. That is not a theoretical edge case. It is a normal architecture pattern in orchestration-heavy agent systems.

The Agents of Chaos red-teaming paper makes the issue painfully concrete. In one case, an attacker spoofed an owner identity across a different Discord channel, and the agent accepted the superficial identity cue, then proceeded toward privileged actions and governance changes [4]. That's not a prompt failure. That's an identity anchor failure.

I found one Reddit discussion useful as a practical signal here, not as evidence on its own. Developers are already asking whether existing IAM tools really cover agent-native identity, capability attestation, and cross-platform trust [5]. That mirrors the research almost exactly, which is usually a sign the problem is real and not just academic.

What does a better agent control plane look like?

A better agent control plane starts with explicit identity, task-bounded delegation, deterministic policy checks, and audit logs that reconstruct who approved what and why. The goal is not to make agents harmless; it is to make their authority legible, revocable, and proportionate [1][2][3].

If I were reviewing an agent stack today, I'd want this before rollout:

Every agent action resolves to a principal: user, agent, or short-lived task identity.
Every tool call is checked by policy before execution.
Sensitive actions require approval with recorded rationale.
Credentials are vaulted, short-lived, and never broadly exposed to the model.
Logs can reconstruct the full chain of delegation and execution.

Here's a simple before-and-after design pattern.

Before

Agent receives request -> uses shared service account -> reads CRM, sends email, updates billing, logs "done"

After

Agent receives request -> resolves user identity -> requests task-scoped capabilities -> policy engine approves read-only CRM access -> asks for approval before billing change -> executes with audit trail

That difference is the whole game.

If you write prompts for agents, this also changes how you should think about prompt engineering. The prompt is no longer the main control surface. It is part of a broader system contract. Good prompts can clarify intent, reduce ambiguity, and structure tool requests, but they cannot replace identity and access design. For more articles on that systems view of prompting, the Rephrase blog is worth browsing.

And if you're constantly rewriting messy instructions for different AI tools, a tool like Rephrase can help clean up the prompt layer quickly. Just don't confuse prompt cleanup with permission control. They solve different problems.

The uncomfortable truth is this: the model is the easy part to talk about. The hard part is governance that survives real workflows. Agents need a real control plane, not vibes, not trust, and definitely not one giant API key.

If you're building agents right now, start with the least glamorous question in the room: what identity does this thing carry when it acts? That answer tells you a lot about how close you are to production.

References

Documentation & Research

Cloud CISO Perspectives: Practical guidance on building with SAIF - Google Cloud AI Blog (link)
The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey - arXiv (link)
Security Considerations for Artificial Intelligence Agents - arXiv (link)
Agents of Chaos - arXiv (link)

Community Examples 5. Identity and trust infrastructure for autonomous agents - is this a real problem? - r/MachineLearning (link)

Frequently asked

What is AI agent identity?

AI agent identity is the way an agent is authenticated and represented when it acts in a system. In practice, that can mean acting as a user, as a named non-human principal, or as a short-lived task identity.

What is the control plane for AI agents?

The control plane is the policy layer that decides who the agent is, what it can access, when it needs approval, and how actions are logged. It matters because prompt-level rules alone cannot enforce those boundaries.