Skip to main content
Char’s security model is built on architectural isolation, not hoping the model behaves correctly. The agent only sees what you explicitly provide, and all tool invocations pass through policy enforcement.

The Prompt Injection Problem

Prompt injection is the attack where untrusted content manipulates an AI agent into taking unintended actions. Simon Willison describes the “Lethal Trifecta” of conditions that make this dangerous:
  1. Untrusted content in context — The agent processes data it didn’t generate
  2. Access to tools — The agent can take actions in the world
  3. Trusted user intent — The system assumes the agent acts on behalf of the user
When all three conditions exist, an attacker can embed malicious instructions in data the agent reads, causing it to execute unauthorized actions.

Char’s Defense Model

Char addresses each condition architecturally:

Explicit Context Boundaries

The agent sees only:
SourceContent
ConversationUser messages in the widget
SkillsSKILL.md files you’ve registered
Tool schemasNames, descriptions, input schemas
Tool outputsResults of tool invocations
The agent does not see:
  • Raw page DOM
  • Cookies or localStorage
  • Hidden fields or private state
  • Other users’ conversations
  • Network requests or responses
If you want the agent to access page data, you expose it through a tool. This makes the boundary explicit and auditable.

Policy-Mediated Tool Access

All tool invocations flow through the Tool Hub, which enforces policy: The agent cannot invoke tools directly—every call is intercepted and evaluated.

Tool Classification

Tools are categorized by risk level:
ClassificationDescriptionExample
readRetrieves data without modificationgetCustomer()
writeModifies stateupdateLead()
exfilSends data externallysendEmail()
Different classifications can have different approval requirements:
  • read — Automatic approval
  • write — Requires user confirmation
  • exfil — Requires explicit approval each time

Role-Based Visibility

Tools can be scoped by user role:
- Admin users see: all tools
- Sales users see: CRM tools only
- Read-only users see: read-classified tools only
If a role can’t see a tool, the agent doesn’t know it exists. There’s no attack surface for tools outside the user’s scope.

Why Architectural Defense Matters

Behavioral defenses (“please don’t do bad things”) are unreliable:
Defense TypeApproachLimitation
Prompt engineeringSystem prompts instructing safe behaviorCan be overridden by injected content
Output filteringChecking agent responses for malicious patternsAttackers can encode instructions
Input sanitizationFiltering dangerous patterns from inputIncomplete coverage
Architectural defenses are structural:
Defense TypeApproachProperty
Context isolationAgent only sees explicit inputsNo unexpected data exposure
Policy enforcementAll tool calls pass through HubNo direct tool access
ClassificationTools categorized by riskGraduated approval requirements
Role scopingTools filtered by user roleReduced attack surface

The Tool-Mediated Access Pattern

A key design principle: agents access application data only through explicitly registered tools. This inverts the typical pattern where agents are given broad read access and trusted to behave appropriately. Why this matters for security:
  • Explicit boundaries — You decide exactly what data the agent can access
  • Documented interfaces — Tool schemas serve as contracts, making the attack surface auditable
  • Minimal privilege — Each tool exposes only what’s needed for its function
The alternative—giving agents DOM access or broad API permissions—creates implicit trust that attackers can exploit. If an agent can read anything, injected instructions can direct it to read sensitive data.
For implementation details, see the WebMCP Tools guide.

Kill Switch

In Tier 2 deployments, administrators can instantly disable:
  • Individual tools — Disable a specific tool across all users
  • Tool providers — Disable all tools from a domain
  • Agent access — Disable agent functionality entirely
This provides an emergency brake if unexpected behavior is detected.

Further Reading