Security posture, roadmap, and how we think about agents that can take real-world actions.
A New Era in Computing Security
For the past 20 years, security models have been built around locking devices and applications down — setting boundaries between inter-process communications, separating internet from local, sandboxing untrusted code. These principles remain important.
But AI agents represent a fundamental shift.
Unlike traditional software that does exactly what code tells it to do, AI agents interpret natural language and make decisions about actions. They blur the boundary between user intent and machine execution. They can be manipulated through language itself.
We understand that with the great utility of a tool like OpenClaw comes great responsibility. Done wrong, an AI agent is a liability. Done right, we can change personal computing for the better.
This security program exists to get it right.
Context
OpenClaw is an AI agent platform. Unlike chatbots that only generate text, OpenClaw agents can:
- Execute shell commands on the host machine
- Send messages through WhatsApp, Telegram, Discord, Slack, and other channels
- Read and write files in the workspace
- Fetch arbitrary URLs from the internet
- Schedule automated tasks
- Access connected services and APIs
This capability is what makes OpenClaw useful. It's also what makes security critical.
AI agents that can take real-world actions introduce risks that traditional software doesn't have:
- Prompt injection — Malicious users can craft messages that manipulate the AI into performing unintended actions
- Indirect injection — Malicious content in fetched URLs, emails, or documents can hijack agent behavior
- Tool abuse — Even without injection, misconfigured agents can cause damage through overly permissive settings
- Identity risks — Agents can send messages as you, potentially damaging relationships or reputation
These aren't theoretical. They're documented attack patterns that affect all AI agent systems.
Scope
This security program covers the entire OpenClaw ecosystem. Nothing is out of scope.
Core Platform
- OpenClaw CLI and Gateway (
openclaw) - Agent execution engine
- Tool implementations
- Channel integrations (WhatsApp, Telegram, Discord, Slack, Signal, etc.)
Applications
- macOS desktop application
- iOS mobile application
- Android mobile application
- Web interface
Services
- ClawHub (clawhub.ai) — Skills marketplace and registry
- Documentation (docs.openclaw.ai)
- Any hosted infrastructure
Extensions
- Official extensions (
extensions/) - Plugin SDK and third-party plugins
- Skills distributed through ClawHub
People
- Core maintainers and contributors
- Security processes and response procedures
- Supply chain and dependency management
Program Overview
We're establishing a formal security function with four phases:
Transparency
Develop threat model openly with community contribution
Product Security Roadmap
Define defensive engineering goals and track publicly
Code Review
Manual security review of entire codebase
Security Triage
Formal process for handling vulnerability reports
Phase 1: Transparency
Goal
Develop and publish our threat model openly, inviting community contribution, so users understand the risks and can make informed decisions about their deployments.
Why
Security through obscurity doesn't work. Attackers already know these techniques — they're documented in academic papers, security blogs, and conference talks. What's missing is clear communication to users about:
- What risks exist
- What we're doing about them
- What users should do to protect themselves
By developing the threat model openly, we benefit from collective expertise and build trust through transparency.
Threat Model Coverage
| Category | Risks Covered |
|---|---|
| A. Input Manipulation | Direct prompt injection, indirect injection, tool argument injection, context manipulation |
| B. Auth & Access | AllowFrom bypass, privilege escalation, cross-session access, API key exposure |
| C. Data Security | System prompt disclosure, workspace exposure, memory leakage, data exfiltration |
| D. Infrastructure | SSRF, gateway exposure, dependency vulnerabilities, file permissions |
| E. Operations | Logging sensitive data, insufficient monitoring, resource exhaustion, misconfiguration |
| F. Supply Chain | ClawHub skills integrity, extension security, dependency vulnerabilities |
Threat Model Scope
| Component | Why It's Included |
|---|---|
| Core platform (CLI, Gateway, agents, tools) | Primary attack surface |
| ClawHub (clawhub.ai) | Skills marketplace — supply chain risk |
| Mobile apps (iOS, Android) | Agent control interface, credential storage |
| Desktop app (macOS) | Gateway host, system integration |
| Extensions and plugins | Third-party code execution |
| Build and release pipeline | Distribution integrity |
Each risk in the threat model will include description and severity rating, attack examples, current mitigations, known gaps, and user recommendations.
The threat model will be open for community contribution via pull requests.
Phase 2: Product Security Roadmap
Goal
Create a public product security roadmap defining defensive engineering goals, tracked as GitHub issues so the community can follow progress, provide input, and contribute.
Defensive Engineering Goals
| Category | Goal | Description |
|---|---|---|
| Prompt Injection Protection | Input validation | Pattern detection and alerting for injection attempts |
| Tool confirmation | Require explicit approval for sensitive actions | |
| Context isolation | Prevent cross-session contamination | |
| Privacy Enhancements | System prompt protection | Prevent disclosure of system prompts |
| Data minimization | Reduce unnecessary data retention | |
| Audit logging | Clear visibility into agent actions | |
| Access Control | Fine-grained permissions | Per-tool, per-session access controls |
| Rate limiting | Prevent resource exhaustion | |
| Spending controls | Hard limits on API costs | |
| Supply Chain | Skills verification | Integrity checks for ClawHub skills |
| Dependency auditing | Automated vulnerability scanning | |
| Signed releases | Cryptographic verification of updates |
Specific priority issues will be identified through the Phase 3 code review and added to the public roadmap as they are discovered and triaged.
Phase 3: Code Review
Goal
Conduct a comprehensive manual security review of the entire codebase, supplemented by automated tooling where appropriate, to identify vulnerabilities we've missed and validate our threat model.
Scope
The code review covers the entire OpenClaw codebase and ecosystem:
| Area | Path | Why |
|---|---|---|
| Agent execution | src/agents/ | Core attack surface — how agents run |
| Tool implementations | src/agents/tools/ | What agents can do — exec, messaging, web |
| Message processing | src/auto-reply/ | Entry point for all user input |
| Security utilities | src/security/ | Existing security controls |
| Gateway server | src/gateway/ | Network-exposed component |
| Authentication | src/*/auth* | Credential handling, API keys |
| Session management | src/config/sessions.ts | Cross-session isolation |
| Pairing and access control | src/pairing/, src/*/access-control* | DM and group gating |
| External content handling | src/security/external-content.ts | Injection defenses |
| macOS desktop app | apps/macos/ | Gateway host, system integration |
| iOS mobile app | apps/ios/ | Agent control, credential storage |
| Android mobile app | apps/android/ | Agent control, credential storage |
| ClawHub | clawhub.ai | Skills registry — supply chain risk |
| Official extensions | extensions/ | First-party plugins |
| Build and release pipeline | CI/CD, scripts | Distribution integrity, signing |
Approach
- Manual code review — Line-by-line analysis of security-critical paths
- Automated scanning — Static analysis, dependency auditing, secret detection
- Dynamic testing — Attempting documented attack patterns against running system
- Architecture review — Evaluating trust boundaries and data flows
Disclosure
- All critical and high findings fixed before public disclosure
- Findings summary published after remediation
- Full report available on request
- CVEs assigned where applicable
Phase 4: Security Triage Function
Goal
Establish a formal process for receiving, triaging, and responding to security vulnerability reports.
Report a Vulnerability
We take security reports seriously. Complete reports receive a response within 48 hours.
Required in Reports
Reports without reproduction steps, demonstrated impact, and remediation advice will be deprioritized. Given the volume of AI-generated scanner findings, we must ensure we're receiving vetted reports from researchers who understand the issues.
Response SLAs
| Severity | Definition | First Response | Triage | Fix Target |
|---|---|---|---|---|
| Critical | RCE, auth bypass, mass data exposure | 24 hours | 48 hours | 7 days |
| High | Significant impact, single-user scope | 48 hours | 5 days | 30 days |
| Medium | Limited impact, requires specific conditions | 5 days | 14 days | 90 days |
| Low | Minor issues, defense in depth | 14 days | 30 days | Best effort |
Our Commitments
- Acknowledge all complete reports within 48 hours
- Provide status updates at least every 14 days
- Credit researchers in advisories (unless anonymity requested)
- Not pursue legal action against good-faith security research
- Consider bounties for qualifying critical/high findings (case-by-case)
Security Advisor
OpenClaw is bringing on Jamieson O'Reilly (@theonejvo) as lead security advisor to guide this program.
Jamieson is the founder of Dvuln, co-founder of Aether AI (the world's most dangerous AI, in your corner), a member of the CREST Advisory Council, and brings extensive experience in offensive security, penetration testing, and security program development.
Responsibilities
- Lead threat modeling and risk assessment
- Scope and oversee code review
- Establish triage process and response procedures
- Review security-critical code changes
- Provide guidance on security architecture decisions
Current Security Posture
OpenClaw already has security controls in place. Understanding what exists helps users configure their deployments appropriately.
Secure by Default
Unknown senders must complete pairing flow with expiring code
Commands not on allowlist are denied by default, user prompted for approval
If not configured, only your own number can DM the agent
Conversations are isolated per session key
Internal IPs and localhost blocked in web_fetch
WebSocket connections must authenticate
Verify Your Setup
openclaw security audit --deep
Key items to verify:
- DM policy is
pairingorallowlist(notopen) allowFromis configured for your channels- Exec security is not set to
fullunless intended - Gateway is bound to loopback or behind authentication
- Workspace doesn't contain secrets
Timeline
WEEK 1-2: Phase 1 — Transparency ├── Threat model development begins (open for contribution) ├── Security configuration guide drafted ├── Visual overview created └── Announcement posted WEEK 3-4: Phase 2 — Product Security Roadmap ├── GitHub issues created for defensive engineering goals ├── Security label and milestone set up ├── Community input period opens └── First security work begins WEEK 5-8: Phase 3 — Code Review Preparation ├── Scope finalized (entire codebase) ├── Review begins └── Initial findings WEEK 8-12: Phase 3 — Code Review Execution ├── Manual review completed ├── Findings documented ├── Remediation for critical/high └── Verification completed WEEK 8+: Phase 4 — Triage Function ├── security@openclaw.ai live ├── PGP key published ├── Disclosure policy published └── First advisories (if needed) ONGOING: ├── Monthly security updates ├── Continuous threat model refinement ├── Regular dependency auditing └── Community engagement
FAQ
Yes, with proper configuration. OpenClaw has security controls enabled by default:
- DM Policy: Defaults to
pairing— unknown senders must complete a pairing flow with an expiring code - Exec Security: Defaults to
denywithask: on-miss— dangerous commands require approval - AllowFrom: If not configured, defaults to self-only
- Gateway Auth: Required by default
Run openclaw security audit --deep to verify your setup. See docs.openclaw.ai/gateway/security
These attack techniques are already public knowledge — documented in papers, blogs, and talks. Developing openly benefits from collective expertise, builds trust through transparency, and holds us accountable.
We receive reports from automated scanners and AI tools that flag theoretical issues without understanding them. By requiring reporters to propose a fix, we filter out scanner noise, get actionable reports from researchers who understand the issues, and speed up remediation with expert input.
ClawHub (clawhub.ai) is in scope for the entire security program — threat model, code review, and ongoing monitoring. Skills are code that runs in your agent's context — supply chain security is critical.
All applications are in scope. The iOS app, Android app, and macOS desktop app will all be covered by the code review and included in the threat model. Nothing is out of scope.
- Contribute to the threat model via pull request
- Review and comment on security-labeled issues
- Report vulnerabilities through security@openclaw.ai
- Help improve security documentation