
Rare Agent Work · Security Engineering Edition
Rev 1.0 · Updated March 18, 2026
Six threat surfaces, twelve controls, and the NemoClaw architecture that addresses all of them
Harden your OpenClaw deployment against the six documented threat surfaces — from plaintext API key exposure to indirect prompt injection — with controls mapped to specific incident classes.
What this report gives you
The finding that changes your next decision
“The most common OpenClaw security failure is not a sophisticated attack — it is an indirect prompt injection via a retrieved webpage combined with plaintext API keys in environment variables. The attacker does not need to access your infrastructure: they need to control one document your agent reads. CVE-2026-25253 is reproducible with techniques that require no special expertise, and most production OpenClaw deployments are fully exposed to it right now.”
This report is right for you if any of these are true
Why this report exists
OpenClaw's design philosophy prioritizes developer velocity over operational security — the right tradeoff for prototyping and the wrong default for production. The six threat surfaces documented in this report are not theoretical risks: they are the incident classes that have affected OpenClaw deployments in the 90 days since the platform reached production-grade stability. Plaintext API keys accessible to the agent runtime. Unrestricted tool execution with no per-role scoping. Indirect prompt injection via retrieved content — the highest-severity attack surface that most teams leave entirely undefended. Missing audit trails that prevent incident reconstruction. No infrastructure-level cost enforcement. Shared compute blast radius that amplifies single-agent compromises. This report maps each threat surface to the specific control that addresses it, provides the 12-item go/no-go checklist for production deployment, and explains where NemoClaw fits as the control plane that addresses all six surfaces simultaneously.
Honest disqualification. If none of the above matches you, this report was written for you.
Six documented attack surfaces with specific incident classes — not a generic AI risk list. Each surface maps to a control and a test.
The CVE-2026-25253 exploitation pattern, why environment variables are insufficient, and the three-property secrets architecture that blocks it.
Input sanitization, privilege separation, output validation, and anomaly detection — why any single layer is insufficient and how to implement all four.
Go/no-go criteria with test evidence requirements for each control. If you cannot produce test evidence, the item is not complete.
How NemoClaw addresses all six threat surfaces as a control plane — with the component-to-threat-surface mapping and full deployment architecture.
SOC 2, HIPAA, and FINRA control mapping for NemoClaw deployments — with explicit distinctions between documented architecture and certification.
All 6 sections — scroll down to read.
Six documented threat surfaces with specific incident classes — including the CVE-2026-25253 exploitation pattern that most teams leave entirely undefended.
The security threat model for OpenClaw differs fundamentally from the threat model for a static API call to a language model. A static model call receives input, produces output, and terminates. OpenClaw agents persist across sessions, execute tools with real-world consequences, retrieve content from external sources, and operate with delegated authority to act on behalf of users. Each of these properties creates an attack surface that does not exist in static model deployments.
Threat Surface 1: Secrets accessible to the agent runtime
The default OpenClaw deployment pattern stores API keys as environment variables accessible to the Python or Node process running the agent. CVE-2026-25253 documents the exploitation pattern: an indirect prompt injection via a retrieved document instructs the agent to include environment variable values in its response. The agent complies. No exception is thrown. No alert fires.
Threat Surface 2: Unrestricted tool execution
OpenClaw agents can be granted access to tools — web search, code execution, file system access, database queries, email, API calls — and by default, there is no per-tool permission enforcement at the runtime layer. A user with access to an agent inherits all tools that agent has been given. Documented incident: an agent given file system read access for document indexing was prompted to read files in adjacent directories outside its intended scope. No permission boundary prevented this.
Threat Surface 3: Indirect prompt injection via retrieved content
Direct prompt injection via user input is relatively easy to defend against. Indirect injection — where adversarial instructions are embedded in content the agent retrieves from external sources — is the higher-severity attack surface and the one most teams leave entirely undefended. Documented incident: an agent summarizing competitor web pages encountered a page with hidden text instructions to output the agent's system prompt. The agent complied, exposing deployment configuration and internal tooling details.
Threat Surfaces 4–6 — missing audit trails, no infrastructure-level cost enforcement, and shared compute blast radius — are documented in the full report with the same specificity: exact failure mode, documented incident class, and the specific control that addresses each.
The three-property secrets architecture that blocks indirect prompt injection key exfiltration, with implementation details for each secrets manager option.
CVE-2026-25253 requires three conditions to exploit: the agent retrieves content from external sources, the attacker can inject content into those sources, and API keys are accessible as environment variables within the agent's process. When all three conditions are met, the attacker embeds instructions in retrieved content directing the agent to include environment variable values in its response. The raw key is exfiltrated. No infrastructure alert fires. No exception is thrown.
The three-property secrets architecture that blocks it:
Property 1: Runtime inaccessibility. The agent runtime should not have access to the raw value of any secret. Instead of OPENAI_API_KEY=sk-... in the environment, the agent calls a secrets management endpoint that returns a short-lived capability token. The raw key never appears in the agent's environment — there is nothing to exfiltrate.
Property 2: Automatic rotation. Secrets rotate automatically without manual intervention. Model API keys: 30 days. Integration tokens: 90 days. Database credentials: 7 days. Rotation should not cause service interruption and should require no manual action.
Property 3: Access logging. Every secret access — which secret, which service, which timestamp — is logged and monitored for anomalies. A spike in secret access requests is an early indicator of a compromise attempt or a runaway session.
The full report includes implementation details for AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager, Azure Key Vault, and NemoClaw built-in — with honest tradeoffs for each option.
4 more sections in this report
What unlocks with purchase:
The Four-Layer Prompt Injection Defense Stack
Why single-layer prompt injection defense always has a bypass — and how the four-layer stack prevents the classes that each individual layer misses.
External Audit Logging: Why Agent Memory Cannot Be the Source of Truth
External audit logging: the write-once storage policy, the 60-second delivery requirement, and why agent memory cannot be the incident investigation source of truth.
The 12-Item Pre-Production Hardening Checklist
12 go/no-go items with test evidence requirements — what enterprise procurement asks for and that most teams cannot produce on first request.
NemoClaw as the Control Plane: Architecture and Compliance Posture
NemoClaw architecture: the component-to-threat-surface mapping and the SOC 2, HIPAA, and FINRA compliance posture reference.
One-time purchase · Instant access · No subscription
Why single-layer prompt injection defense always has a bypass — and how the four-layer stack prevents the classes that each individual layer misses.
Every documented prompt injection defense has a bypass. Input validation fails against sufficiently obfuscated injections. Output validation misses attacks that use the agent's capabilities in technically correct but unintended ways. System prompt hardening degrades over long context windows. No single layer provides reliable defense. The correct approach is defense-in-depth.
Layer 1: Input sanitization — Validate and sanitize all text that enters the agent's context from external sources. Strip HTML from retrieved content. Detect and flag content containing patterns common in injection attacks. Rate-limit the volume of external content that can enter a single context window.
Layer 2: Privilege separation — The agent that retrieves external content should not have the same tool permissions as the agent that takes action. A retrieval agent has read-only access. An action agent receives a sanitized summary — not raw retrieved content. A successful injection in retrieved content only compromises the retrieval agent's limited capabilities.
Layer 3: Output validation — Before any agent output is returned to the user or passed to another system, validate that it does not contain environment variable names, API key formats, system prompt fragments, or instructions addressed to external systems. Flag for human review rather than silently dropping — silent dropping masks ongoing attacks.
Layer 4: Audit and anomaly detection — A monitoring system that baselines normal agent behavior and alerts on deviations is the last line of defense and the one most likely to catch attacks that bypass all other layers. Even a successful injection leaves traces: unusual tool call sequences, unexpected external requests, atypically high token consumption before a consequential action.
External audit logging: the write-once storage policy, the 60-second delivery requirement, and why agent memory cannot be the incident investigation source of truth.
OpenClaw's built-in logging captures conversation history, tool call inputs and outputs, and session metadata — stored in agent memory. This creates three investigation-critical gaps.
Gap 1: Mutability. The agent can modify its own memory. Logs stored in agent memory are unreliable for incident investigation — they can be altered by the same attack that caused the incident.
Gap 2: Inaccessibility. Logs stored in agent memory are not accessible to a SIEM, log aggregation system, or compliance auditor without specific export configuration that most teams never implement.
Gap 3: Format mismatch. The default log format optimizes for agent context, not forensic analysis. It lacks the structured fields, consistent schema, and timestamp precision that incident investigation and compliance audit require.
The required audit architecture: Every tool call, external request, memory operation, model invocation, and authentication event logged with structured JSON, consistent schema, and millisecond-precision timestamps. Logs shipped to an external destination immediately on creation — not buffered in agent memory first. Write-once or append-only storage with hash verification. Minimum 90-day retention; 1 year for regulated industries.
Documented incident: a team investigating a suspected data exfiltration found that the relevant session logs had been overwritten during a memory compaction routine. The incident could not be fully reconstructed. The audit failure was as damaging as the incident itself in the enterprise procurement review that followed.
12 go/no-go items with test evidence requirements — what enterprise procurement asks for and that most teams cannot produce on first request.
These 12 items are the go/no-go criteria before any OpenClaw deployment goes to production with real user data or real-world consequences. Every item has a test evidence requirement — not an assertion, not a vendor claim, not a documentation reference.
8–12 cover rollback procedures, compliance posture documentation, adversarial prompt test sets (minimum 20 inputs across 4 attack categories), cost monitoring alerts, and incident response runbook review — each with the same test evidence standard.
The procurement reality: Enterprise buyers now ask for this checklist completed with test evidence on first submission. Teams that can produce it move 3x faster through procurement than teams that provide assertions and documentation links.
NemoClaw architecture: the component-to-threat-surface mapping and the SOC 2, HIPAA, and FINRA compliance posture reference.
NemoClaw is the control plane that makes OpenClaw production-deployable at enterprise scale. Each NemoClaw component addresses one or more of the six threat surfaces.
The component-to-threat-surface mapping: - Isolated compute namespace with NetworkPolicy → shared compute blast radius - Vault-integrated secrets management → CVE-2026-25253 secrets exposure - RBAC + SSO integration → unrestricted tool execution - Prompt sanitization pipeline → indirect prompt injection - Tamper-evident external audit log → missing audit trail - Gateway-level token budget enforcement → cost explosion from runaway sessions
Compliance posture for regulated industries:
SOC 2 Type II: Access controls (CC6.1), logical and physical access restrictions (CC6.2, CC6.3), change management (CC8.1), risk assessment (CC3.1, CC3.2), and monitoring (CC7.1, CC7.2). NemoClaw provides the control architecture; SOC 2 certification requires an independent audit.
HIPAA: Private inference routing ensures that PHI processed by the agent runtime does not transit a shared public API surface. The external audit logging supports controls under 45 CFR § 164.312(b). A Business Associate Agreement with your cloud provider is still required separately.
FINRA: The tamper-evident external audit log supports record-keeping requirements under FINRA Rule 4511. The audit architecture is designed to be defensible under regulatory examination — not just internally documented.
Every claim in this report traces to a verifiable source.
Last reviewed March 18, 2026
Who wrote this, what evidence shaped it, and how the recommendations are framed.
Author: Rare Agent Work Team · Written and maintained by the Rare Agent Work Team.
Proof 1
Six threat surfaces are documented with specific incident classes, not theoretical risk — including CVE-2026-25253 and its exact exploitation pattern.
Proof 2
12-item pre-production checklist includes test evidence requirements for each control — not assertions, not documentation references.
Proof 3
NemoClaw architecture section maps each NemoClaw component to the threat surfaces it addresses, with a complete deployment architecture diagram.
Powered by Claude — trained on this report's content. Your first question is free.
Ask anything about implementation, setup, or how to apply the concepts in this report. Your first question is free — then we'll ask you to sign in.
Powered by Claude · First question free
When the report isn't enough
Architecture review, implementation rescue, and strategy calls for teams with real blockers. Every intake is read by a human before any next step.