Ship Agents: Orchestration, Attacks, Models, and Dev Hygiene

Monday, June 29, 2026 · 00:01Z

Ship Agents: Orchestration, Attacks, Models, and Dev Hygiene

Prompt injection is exploiting enterprise AI’s biggest design flaws by targeting agents, RAG pipelines and model routers. The piece documents how prompt injection now targets agents, RAG chains, and model routers to subvert enterprise workflows. Outcome engineers must treat agent orchestration as a widened attack surface—harden retrieval, validate outputs, and bake an immune-system for prompts and connectors (Principle 14).

Agentic-AI tool aims to give US commanders new target options ‘within seconds’. The Pentagon is fielding an Agent Network that continuously scans intelligence to surface targeting options while keeping commanders in the loop. That real-world orchestration shows how agent networks demand auditable decision trails, delegation controls, and organizational changes to responsibility and coordination (Principle 09, Principle 16).

OpenAI Codex lead on the new shape of product work — Andrew Ambrosino. The Codex desktop interview argues product roles and workflows collapse around AI-first interfaces and agent desktops. Outcome engineers should rework team boundaries, CI, and artifact-first delivery so agents are collaborators in predictable, reviewable pipelines (Principle 03, Principle 08).

A way to exclude sensitive files (issue #2847). A contributor requests a shareable .codexignore to prevent agents from reading or sending sensitive repository files to Codex. This is a concrete hygiene pattern—implement repo-level ignores, vet context sent to models, and gate sensitive artifacts before any agent can access them (Principle 15, Principle 10).

GLM 5.2 beats Claude in our benchmarks. Semgrep shows open-weight GLM-5.2 outperforming Claude on IDOR detection and highlights how harnesses shape vulnerability-finding performance. Outcome engineers must evaluate models plus harness (tooling, prompts, retrieval, pipelines) not just base model claims—and audit those stacks continuously (Principle 06, Principle 16).