Agent Ops Brief: orchestration, safety, review, and cost

How Sakana trained a 7B model to orchestrate GPT-5, Claude Sonnet 4 and Gemini 2.5 Pro trains a compact 7B “conductor” to dynamically coordinate top LLMs and power a cheaper multi‑model orchestrator. Outcome engineers can adopt the conductor pattern to reduce inference cost while retaining best-of-breed model behavior — a practical approach to agentic orchestration (Principle 09, Graph 11).

Anthropic Mythos Identifies Hundreds of Longstanding Vulnerabilities surfaces hundreds of previously unknown, decade‑old software vulnerabilities using an AI testing harness. Treat AI-driven vulnerability discovery as a standard pre‑release step: it forces new triage, verification, and prioritization workflows for secure, validated delivery (Principle 14, 16).

Claude Chrome Extension Vulnerability Allows Agent Takeover details a ShadowPrompt zero‑click flaw that let websites inject prompts into a browser extension and exfiltrate tokens and files until patched. That attack shows browser integrations and agent endpoints are first‑class threat surfaces — lock down permission models, prompt‑injection defenses, and Gate controls in any agent product.

GitHub Improves Token Efficiency in Agentic Workflows instruments agentic workflows at the API proxy and adds output constraints to materially cut token consumption and recurring CI spend. Instrumentation plus output shaping is a low‑friction operational lever for outcome engineers to reduce runtime costs and make agent behavior more predictable (Principle 06, 11).

Engineers Review Agent-Generated Pull Requests Effectively reports agent‑generated PRs saturating reviewer bandwidth and recommends review heuristics, policy gates, and semantic duplication detection to preserve maintainability. If agents author code in your stack, build review workflows, automated policy gates, and duplication checks into CI to keep code auditable and maintainable (Principle 14, 15).