← Latest Update

Ship at Inference Speed — Parallel Agents, WebMCP, SkillsBench

Shipping at Inference-Speed. Developers now ship software at model-inference speed, relying on agent-driven code streams and trusting Codex for large refactors. If your delivery loop is bounded by model latency, you need streaming CI, tighter human checkpoints, and new error budgets to keep outcome velocity predictable.

Just Talk To It — the no-bs Way of Agentic Engineering. Peter Steinberger lays out pragmatic parallel agents, human checkpoints, and blast-radius controls using the Codex CLI to deliver code faster. Use this pattern to design agent fleets with clear handoffs, containment, and explicit human approval gates (Principle 09).

Building SQLite with a Small Swarm. Six agents collaborate to build a 19k-line Rust SQLite clone with 282 passing tests, demonstrating agents can produce substantial system code. Reproduce their orchestration: define roles, enforce cross-agent contracts, and bake tests as the primary verification loop.

WebMCP Proposal. WebMCP defines a discoverable, schema-driven way for web apps to expose functions so browser and LLM agents can act inside interfaces with shared context. Implementing a WebMCP layer gives you standardized tool descriptors, safer automation boundaries, and simpler context engineering for product-facing agents.

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks. The benchmark finds curated agent Skills materially boost task pass rates while self-generated Skills add almost no benefit across 86 diverse tasks. Treat curated skills as testable artifacts: version, evaluate across representative tasks, and include deterministic verification in your outcome audits.