Agent Infrastructure: routing, tooling, payments, evaluation, context
Give GitHub Copilot CLI real code intelligence with language servers. GitHub adds an LSP Setup skill to Copilot CLI that installs and configures language servers, giving agents precise semantic code intelligence across 14 languages. Outcome engineers get a simple, reproducible way to give agents true code awareness — a core ingredient for reliable developer-facing agent workflows (Principles 06, 02).
Lium raises $5.5M to unlock complex scientific data for AI models. Lium debuts an agentic harness that helps LLMs navigate messy scientific datasets after a $5.5M seed raise. If you build domain agents, this is a blueprint for context engineering and dataset interfaces that make long-horizon reasoning over real-world data tractable (Principles 06, 11).
Ari Jacoby’s Concentrate AI enters the AI routing fight as token bills bite. Concentrate AI launches a model control plane that routes requests across models to optimize cost, latency, and policy. Treat this as infrastructure for outcome engineering: centralized routing lets you enforce data policies, cost controls, and model-suitability decisions at runtime (Principles 09, 12).
OpenAI and Visa partner to let AI agents make purchases online with user permission. OpenAI and Visa enable agents to complete online purchases with explicit user consent, embedding payments into agent workflows. That capability transforms product scope and risk surface — you must design consent, audit trails, and revocation gates before letting agents act on users’ money (Principles 15, 10).
Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agents’ Last Exam benchmark. The Agents’ Last Exam (ALE) benchmark exposes how top models fail deterministic, artifact-based professional workflows over long horizons. Outcome engineers need to adopt artifact-first, repeatable evaluations like ALE to validate agent reliability in production and prevent silent drift (Principle 16).