Agent Ops: routing, memory, context and the security boundary
Show HN: Smart model routing directly in Claude, Codex and Cursor. Weave Router picks the best model per request across Anthropic, OpenAI, Gemini and OSS using an on-box embedder and per-request scoring. Outcome engineers can use on-device routing to reduce latency, control costs, and compose provider-agnostic agent pipelines for hybrid deployments (Principles 06 & 09).
New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.. MRAgent replaces static retrieval with active, associative memory reconstruction, cutting token use and runtime for long-horizon agent reasoning. That changes how you design agent memory: cheaper, reconstructive memories let multi-step agents scale without exploding token costs (Principles 06 & 11).
Lovelace Cuts AI Costs With Context Engines. Lovelace’s context engines replace prompt-stuffed guessing with graph-based context, cutting token costs and making agentic AI auditable. Treat context engines as a first-class artifact for agent stacks — they improve grounding, traceability and cost-predictability (Principles 11 & 06).
Incident Report: CVE-2026-LGTM. Competing AI review agents enter an adversarial disagreement loop that runs up compute bills and exposes supply-chain and prompt-injection risks. Design for adversarial coordination, circuit-breakers, and adjudication layers to stop runaway multi-agent loops before they burn budget or leak credentials (Principles 09, 14 & 15).
US tells OpenAI to restrict access to its most powerful AI model. Regulators urge limiting GPT-5.6 access to a narrow partner list over national-security concerns. Prepare contingency plans — degraded or private-model fallbacks, provenance checks, and negotiated access policies — because model availability is now an operational dependency (Principles 10 & 15).