Agent infrastructure & protocols: throughput, costs, embeddings, orchestration
Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI announces a Nemotron 3 Super that boosts agentic AI throughput up to 5× with a 1M-token context, hybrid MoE architecture, and open weights. This matters because scalable, long‑context models change how you design agent pipelines and capacity planning for production agent fleets (Principle 09).
Slashing agent token costs by 98% with RFC 9457-compliant error responses rolls out machine‑readable RFC 9457 error responses that cut agent token usage by over 98% and return actionable retry guidance. Lowering token churn and making errors consumable shifts agent design from costly retries to deterministic retry logic and compact state handling (Principle 06 / Principle 11).
Manufact raises $6.3M as MCP becomes the ‘USB-C for AI’ powering ChatGPT and Claude apps describes the Model Context Protocol (MCP), an open standard and toolkit to plug AI agents into apps and share context. A common protocol for context and tooling reduces bespoke integration work, letting outcome engineers treat context plumbing as infrastructure instead of one‑off code (Principle 11).
Perplexity takes its ‘Computer’ AI agent into the enterprise, taking aim at Microsoft and Salesforce brings a multi‑model, multi‑agent orchestrator to enterprises, routing tasks across ~20 models and isolating sessions with Firecracker microVMs. This provides a real blueprint for safe, multi‑model orchestration and per‑session isolation patterns you can copy for reproducibility and tenant separation (Principle 09).
Google’s Gemini Embedding 2 arrives with native multimodal support to cut costs and speed up your enterprise data stack releases a unified multimodal embedding that maps text, images, audio, video, and documents into a single vector space to reduce latency and cost for retrieval. Unifying modalities at the embedding layer simplifies RAG and context engineering for agents that must reason across heterogeneous data sources, lowering integration friction (Principle 06 / Principle 11).