Buildable Agents: Specs, State Machines, Orchestration & Audit Trails

GitHub Open-Sources Spec Kit for Spec-Driven Development. GitHub open-sources Spec Kit to make specifications executable and shift teams to spec-first development that grounds code-generation agents. That lowers brittle, intent-missing deliveries and gives you a formal ground truth to drive validation and iteration — Principle 02.

Statewright — Visual state machines that make AI agents reliable. Statewright introduces visual state-machine guardrails that constrain tool access per workflow phase to enforce predictable agent behavior. Use it to tighten runtime safety and observability when composing multi-step agents in production — Principle 07.

Orchestration Outweighs Model Wars in AI Infrastructure. Surf AI argues orchestration — routing, unified observability, and policy layers — determines reliability more than chasing a single best model. That reframes investment priorities: build robust routing, monitoring, and governance layers first to scale agent-driven outcomes — Principle 09.

Needle — Distilled Gemini Tool Calling into a 26M Model. Cactus Compute releases Needle, a 26M distilled model that implements Gemini-style tool-calling and runs locally for fine-tuning. That makes specialized tool models cheap and auditable for on-prem agents, accelerating iteration and reducing dependency on large remote inference — Principle 07.

AWS Explains EU AI Act FLOPs Tracking for SageMaker Fine-Tuning. AWS documents using SageMaker with an open FLOPs meter to generate audit-ready compute records that map to EU AI Act thresholds. Instrumenting compute-level telemetry like this is essential for compliance, model provenance, and outcome audits — Principle 10.