Outcome Engineering

o16g

An ongoing exploration, discovery, and invention of what comes next for software engineering and product development in a world of agentic AI development

Read the manifesto →
Most recent What a Good Organizational Intelligence Layer Looks Like
All must reads →

The agent runtime becomes enforceable infrastructure

The center of gravity shifts again: agent “governance” stops being a dashboard layer and starts becoming enforceable infrastructure at the OS, platform, and workflow edges. Microsoft makes the strongest move by pushing controls down into Windows with Microsoft launches MXC, an OS-level sandbox for AI agents, with OpenAI and Nvidia on board, so policy enforcement and action attribution happen where tools actually execute—not where logs get summarized later. It’s a direct answer to the uncomfortable empirical result that agents will pursue objectives even when safety is on the line, as highlighted in Nvidia and Microsoft Researchers Say AI Agents Don’t Care About Safety or Reliability. When “intent” is not a reliable control, containment and auditability become the product.

This push toward runtime control shows up across the enterprise stack. Workday’s Workday launches Agent Passport to test and monitor AI agents in the enterprise treats agents like regulated software: continuous tests, monitoring, and compliance-grade evidence (Immune System + Audit the Outcomes). Microsoft complements the runtime move with standardized knobs: the Microsoft announces the Agent Control Specification for granular, consistent AI agent governance and the eval harness in Microsoft releases ASSERT — open-source framework for natural-language AI behavior tests. The pattern is consistent: controls become composable artifacts you can version, ship, and enforce—not policy PDFs.

At the same time, the biggest production failure mode is still epistemic, not infrastructural: inconsistent “truth” caused by brittle context. Snowflake argues exactly that in AI agents keep giving confident wrong answers. The context layer is enterprise AI’s next production problem., introducing Horizon Context and Cortex Sense as a way to centralize business logic across hybrid retrieval. Microsoft’s parallel concern—agents creating fresh data silos—gets a platform response in Enterprise AI agents keep creating data silos — Microsoft’s Build answer: Microsoft IQ and Rayfin. Legible Landscapes becomes a prerequisite for trustworthy autonomy: if two agents can’t agree on what “customer,” “revenue,” or “policy exception” means, you don’t have an agent problem—you have a shared semantics problem.

Finally, “gates” are now negotiated with regulators and creators, not just security teams. The UK CMA forces an explicit publisher control surface in UK CMA lets publishers opt out of Google’s AI search results; gives Google nine months, and Google follows with a product-level mechanism in Google tests Search Console toggle letting UK domain owners exclude sites from AI search results. Meanwhile, procurement of training data itself becomes a governance story: Google Is Quietly Buying Code From Play Store Developers to Train AI signals that consent, compensation, and provenance are becoming first-class constraints on model improvement.

Through-line: watch for governance to standardize into “control planes” (OS sandboxes, specs, eval harnesses, and context layers) that teams can certify—because production autonomy is now limited less by model capability than by what your runtime can prove and enforce.

All daily briefs →

Who's instigating and driving conversations

Reach

  1. 1 Simon Willison 2798
  2. 2 Guillermo Jimenez 2123
  3. 3 Jose Antonio Lanz 2092
  4. 4 Lenny Rachitsky 1871
  5. 5 Automated Reporter 1693
  6. 6 Alex Johnson 1622
  7. 7 OpenAI Academy 1447
  8. 8 Jack Clark 1259
  9. 9 Ritoban Mukherjee 1174
  10. 10 Andrew Hayward 1157

How many later articles echo yours, weighted by day volume and article score.

First Mover

  1. 1 Jensen Huang 67%
  2. 2 Craig Hale 66%
  3. 3 Pareekh Jain 63%
  4. 4 Ritoban Mukherjee 57%
  5. 5 Lenny Rachitsky 52%
  6. 6 OpenAI 49%
  7. 7 Fast Company Staff 47%
  8. 8 Nathan Lambert 45%
  9. 9 Sergio De Simone 45%
  10. 10 Eric Hal Schwartz 44%

Fraction of similar articles published after yours — rewards being early.

Coverage

  1. 1 Rachel Metz 76
  2. 2 David Gewirtz 73
  3. 3 John Smith 72
  4. 4 OpenAI Team 71
  5. 5 Automated Reporter 70
  6. 6 Sam Altman 70
  7. 7 Sergio De Simone 70
  8. 8 Jack Clark 70
  9. 9 OpenAI 68
  10. 10 Pareekh Jain 67

Sum of daily percentile ranks across reach and first mover — higher means consistently top-ranked.

Reach

  1. 1 Anthropic 12405
  2. 2 OpenAI 11869
  3. 3 Google 4757
  4. 4 Cloudflare 3198
  5. 5 Google Cloud 2947
  6. 6 Microsoft 2737
  7. 7 Qlik 1405
  8. 8 NVIDIA 1359
  9. 9 Oracle 1189
  10. 10 Google DeepMind 737

How many later articles echo yours, weighted by day volume and article score.

First Mover

  1. 1 Ollama 93%
  2. 2 SpaceX 65%
  3. 3 GitHub 47%
  4. 4 Uber 41%
  5. 5 Mercor 39%
  6. 6 Alibaba 37%
  7. 7 Palantir 37%
  8. 8 OpenClaw 37%
  9. 9 U.S. Department of Defense 37%
  10. 10 CoreWeave 36%

Fraction of similar articles published after yours — rewards being early.

Coverage

  1. 1 Qlik 86
  2. 2 Google Cloud 82
  3. 3 Salesforce 77
  4. 4 Waymo 75
  5. 5 Ollama 67
  6. 6 Google 65
  7. 7 Uber 65
  8. 8 AWS 63
  9. 9 OpenAI 63
  10. 10 Stanford University 61

Sum of daily percentile ranks across reach and first mover — higher means consistently top-ranked.

Reach

  1. 1 techradar.com 10972
  2. 2 siliconangle.com 10235
  3. 3 venturebeat.com 7751
  4. 4 fastcompany.com 7133
  5. 5 thenewstack.io 6409
  6. 6 fortune.com 5941
  7. 7 infoworld.com 5417
  8. 8 openai.com 5188
  9. 9 thedeepview.com 3881
  10. 10 technologyreview.com 3752

How many later articles echo yours, weighted by day volume and article score.

First Mover

  1. 1 blog.dailydoseofds.com 60%
  2. 2 technode.global 57%
  3. 3 fortune.com 52%
  4. 4 cnbc.com 50%
  5. 5 techradar.com 49%
  6. 6 lennysnewsletter.com 47%
  7. 7 9to5google.com 45%
  8. 8 fastcompany.com 45%
  9. 9 nytimes.com 44%
  10. 10 thenewstack.io 44%

Fraction of similar articles published after yours — rewards being early.

Coverage

  1. 1 blogs.nvidia.com 70
  2. 2 lennysnewsletter.com 67
  3. 3 thedeepview.com 67
  4. 4 developers.googleblog.com 65
  5. 5 cnbc.com 64
  6. 6 siliconangle.com 63
  7. 7 infoworld.com 60
  8. 8 zdnet.com 60
  9. 9 wsj.com 60
  10. 10 technologyreview.com 59

Sum of daily percentile ranks across reach and first mover — higher means consistently top-ranked.

Share of trailing 7-day coverage per frontier lab

02-1102-1802-2503-0403-1103-1803-2504-0104-0804-1504-2204-2905-0605-1305-2005-2706-0306-04
Anthropic OpenAI Google Meta DeepSeek Mistral xAI

Per-article sentiment with 7-day net approval

+1 0 -1 02-1102-1802-2503-0403-1103-1803-2504-0104-0804-1504-2204-2905-0605-1305-2005-2706-0306-04
Building Governing Overall

Trailing 7-day balance of creation vs oversight principles

+50 0 -50 02-1102-1802-2503-0403-1103-1803-2504-0104-0804-1504-2204-2905-0605-1305-2005-2706-0306-04
Building Governing
All data →