Episode 66: Claude Friday Outage, Claude Code .168 Day-Late

AgentStack Daily EP066 — Claude Friday Outage, Claude Code .168 Day-Late Fix, OpenClaw Monthly Cadence, OpenAI ChatGPT Superapp, Apple WWDC 2026, Anthropic Mythos Wider Rollout, Microsoft MAI, Gemma 4 12B

Title: Claude Friday Outage, Claude Code .168 Day-Late Fix, OpenClaw Monthly Cadence Switch, OpenAI ChatGPT Superapp, Apple WWDC 2026, Anthropic Mythos Widens, Microsoft MAI Lands in Copilot, Gemma 4 12B on Mac

Tagline: Claude took a two-hour hit on Friday June 5 across the API, Claude Code, claude.ai, and Claude Cowork — primarily Opus 4.7 and 4.8. Claude Code shipped .168 the next day to close the .167 regressions exposed by the outage window. OpenClaw switched release trains to a monthly patch cadence on v2026.6.5-beta.2. OpenAI is rebuilding ChatGPT into a coding-agents superapp ahead of its IPO. Apple WWDC 2026 leans on a Gemini-built Siri to fix the Apple Intelligence gap. Anthropic widens Project Glasswing to 150+ organizations and signals Mythos is closer to public release. Microsoft ships MAI-Thinking-1 and MAI-Code-1-Flash into GitHub Copilot and VS Code. Gemma 4 12B puts a 12B multimodal model in the 16GB local-mac sweet spot. Project radar: A2A v1.0 milestone adoption and the CheetahClaws Python-native harness keeps climbing. OpenClaw prerelease v2026.6.5 is covered in the harness block.

Feed description: OpenClaw v2026.6.5-beta.2 and Claude Code 2.1.168 lead the agent-harness cycle, and the cycle opens with a Friday June 5 outage that hit Claude API, Claude Code, claude.ai, and Claude Cowork for roughly two hours — primarily Opus 4.7 and 4.8 — peaking near a thousand Downdetector reports. OpenClaw switched release trains to a monthly patch cadence with the June 2026 floor at 5.28. Claude Code shipped a focused day-late bug-fix release on the .167 baseline, closing session attachment, stream-json event ordering, and interrupt handling regressions that some users reported during the outage window. OpenAI is reportedly planning its biggest ChatGPT overhaul yet — a unified superapp that folds in Codex, agents, and third-party services ahead of a fall IPO. Apple WWDC 2026 opens June 8 with a Gemini-powered Siri as the headline. Anthropic expands Project Glasswing to 150+ organizations and signals Mythos-class capabilities are coming in weeks. Microsoft launches MAI-Thinking-1 and MAI-Code-1-Flash into GitHub Copilot. Gemma 4 12B ships an encoder-free multimodal design for 16GB local Macs. The MCP lane is brief this week — a one-paragraph blip, not a deep-dive. Project radar covers A2A v1.0 and the CheetahClaws Python harness.

Story Slate

Claude Friday outage (June 5), Claude Code .168 day-late fix, and OpenClaw v2026.6.5-beta.2 monthly cadence — The Claude stack took a roughly two-hour hit on Friday June 5 starting at 11:19 a.m. EDT. Anthropic confirmed elevated error rates across Claude API, Claude Code, claude.ai, and Claude Cowork. The disruption primarily hit Opus 4.7 and 4.8, with Downdetector peaking near a thousand US reports — 40% Claude Chat, 33% Claude Code, 20% the Claude app. Anthropic's status page showed the incident resolved by early Friday afternoon, and the public statement was that success rates had returned to expected levels. The outage is the real-world reason Claude Code .168 shipped the next day. Claude Code's npm latest is now 2.1.168, published June 6 at 23:41 UTC, one day after the .166 and .167 release wave. This is a focused bug-fix release that closes session attachment issues, stream-json event ordering regressions, and interrupt handling bugs reported against the .167 baseline — several of which match the failure modes users reported during the outage. The other harness line is moving on a different axis. OpenClaw prerelease is now v2026.6.5-beta.2, published June 7, 2026. The meaningful change is structural: the release train has switched to a monthly patch numbering scheme, and the June 2026 floor is pinned at 5.28 after the published beta. Translation — the next stable OpenClaw release is on a monthly cadence, and the version naming scheme has changed. Pre-transition tags remain compatible, but operators should expect a new shape going forward. The 2026.6.5-beta.2 build itself bundles the new Parallel bundled web_search provider, MCP tool result coercion for non-text and non-image blocks, Anthropic extended-thinking recovery after prompt-cache expiry, and a macOS node mode fix that prevents silent self-reconnect away from a healthy direct Gateway session. Stable OpenClaw remains at 6.1 from June 3. The other harness lines held their positions this week: Hermes Agent at v0.16.0, OpenAI Codex CLI at rust-v0.137.0, Antigravity CLI on continuous delivery.

Technical depth angle: explain the Claude Friday outage root-cause class (the elevated-error-rate pattern across multiple model SKUs and chat surfaces often traces to inference-routing or load-balancer issues at the model-tier boundary, and the .168 cleanup addresses the stream-json and session-attachment surface that surfaced during the outage), the .168 cleanup pass against the .167 baseline (background session attachment, stream-json event ordering, interrupt handler responsiveness), the OpenClaw monthly patch numbering switch from YYYY.M.MINOR to YYYY.M.PATCH and what 2026.6.5 floor implies for version pins, the Parallel bundled web_search provider replacement of an external dependency with an in-process implementation, MCP tool result coercion for non-text and non-image result blocks, Anthropic extended-thinking recovery after prompt-cache expiry (preserves extended-thinking context across cache roll, removes re-derivation tax), and the macOS node mode reconnect pinning fix.

Actionability angle: run claude --version and confirm you are on .168 — if you have been holding a background session that was exhibiting stream-json stalls or interrupt issues on .167, the upgrade should resolve them; for outage-resilience, watch Anthropic's status page and subscribe to the incident RSS feed so you can correlate local model errors with platform incidents in real time; rotate any long-lived Claude API sessions that may have ended mid-call during the Friday window; for OpenClaw, decide whether to track the prerelease 2026.6.5-beta.2 or hold stable on 6.1 based on your version-pin policy; test the Parallel bundled web_search provider on a code-mode task to confirm latency and reliability improvement over the external dependency; if you run a long-running agent loop with extended-thinking, validate that the prompt-cache roll no longer re-derives the thinking state.

Listener hook: a same-week outage hit the Claude stack for two hours on Friday and the bug-fix release landed the next day — that is the agent harness layer's new normal, and the OpenClaw monthly cadence change is the next layer of operational maturity.

OpenAI plans a ChatGPT "superapp" — chat is dead, agents are the product — The Financial Times reported June 7 that OpenAI is preparing the biggest ChatGPT overhaul since launch. The pitch from inside the company is blunt — a senior OpenAI employee told the FT that "chat is dead." The new ChatGPT is being rebuilt as a unified superapp that folds in Codex, AI agents, image generation, and third-party services, a single product surface that does the work instead of conversing about it. Thibault Sottiaux, who leads OpenAI's core product and platform, framed the goal as a "personal agent capable of helping you across everything in your life, be it personally or at work." The strategic context is an IPO race. Anthropic filed confidentially on June 1. OpenAI is expected to follow in the coming weeks. Anthropic's annualized revenue hit $47 billion in May, up from $30 billion earlier this year, mostly on Claude Code and the Mythos preview. OpenAI is being told by investors that it needs a clearer revenue path, and the superapp is it. The move also explains the Sora wind-down — in March, the Wall Street Journal reported that OpenAI was abandoning "side quests" like the standalone Sora video product. The superapp strategy confirms that read. OpenAI is consolidating its surface area into a single revenue product instead of a portfolio of experiments.

Technical depth angle: explain how a unified superapp surface differs from a chatbot with attached plugins (composable Codex as the agent primitive that other surfaces compose on top of, single sign-on path, real Codex as the foundation versus 2023 ChatGPT Plugins), how Codex becomes the coding/agent primitive that other surfaces compose on top of, why the enterprise per-token billing model is the revenue path investors are pricing in, the strategic shift away from Sora-style standalone products and what it signals about OpenAI's product philosophy (single revenue product over portfolio of experiments), and the relationship between the superapp consolidation and the IPO timeline (Q3-Q4 2026 confidential S-1 window).

Actionability angle: audit which OpenAI products you currently use standalone and identify which would collapse into the superapp if it ships; if you have a ChatGPT Team or Enterprise seat, watch for an admin-side migration notice; if you are using Sora or other sunset-prone products, note the data export path before the consolidation; if you have a ChatGPT Plugins integration from 2023, evaluate whether the new superapp surface is a drop-in replacement or requires a rebuild.

Listener hook: OpenAI is rebuilding its flagship product around the same agentic coding thesis that has been working for Anthropic and Claude Code — and it is doing it on a deadline that ends at the IPO.

Apple WWDC 2026 — Gemini-built Siri as the AI reset, with Tim Cook's last keynote — WWDC 2026 opens June 8 at 10:00 a.m. PT with a pre-recorded keynote streamed from Apple Park. This is Tim Cook's last WWDC as CEO before he hands the role to John Ternus in September. The headline is the Siri overhaul Apple has been teasing and delaying since WWDC 2024. The new Siri is built on a custom Gemini model developed jointly with Google's Gemini team as part of the January 2026 Apple-Google partnership. The reported feature set: more conversational, context-aware, multi-step task handling, app-spanning actions, and a standalone Siri app capable of competing with ChatGPT, Claude, and Gemini directly. Bloomberg's Mark Gurman reports a new "Visual Intelligence" section in the Camera app that uses Google Image Search for object recognition. Reports also point to AI-driven Photos features, AI wallpapers tied to user mood, expanded Genmoji, and an App Store AI agent integration. The operating system lineup ships as iOS 27, iPadOS 27, macOS 27, watchOS 27, and visionOS 27. iOS 27 needs to accommodate Apple's first foldable iPhone shipping in September.

Technical depth angle: explain the custom Gemini model architecture Apple and Google reportedly co-developed (likely Gemini 3 backbone with Apple-tuned on-device variants, a joint model family that serves both Android/Google and iOS/macOS through separate API surfaces), how on-device and cloud inference split across the iOS 27 and macOS 27 surfaces (on-device model runs the simple path, cloud model runs the agentic and multi-step path, with a routing layer that decides based on the request type), what the standalone Siri app surface is and how it differs from the integrated Siri experience (a direct competitor to ChatGPT and Claude app surfaces, with its own system tray and a launch from the home screen as a first-class app), and the privacy/on-device tradeoff that the Apple-Google partnership structure implies (Apple gets the model capability, Google gets a billion-device distribution surface for Gemini, both wrap the data in their respective privacy and licensing terms).

Actionability angle: install the iOS 27 or macOS 27 beta when it lands in the developer track after the keynote; test the new Siri conversational layer against your current ChatGPT or Claude app to compare multi-step task handling; explore the standalone Siri app to see how it positions against ChatGPT and Claude as a daily driver; if you are an enterprise iOS shop, ask your Apple account team about the Siri app's data handling and cloud routing policies before rolling it out as a default surface.

Listener hook: WWDC 2026 is the moment Apple stops pretending it can build a frontier assistant on its own and starts shipping the same Gemini backbone that powers Google's own products — the question is whether Apple can wrap it in a privacy story that sticks.

Anthropic widens Project Glasswing to 150+ organizations — Mythos preview getting closer to public — Anthropic announced June 2 that Project Glasswing, its joint industry program to find and fix critical software vulnerabilities using AI, is expanding to about 150 new organizations across more than 15 countries. The expansion covers power, water, healthcare, communications, and hardware. Industries that were not "well-represented" in the original 50-partner cohort that got Claude Mythos Preview access in April. New access is going to U.S.-based identity and security vendor Okta, South Korean companies Samsung, SK Hynix, and SK Telecom, NATO, the EU's cybersecurity agency ENISA, and others. The original cohort has reportedly used Mythos to find more than 10,000 high or critical security flaws. Anthropic says it is "working as quickly as we can to safely release Mythos-level capabilities" to the public, but the public release waits for "highly robust safeguards" to prevent misuse. Politico reported this week that Anthropic has pledged to make Mythos-class models available to all customers "in the coming weeks." Anthropic's IPO timeline (confidentially filed June 1) is in the same window. The company needs a Mythos-class public model launch and a security story strong enough for an S-1 to land. The competitive context: OpenAI offers GPT-5.5 Cyber to UK banks that Anthropic has so far blocked from Mythos previews. The UK AI Security Institute tested both models and reported "a similar level of performance."

Technical depth angle: explain the architectural shape of Mythos Preview as a vulnerability-discovery model versus a general assistant (specialized RLHF on CVE datasets, red-team reinforcement learning, focused context window and tool surface optimized for code and protocol analysis, not a general-purpose model), how Project Glasswing's pre-public review is being used to harden release safeguards (pre-deployment red-team, finding-class-aware guardrails, document review pipeline for sensitive findings before public model access), what "Mythos-class capabilities" means in terms of public release tier (Opus, Sonnet, or a new top tier above Opus with a separate pricing lane), and how the OpenAI GPT-5.5 Cyber competition changes the release calculus (frontier bifurcation, both labs gatekeeping access carefully, safety framing as differentiator, public release shape driven by market pressure rather than internal safety review).

Actionability angle: monitor Anthropic's status page and the Mythos preview FAQ for the wider public release announcement; if you are an enterprise buyer, ask your Anthropic account team for the Mythos-class public release timeline and pricing tier; for security teams, read the Project Glasswing partnership program docs to see whether your industry sector is now eligible for preview access; for the agent stack, watch for the public Mythos API surface — a new tool-use tier or a separate claude-mythos model ID would be the integration signal.

Listener hook: Mythos-class models are weeks from public release, the first frontier model launch paired with a hard security story, and Anthropic is timing it for an IPO window that closes in the fall.

Microsoft ships MAI-Thinking-1, MAI-Code-1-Flash, MAI-Image-2.5, and MAI-Voice-2 at Build 2026 — Microsoft used Build 2026 on June 2 to announce its first in-house advanced reasoning model and a full pipeline of supporting models. MAI-Thinking-1 is a "medium-sized model" that Microsoft says matches leading models on key software engineering benchmarks. MAI-Code-1-Flash is positioned as inference-efficient and is integrated into GitHub Copilot and Visual Studio Code. That is the most stack-relevant drop for the agent stack, because MAI-Code is now a first-party Microsoft option for code-mode flows in the editor most agents are already wired into. MAI-Image-2.5 (and a flash variant) handles text-to-image and image editing. MAI-Transcribe-1.5 is "five times faster than competing models" on speech-to-text. MAI-Voice-2 (with a flash version "coming soon") adds 15 new languages and new voice options. PCMag tested all four and called the new MAI family "fine, and that's about the best I can say about them." The takeaway is that Microsoft now has a working in-house model lineup that can substitute for OpenAI across image, voice, and code paths. That substitution capability is the strategic point. Microsoft is no longer solely dependent on OpenAI for the model layer of its product surface.

Technical depth angle: explain MAI-Thinking-1's "medium-sized" reasoning architecture and the tradeoffs versus larger reasoning models like Claude Mythos or GPT-5.5 Pro (specialized reasoning layers, test-time-compute scaling, smaller base with stronger reasoning RL), the MAI-Code-1-Flash integration surface in GitHub Copilot and VS Code (Copilot extension API as the primary surface, VS Code Language Server as the deep integration, model picker as the user-facing entry point), how the MAI model family slots into Microsoft's broader OpenAI independence strategy (substitution capability across image, voice, code, and reasoning as a renegotiation lever for the OpenAI contract), and the cost and latency shape of inference-efficient code models in production (per-token cost vs. frontier models, latency budget for editor completion flows, batch size and request rate characteristics).

Actionability angle: enable MAI-Code-1-Flash in GitHub Copilot settings and run a code-completion test against your current default model; try MAI-Image-2.5 through the Microsoft Foundry playground to compare output against your current image provider; test MAI-Voice-2 across one of the new supported languages for a TTS workload; audit any OpenAI dependencies in your agent stack to identify where MAI can substitute; for SWE-Bench Pro and SWE-Bench Verified benchmark results, watch for independent confirmation of the MAI-Thinking-1 claim before assuming parity with the leading tier.

Listener hook: Microsoft just shipped its own coding model into the editor most of you already use, and the strategic point is that Microsoft is not solely dependent on OpenAI anymore.

Gemma 4 12B hits Google AI Edge Gallery for Mac as a local 16GB multimodal model — Google released Gemma 4 12B on June 3, 2026 — a 12-billion-parameter open-weights model with an Apache 2.0 license, designed to run locally on a standard laptop with 16GB of VRAM or unified memory. The architectural shift is the encoder-free "Unified" design — raw audio waveforms and visual patches flow directly into the LLM backbone without secondary processing modules. The context window is 256K tokens, with native agentic tool-use capabilities and a step-by-step reasoning mode. Gemma 4 12B is available immediately on Hugging Face, Kaggle, and through Google AI Edge Gallery, which launched on macOS the same day. The companion Google AI Edge Eloquent dictation app is also available on Mac. The five Google models available in AI Edge Gallery for Mac are all from the Gemma family, tuned for instruct behavior.

Technical depth angle: explain the encoder-free "Unified" multimodal architecture and how audio and visual patches feed directly into the LLM backbone (raw audio waveform tokens and raw visual patch tokens enter the same tokenizer and embed in the same latent space, eliminating the encoder-decoder step and the encoder weights from the working set), the 256K context window implementation (likely a hybrid attention pattern with sliding-window and full attention layers, 12B parameters sized to fit 256K context in 16GB of unified memory through aggressive KV cache quantization), the native agentic tool-use capability surface (structured output format that an agent loop can consume directly, no adapter layer), the licensing and redistribution shape under Apache 2.0 (commercial use, fine-tuning, redistribution allowed without per-token or per-call cost), and what encoder-free design means for latency and memory usage versus encoder-decoder multimodal approaches (lower latency, smaller memory footprint, slightly different accuracy profile for vision-heavy tasks).

Actionability angle: pull the Gemma 4 12B checkpoint from Hugging Face and run it through Google AI Edge Gallery on a Mac with 16GB of unified memory; compare one coding task output against your current local model; test the 256K context window on a real codebase or long document understanding task; try the audio input with the AI Edge Eloquent dictation app; for a fine-tuning evaluation, run a small domain adaptation on a private corpus and deploy the fine-tuned variant in a private inference environment to evaluate the per-call economics.

Listener hook: a 12B model that sees, hears, reasons, and fits in 16GB of laptop memory is the new local-first baseline for an agent that does not need to call out to the cloud.

MCP lane (one-paragraph blip) and project radar: CheetahClaws, A2A Protocol v1.0 — Two quick hits on the MCP security front, kept short because the news cycle this week is model-heavy. OpenAI is rolling out Lockdown Mode and Active Sessions for ChatGPT on June 8, 2026, bringing two account-security controls more broadly available — Lockdown Mode limits outbound network requests to reduce data exfiltration from prompt-injection attacks, and Active Sessions lets users review where their account is signed in. The controls are available on personal and self-serve Business accounts. The same MCP ecosystem that produced the late-May eight-hundred-ninety-three-finding audit is moving fast on the response side. The scanner surface (AgentAudit) and the server hardening surface are both receiving updates. No single CVE this week justifies a deep-dive. The lane to watch: as more MCP servers connect to the agent stack, the operational practice of treating each connected server as a network service with a documented input-validation review becomes the default, not the exception.

CheetahClaws 3.0.5 is a Python-native multi-model agent harness from SafeRL-Lab, designed as a readable alternative to the compiled TypeScript bundle most agent harnesses ship as. The release landed June 4, 2026 with Claude-Code-style quiet output as the default behavior. The agent loop fits in roughly 740 lines of Python. The model support list is broad — Anthropic, OpenAI, Gemini, Kimi, Qwen, Zhipu, DeepSeek, MiniMax, Ollama, LM Studio, and any OpenAI-compatible endpoint. The feature set covers runtime tool registration with MCP and git plugins, markdown skills, a task dependency graph with blocks/blocked-by semantics, two-layer context compression, offline voice, cloud session sync, and bridges to Telegram, WeChat, Slack, and QQ. The repo has over 700 stars. The A2A Protocol reached v1.0 in 2026 under the Linux Foundation. Originally launched by Google, A2A is now governed alongside MCP. The protocol defines agent cards (JSON capability manifests) for agent discovery and a task-based state machine for long-running interactions using JSON-RPC 2.0.

Technical depth angle: explain the Python-native multi-model agent loop architecture versus the compiled TypeScript bundle most agent harnesses ship as (740 lines of Python for the agent loop, all logic readable in one sitting, runtime tool registration decoupled from the model call path), the Claude-Code-style quiet output default (per-tool clutter suppressed, single spinner, single summary line per turn), the broad model support list and the OpenAI-compatible endpoint shim (a single API surface that any provider can plug into), the markdown skills for declarative capability definition (a single markdown file declares a skill, the harness registers it as a tool, no Python code required), the task dependency graph with blocks/blocked-by semantics (a directed acyclic graph for multi-task coordination), two-layer context compression (a primary layer that summarises the recent window and a secondary layer that summarises the long-term window), offline voice (TTS runs locally without a cloud call), the bridge integrations to Telegram, WeChat, Slack, and QQ (chat surfaces as agent endpoints); explain A2A agent card JSON capability manifests for discovery (a JSON document that describes an agent's capabilities, authentication, and endpoint, exposed at a well-known URL), the task-based state machine for long-running interactions (Submitted, Working, Input-Required, Completed, Failed, Canceled as the canonical states, transitions encoded in JSON-RPC 2.0 messages), JSON-RPC 2.0 message format, the MCP versus A2A scope distinction (MCP standardizes how an agent connects to external tools and data sources — what an agent can do; A2A standardizes how agents communicate with each other — how agents work together), Linux Foundation governance, and what v1.0 formalization means for protocol stability and adoption; explain the Lockdown Mode outbound network request limiter implementation (allowlist of permitted outbound hosts, request shape constraints that prevent indirect prompt-injection exfiltration, per-session state that holds the lock and a key rotation that releases it), the Active Sessions audit surface (per-device and per-location sign-in records, user-visible kill switch, token rotation on sign-out), and the operational practice of treating each connected MCP server as a network service with a documented input-validation review.

Actionability angle: clone the CheetahClaws repo, read the agent module in 740 lines, run one task with quiet output enabled, and connect it to Ollama local or OpenRouter; read the A2A v1.0 specification on the a2aproject/A2A GitHub repo and identify one multi-agent handoff point in your workflow where A2A agent cards could replace a custom integration.

Listener hook: CheetahClaws gives the agent stack a Python-native multi-model harness you can read in 740 lines, and A2A v1.0 gives agents from different frameworks a formal handoff protocol — the two projects together close the multi-model and multi-agent gaps that have been the agent stack's biggest architectural debt.

Model Discovery Check

MiniMax — No new model release between June 4 and June 8, 2026 in the MiniMax M-series or M-code line. The MiniMax M3 release on June 1, 2026 remains the most recent material model drop. Selection decision: not selected this cycle because the most recent material drop is already in the slate coverage window. MiniMax M3 is a 1M-context open-weight model with MSA/sparse attention, multimodal support, MiniMax Code integration, and API availability through the MiniMax platform.
Anthropic — The Friday June 5 outage is the operational Anthropic event of the week — roughly two hours of elevated errors across Claude API, Claude Code, claude.ai, and Claude Cowork, primarily affecting Opus 4.7 and 4.8, with Downdetector peaking near a thousand US reports. The Claude Code .168 release the next day is the immediate response. No new public Claude model tier release this week. Claude Mythos Preview expansion via Project Glasswing is the release-shaped event (see story 4). The next public release event is the "Mythos-class capabilities to all customers" window that Anthropic signaled as "coming weeks." Selection decision: the outage is folded into Story 1 (Claude Code .168) because the bug-fix release is the operational response, and Mythos preview expansion is the strongest Anthropic release story, and it is selected as Story 4.
OpenAI — No new GPT model release this week. The "chat is dead" ChatGPT superapp is a product restructuring, not a model drop. The Codex CLI release cycle (rust-v0.137.0) is on its own track. Selection decision: not selected because the superapp is a product story, not a model drop, and the current Codex release has been covered.
Google / Gemini — Gemma 4 12B released June 3 (see story 6). Gemini Spark is the agentic surface on Google's side. No new Gemini flagship tier release this week; the WWDC Siri partnership (see story 3) is the most material Gemini-Apple event. Selection decision: Gemma 4 12B is selected as the most accessible local multimodal model in the 16GB-RAM sweet spot.
xAI / Grok — No new Grok model release this week. Gopuff and SpaceXAI launched the Grok-powered "Go" personal shopping assistant on June 3, 2026 — a deployment event, not a model drop. The xAI roadmap signal for a new model tier is not in this week's news cycle. Selection decision: not selected.
Meta — No new Llama or other Meta model release in the verified cycle. Selection decision: not selected.
Mistral — No new Mistral model release in the verified cycle. Nemotron Coalition work with Mistral and Perplexity continues for Nemotron 4 (announced Computex, June 1) but is not a Mistral model drop. Selection decision: not selected.
Qwen / Alibaba — No new Qwen model release in the verified cycle. Selection decision: not selected.
DeepSeek — No new DeepSeek model release in the verified cycle. Selection decision: not selected.
Z.ai / GLM — No new GLM model release in the verified cycle. Selection decision: not selected.
Kimi / Moonshot — No new Kimi model release in the verified cycle. The Kimi Code CLI 0.11.0 release is the Moonshot agent tooling event from a previous episode. Selection decision: not selected.
NVIDIA — Nemotron 3 Ultra, a 550B-parameter open-weight model, was announced at Computex on June 1, 2026. The top U.S. open-weight model by a comfortable margin. Nemotron 4 is in co-development through the Nemotron Coalition with Mistral and Perplexity. Not a direct agent-stack driver this week, but a meaningful open-weight release. Selection decision: not selected because the open-weight narrative this week is dominated by Gemma 4 12B and the local-mac story.
Microsoft — MAI-Thinking-1, MAI-Code-1-Flash, MAI-Image-2.5, MAI-Transcribe-1.5, MAI-Voice-2 announced at Build 2026 on June 2 (see story 5). First in-house reasoning model. Direct agent-stack impact through Copilot and VS Code integration. Selection decision: MAI family is selected because MAI-Code-1-Flash in GitHub Copilot and VS Code is the direct agent-stack impact.

Primary source links: MiniMax https://minimaxi.com/, Anthropic https://docs.anthropic.com/, OpenAI https://openai.com/, Google https://blog.google/, xAI https://x.ai/, Meta https://ai.meta.com/, Mistral https://mistral.ai/, Qwen https://qwen.alibaba.com/, DeepSeek https://deepseek.com/, Z.ai https://z.ai/, Moonshot https://www.moonshot.ai/, NVIDIA https://www.nvidia.com/, Microsoft https://www.microsoft.com/.

GitHub Project Radar

CheetahClaws — https://github.com/SafeRL-Lab/CheetahClaws — Python-native multi-model agent harness, 740-line agent loop, broad model support (Anthropic, OpenAI, Gemini, Kimi, Qwen, Zhipu, DeepSeek, MiniMax, Ollama, LM Studio, any OpenAI-compatible endpoint), Claude-Code-style quiet output default, MCP and git plugin support, markdown skills, task dependency graph with blocks/blocked-by semantics, two-layer context compression, offline voice, cloud session sync, bridges to Telegram/WeChat/Slack/QQ. 700+ GitHub stars, active development as of June 4, 2026. Stack improvement angle: gives the agent stack a Python-native multi-model harness that is readable in one sitting, supports local-first deployment through Ollama and LM Studio, and bridges the chat surface gaps in the agent stack for the major Chinese messaging platforms. Try now: clone the CheetahClaws repo, read the agent module in 740 lines, run one task with quiet output enabled, connect it to Ollama local or OpenRouter, and compare session behavior against Claude Code on the same task.
A2A Protocol — https://github.com/a2aproject/A2A — Open protocol for agent-to-agent interoperability, JSON-RPC 2.0 task state machine, agent card discovery. 24,153 GitHub stars, active development as of June 6, 2026. Stack improvement angle: enables cross-framework agent handoff without custom integration code per pair — the formal interoperability layer the agent stack has been missing. A Claude Code session can delegate to a Hermes agent, or an OpenClaw agent can hand off to a Codex thread, through the same protocol surface. Try now: read the A2A v1.0 spec to understand agent card structure, then design one multi-agent handoff point in your workflow using the protocol's task state machine.
OpenClaw — https://github.com/openclaw/openclaw — Agent harness and gateway, monthly patch cadence as of v2026.6.5-beta.2, macOS node mode fix in the latest beta, Parallel bundled web_search provider, MCP tool result coercion, Anthropic extended-thinking recovery. Active development as of June 7, 2026. Stack improvement angle: the agent harness that ties the agent stack together — sessions, plugins, auth, browser, cron, provider routing, and the gateway surface for multi-node deployments. The monthly cadence change is the operational improvement most worth tracking. Try now: read the v2026.6.5-beta.2 release notes, decide whether to track prerelease or hold stable on 6.1, and run one full agent loop end-to-end through the bundled web_search provider to feel the latency and reliability improvement.

Show Notes

[00:00] Intro: OpenClaw v2026.6.5-beta.2 monthly cadence switch, Claude Code .168 day-late fix, OpenAI ChatGPT superapp, Apple WWDC 2026, Anthropic Mythos widening, Microsoft MAI in Copilot, Gemma 4 12B on Mac

The OpenClaw v2026.6.5-beta.2 prerelease is the headline release of the cycle. It switched release trains to a monthly patch numbering scheme with the June 2026 floor pinned at 5.28. The build bundles the new Parallel bundled web_search provider, MCP tool result coercion for non-text and non-image blocks, Anthropic extended-thinking recovery after prompt-cache expiry, and a macOS node mode fix. The Claude stack also took a Friday June 5 outage for roughly two hours starting at 11:19 a.m. EDT, hitting Claude API, Claude Code, claude.ai, and Claude Cowork with elevated error rates that primarily hit Opus 4.7 and 4.8 — Downdetector peaked near a thousand US reports before Anthropic confirmed the incident resolved by early Friday afternoon. Claude Code 2.1.168 is the next-day response: a focused day-late bug-fix on the .167 baseline that closes session attachment, stream-json event ordering, and interrupt handling bugs, several of which match the failure modes users reported during the outage window. After the harness block, OpenAI is rebuilding ChatGPT into a coding-agents superapp ahead of a fall IPO, Apple WWDC 2026 opens with a Gemini-built Siri, Anthropic widens Project Glasswing to 150+ organizations, Microsoft ships MAI-Thinking-1 and MAI-Code-1-Flash into GitHub Copilot, and Gemma 4 12B hits Google AI Edge Gallery for Mac as a 16GB local multimodal model. The MCP lane is brief this week — a one-paragraph blip, not a deep-dive. Project radar covers A2A v1.0 and the CheetahClaws Python harness.

[02:00] Claude Friday outage (June 5), Claude Code .168 day-late bug-fix, OpenClaw v2026.6.5-beta.2 monthly cadence switch — release coverage

The Claude stack took a roughly two-hour hit on Friday June 5, 2026 starting at 11:19 a.m. EDT. Anthropic confirmed elevated error rates across Claude API, Claude Code, claude.ai, and Claude Cowork. The disruption primarily hit Opus 4.7 and 4.8, and Downdetector peaked near a thousand US reports — forty percent Claude Chat, thirty-three percent Claude Code, twenty percent the Claude app. Anthropic's status page showed the incident resolved by early Friday afternoon, and the public statement was that success rates had returned to expected levels. The outage is the real-world reason Claude Code .168 shipped the next day. The release is a focused bug-fix on the .167 baseline that closes session attachment issues, stream-json event ordering regressions, and interrupt handling bugs — several of which match the failure modes users reported during the outage. The takeaway for the harness layer is the operational response time. A two-hour outage on Friday, a bug-fix release on Saturday, and a same-week changelog entry is the new normal for the agent stack.

Claude Code 2.1.168 is the npm latest, published June 6 at 23:41 UTC, one day after the .166 and .167 release wave. The version is verified from the npm registry and the changelog. This is a focused bug-fix release, not a feature release. The release notes describe a cleanup wave that closes session attachment issues, stream-json event ordering regressions, and interrupt handling bugs reported against the .167 baseline. The scope matters because those three surfaces are exactly where background agent work goes wrong silently. A background session that fails to attach cleanly is a session that loses its running task, and the failure mode is invisible until the operator reconnects and finds an empty task list. A stream-json session that mishandles event ordering is a session that drops work mid-tool-call, and the consumer at the other end of the JSON pipe sees a partial or duplicated event. An interrupt handler that swallows a keypress is a session that looks like it is hanging when it actually accepted the input and is waiting on the model. Point one sixty eight is the cleanup pass for point one sixty seven, and the team got the patch out the door inside a day of the feature release. The version metadata is also worth a note. The npm latest dist-tag is point one sixty eight, and the npm stable dist-tag remains point one fifty three. That gap between latest and stable is intentional. Anthropic uses the latest dist-tag to roll forward through cleanup releases while keeping stable pinned to a known-good build for fleet environments that prefer not to chase every point release. The behavior delta between point one sixty seven and point one sixty eight is in session stability, not in capability. Background sessions that were stuck should resume cleanly. Stream-json consumers that were receiving truncated or duplicated tool events should see clean event ordering. Interactive users who pressed interrupt at the start of a turn and watched the session ignore the keypress should now see the interrupt accepted.

The OpenClaw line is moving on a different axis. The v2026.6.5-beta.2 prerelease published June 7 carries the new monthly patch numbering scheme, and the June 2026 floor is pinned at five point two eight. The meaningful change is structural. The release train has switched to a monthly cadence, and the version naming scheme has changed. The next stable OpenClaw release is on a monthly cadence, and operators should expect a new shape going forward. Pre-transition tags remain compatible, which means nothing breaks on upgrade, and existing six point one deployments continue to work without intervention. The June prerelease bundle itself is dense. The Parallel bundled web_search provider replaces an external dependency with an in-process implementation, and the win is latency and reliability on the search path. The bundled provider removes a network hop and a third-party API surface, which means a tool call that returns search results no longer depends on an external service being up. MCP tool result coercion handles non-text and non-image result blocks uniformly, so a tool that returns a structured payload no longer needs a custom adapter for the agent to consume it. Anthropic extended-thinking recovery after prompt-cache expiry closes a class of recovery issues where the prompt cache is invalidated and the extended-thinking state is lost. The macOS node mode fix prevents a silent self-reconnect away from a healthy direct Gateway session. Stable OpenClaw remains at six point one from June 3.

[12:00] OpenAI ChatGPT "superapp" — chat is dead

The Financial Times reported June 7 that OpenAI is preparing the biggest ChatGPT overhaul since launch. The pitch from inside the company is blunt, and a senior OpenAI employee told the FT that "chat is dead." The new ChatGPT is being rebuilt as a unified superapp that folds in Codex, AI agents, image generation, and third-party services. The product surface is being narrowed to a single revenue product that does the work instead of conversing about it. Thibault Sottiaux, who leads OpenAI's core product and platform, framed the goal as a "personal agent capable of helping you across everything in your life, be it personally or at work." The strategic context is an IPO race. Anthropic filed confidentially on June 1. OpenAI is expected to follow in the coming weeks. Anthropic's annualized revenue hit $47 billion in May, up from $30 billion earlier this year, mostly on Claude Code and the Mythos preview. OpenAI is being told by investors that it needs a clearer revenue path, and the superapp is it. The move also explains the Sora wind-down. In March, the Wall Street Journal reported that OpenAI was abandoning "side quests" like the standalone Sora video product. The superapp strategy confirms that read. OpenAI is consolidating its surface area into a single revenue product instead of a portfolio of experiments. ChatGPT Plugins, the company's first attempt at this consolidation in March 2023, did not stick. The 2026 attempt has a different foundation. Codex is real, agentic coding is paying, and the enterprise customer already has a single sign-on path. The "chat is dead" framing is the marketing reset — the previous framing was assistant, the current one is coworker.

[22:00] Apple WWDC 2026 — Gemini-built Siri

WWDC 2026 opens June 8 at 10:00 a.m. PT with a pre-recorded keynote streamed from Apple Park. This is Tim Cook's last WWDC as CEO before he hands the role to John Ternus in September. The headline is the Siri overhaul Apple has been teasing and delaying since WWDC 2024. The new Siri is built on a custom Gemini model developed jointly with Google's Gemini team as part of the January 2026 Apple-Google partnership. The reported feature set: more conversational, context-aware, multi-step task handling, app-spanning actions, and a standalone Siri app capable of competing with ChatGPT, Claude, and Gemini directly. Bloomberg's Mark Gurman reports a new "Visual Intelligence" section in the Camera app that uses Google Image Search for object recognition. Reports also point to AI-driven Photos features, AI wallpapers tied to user mood, expanded Genmoji, and an App Store AI agent integration. The operating system lineup ships as iOS 27, iPadOS 27, macOS 27, watchOS 27, and visionOS 27. iOS 27 needs to accommodate Apple's first foldable iPhone shipping in September. The iPhone Fold is expected to support two apps side-by-side for the first time, with an iPad-like display when open.

[32:00] Anthropic Project Glasswing widens to 150+ organizations

Anthropic announced June 2 that Project Glasswing, its joint industry program to find and fix critical software vulnerabilities using AI, is expanding to about 150 new organizations across more than 15 countries. The expansion covers power, water, healthcare, communications, and hardware. Industries that were not "well-represented" in the original 50-partner cohort that got Claude Mythos Preview access in April. New access is going to U.S.-based identity and security vendor Okta, South Korean companies Samsung, SK Hynix, and SK Telecom, NATO, the EU's cybersecurity agency ENISA, and others. The original cohort has reportedly used Mythos to find more than 10,000 high or critical security flaws. Anthropic says it is "working as quickly as we can to safely release Mythos-level capabilities" to the public, but the public release waits for "highly robust safeguards" to prevent misuse. Politico reported this week that Anthropic has pledged to make Mythos-class models available to all customers "in the coming weeks." Anthropic's IPO timeline (confidentially filed June 1) is in the same window. The company needs a Mythos-class public model launch and a security story strong enough for an S-1 to land. The competitive context: OpenAI offers GPT-5.5 Cyber to UK banks that Anthropic has so far blocked from Mythos previews. The UK AI Security Institute tested both models and reported "a similar level of performance." That result is the signal that the frontier is genuinely bifurcating. Both labs have a cyber-capable model, both are gatekeeping access carefully, and both are pitching safety framing to differentiate.

[42:00] Microsoft MAI-Thinking-1, MAI-Code-1-Flash, MAI-Image-2.5

Microsoft used Build 2026 on June 2 to announce its first in-house advanced reasoning model and a full pipeline of supporting models. MAI-Thinking-1 is a "medium-sized model" that Microsoft says matches leading models on key software engineering benchmarks. MAI-Code-1-Flash is positioned as inference-efficient and is integrated into GitHub Copilot and Visual Studio Code. That is the most stack-relevant drop for the agent stack, because MAI-Code is now a first-party Microsoft option for code-mode flows in the editor most agents are already wired into. MAI-Image-2.5 (and a flash variant) handles text-to-image and image editing. MAI-Transcribe-1.5 is "five times faster than competing models" on speech-to-text. MAI-Voice-2 (with a flash version "coming soon") adds 15 new languages and new voice options. PCMag tested all four and called the new MAI family "fine, and that's about the best I can say about them." The reasoning is competitive, the image and voice models are functional but not differentiated, and the takeaway is that Microsoft now has a working in-house model lineup that can substitute for OpenAI across image, voice, and code paths. That substitution capability is the strategic point. Microsoft is no longer solely dependent on OpenAI for the model layer of its product surface.

[52:00] Gemma 4 12B on Google AI Edge Gallery for Mac

Google released Gemma 4 12B on June 3, 2026 — a twelve billion parameter open-weights model with an Apache 2.0 license, designed to run locally on a standard laptop with sixteen gigabytes of VRAM or unified memory. The architectural shift is the encoder-free "Unified" design. Raw audio waveforms and visual patches flow directly into the LLM backbone without secondary processing modules, which means the model can hear, see, and reason without a routing layer. The context window is 256K tokens, with native agentic tool-use capabilities and a step-by-step reasoning mode. Gemma 4 12B is available immediately on Hugging Face, Kaggle, and through Google AI Edge Gallery, which launched on macOS the same day. The companion Google AI Edge Eloquent dictation app is also available on Mac. The five Google models available in AI Edge Gallery for Mac are all from the Gemma family, tuned for instruct behavior — instruction-following rather than text completion. The encoder-free architecture matters because it removes the latency and memory overhead of separate audio and vision encoders. The traditional multimodal design routes audio and vision through dedicated encoders that produce embeddings, and the embeddings are then concatenated with the text token stream and fed into the LLM. The encoder-free design skips the encoder stage and feeds the raw audio and vision tokens directly into the LLM, which means the model learns to handle audio and vision as part of the same token stream. The latency win is real: a multimodal request no longer pays the encoder inference cost, and the memory win is real: the encoder weights are gone from the working set. The 256K context window is the other architectural bet. A twelve billion parameter model with a 256K context window is a real capability for local agent stacks.

[60:00] MCP lane (brief blip) and project radar: CheetahClaws, A2A Protocol v1.0

One brief MCP note this week. OpenAI is rolling out Lockdown Mode and Active Sessions for ChatGPT on June 8, bringing two account-security controls more broadly available — Lockdown Mode limits outbound network requests to reduce data exfiltration from prompt-injection attacks, and Active Sessions lets users review where their account is signed in. The controls land on personal and self-serve Business accounts. The same MCP ecosystem that produced the late-May audit is moving fast on the response side, and the scanner and server hardening surfaces are both receiving updates. That is a one-paragraph blip, not a deep-dive — the news cycle is model-heavy this week. CheetahClaws three point zero five is a Python-native multi-model agent harness from SafeRL-Lab, designed as a readable alternative to the compiled TypeScript bundle most agent harnesses ship as. The release landed June 4 with Claude-Code-style quiet output as the default behavior. The agent loop fits in roughly 740 lines of Python, and the model support list is broad — Anthropic, OpenAI, Gemini, Kimi, Qwen, Zhipu, DeepSeek, several others, Ollama, LM Studio, and any OpenAI-compatible endpoint. The feature set covers runtime tool registration with MCP and git plugins, markdown skills for declarative capability definition, a task dependency graph with blocks and blocked-by semantics, two-layer context compression, offline voice, cloud session sync, and bridges to Telegram, WeChat, Slack, and QQ. The repo has over 700 stars with activity concentrated in the agent loop and task graph. The trade-off is real — error handling on provider failures, retry logic on transient tool errors, and observability hooks are thinner than a mature harness like Claude Code. The A2A Protocol reached version one in 2026 under the Linux Foundation. Originally launched by Google, A2A is now governed alongside MCP. The protocol defines agent cards — JSON capability manifests for agent discovery — and a task-based state machine for long-running interactions using JSON-RPC 2.0. The MCP versus A2A distinction is the key mental model: MCP standardizes how an agent connects to external tools, databases, and data sources; A2A standardizes how agents communicate with each other. The repository has more than 24,000 stars and active development, and the protocol has reached sufficient maturity that builders should be aware of it when designing multi-agent workflows.

[66:00] Practical queue

The practical queue this week is short and concrete. For Claude Code, run the version command and confirm you are on point one sixty eight; if you have been holding a background session that was exhibiting stream-json stalls or interrupt issues on point one sixty seven, the upgrade should resolve them. For outage-resilience, watch the Anthropic status page and subscribe to the incident RSS feed so you can correlate local model errors with platform incidents in real time, and rotate any long-lived Claude API sessions that may have ended mid-call during the Friday June 5 window. For OpenClaw, decide whether to track the June prerelease or hold stable on six point one based on your version-pin policy. For OpenAI ChatGPT, audit standalone products that will collapse into the superapp and note data export paths for any sunset-prone service. Install the iOS 27 or macOS 27 beta after WWDC and test the new Siri against ChatGPT or Claude for multi-step task handling. For Anthropic, monitor the Mythos public release announcement and ask your account team for the timeline. For Microsoft, enable MAI-Code-1-Flash in GitHub Copilot and run a completion test against your current default model. For Gemma 4 12B, pull the checkpoint and run it on a 16GB Mac to compare one coding task against your current local model. For CheetahClaws, clone the repo and read the 740-line agent loop. For A2A, read the version one specification and identify one handoff point in your workflow where agent cards could replace a custom integration.

[68:00] Outro

That is the cycle. Harness first, model lane second, project radar third, practical queue last. The Claude Friday outage is a reminder that the agent stack now has real production reliability concerns, and the next-day bug-fix release is the operational pattern that makes those concerns manageable. For the full show notes with links and the chapter slate, look at the show notes at Toby On Fitness Tech dot com.

Thanks for listening to AgentStack Daily.

We'll be back soon.

Chapters

00:00 — Intro: Claude Friday outage, Claude Code .168 day-late fix, OpenClaw monthly cadence, OpenAI ChatGPT superapp, Apple WWDC 2026, Anthropic Mythos widening, Microsoft MAI in Copilot, Gemma 4 12B on Mac
02:00 — Claude Friday outage (June 5), Claude Code .168 day-late bug-fix, OpenClaw monthly cadence
12:00 — OpenAI ChatGPT "superapp" — chat is dead
22:00 — Apple WWDC 2026 — Gemini-built Siri
32:00 — Anthropic Project Glasswing widens to 150+ organizations
42:00 — Microsoft MAI-Thinking-1, MAI-Code-1-Flash, MAI-Image-2.5
52:00 — Gemma 4 12B on Google AI Edge Gallery for Mac
60:00 — MCP lane (brief) and project radar: CheetahClaws, A2A Protocol v1.0
66:00 — Practical queue

Primary Links

Claude Friday outage coverage: https://cybernews.com/ai-news/claude-outage-resolved-anthropic-opus-model-errors/ (June 5, 2026, ~2-hour outage on Claude API, Claude Code, claude.ai, Claude Cowork, primarily Opus 4.7/4.8)
Anthropic status page: https://status.anthropic.com/ (subscribe to incident RSS for outage-resilience)
Claude Code changelog: https://raw.githubusercontent.com/anthropics/claude-code/main/CHANGELOG.md
Claude Code npm: https://www.npmjs.com/package/@anthropic-ai/claude-code
OpenClaw 2026.6.5-beta.2 release: https://github.com/openclaw/openclaw/releases/tag/v2026.6.5-beta.2
OpenAI ChatGPT superapp: https://www.ft.com/content/openai-chatgpt-superapp (Financial Times, June 7, 2026) — also: https://techcrunch.com/2026/06/07/openai-is-still-working-on-that-super-app/
Apple WWDC 2026: https://www.apple.com/apple-events/
Apple-Google Gemini Siri partnership: https://www.bloomberg.com/news/articles/2026-01-XX/apple-google-siri-partnership
Anthropic Project Glasswing expansion: https://www.engadget.com/2185709/anthropic-expands-its-claude-mythos-preview-to-more-partners/
Mythos public release quote: https://www.politico.com/news/2026/06/07/frontier-ai-cybersecurity-china-race-00952786
Microsoft Build 2026 MAI models: https://www.theverge.com/tech/941664/microsoft-ai-model-reasoning-mai-thinking-1-build-2026
Microsoft Build 2026 MAI hands-on: https://www.pcmag.com/news/i-tested-all-4-of-microsofts-new-ai-models-heres-the-brutal-truth
Gemma 4 12B: https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12B/
Google AI Edge Gallery for Mac: https://9to5mac.com/2026/06/03/google-ai-edge-gallery-launches-to-macos-letting-mac-users-run-gemini-models-locally/
OpenAI ChatGPT Lockdown Mode: https://www.securityweek.com/openai-rolling-out-chatgpt-account-security-controls/
CheetahClaws: https://github.com/SafeRL-Lab/CheetahClaws
A2A Protocol: https://github.com/a2aproject/A2A

Release Coverage Check

OpenClaw — Latest stable verified: v2026.6.1, published 2026-06-03T19:35:05Z from GitHub releases. Recent episode version tags detected: v2026.6.1. Latest prerelease: v2026.6.5-beta.2, published 2026-06-07T00:26:39Z — switches release trains to YYYY.M.PATCH monthly patch numbering, pins the June 2026 floor at 2026.6.5. Stable coverage unchanged this cycle; the beta is meaningful for operators evaluating the new monthly cadence.
Hermes Agent — Latest stable verified: v2026.6.5 / v0.16.0, published 2026-06-06T00:55:58Z from GitHub releases. Recent episode version tags detected: v2026.6.5. No new stable release available; v0.16.0 "The Surface Release" remains the latest and is in the prior-episode reference.
OpenAI Codex app/CLI — Latest stable verified: rust-v0.137.0, published 2026-06-04T01:17:20Z from GitHub releases. Recent episode version tags detected: rust-v0.137.0. No new stable release available. rust-v0.138.0-alpha.6 is a prerelease excluded from stable coverage.
Claude Code CLI — Latest npm latest verified: 2.1.168, published 2026-06-06T23:41:53Z from GitHub releases. Recent episode version tags detected: 2.1.167. Selected missing version from npm latest: 2.1.168. npm stable dist-tag remains 2.1.153.
Antigravity CLI — Continuous delivery model; no discrete release tags. Latest build as of 2026-06-07. Antigravity CLI launched May 19, 2026 at Google I/O as the successor to Gemini CLI. No version-tagged releases tracked to date.

Harness Version Reference

OpenClaw — v2026.6.1 (stable) / v2026.6.5-beta.2 (prerelease, monthly cadence switch)
Hermes Agent — v2026.6.5 (v0.16.0)
OpenAI Codex — rust-v0.137.0
Claude Code CLI — 2.1.168 (npm latest) / 2.1.153 (npm stable)
Antigravity CLI — Continuous delivery (launched 2026-05-19)