Episode 51: OpenClaw 2026.5.12, Hermes Foundation, Claude

AgentStack Daily EP051 — OpenClaw 2026.5.12, Hermes Foundation, Claude Code Background Controls, and Gemini Agent Deployments

Release Coverage Check

OpenClaw — Latest stable/verified version: v2026.5.12 from https://api.github.com/repos/openclaw/openclaw/releases?per_page=10 with prereleases skipped. Recent episode version tags detected: v2026.4.29, v2026.5.2, v2026.5.3, v2026.5.4, v2026.5.5, v2026.5.6, v2026.5.7. Selected missing versions: v2026.5.12.
Hermes Agent — Latest stable/verified version: v2026.5.16 / Hermes Agent 0.14.0 from https://api.github.com/repos/NousResearch/hermes-agent/releases?per_page=10 with prereleases skipped. Recent episode version tags detected: v2026.5.7 and earlier. Selected missing versions: v2026.5.16.
OpenAI Codex app/CLI — Latest stable/verified version: rust-v0.130.0 from https://api.github.com/repos/openai/codex/releases?per_page=100 with alpha/prerelease builds skipped. Recent episode version tags detected: rust-v0.130.0. Selected missing versions: none.
Claude Code CLI — Latest verified npm package version: 2.1.143 from https://registry.npmjs.org/@anthropic-ai/claude-code/latest; concrete changes verified in Anthropic's Claude Code changelog at https://code.claude.com/docs/en/changelog. Recent episode version tags detected: 2.1.141 and earlier. Selected missing versions: 2.1.143, 2.1.142.
Candidate verification — OpenClaw, Hermes Agent, and Claude Code have latest-contiguous uncovered blocks starting at their latest stable/verified release. Codex does not: its latest stable release is present in the recent-version scan, while newer Codex releases in the GitHub feed are prerelease alpha builds and are skipped.

Episode Title

OpenClaw 2026.5.12, Hermes Foundation, Claude Code Background Controls, and Gemini Agent Deployments

Tagline

Install leaner, run agents faster, keep background work safer, and ship Gemini agents with revision-level rollout control.

Feed Description

AgentStack Daily EP051 opens with an agent-stack release readout: OpenClaw v2026.5.12 trims core installs, hardens Telegram, Codex, plugin, gateway, browser, and config paths, and improves reply delivery; Hermes Agent 2026.5.16 adds native Windows beta, PyPI installation, faster startup, a local OpenAI-compatible proxy, vision, video, browser, LSP, and verification upgrades; Claude Code 2.1.143 and 2.1.142 tighten plugin dependencies, background-session flags, PowerShell behavior, worktree isolation, MCP timeout handling, and agent-dashboard defaults. Then the episode turns to Google Cloud's Gemini Enterprise Agent Platform release notes for immutable agent revisions, traffic splitting, and Priority PayGo, and to Google's Interactions API breaking-change guide for the new steps timeline and response_format migration.

Story Slate

1. Agent-stack release readout: OpenClaw v2026.5.12, Hermes Agent 2026.5.16, and Claude Code 2.1.143/2.1.142 move the operator surface

OpenClaw's stable release is the practical host update: leaner provider/plugin installs, isolated Telegram polling with local spooling, Codex/OpenAI auth and MCP projection fixes, plugin install hardening, gateway/session protocol repairs, richer reply delivery, browser-scope fixes, and Windows sandbox/config credential protections. Hermes Agent's Foundation release is the distribution and runtime jump: native Windows beta, pip install hermes-agent, lazy dependencies, install-time supply-chain checks, an OpenAI-compatible local proxy for OAuth-backed providers, faster startup, faster CDP browser console calls, cross-session prompt caching, LSP diagnostics, vision_analyze, video_generate, computer_use, /handoff, x_search, and per-turn mutation verification. Claude Code adds the adjacent CLI controls for plugin dependency enforcement, token-cost visibility, background worktree isolation, PowerShell defaults, preserved background models/settings/MCP config, stop-hook caps, and background agent reliability. Technical depth angle: explain install dependency cones, polling/spooling failure modes, auth-profile-backed Codex media and MCP projection, gateway delta semantics, plugin dependency graphs, background-session state preservation, worktree isolation knobs, local OpenAI-compatible proxy semantics, LSP diagnostics after edits, CDP persistent-session latency, and what operators should test before upgrading. Primary release link: https://github.com/openclaw/openclaw/releases/tag/v2026.5.12

2. Gemini Enterprise Agent Platform adds agent revisions, traffic splitting, and Priority PayGo

Google Cloud's May 15 release notes put immutable deployed-agent revisions and traffic splitting into public preview, one day after Priority PayGo became generally available for more consistent performance without a provisioned-throughput commitment. For builders, that turns agent deployment from a single mutable endpoint into a safer release lane: create a revision, send a slice of traffic to it, watch quality and latency, then roll forward or back. Technical depth angle: explain immutable revision resources, canary traffic splits, runtime deployment safety, latency/cost tradeoffs between Standard PayGo, Priority PayGo, and Provisioned Throughput, and how agent observability should be wired around revision IDs rather than only endpoint names.

3. Google's Interactions API migration replaces flat outputs with a typed steps timeline

Google's May 2026 breaking-change guide for the Gemini Interactions API says the legacy schema is being removed on June 8 and introduces a new steps array plus a polymorphic response_format. That matters for agent clients because a turn is no longer just the last generated text blob; it becomes a structured timeline that can support future mid-flight steering and asynchronous tool calls. Technical depth angle: explain request/response schema migration, Api-Revision migration control, outputs to steps, type discriminators, GET /interactions/{id} full timelines, POST /interactions output-step behavior, response_mime_type removal, and why durable agent logs should store timeline steps rather than only final text.

Extra Research Candidates

Agent Browser v0.27.0 Adds React Introspection, Web Vitals, SPA Navigation, Init Scripts, and Network Resource Filters — Primary source: https://agent-browser.dev/changelog. Technical depth angle: explain how browser-control CLIs are moving from generic click/snapshot loops into framework-aware debugging with React fiber inspection, render profiling, Suspense boundary classification, Web Vitals, SPA pushstate, pre-navigation init scripts, and CDP resource-type filtering.
Augment Code Changelog Tightens MCP OAuth, Context Recovery, Token Counting, Cloud Agent Tools, and Retrieval State — Primary source: https://www.augmentcode.com/changelog. Technical depth angle: explain stdio MCP auth-header boundaries, OAuth refresh edge cases when no new refresh token is returned, forked-conversation recovery, character-estimate token counting fallbacks, pinned-file retrieval context, checkpoint migration, and why cloud-agent tool exposure must match plan-mode policy.
Anthropic Claude Agent SDK Documentation Clarifies Library-Mode Agents, Built-In Tools, Hooks, Subagents, MCP, and Subscription Credit Changes — Primary source: https://code.claude.com/docs/en/agent-sdk/overview. Technical depth angle: explain the SDK's async message stream, built-in Read/Write/Edit/Bash/Monitor/Glob/Grep/WebSearch/WebFetch tools, hook lifecycle, custom subagent definitions, MCP server configuration, provider authentication modes, and how separate Agent SDK credits change production cost planning.

Show Notes

[00:00] Hook — the upgrade starts with install size, polling resilience, and background agents
OpenClaw v2026.5.12 is the first thing to look at today because it changes the host surfaces that decide whether an agent stack is pleasant to run every day: what gets installed by default, how Telegram survives event-loop stalls, how Codex/OpenAI auth-backed media and MCP paths behave, how plugin updates avoid wedging, and how reply delivery handles rich-only cards and source replies. Beside it, Hermes Agent 2026.5.16 is a major distribution and runtime release: native Windows is in early beta, `pip install hermes-agent` becomes real, cold start drops, CDP browser calls get dramatically faster, and OAuth-backed provider access can be exposed through an OpenAI-compatible local proxy. Claude Code 2.1.143 and 2.1.142 add the background-session and plugin controls that matter when CLI agents are doing real unattended work: dependency-aware plugin enable/disable, projected context cost, background worktree isolation, preserved MCP and settings flags, PowerShell defaults, and caps on stop-hook loops.

The external story after the release readout is about production rollout shape. Google Cloud now lets Gemini Enterprise Agent Platform users create immutable agent revisions and split traffic between active revisions, while Priority PayGo is generally available for more predictable latency without a committed throughput contract. Then we close with a schema migration builders should not leave to the last week: Google's Interactions API is replacing flat `outputs` with a typed `steps` timeline and consolidating output configuration under `response_format`.

[03:00] Agent-stack release readout — OpenClaw v2026.5.12, Hermes Agent 2026.5.16, and Claude Code 2.1.143/2.1.142
OpenClaw v2026.5.12 is not a single headline feature release; it is a host-quality release. The first operator-facing change is dependency shape. Bedrock, Bedrock Mantle, Slack, OpenShell sandbox, Anthropic Vertex, WhatsApp, and related packages are moved out of the core runtime so an installation only pulls what it needs. That matters because agent hosts age badly when optional providers quietly become mandatory dependency cones. Leaner installs mean fewer platform-specific build failures, smaller update blast radius, and less time spent debugging a provider you never enabled.

The second change cluster is channel resilience. Telegram polling moves to an isolated worker with durable local spooling, so a main event-loop stall is less likely to drop or delay inbound messages. The release also preserves rendered HTML formatting in lazy cron announcements, skips unmentioned group media before download when mention-gating is active, and deletes tool-progress-only draft bubbles before rotating to a real answer. The practical operator recipe is simple: after upgrading, test one streamed reply, one scheduled or cron-style announcement, one group-media edge case, and one interrupted turn. This release is trying to make the messaging layer behave like a transport, not like a fragile UI side effect.

The Codex and OpenAI paths are the other major OpenClaw reason to upgrade. Auth-profile-backed media tools stay available when OpenAI credentials live in the agent's auth-profile store rather than the environment. Codex OAuth refresh errors are classified more cleanly, high-confidence app-server refresh failures no longer collapse into raw runtime failures, and selectable OpenAI agent models are treated as Codex runtime requirements even when the primary config is Anthropic. The release also keeps per-agent `CODEX_HOME` isolation without rewriting `HOME` by default, which is the difference between isolated Codex credentials and breaking ordinary subprocess user-home discovery. For builders running mixed Claude/OpenAI/Codex hosts, the point is fewer false reauth loops and fewer model-switch failures.

OpenClaw also tightens plugin and gateway mechanics. Plugin installs preserve peer dependencies, handle pnpm 11, restore a deprecated memory SDK subpath for companion plugins, scan runtime entry points more narrowly, discover provider plugins through structured setup credentials, and preserve install records through doctor cleanup. Gateway and session history now carry monotonic transcript sequence numbers and stream explicit `deltaText` and `replace` frames so SDK clients do not need to diff assistant output locally. Rich-only replies, cards, buttons, and message-tool-only responses are treated as real outbound content instead of being dropped as empty. If you are building on the gateway protocol, this is the kind of release where client assumptions should be tested against cards, media, source replies, and reconnects, not just plain text.

Security and config hardening are also concrete. Windows user-profile roots are included in sandbox blocked home roots so credential-bearing folders are denied even when `HOME` points somewhere else. Provider credentials are resolved through structured secret references instead of broad environment-variable-looking strings, reducing accidental credential inference. Semantic config mutations are serialized and retried centrally, which reduces clobber risk when concurrent commands edit config. Browser CLI commands explicitly request the existing operator-admin gateway scope, avoiding approval-loop noise. These are not glamorous changes, but they are exactly the changes that keep an agent host from becoming a credentials accident or an update-time mystery.

Hermes Agent 2026.5.16 is the broader runtime story. The release names native Windows support as early beta, with a PowerShell installer, native subprocess and PTY paths, taskkill-based process management, MinGit auto-install, Python stub detection, Ctrl+C preservation, and many Windows-only fixes. It also ships a real PyPI wheel: `pip install hermes-agent && hermes`. That changes onboarding because a user no longer has to clone a repo or run a custom shell installer just to try the agent. The lazy-dependency framework and advisory checker are equally important: heavy provider libraries defer to first use, installer fallbacks move through extras tiers, and install/update scans look for unsafe versions.

The performance numbers in Hermes are worth calling out because they map directly to daily agent feel. The release says cold start drops by roughly nineteen seconds through skills caching, lazy imports, disk-cache-first model lookup, deferred provider libraries, and parallel doctor checks. `hermes tools` All-Platforms falls from about fourteen seconds to under one and a half seconds. `browser_console` evaluations become dramatically faster by reusing the supervisor's persistent CDP WebSocket instead of spawning a fresh DevTools session per call. For browser-heavy agents, persistent CDP is not an implementation detail; it changes whether a debugging loop feels instant or constantly waits on browser setup.

Hermes also adds capabilities that make it more of a hub. `hermes proxy` exposes OAuth-authenticated providers through an OpenAI-compatible local endpoint, so tools such as Codex, Aider, Cline, or editor extensions can talk to Claude Pro, ChatGPT Pro, SuperGrok, or similar OAuth-backed accounts through an interface they already understand. Cross-session one-hour Claude prompt caching reduces repeated-prefix cost across resumes and new sessions. `vision_analyze` now passes pixels to vision-capable models instead of reducing the image to text. A unified `video_generate` tool supports pluggable video providers. `computer_use` gets a non-Anthropic-capable cua-driver backend. `/handoff` moves the live session to another model, persona, or profile while preserving context and tool history. The operational question after this release is not only, "Does Hermes run?" It is, "Which local tools should point at Hermes as their provider bridge?"

For code-writing agents, Hermes adds two guardrails that are especially relevant. LSP semantic diagnostics run after `write_file` and `patch`, so the agent sees language-server errors on the changed file before downstream work continues. A per-turn file-mutation verifier footer tells the agent what actually changed on disk after a turn that wrote files. That is a direct answer to a common failure mode: the model believes it edited a file, but the patch missed, overwrote the wrong region, or produced a silent type error. Diagnostics plus mutation summaries do not replace tests, but they shorten the loop before tests even run.

Claude Code 2.1.143 and 2.1.142 round out the CLI-agent side. Plugin dependency enforcement means disabling a plugin now refuses when another enabled plugin depends on it, with a disable-chain hint, and enabling a plugin force-enables transitive dependencies. The plugin marketplace browse pane shows projected context cost per turn and invocation, which helps operators see when a plugin is not just installed but expensive. A new `worktree.bgIsolation: "none"` setting lets background sessions edit the working copy directly when Git worktrees are impractical, while worktree cleanup no longer falls back to destructive removal if `git worktree remove` fails.

The background-agent fixes are the ones to test in real work. Background sessions preserve model and effort level after waking from idle. `/bg` preserves MCP config, settings, add-dir, plugin-dir, strict MCP config, fallback model, and bypass-permission availability across respawn or detach. Claude agents accepts flags for add-dir, settings, MCP config, plugin directories, permission mode, model, effort, and skip-permission defaults, and background sessions launched from the dashboard honor the configured default permission mode. MCP HTTP and SSE tool calls now respect the configured timeout instead of being capped at sixty seconds. Stop hooks that keep blocking now end with a warning after eight consecutive blocks unless overridden. In short: fewer background workers lose their environment, permissions, model, or long-running MCP calls.

[24:00] Gemini Enterprise Agent Platform — revisions, traffic splitting, and Priority PayGo
Google Cloud's May 15 Gemini Enterprise Agent Platform update adds a deployment primitive that agent teams need: immutable agent revisions with traffic splitting. Before this kind of feature, an agent deployment often behaves like a mutable service endpoint. You update prompt, tools, model settings, routing, or container code; the endpoint changes; and rollback depends on how disciplined your release process was. Revisions give you a named deployment artifact. Traffic splitting lets you move a controlled percentage of production traffic to the new version while the old version still serves most users.

That sounds like standard software deployment, but it matters more for agents because a small change can alter tool choice, latency, refusal behavior, memory use, or hallucination profile. A canary release for a deterministic API often watches error rate and p95 latency. A canary release for an agent should watch those plus task completion, tool-call count, escalation rate, user correction rate, retrieval miss rate, and cost per successful outcome. Revision IDs should appear in traces, logs, evaluation records, and user-feedback bundles. If you only log the endpoint name, you will not know which agent version caused a regression.

The May 14 Priority PayGo update adds the cost-and-latency side of the story. Provisioned Throughput is best when you know the traffic and can commit. Standard PayGo is flexible but can have more variable performance. Priority PayGo sits between them: more consistent performance than standard consumption without the upfront commitment. For production agents, that maps to workloads that are important but bursty: customer support triage, internal research assistants, incident helpers, and workflow agents that spike during business hours or outages.

The builder recommendation is to think of these two releases together. Use revisions and traffic splitting to make behavioral change safe. Use Priority PayGo where latency variance would make the rollout look worse than it is. If a new agent revision is slower because the platform is under variable load, you may misdiagnose a model or prompt regression. If a new revision actually increases tool calls or retrieval depth, Priority PayGo will not hide the cost profile; you still need per-revision metrics. The minimum useful rollout dashboard should show revision, traffic share, latency, model/tool costs, tool errors, human escalation, and task success.

[34:00] Gemini Interactions API — from flat outputs to a typed steps timeline
Google's Interactions API breaking-change guide is a schema migration with bigger agent-design implications. The old shape returned a flat `outputs` array. The new shape returns a `steps` array with type discriminators. For a simple request, you may still grab the last text chunk and move on. But for long-running agents, research agents, tool-using agents, and future asynchronous tool calls, a timeline is the right abstraction. A turn is not just the final answer; it is user input, model output, tool activity, intermediate state, and potentially steering events.

The guide also changes output configuration. Instead of `response_mime_type`, output controls move into a polymorphic `response_format`. That reduces the number of one-off fields clients need to branch on and gives the API room to add structured modes without growing a pile of unrelated request parameters. For SDK maintainers, this is a type-generation and compatibility issue. For app builders, it is a persistence issue: update response readers, fixtures, tests, and database schemas that assumed `outputs[-1].text` was the canonical answer.

The migration-control detail is the `Api-Revision` request header. That gives teams a way to pin behavior during migration rather than discovering the removal date through production failures. The guide says the legacy schema is removed on June 8, so the practical plan is: add dual-read support, store raw interaction objects during the migration, update summaries and replay tools to understand `steps`, and run a small set of old transcripts through the new parser. If your agent logs are used for evals, support, or audit, do not throw away step types just to keep an old text-only shape.

The reason this is worth an episode segment is the future direction it signals. Google says the new API shape supports future capabilities like mid-flight steering and asynchronous tool calls. Those features need a structured event timeline. If your client collapses the new timeline back into a single string immediately, you will be technically compatible but architecturally behind. Treat the migration as a chance to make agent traces first-class: each step gets an ID, a type, timestamps, content, tool metadata, and linkage to the interaction. That is how you debug an agent that changes course halfway through a job.

[43:00] Closing — what to upgrade and what to watch
The upgrade priority is clear. If you operate OpenClaw, test v2026.5.12 against your channels, Codex/OpenAI profiles, gateway clients, browser commands, plugin installs, and config mutations. If you operate Hermes, test the new install paths, proxy, browser latency, diagnostics, and file-mutation verifier on a real repo rather than a toy prompt. If you use Claude Code background agents, update and verify that `/bg`, `claude agents`, MCP configs, settings, permission modes, fallback models, and PowerShell behavior survive detach, wake, and idle.

For platform builders, Gemini Enterprise Agent Platform's revisions and traffic splitting are the production pattern to copy: agents need canaries, rollback, revision-aware observability, and cost-aware rollout gates. For API builders, the Interactions migration is a reminder that agent APIs are becoming event timelines. Store the steps. Keep the type metadata. Build the parser now, before the removal date turns a schema cleanup into an outage.

Verified Links

OpenClaw v2026.5.12 release: https://github.com/openclaw/openclaw/releases/tag/v2026.5.12
OpenClaw releases API: https://api.github.com/repos/openclaw/openclaw/releases?per_page=10
Hermes Agent 2026.5.16 release: https://github.com/NousResearch/hermes-agent/releases/tag/v2026.5.16
Hermes Agent releases API: https://api.github.com/repos/NousResearch/hermes-agent/releases?per_page=10
OpenAI Codex releases API: https://api.github.com/repos/openai/codex/releases?per_page=10
Claude Code npm package latest: https://registry.npmjs.org/@anthropic-ai/claude-code/latest
Claude Code changelog: https://code.claude.com/docs/en/changelog
Gemini Enterprise Agent Platform release notes: https://docs.cloud.google.com/gemini-enterprise-agent-platform/release-notes
Gemini Interactions API breaking-change guide: https://ai.google.dev/gemini-api/docs/interactions-breaking-changes-may-2026
Agent Browser changelog: https://agent-browser.dev/changelog
Augment Code changelog: https://www.augmentcode.com/changelog
Claude Agent SDK overview: https://code.claude.com/docs/en/agent-sdk/overview

Chapters

[00:00] Hook — the upgrade starts with install size, polling resilience, and background agents
[03:00] Agent-stack release readout — OpenClaw v2026.5.12, Hermes Agent 2026.5.16, and Claude Code 2.1.143/2.1.142
[24:00] Gemini Enterprise Agent Platform — revisions, traffic splitting, and Priority PayGo
[34:00] Gemini Interactions API — from flat outputs to a typed steps timeline
[43:00] Closing — what to upgrade and what to watch