Hermes Agent: the open-source coding agent that just topped OpenClaw on OpenRouter token usage

Nous Research's Hermes Agent crossed OpenClaw in OpenRouter usage in May. The 0.16 Surface release adds a real desktop app, remote-gateway support, and an admin dashboard.

C Charles Lin · June 6, 2026

Hermes Agent crossed OpenClaw in OpenRouter token usage in May 2026. That is not a benchmark; it is the unfaked usage signal that working engineers are running it as a primary daily driver in numbers that now exceed the previous open-source incumbent. The Nous Research team shipped the 0.16 “Surface” release on June 6 — a native desktop app, a major admin dashboard, remote-gateway support, multi-profile sessions, OAuth login. It is the kind of release that converts a credible CLI tool into a serious team-scale product.

This piece is the two-week read after running Hermes Agent alongside Claude Code on a personal homelab, cross-checked against the strongest creator coverage from May 20 and June 6.

Why the OpenClaw → Hermes shift is the story

The single most useful summary of why Hermes specifically — and not the next generic open-source agent — became the OpenClaw alternative is NetworkChuck’s “you need to use Hermes RIGHT NOW!!” (May 20, the early-window evangelist take). His framing, after a month of use: “It’s the fastest growing GitHub project. It just topped OpenClaw on the OpenRouter token usage.”

His five reasons for the switch — vibe / mission, memory, the “fact that it learns” (a skill-acquisition pattern), the self-improvement loop, and an integration story — are not the things that would normally drive a tool migration. Engineers do not typically switch agents because of vibe. What NetworkChuck’s video is actually documenting is the cultural fit between Hermes and the open-source AI-coding community. Nous Research’s positioning — explicitly community-built, explicitly the open alternative — lands in a way the more commercial open-source projects have not.

The technical reason underneath the cultural reason is the skill-acquisition pattern. Hermes is explicitly designed to learn from its sessions — the agent that solves a problem today builds a skill that the agent solving a similar problem tomorrow can reuse. This is the same compounding-investment pattern Anthropic’s large-codebase masterclass articulates, applied to an open-source tool where the engineer can audit and modify the skill catalogue directly.

What the 0.16 Surface release actually shipped

AICodeKing’s “Hermes Agent 5.0 (New Upgrades): HERMES BECAME ULTRA-HERMES!” (June 6, launch-day) walks through the 0.16 release in detail. The version numbering is confusing — the headline is “Agent 5.0” but the actual semver tag is 0.16 — but the shipped surface is substantial.

Native desktop app for macOS, Linux, Windows. Streaming chat, searchable session list, session archive, drag-and-drop files, clipboard image paste, command palette, model picker in the status bar. AICodeKing’s framing on what this matters for: “If someone looked at Hermes before and thought ‘this is interesting’ but it feels too much like a terminal tool, this release directly addresses that and that matters a lot.” For engineers who were not going to install a CLI agent because the terminal-only ergonomics felt like a backward step, the desktop app changes the calculus.

Remote gateway support in the desktop app. This is the feature serious users should care about most. The desktop app does not have to run Hermes locally. You can point the GUI at a remote Hermes gateway running on a homelab box, a hosted VPS, or a teammate’s machine, authenticate via OAuth or username/password, and treat your laptop as just the control surface. AICodeKing’s framing: “Your API keys, your compute, your long running tasks, and your gateway integrations may not belong on your laptop, they may belong on a machine that stays online. Now the desktop app can connect to that.”

This is the architectural pattern that turns Hermes from a personal agent into a team or homelab agent. The same pattern Codex CLI and Claude Code do not yet ship cleanly, which is part of why Hermes is taking share at the homelab edge.

Multi-profile sessions. Each profile can point to its own remote host, with concurrent sessions across profiles in one window. The natural use case is “this profile runs my personal agent on my homelab, this profile runs my client-work agent on a hosted VPS, this profile runs my work agent on the team’s shared gateway.” All inside the same desktop window.

Full admin dashboard. Web UI for managing the MCP catalogue (enable/disable toggles), messaging channels (Telegram, Discord, Slack), credentials, webhooks, hook creation, memory configuration, gateway controls, system settings, update checks, debug sharing. The dashboard is the upgrade from “you SSH into a server and edit YAML” to “you click through the admin panel like a normal piece of software.” For engineers who treat Hermes as a serious tool worth investing in, the dashboard pays back the configuration time within a week.

AICodeKing’s earlier video “Hermes Agent Desktop + FREE APIs” (June 3) covered the desktop app in detail before the 0.16 surface release made it the main story. The desktop app was already credible at that point; the surface release made it the centerpiece.

What the r/LocalLLaMA community is doing with it

The r/LocalLLaMA “Best Coding Harness for Qwen3.6 35B” (32 ups, June 7) thread is the relevant signal for how Hermes is being adopted by the local-model crowd. The community has converged on a pattern: run Hermes as the harness, route it to a local Qwen 3.6 35B or DeepSeek V4 instance, get genuinely competitive agentic coding without sending any data to a closed-model API.

The shape of that workflow is what makes the “Hermes vs Claude Code” comparison strange. The two products optimise for different user constraints. Claude Code is for engineers who want the best closed-model agent loop with the best harness. Hermes is for engineers who want a serious agent loop they can run on local models or remote gateways, with full source-level control and no vendor lock-in. They are not direct competitors. They are competitive for the same engineer-hour budget — which is what makes the OpenRouter token usage crossover meaningful.

The broader open-model context comes from the 689-upvote r/LocalLLaMA “Cohere’s unreleased coding model” thread that dominated the same week. The community pattern in June 2026 is to chase early access to the next open coding model rather than to commit to whatever just shipped. Hermes benefits from this because the model-agnostic harness rewards exactly the engineer who is willing to rotate cheap-tier models every two months. If you commit to a closed-model agent (Claude Code, Codex CLI, Antigravity), the model rotation is the vendor’s job. If you commit to Hermes, the model rotation is your job — and the harness is built to make that rotation easy.

What this means for working engineers right now

Three practical implications:

1. If you have a homelab, install Hermes 0.16. The remote-gateway pattern is genuinely the right architecture for serious agent work — agent compute on the always-on box, GUI on the laptop. The deploy story has matured enough that this is now a one-evening setup rather than a weekend project. The NetworkChuck walkthrough covers the VPS deployment path in detail if you do not have homelab hardware.

2. Treat Hermes as the harness for your cheap-tier model rotation. Run it pointed at DeepSeek V4, Qwen 3.6, Gemma 4 — rotate the model every two months as the open-weights frontier moves. The harness investment is the durable one; the model choice is the rotating one. This is the inverse of how engineers thought about model commitment a year ago.

3. Keep Claude Code or Codex CLI for the closed-model frontier work. Hermes is the right tool for the open-model slice of your work. It is not yet the right tool for “I need the absolute best agentic coding loop available”, which still runs through Claude Code + Opus 4.8 or Codex CLI + GPT-5.5. The right pattern for most engineers is to run both — Hermes for the cost-sensitive volume, Claude Code or Codex CLI for the frontier-tier work.

Creator POV vs the harder dissent

NetworkChuck’s coverage is positive bordering on evangelical, which is consistent with how he covers tools he genuinely uses. AICodeKing’s coverage across his three Hermes videos is more technical and more measured — he is right to flag that the 0.16 surface release has rough edges in the multi-profile UX and that the admin dashboard’s MCP catalogue is still a work in progress.

The harder dissent that has not yet surfaced in the videos but is the right question to ask: does the harness compounding investment in Hermes pay back relative to the harness compounding investment in Claude Code or Codex CLI? The answer depends on what model you are running underneath. For open-weights work, Hermes is now genuinely the right harness. For closed-model work, the established players’ harnesses are more mature.

That makes the migration calculus tricky for engineers who are currently mid-investment in one tool. The right discipline is to not migrate just because Hermes is winning the open-source attention game. Migrate if your model strategy is genuinely open-weights-heavy. Stay if your model strategy is closed-model-heavy. Treat the two as different stacks for different workloads.

The honest summary

Hermes Agent crossing OpenClaw in OpenRouter token usage is the kind of milestone the open-source AI-coding community has been working toward for two years. The Nous Research team executed cleanly through the May 20 evangelism wave, the June 3 desktop app preview, and the June 6 Surface release. The product is now mature enough that engineers who want a serious open-model agent stack should evaluate it seriously.

The deeper strategic read is that the AI-coding tooling market has bifurcated. The closed-model side (Claude Code, Codex CLI, Antigravity) competes on harness depth + model quality. The open-model side (Hermes, with Qwen / DeepSeek / Gemma underneath) competes on portability + cost + audit-ability. Both sides have working answers. The right answer for most engineers is to know how to run a workflow in both, then pick by workload. That is the harness-engineering thesis cashed out as a real-world tool rotation.

Sources

Every reference behind this piece. If we make a claim, it's because at least one of these said so — or we lived it ourselves.

Firsthand Two weeks running Hermes Agent alongside Claude Code on a personal homelab
Docs Nous Research — Hermes Agent documentation — Nous Research
YouTube you need to use Hermes RIGHT NOW!! (goodbye OpenClaw!!) — NetworkChuck
YouTube Hermes Agent 5.0 (New Upgrades): HERMES BECAME ULTRA-HERMES! — AICodeKing
YouTube Hermes Agent Desktop + FREE APIs: This is ACTUALLY ONE OF THE BEST AGENT APPS! — AICodeKing
Blog r/LocalLLaMA — Best Coding Harness for Qwen3.6 35B (32 ups, harness-curious community signal) — r/LocalLLaMA
Blog r/LocalLLaMA — Cohere's unreleased coding model early access (689 ups, broader open-model context) — r/LocalLLaMA