Open models inference

We run the best open coding models on our own GPUs, and we host the cloud agents that use them. Plug us into the harness you already use, or open a wired-up cloud agent from any device. Open stack. No lock-in.

Start with Umans Read docs

Use it with the tools you already use

OpenCode

Claude CodeπPi Zed

Zed

Cursor

Copilotand more

Your tools

Any harness, any device

Umans inference

Open models

Our GPUs

Cloud agent

live

Always-on. Reach from anywhere.

Why teams choose Umans

Infrastructure we own

We run open inference on our own GPUs and host the cloud agents that use them. Mostly in Europe, with US capacity for US customers.

Open models, by design

Open weights you can audit. The model that wrote your code is the same one anyone can inspect and run elsewhere.

Open stack, no lock-in

Plug our endpoint into the harness you already use with BYOK. Mix our inference with proprietary models. Walk away whenever.

Reach your agent from anywhere

Cloud agents are always-on and reachable from your laptop or phone. Same session, any device.

How it works

Two ways to use Umans

Local

Use Umans from your tool

Three ways to plug in. Pick the one that fits your harness.

Pick Umans in your tool

We're on models.dev. OpenCode, Zed and other modern tools list Umans as a provider out of the box.

Provider

Umans

and more

Runumans claude

Install the umans CLI, then `umans claude` powers Claude Code with our inference. Switch back to `claude` anytime.

$ umans claude

Claude Code, powered by Umans.

Set base URL and key

Two env vars in any OpenAI- or Anthropic-compatible tool.

BASE_URL=https://api.code.umans.ai/v1

API_KEY=umans_••••••••••••

Remote

Run your agent in our cloud

One click. The agent runs on our infra, already wired up to our inference.

Close your laptop
The agent keeps working. Come back later, the session is still there.
Open it from your phone
Same session, any device. Walk in and out of the work.
Sandboxed in its own box
The agent does its thing on isolated infra. Your local setup stays clean.

See cloud agents

Pricing at a glance

Popular

Pro

For individuals

Hands-on coding

200 effective requests per rolling 5h window
Up to 3 concurrent requests
Full open model lineup

Get Pro

Max

For power users

Long sessions, heavy use

Unlimited tokens
Up to 4 concurrent sessions
Full open model lineup

Get Max

Orgs

For teams and automations

Service keys + role management

Pay per token via service keys
Centralized usage and billing
Role management for seats

Plans for individuals. Service keys for automations.

Early release

Cloud agents

Spin up a coding agent on our infra in one click. It lands wired up to our inference, with the umans CLI installed and configured. Reach it from your laptop or phone. Same session, any device.

Open cloud agents Read docs

What you can do with one

Refactor while you sleep

Set the goal, close your laptop. The agent keeps working through the night.

Triage from anywhere

Open the agent on your phone. Walk through issues, ship the small fix.

Sandbox bold ideas

Spin up a fresh box, try a wild change, throw it away. Your local stays clean.