Claude Code—Anthropic’s agentic coding assistant that lives in your terminal—now works with local models.

Ollama v0.14.0 added Anthropic Messages API compatibility, which means you can run the exact same Claude Code experience against models running on your laptop instead of calling Anthropic’s cloud API.

Getting Started

1. Install Claude Code

macOS, Linux, WSL:

curl -fsSL https://claude.ai/install.sh | bash

Windows PowerShell:

irm https://claude.ai/install.ps1 | iex

2. Install Ollama

Get Ollama from ollama.com and pull a coding model:

ollama pull qwen3-coder

3. Configure environment variables

Point Claude Code to Ollama instead of Anthropic’s cloud API:

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434

4. Run Claude Code

claude --model qwen3-coder

That’s it. Same Claude Code experience—multi-file editing, codebase understanding, terminal integration—but running entirely on your machine.

What You Get

Zero API costs: Models run locally. No per-token charges after downloading.

Data stays local: Code never leaves your machine. Perfect for sensitive codebases.

Works offline: No network required once models are downloaded.

Same workflows: If you’ve used Claude Code with Anthropic’s API, nothing changes except where it runs.

Ollama’s announcement recommends these models for Claude Code:

Local models:

  • qwen3-coder - Strong coding performance
  • gpt-oss:20b - Larger context window

Cloud models (via ollama.com):

  • glm-4.7:cloud - Fast inference
  • minimax-m2.1:cloud - Good reasoning

They recommend at least 64k tokens context length for coding tasks. Check Ollama’s context length docs to configure this.

Beyond Claude Code: Using the Anthropic SDK

The same API compatibility that makes Claude Code work also means you can use the Anthropic SDK in your own applications.

If you’re building agents or tools that call Anthropic’s API, you can swap in Ollama for local development.

C# example:

using Anthropic.SDK;

var client = new AnthropicClient(new APIAuthentication("ollama"))
{
    BaseUrl = "http://localhost:11434"
};

var messages = new List<Message>
{
    new Message(RoleType.User, "Write a function to check if a number is prime")
};

var response = await client.Messages.GetClaudeMessageAsync(
    messages,
    model: "qwen3-coder",
    maxTokens: 1024
);

Console.WriteLine(response.Content[0].Text);

Make it configurable:

var baseUrl = Environment.GetEnvironmentVariable("ANTHROPIC_BASE_URL") 
    ?? "https://api.anthropic.com";
var apiKey = Environment.GetEnvironmentVariable("ANTHROPIC_API_KEY");

var client = new AnthropicClient(new APIAuthentication(apiKey))
{
    BaseUrl = baseUrl
};

Now you can prototype locally (set ANTHROPIC_BASE_URL=http://localhost:11434) and deploy to production (use Anthropic’s cloud API) without changing code.

Python and JavaScript work the same way—see Ollama’s documentation for examples.

Why Anthropic Enables This

Why would Anthropic document and support running their tools against competitors’ models when they can’t monetize it?

Ecosystem growth.

When developers use Claude Code (whether local or cloud), they learn Anthropic’s tooling patterns. Teams that prototype locally often move production workloads to Anthropic’s cloud API when they need:

  • Specific model capabilities (Claude Sonnet or Opus)
  • Scale beyond local hardware
  • Team collaboration features
  • Production reliability

By removing friction for getting started, Anthropic captures value when teams graduate to cloud workloads.

What’s Supported

Ollama v0.14.0 implements the full Anthropic Messages API spec (docs):

  • Streaming responses - Get tokens as they’re generated
  • Tool/function calling - Let models use external tools
  • System prompts - Set model behavior
  • Multi-turn conversations - Maintain context across messages
  • Extended thinking - Deeper reasoning mode
  • Vision - Image input support

If a tool uses the Anthropic SDK, it works with Ollama.

The Broader Pattern

This isn’t unique to Anthropic. The AI tooling ecosystem is standardizing on API compatibility:

MCP servers work with any MCP client—build infrastructure once, use anywhere.

OpenAI compatibility (which Ollama also supports) lets you swap providers by changing the base URL.

Local-first architectures give you flexibility—prototype on your laptop, deploy to cloud for production.

When APIs become interchangeable, you get options. Use the right tool for each workload without locking into one provider’s roadmap.


Building with Claude Code locally? Connect with me on LinkedIn to discuss what you’re working on.

Updated: