How to Get a Free OpenRouter API Key (2026)
39 free models available — no credit card required. Get your OpenRouter API key →
Overview
Single API key for 300+ models from every major lab.
OpenRouter is a unified API gateway that routes requests to models from OpenAI, Anthropic, Google, Meta, Mistral, and dozens of other providers. With a single API key and one base URL, developers can switch between models without changing code. The free tier (marked :free) offers 200 requests/day — or 1,000/day with a $10 lifetime top-up.
- 35+ permanently free models
- OpenAI-compatible endpoint
- 200 RPD free, 1,000 RPD with $10 lifetime credit
- One key for 300+ models
API Compatibility: OpenAI SDK-compatible (Chat Completions)
Quick Start Guide
- 1 Sign up at openrouter.ai No credit card. $10 lifetime top-up for 5× more rate limits.
- 2 Go to Keys page
- 3 Create a new API key
- 4 Find free models Look for :free suffix in model name, or browse our list below.
- 5 Configure OpenAI client Base URL: https://openrouter.ai/api/v1
All Free OpenRouter Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Released | Status |
|---|---|---|---|---|---|---|
| Nex AGI: Nex-N2-Pro (free) | 262K | 262K | See provider page | Jun 8, 2026 | Online | |
| NVIDIA: Nemotron 3.5 Content Safety (free) | 128K | 8K | See provider page | Jun 4, 2026 | Online | |
| NVIDIA: Nemotron 3 Ultra (free) | 1.0M | 66K | See provider page | Jun 4, 2026 | Online | |
| MiniMax: MiniMax M3 | 1.0M | 512K | See provider page | May 31, 2026 | Online | |
| inclusionAI: Ring-2.6-1T | 262K | 66K | See provider page | May 8, 2026 | Online | |
| Owl Alpha | 1.0M | 262K | See provider page | Apr 28, 2026 | Online | |
| NVIDIA: Nemotron 3 Nano Omni (free) | 256K | 66K | See provider page | Apr 28, 2026 | Online | |
| Poolside: Laguna XS.2 (free) | 262K | 33K | See provider page | Apr 28, 2026 | Online | |
| Poolside: Laguna M.1 (free) | 262K | 33K | See provider page | Apr 28, 2026 | Online | |
| DeepSeek: DeepSeek V4 Flash | 1.0M | 131K | See provider page | Apr 24, 2026 | Online | |
| MoonshotAI: Kimi K2.6 (free) | 262K | 8K | See provider page | Apr 20, 2026 | Online | |
| Z.ai: GLM 5.1 | 203K | 8K | See provider page | Apr 7, 2026 | Online | |
| Google: Gemma 4 26B A4B (free) | 262K | 33K | See provider page | Apr 3, 2026 | Online | |
| Google: Gemma 4 31B (free) | 262K | 33K | See provider page | Apr 2, 2026 | Online | |
| Arcee AI: Trinity Large Thinking | 262K | 262K | See provider page | Apr 1, 2026 | Online | |
| Google: Lyria 3 Pro Preview | 1.0M | 66K | See provider page | Mar 30, 2026 | Online | |
| Google: Lyria 3 Clip Preview | 1.0M | 66K | See provider page | Mar 30, 2026 | Online | |
| NVIDIA: Nemotron 3 Super (free) | 1.0M | 262K | See provider page | Mar 11, 2026 | Online | |
| MiniMax: MiniMax M2.5 | 205K | 197K | See provider page | Feb 12, 2026 | Online | |
| Free Models Router | 200K | 8K | See provider page | Feb 1, 2026 | Online | |
| LiquidAI: LFM2.5-1.2B-Thinking (free) | 33K | 8K | See provider page | Jan 20, 2026 | Online | |
| LiquidAI: LFM2.5-1.2B-Instruct (free) | 33K | 8K | See provider page | Jan 20, 2026 | Online | |
| NVIDIA: Nemotron 3 Nano 30B A3B (free) | 256K | 8K | See provider page | Dec 14, 2025 | Online | |
| OpenAI: gpt-oss-safeguard-20b | 131K | 66K | See provider page | Oct 29, 2025 | Online | |
| NVIDIA: Nemotron Nano 12B 2 VL (free) | 128K | 128K | See provider page | Oct 28, 2025 | Online | |
| Qwen: Qwen3 Next 80B A3B Instruct (free) | 262K | 8K | See provider page | Sep 11, 2025 | Online | |
| NVIDIA: Nemotron Nano 9B V2 (free) | 128K | 8K | See provider page | Sep 5, 2025 | Online | |
| OpenAI: gpt-oss-120b (free) | 131K | 131K | See provider page | Aug 5, 2025 | Online | |
| OpenAI: gpt-oss-20b (free) | 131K | 8K | See provider page | Aug 5, 2025 | Online | |
| Z.ai: GLM 4.5 Air (free) | 131K | 96K | See provider page | Jul 25, 2025 | Online | |
| Qwen: Qwen3 Coder 480B A35B (free) | 1.0M | 262K | See provider page | Feb 4, 2026 | Online | |
| Venice: Uncensored (free) | 33K | 8K | See provider page | — | Online | |
| Meta: Llama 3.3 70B Instruct (free) | 131K | 8K | See provider page | Dec 6, 2024 | Online | |
| Meta: Llama 3.2 3B Instruct (free) | 131K | 8K | See provider page | Sep 25, 2024 | Online | |
| Nous: Hermes 3 405B Instruct (free) | 131K | 8K | See provider page | Aug 16, 2024 | Online | |
| Baidu Qianfan: CoBuddy | 131K | 65K | See provider page | — | Unavailable | |
| NVIDIA: Llama Nemotron Embed VL 1B V2 (free) | 131K | 8K | See provider page | Feb 25, 2026 | Unavailable | |
| Sourceful: Riverflow V2.5 Fast (free) | 8K | 8K | See provider page | Jun 4, 2026 | Unavailable | |
| Sourceful: Riverflow V2.5 Pro (free) | 8K | 8K | See provider page | Jun 4, 2026 | Unavailable |
Nex AGI: Nex-N2-Pro (free)
Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts text and image input and produces text output, and supports reasoning, function calling, and structured outputs. It is designed for coding, tool use, deep research, and long-horizon agentic workflows, unifying planning, code implementation, debugging, and iteration into a single execution loop.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nex-agi/nex-n2-pro:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nex-agi/nex-n2-pro:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nex-agi/nex-n2-pro:free","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron 3.5 Content Safety (free)
NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both inputs to and responses from LLMs and VLMs, accepting text and image input and returning text output: a safe/unsafe classification for the user prompt and the response, safety category labels, and an optional reasoning trace. It covers 12 languages with a context window of up to 128K tokens. It is suited for prompt and response moderation, content classification, safety pipelines, and enterprise AI guardrails with policy enforcement, and includes a togglable reasoning mode. It is part of the NVIDIA Nemotron family of open models for agentic AI.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-3.5-content-safety:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-3.5-content-safety:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-3.5-content-safety:free","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron 3 Ultra (free)
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it supports text input and output with a context window of up to 1M tokens. It is suited for long-running agentic workflows, including agent orchestration, coding agents, deep research, and complex enterprise tasks. It is particularly strong at multi-step reasoning and planning, with high-throughput inference designed for high-volume agent pipelines. It is part of the NVIDIA Nemotron family of open models for agentic AI.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-3-ultra-550b-a55b:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-3-ultra-550b-a55b:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-3-ultra-550b-a55b:free","messages":[{"role":"user","content":"Hello"}]}' MiniMax: MiniMax M3
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding, and tool use. It is built on MiniMax Sparse Attention (MSA), which replaces full attention with KV-block selection to cut per-token compute at long context — roughly 1/20 the cost of the previous generation at 1M tokens, with substantially faster prefill and decode while retaining quality across most tasks. Trained as a native multimodal model on interleaved data and tuned for multi-turn, production-like collaboration via an interactive user-simulator framework, the model is oriented toward sustained, multi-step tasks rather than single-turn execution.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="minimax/minimax-m3",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "minimax/minimax-m3",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"minimax/minimax-m3","messages":[{"role":"user","content":"Hello"}]}' inclusionAI: Ring-2.6-1T
inclusionAI Ring-2.6-1T is a 1-trillion-parameter thinking model with 63B active parameters (MoE), available free on OpenRouter. Optimized for coding agents, tool use, and long-horizon task execution. Features adaptive reasoning with high and xhigh modes that dynamically adjust the reasoning budget based on task complexity — delivering stronger performance with lower token overhead in tool-heavy, multi-turn agent workflows. Strong results on PinchBench, ClawEval, TAU2-Bench, and GAIA2-search. 262K context window. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="inclusionai/ring-2.6-1t",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "inclusionai/ring-2.6-1t",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"inclusionai/ring-2.6-1t","messages":[{"role":"user","content":"Hello"}]}' Owl Alpha
Owl Alpha is a high-performance foundation model designed for agentic workloads, available free on OpenRouter. Features a 1M context window — one of the largest of any model. Natively handles tool use, long-context tasks, code generation, automated workflows, and complex instruction execution. Compatible with Claude Code, OpenClaw, and other mainstream productivity tools. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit). Note: prompts and completions may be logged by the provider for model improvement.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="openrouter/owl-alpha",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "openrouter/owl-alpha",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openrouter/owl-alpha","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron 3 Nano Omni (free)
NVIDIA Nemotron 3 Nano Omni 30B A3B Reasoning is an open multimodal model with 30B total parameters (3B active via MoE), available free on OpenRouter. Accepts text, image, video, and audio input — built as a perception and context sub-agent for enterprise agent systems. Uses a Hybrid MoE Transformer-Mamba architecture with Conv3D video layers and Efficient Video Sampling (EVS) for ~2x higher throughput and 2.5x lower compute on video tasks vs. separate vision+speech pipelines. Up to 300K context, 16K reasoning budget. OpenAI-compatible. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free","messages":[{"role":"user","content":"Hello"}]}' Poolside: Laguna XS.2 (free)
Poolside Laguna XS.2 is the second-generation efficient coding agent model from Poolside, available free on OpenRouter. Combines tool calling and reasoning capabilities with a compact footprint for software engineering and agentic coding workflows. 131K context window, up to 8K output. Quantized to fp8 for fast, cost-efficient inference. Released under Apache 2.0. Text/code-focused. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="poolside/laguna-xs.2:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "poolside/laguna-xs.2:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"poolside/laguna-xs.2:free","messages":[{"role":"user","content":"Hello"}]}' Poolside: Laguna M.1 (free)
Poolside Laguna M.1 is the flagship coding agent model from Poolside, available free on OpenRouter. Designed for complex software engineering tasks with agentic coding workflows — supports tool calling and reasoning. Features a 131K context window with up to 8K output tokens. Quantized to fp8 for efficient inference. Users must agree to Poolside's End User License Agreement. Text/code-focused. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="poolside/laguna-m.1:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "poolside/laguna-m.1:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"poolside/laguna-m.1:free","messages":[{"role":"user","content":"Hello"}]}' DeepSeek: DeepSeek V4 Flash
DeepSeek V4 Flash is a Mixture-of-Experts model with 284B total parameters (13B active per token), available free on OpenRouter. It features a 1M context window with hybrid attention for efficient long-context processing. Supports configurable reasoning effort (high/xhigh levels). Strong performance on coding, reasoning, and agent workflows. Model weights available on Hugging Face. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="deepseek/deepseek-v4-flash",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "deepseek/deepseek-v4-flash",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"deepseek/deepseek-v4-flash","messages":[{"role":"user","content":"Hello"}]}' MoonshotAI: Kimi K2.6 (free)
Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and can convert prompts and visual inputs into production-ready interfaces. Its agent swarm architecture scales to hundreds of parallel sub-agents for autonomous task decomposition - delivering documents, websites, and spreadsheets in a single run without human oversight.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="moonshotai/kimi-k2.6:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "moonshotai/kimi-k2.6:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"moonshotai/kimi-k2.6:free","messages":[{"role":"user","content":"Hello"}]}' Z.ai: GLM 5.1
GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="z-ai/glm-5.1",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "z-ai/glm-5.1",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"z-ai/glm-5.1","messages":[{"role":"user","content":"Hello"}]}' Google: Gemma 4 26B A4B (free)
Google Gemma 4 26B is a Mixture-of-Experts vision-language model with 25.2B total parameters (3.8B active per token), available free on OpenRouter. It supports multimodal input — text, images, and video (up to 60s at 1fps) — with a 262K context window. Includes native function calling, configurable thinking/reasoning mode, and structured output support. Delivers near-31B quality at a fraction of the compute. Apache 2.0 license. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="google/gemma-4-26b-a4b-it:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "google/gemma-4-26b-a4b-it:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"google/gemma-4-26b-a4b-it:free","messages":[{"role":"user","content":"Hello"}]}' Google: Gemma 4 31B (free)
Google Gemma 4 31B is a dense 30.7B-parameter multimodal model from Google DeepMind, available free on OpenRouter. Supports text and image input with a 262K context window. Strong on coding, reasoning, and document understanding tasks. Includes native function calling, configurable thinking/reasoning mode, and multilingual support across 140+ languages. Apache 2.0 license. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="google/gemma-4-31b-it:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "google/gemma-4-31b-it:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"google/gemma-4-31b-it:free","messages":[{"role":"user","content":"Hello"}]}' Arcee AI: Trinity Large Thinking
Arcee AI Trinity Large Thinking is a powerful open-source reasoning model available free on OpenRouter. With a 262K context window, it is optimized for agentic workflows and performs well on PinchBench and reasoning benchmarks. Best results come from preserving interleaved thinking (chain-of-thought) tokens. Model weights are open on Hugging Face. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="arcee-ai/trinity-large-thinking",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "arcee-ai/trinity-large-thinking",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"arcee-ai/trinity-large-thinking","messages":[{"role":"user","content":"Hello"}]}' Google: Lyria 3 Pro Preview
Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Pro can generate full-length songs with verses, choruses, bridges.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="google/lyria-3-pro-preview",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "google/lyria-3-pro-preview",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"google/lyria-3-pro-preview","messages":[{"role":"user","content":"Hello"}]}' Google: Lyria 3 Clip Preview
30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Clip can generate short clips, loops, previews.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="google/lyria-3-clip-preview",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "google/lyria-3-clip-preview",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"google/lyria-3-clip-preview","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron 3 Super (free)
NVIDIA Nemotron 3 Super 120B A12B is an open hybrid MoE model with 120B total parameters (12B active), available free on OpenRouter. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50% higher token generation speed compared to leading open models. Designed for multi-agent applications — long-term agent coherence, cross-document reasoning, and multi-step task planning. Trained with multi-environment RL across 10+ environments. Latent MoE calls 4 experts for the cost of one. Up to 1M context window. Fully open: weights, datasets, and recipes. OpenAI-compatible. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-3-super-120b-a12b:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-3-super-120b-a12b:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-3-super-120b-a12b:free","messages":[{"role":"user","content":"Hello"}]}' MiniMax: MiniMax M2.5
MiniMax M2.5 is a SOTA large language model designed for real-world productivity, available free on OpenRouter. It builds on M2.1's coding strengths and extends into general office tasks — handling Word, Excel, and PowerPoint files, switching between software environments, and working across teams. Achieved 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp. 197K context window. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="minimax/minimax-m2.5",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "minimax/minimax-m2.5",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"minimax/minimax-m2.5","messages":[{"role":"user","content":"Hello"}]}' Free Models Router
OpenRouter Free Models Router dynamically selects from 25 free models at random, smartly filtering for models that support the features needed for each request — image understanding, tool calling, and structured outputs. Rather than locking into a single model, it routes to the best available free model at request time. 200K context window. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="openrouter/free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "openrouter/free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openrouter/free","messages":[{"role":"user","content":"Hello"}]}' LiquidAI: LFM2.5-1.2B-Thinking (free)
Liquid LFM 2.5 1.2B Thinking is a compact reasoning model optimized for agentic tasks, data extraction, and RAG, available free on OpenRouter. Despite its small 1.2B parameter footprint, it produces high-quality chain-of-thought responses and runs comfortably on edge devices. 33K context window, supporting long-context reasoning in a lightweight package. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="liquid/lfm-2.5-1.2b-thinking:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "liquid/lfm-2.5-1.2b-thinking:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"liquid/lfm-2.5-1.2b-thinking:free","messages":[{"role":"user","content":"Hello"}]}' LiquidAI: LFM2.5-1.2B-Instruct (free)
Liquid LFM 2.5 1.2B Instruct is a compact, high-performance instruction-tuned model designed for fast on-device AI, available free on OpenRouter. With a 1.2B parameter footprint and 33K context window, it delivers strong chat quality for edge deployment scenarios. Built for efficient inference across a broad range of runtimes. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="liquid/lfm-2.5-1.2b-instruct:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "liquid/lfm-2.5-1.2b-instruct:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"liquid/lfm-2.5-1.2b-instruct:free","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron 3 Nano 30B A3B (free)
NVIDIA Nemotron 3 Nano 30B A3B is a Mixture-of-Experts language model with 30B total parameters (3B active), available free on OpenRouter. Designed for highest compute efficiency and accuracy in agentic AI systems. Features a 262K context window — exceptionally long for its size class. Fully open with weights, datasets, and recipes under the NVIDIA Open License for custom deployment. Also available directly on NVIDIA NIM. OpenAI-compatible. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-3-nano-30b-a3b:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-3-nano-30b-a3b:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-3-nano-30b-a3b:free","messages":[{"role":"user","content":"Hello"}]}' OpenAI: gpt-oss-safeguard-20b
gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="openai/gpt-oss-safeguard-20b",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "openai/gpt-oss-safeguard-20b",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-oss-safeguard-20b","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron Nano 12B 2 VL (free)
NVIDIA Nemotron Nano 12B V2 VL is a 12B-parameter multimodal reasoning model available free on OpenRouter. Uses a hybrid Transformer-Mamba architecture for better throughput and lower latency. Handles text and multi-image document inputs with 128K context. Optimized for optical character recognition (OCR), chart reasoning, and multimodal comprehension. Supports long-form video via Efficient Video Sampling (EVS). Achieves ~74 average across MMMU, MathVista, AI2D, OCRBench, ChartQA, DocVQA, and Video-MME. Open-weights under NVIDIA license. Also available on NVIDIA NIM. OpenAI-compatible. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-nano-12b-v2-vl:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-nano-12b-v2-vl:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-nano-12b-v2-vl:free","messages":[{"role":"user","content":"Hello"}]}' Qwen: Qwen3 Next 80B A3B Instruct (free)
Qwen3 Next 80B A3B Instruct is a Mixture-of-Experts chat model from Alibaba's Qwen team, available free on OpenRouter. With 80B total parameters (3B active), it is optimized for fast, stable responses without chain-of-thought traces — returning only final answers. Designed for production settings where deterministic, instruction-following outputs are preferred. Handles reasoning, code generation, knowledge QA, and multilingual tasks across a 262K context window. Supports RAG, tool use, and agentic workflows. Model weights on Hugging Face. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="qwen/qwen3-next-80b-a3b-instruct:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "qwen/qwen3-next-80b-a3b-instruct:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"qwen/qwen3-next-80b-a3b-instruct:free","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Nemotron Nano 9B V2 (free)
NVIDIA Nemotron Nano 9B V2 is a 9B-parameter language model trained from scratch by NVIDIA, available free on OpenRouter. Designed as a unified model for both reasoning and non-reasoning tasks — it generates a reasoning trace before delivering a final answer, with the ability to skip intermediate reasoning via system prompt configuration. This makes it a single model that handles both chain-of-thought and direct-response tasks. 131K context window. Text-only. Also available directly on NVIDIA NIM. OpenAI-compatible. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/nemotron-nano-9b-v2:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/nemotron-nano-9b-v2:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/nemotron-nano-9b-v2:free","messages":[{"role":"user","content":"Hello"}]}' OpenAI: gpt-oss-120b (free)
GPT-OSS 120B is OpenAI's open-weight 117B-parameter Mixture-of-Experts model with 5.1B active parameters per forward pass, available free on OpenRouter (also on Groq, Cerebras, and Cloudflare). Supports configurable reasoning depth, full chain-of-thought access, and native tool use — function calling, browsing, and structured outputs. Optimized to run on a single H100 GPU with native MXFP4 quantization. Built for reasoning-heavy and agentic tasks. 131K context window. Text-only. OpenAI-compatible. Free tier: 200 RPD on OpenRouter.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="openai/gpt-oss-120b:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "openai/gpt-oss-120b:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-oss-120b:free","messages":[{"role":"user","content":"Hello"}]}' OpenAI: gpt-oss-20b (free)
GPT-OSS 20B is OpenAI's open-weight 21B-parameter Mixture-of-Experts model with 3.6B active per forward pass, available free on OpenRouter (also on Groq, Cerebras, and Cloudflare). Supports reasoning level configuration, fine-tuning, and agentic capabilities — function calling, tool use, and structured outputs. Trained in OpenAI's Harmony response format. Released under Apache 2.0. The low active parameter count enables lower-latency inference on consumer or single-GPU hardware. 131K context window. Text-only. OpenAI-compatible. Free tier: 200 RPD on OpenRouter.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="openai/gpt-oss-20b:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "openai/gpt-oss-20b:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-oss-20b:free","messages":[{"role":"user","content":"Hello"}]}' Z.ai: GLM 4.5 Air (free)
GLM-4.5 Air is the lightweight Mixture-of-Experts variant of Zhipu AI's (Z.ai) flagship GLM model family, available free on OpenRouter. Purpose-built for agent-centric applications with 131K context window. Supports hybrid inference modes — a thinking mode for reasoning and tool use, plus a non-thinking mode for real-time interaction, toggled via a reasoning.enabled parameter. Compact MoE architecture for efficient deployment. Text-only. Also available directly from Z AI's platform. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="z-ai/glm-4.5-air:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "z-ai/glm-4.5-air:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"z-ai/glm-4.5-air:free","messages":[{"role":"user","content":"Hello"}]}' Qwen: Qwen3 Coder 480B A35B (free)
Qwen3 Coder 480B A35B is a Mixture-of-Experts code generation model from Alibaba's Qwen team, available free on OpenRouter. With 480B total parameters and 35B active per forward pass (8 of 160 experts), it is optimized for agentic coding tasks — function calling, tool use, and long-context reasoning over repositories. 262K context window. One of the most capable free coding models on OpenRouter. Model weights available on Hugging Face. Text/code. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="qwen/qwen3-coder:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "qwen/qwen3-coder:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"qwen/qwen3-coder:free","messages":[{"role":"user","content":"Hello"}]}' Venice: Uncensored (free)
"Venice: Uncensored is a free 2-3 sentence model from OpenRouter, suitable for text-based chat tasks, capable of handling 32,768 input tokens, and producing up to 8,192 output tokens. It's openAI compatible and doesn't require a credit card, making it ideal for developers with a rate limit constraint."
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="cognitivecomputations/dolphin-mistral-24b-venice-edition:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "cognitivecomputations/dolphin-mistral-24b-venice-edition:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"cognitivecomputations/dolphin-mistral-24b-venice-edition:free","messages":[{"role":"user","content":"Hello"}]}' Meta: Llama 3.3 70B Instruct (free)
Meta Llama 3.3 70B Instruct is Meta's flagship open-weight multilingual dialogue model, available free on OpenRouter (also widely hosted on Groq, NVIDIA NIM, Cerebras, and Cloudflare). With a 131K context window, it outperforms many open and closed chat models on common industry benchmarks. Supports 8 languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Text-only. Model weights available on Hugging Face. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="meta-llama/llama-3.3-70b-instruct:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "meta-llama/llama-3.3-70b-instruct:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"meta-llama/llama-3.3-70b-instruct:free","messages":[{"role":"user","content":"Hello"}]}' Meta: Llama 3.2 3B Instruct (free)
Meta Llama 3.2 3B Instruct is a 3-billion-parameter multilingual model available free on OpenRouter. Trained on 9 trillion tokens, it excels in instruction-following, complex reasoning, and tool use across 8 languages (English, Spanish, Hindi, and 5 others). 80K context window. The smallest model in Meta's Llama 3.2 family, optimized for accuracy and efficiency in text generation. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="meta-llama/llama-3.2-3b-instruct:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "meta-llama/llama-3.2-3b-instruct:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"meta-llama/llama-3.2-3b-instruct:free","messages":[{"role":"user","content":"Hello"}]}' Nous: Hermes 3 405B Instruct (free)
Nous Research Hermes 3 405B is a frontier-level, full-parameter finetune of Llama 3.1 405B, available free on OpenRouter. It builds on Hermes 2 with advanced agentic capabilities, roleplaying, multi-turn conversation, and long-context coherence. Features powerful and reliable function calling, structured output capabilities, and improved code generation. Emphasizes user alignment — providing strong steering control to the end user. 131K context window. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nousresearch/hermes-3-llama-3.1-405b:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nousresearch/hermes-3-llama-3.1-405b:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nousresearch/hermes-3-llama-3.1-405b:free","messages":[{"role":"user","content":"Hello"}]}' Baidu Qianfan: CoBuddy
Baidu CoBuddy is a code generation model optimized for coding tasks and AI agent workflows, available free on OpenRouter. It features a 131K context window with up to 65K output tokens — one of the largest output windows among free coding models. Runs on fp8 quantization for high throughput and low latency. Supports native tool calling and reasoning. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="baidu/cobuddy",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "baidu/cobuddy",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"baidu/cobuddy","messages":[{"role":"user","content":"Hello"}]}' NVIDIA: Llama Nemotron Embed VL 1B V2 (free)
NVIDIA Llama Nemotron Embed VL 1B V2 is a compact multimodal retrieval model available free on OpenRouter. Optimized for multimodal question-answering retrieval — it embeds documents as image, text, or both, retrievable via text query. Supports images containing text, tables, charts, and infographics. 131K context window, 1B parameters. Not a chat model — purpose-built for embedding and retrieval pipelines. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="nvidia/llama-nemotron-embed-vl-1b-v2-20260224:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "nvidia/llama-nemotron-embed-vl-1b-v2-20260224:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/llama-nemotron-embed-vl-1b-v2-20260224:free","messages":[{"role":"user","content":"Hello"}]}' Sourceful: Riverflow V2.5 Fast (free)
Riverflow V2.5 Fast is the speed-optimized variant of Sourceful's Riverflow 2.5 lineup, best for production deployments and latency-critical workflows. The Riverflow 2.5 series is a unified text-to-image and image-to-image family that treats generation as a production workflow, using an integrated reasoning model to plan multi-step edits and judge candidates before accepting a result. Reasoning effort is controllable via the reasoning parameter (low/medium/high) - higher levels do more editing passes and apply a stricter internal judge, while lower levels return faster for early exploration. It generates at 1K and 2K resolution (no 4K) and accepts up to 4 input images for editing. Pricing is dynamic: cost is finalized per job at completion based on billable processing, so it scales with reasoning effort, resolution, and editing complexity rather than a fixed per-image rate. Additional features (via image_config): - Custom font rendering via font_inputs (max 2) to match brand lettering, spacing, and weight - Custom scoring via scoring_prompt and scoring_rubric, so the reasoning model evaluates and steers each candidate against the criteria you care about - Background control via background_mode (original, transparent, solid) and background_hex_color See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="sourceful/riverflow-v2.5-fast-20260605:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "sourceful/riverflow-v2.5-fast-20260605:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sourceful/riverflow-v2.5-fast-20260605:free","messages":[{"role":"user","content":"Hello"}]}' Sourceful: Riverflow V2.5 Pro (free)
Riverflow V2.5 Pro is the most powerful variant of Sourceful's Riverflow 2.5 lineup, best for top-tier control and quality-sensitive outputs. The Riverflow 2.5 series is a unified text-to-image and image-to-image family that treats generation as a production workflow, using an integrated reasoning model to plan multi-step edits and judge candidates before accepting a result. Reasoning effort is controllable via the reasoning parameter (low/medium/high/xhigh) - higher levels do more editing passes and apply a stricter internal judge, with xhigh suited to batch runs that need high repeatability. It generates at 1K, 2K, and 4K resolution and accepts up to 10 input images for editing. Pricing is dynamic: cost is finalized per job at completion based on billable processing, so it scales with reasoning effort, resolution, and editing complexity rather than a fixed per-image rate. Additional features (via image_config): - Custom font rendering via font_inputs (max 2) to match brand lettering, spacing, and weight - Custom scoring via scoring_prompt and scoring_rubric, so the reasoning model evaluates and steers each candidate against the criteria you care about - Background control via background_mode (original, transparent, solid) and background_hex_color See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_API_KEY" # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
model="sourceful/riverflow-v2.5-pro-20260605:free",
messages=[{"role": "user", "content": "Hello"}]
) JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.API_KEY // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
model: "sourceful/riverflow-v2.5-pro-20260605:free",
messages: [{ role: "user", content: "Hello" }]
}); cURL
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"sourceful/riverflow-v2.5-pro-20260605:free","messages":[{"role":"user","content":"Hello"}]}' Free Tier Pricing & Rate Limits
Tool Compatibility — Configure OpenRouter with Your AI Tools
OpenRouter works with popular AI coding tools. Here's how to configure each one:
Claude Code Anthropic CLI coding agent Compatible
# Claude Code works via OpenRouter's Anthropic-compatible API.
# Note: Only paid Anthropic Claude models are supported (e.g. claude-sonnet-4.6, claude-opus-4).
# Browse available Claude models at: https://openrouter.ai/models?q=anthropic
# Add to ~/.zshrc or ~/.bashrc
export OPENROUTER_API_KEY="<your-openrouter-api-key>" # Get at https://openrouter.ai/settings/keys
export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
export ANTHROPIC_API_KEY="" # Must be explicitly empty to avoid conflicts
# Optional: pin specific models for each role
# export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
# export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"
# Then simply run: claude See full Claude Code configuration guide → Cursor AI-first code editor Compatible
# Cursor → Settings (⚙️) → Models → Add Model
# Enter the model name exactly as shown, then fill in:
# Override OpenAI Base URL: https://openrouter.ai/api/v1
# OpenAI API Key: <your-api-key> # Get at https://openrouter.ai/workspaces/default/keys
# Click "Verify" to confirm the connection, then enable the model.
#
# Model name to add: nex-agi/nex-n2-pro:free See full Cursor configuration guide → Codex OpenAI CLI coding agent Compatible
# Add to ~/.zshrc or ~/.bashrc
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="<your-api-key>" # Get at https://openrouter.ai/workspaces/default/keys
# Then run:
codex --model "nex-agi/nex-n2-pro:free" See full Codex configuration guide → Gemini CLI Google Gemini CLI tool Compatible
# ~/.gemini/settings.json
{
"apiKey": "<your-api-key>",
"model": "nex-agi/nex-n2-pro:free"
}
# Get API key at https://openrouter.ai/workspaces/default/keys See full Gemini CLI configuration guide → OpenCode Open-source AI coding agent Compatible
// ~/.config/opencode/opencode.json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"free-llm": {
"npm": "@ai-sdk/openai-compatible",
"name": "Free LLM",
"options": {
"baseURL": "https://openrouter.ai/api/v1",
"apiKey": "<your-api-key>"
},
"models": {
"nex-agi/nex-n2-pro:free": { "name": "nex-agi/nex-n2-pro:free" }
}
}
}
}
// Get API key at https://openrouter.ai/workspaces/default/keys See full OpenCode configuration guide → Hermes AI coding agent Compatible
# Step 1 — Edit config.yaml
# Windows: C:\Users\<you>\AppData\Local\hermes\config.yaml
# macOS/Linux: ~/.config/hermes/config.yaml
model:
default: nex-agi/nex-n2-pro:free
provider: custom
base_url: ${CUSTOM_BASE_URL}
api_key: ${CUSTOM_API_KEY}
model_aliases:
nex-agi/nex-n2-pro:free:
model: "nex-agi/nex-n2-pro:free"
provider: "custom"
# Step 2 — Edit .env (same directory as config.yaml)
# Windows: C:\Users\<you>\AppData\Local\hermes\.env
# macOS/Linux: ~/.config/hermes/.env
# ========================
# Custom API (OpenAI-compatible)
# ========================
CUSTOM_API_KEY=<your-api-key> # Get at https://openrouter.ai/workspaces/default/keys
CUSTOM_BASE_URL=https://openrouter.ai/api/v1 See full Hermes configuration guide → OpenClaw AI coding agent (messaging gateway) Compatible
// ~/.openclaw/openclaw.json (JSON5 format)
{
"agents": {
"defaults": {
"model": {
"primary": "nex-agi/nex-n2-pro:free",
},
},
},
"models": {
"providers": {
// Option A — Built-in provider (OpenAI, Anthropic, Google…)
// Just add apiKey; OpenClaw handles the baseUrl automatically
// "openai": { "apiKey": "<your-api-key>" },
// Option B — Custom OpenAI-compatible base URL (e.g. OpenRouter, NVIDIA)
"free-llm": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "<your-api-key>", // Get at https://openrouter.ai/workspaces/default/keys
"api": "openai-completions", // openai-completions | anthropic-messages | …
"models": [
{ "id": "nex-agi/nex-n2-pro:free", "name": "nex-agi/nex-n2-pro:free" },
],
},
},
},
}
// Apply: openclaw gateway restart
// Verify: openclaw doctor --fix See full OpenClaw configuration guide → OpenHuman Personal AI super intelligence (desktop agent) Compatible
# config.toml — OpenHuman workspace config
# Edit via Settings → AI & Skills → Local AI, or directly in the file.
#
# OpenHuman defaults to its built-in subscription backend.
# Set inference_url + api_key below to route to a free third-party API.
inference_url = "https://openrouter.ai/api/v1"
api_key = "<your-api-key>" # Get at https://openrouter.ai/workspaces/default/keys
default_model = "nex-agi/nex-n2-pro:free"
# Optional: pin hints to specific models
# [model_routing]
# reasoning = "nex-agi/nex-n2-pro:free"
# fast = "nex-agi/nex-n2-pro:free"
# Verify: check Settings → AI & Skills for connection status See full OpenHuman configuration guide → Use Cases
What OpenRouter's free models are best for, based on aggregated model capabilities:
Limitations & Caveats
- 200 RPD on free tier (1,000 with $10 lifetime credit)
- Free models may change without notice
- Some free models have shorter context windows than paid variants
Frequently Asked Questions
What happens when I hit the 200 RPD limit on OpenRouter?
You'll get a 429 rate limit error. You can either wait for the daily reset (midnight UTC) or make a one-time $10 lifetime top-up to get 1,000 RPD permanently.
Do free models on OpenRouter cost anything?
No — models marked with :free in their ID are completely free to use within the rate limit. OpenRouter subsidizes free models through paid model margins.
Can I use OpenRouter with Claude Code?
Yes. Set ANTHROPIC_BASE_URL to https://openrouter.ai/api and your OpenRouter API key. Claude Code will route through OpenRouter. Note: free models won't work for Claude Code as it requires Anthropic Claude models specifically.