How to Get a Free OpenRouter API Key (2026)

39 free models available — no credit card required. Get your OpenRouter API key →

Overview

Single API key for 300+ models from every major lab.

OpenRouter is a unified API gateway that routes requests to models from OpenAI, Anthropic, Google, Meta, Mistral, and dozens of other providers. With a single API key and one base URL, developers can switch between models without changing code. The free tier (marked :free) offers 200 requests/day — or 1,000/day with a $10 lifetime top-up.

35+ permanently free models
OpenAI-compatible endpoint
200 RPD free, 1,000 RPD with $10 lifetime credit
One key for 300+ models

API Compatibility: OpenAI SDK-compatible (Chat Completions)

Quick Start Guide

1
Sign up at openrouter.ai No credit card. $10 lifetime top-up for 5× more rate limits.
2
Go to Keys page
3
Create a new API key
4
Find free models Look for :free suffix in model name, or browse our list below.
5
Configure OpenAI client Base URL: https://openrouter.ai/api/v1

All Free OpenRouter Models — Context Windows & Rate Limits

Model	Context	Max Output	Modality	Rate Limit	Released	Status
Nex AGI: Nex-N2-Pro (free)	262K	262K	textimage	See provider page	Jun 8, 2026	Online
NVIDIA: Nemotron 3.5 Content Safety (free)	128K	8K	textimage	See provider page	Jun 4, 2026	Online
NVIDIA: Nemotron 3 Ultra (free)	1.0M	66K	text	See provider page	Jun 4, 2026	Online
MiniMax: MiniMax M3	1.0M	512K	textimage	See provider page	May 31, 2026	Online
inclusionAI: Ring-2.6-1T	262K	66K	text	See provider page	May 8, 2026	Online
Owl Alpha	1.0M	262K	text	See provider page	Apr 28, 2026	Online
NVIDIA: Nemotron 3 Nano Omni (free)	256K	66K	textimageaudio	See provider page	Apr 28, 2026	Online
Poolside: Laguna XS.2 (free)	262K	33K	text	See provider page	Apr 28, 2026	Online
Poolside: Laguna M.1 (free)	262K	33K	text	See provider page	Apr 28, 2026	Online
DeepSeek: DeepSeek V4 Flash	1.0M	131K	text	See provider page	Apr 24, 2026	Online
MoonshotAI: Kimi K2.6 (free)	262K	8K	textimage	See provider page	Apr 20, 2026	Online
Z.ai: GLM 5.1	203K	8K	text	See provider page	Apr 7, 2026	Online
Google: Gemma 4 26B A4B (free)	262K	33K	textimage	See provider page	Apr 3, 2026	Online
Google: Gemma 4 31B (free)	262K	33K	textimage	See provider page	Apr 2, 2026	Online
Arcee AI: Trinity Large Thinking	262K	262K	textreasoning	See provider page	Apr 1, 2026	Online
Google: Lyria 3 Pro Preview	1.0M	66K	textimage	See provider page	Mar 30, 2026	Online
Google: Lyria 3 Clip Preview	1.0M	66K	textimage	See provider page	Mar 30, 2026	Online
NVIDIA: Nemotron 3 Super (free)	1.0M	262K	text	See provider page	Mar 11, 2026	Online
MiniMax: MiniMax M2.5	205K	197K	text	See provider page	Feb 12, 2026	Online
Free Models Router	200K	8K	textimage	See provider page	Feb 1, 2026	Online
LiquidAI: LFM2.5-1.2B-Thinking (free)	33K	8K	textreasoning	See provider page	Jan 20, 2026	Online
LiquidAI: LFM2.5-1.2B-Instruct (free)	33K	8K	text	See provider page	Jan 20, 2026	Online
NVIDIA: Nemotron 3 Nano 30B A3B (free)	256K	8K	text	See provider page	Dec 14, 2025	Online
OpenAI: gpt-oss-safeguard-20b	131K	66K	text	See provider page	Oct 29, 2025	Online
NVIDIA: Nemotron Nano 12B 2 VL (free)	128K	128K	textimage	See provider page	Oct 28, 2025	Online
Qwen: Qwen3 Next 80B A3B Instruct (free)	262K	8K	text	See provider page	Sep 11, 2025	Online
NVIDIA: Nemotron Nano 9B V2 (free)	128K	8K	text	See provider page	Sep 5, 2025	Online
OpenAI: gpt-oss-120b (free)	131K	131K	text	See provider page	Aug 5, 2025	Online
OpenAI: gpt-oss-20b (free)	131K	8K	text	See provider page	Aug 5, 2025	Online
Z.ai: GLM 4.5 Air (free)	131K	96K	text	See provider page	Jul 25, 2025	Online
Qwen: Qwen3 Coder 480B A35B (free)	1.0M	262K	textcode	See provider page	Feb 4, 2026	Online
Venice: Uncensored (free)	33K	8K	text	See provider page	—	Online
Meta: Llama 3.3 70B Instruct (free)	131K	8K	text	See provider page	Dec 6, 2024	Online
Meta: Llama 3.2 3B Instruct (free)	131K	8K	text	See provider page	Sep 25, 2024	Online
Nous: Hermes 3 405B Instruct (free)	131K	8K	text	See provider page	Aug 16, 2024	Online
Baidu Qianfan: CoBuddy	131K	65K	textcode	See provider page	—	Unavailable
NVIDIA: Llama Nemotron Embed VL 1B V2 (free)	131K	8K	textimageembeddings	See provider page	Feb 25, 2026	Unavailable
Sourceful: Riverflow V2.5 Fast (free)	8K	8K	textimage	See provider page	Jun 4, 2026	Unavailable
Sourceful: Riverflow V2.5 Pro (free)	8K	8K	textimage	See provider page	Jun 4, 2026	Unavailable

Nex AGI: Nex-N2-Pro (free) textimage

Nex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total. Built on the Qwen3.5 architecture, it accepts text and image input and produces text output, and supports reasoning, function calling, and structured outputs. It is designed for coding, tool use, deep research, and long-horizon agentic workflows, unifying planning, code implementation, debugging, and iteration into a single execution loop.

Context262K

Max Output262K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nex-agi/nex-n2-pro:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nex-agi/nex-n2-pro:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nex-agi/nex-n2-pro:free","messages":[{"role":"user","content":"Hello"}]}'

NVIDIA: Nemotron 3.5 Content Safety (free) textimage

NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both inputs to and responses from LLMs and VLMs, accepting text and image input and returning text output: a safe/unsafe classification for the user prompt and the response, safety category labels, and an optional reasoning trace. It covers 12 languages with a context window of up to 128K tokens. It is suited for prompt and response moderation, content classification, safety pipelines, and enterprise AI guardrails with policy enforcement, and includes a togglable reasoning mode. It is part of the NVIDIA Nemotron family of open models for agentic AI.

Context128K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-3.5-content-safety:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-3.5-content-safety:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-3.5-content-safety:free","messages":[{"role":"user","content":"Hello"}]}'

NVIDIA: Nemotron 3 Ultra (free) text

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it supports text input and output with a context window of up to 1M tokens. It is suited for long-running agentic workflows, including agent orchestration, coding agents, deep research, and complex enterprise tasks. It is particularly strong at multi-step reasoning and planning, with high-throughput inference designed for high-volume agent pipelines. It is part of the NVIDIA Nemotron family of open models for agentic AI.

Context1.0M

Max Output66K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-3-ultra-550b-a55b:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-3-ultra-550b-a55b:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-3-ultra-550b-a55b:free","messages":[{"role":"user","content":"Hello"}]}'

MiniMax: MiniMax M3 textimage

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding, and tool use. It is built on MiniMax Sparse Attention (MSA), which replaces full attention with KV-block selection to cut per-token compute at long context — roughly 1/20 the cost of the previous generation at 1M tokens, with substantially faster prefill and decode while retaining quality across most tasks. Trained as a native multimodal model on interleaved data and tuned for multi-turn, production-like collaboration via an interactive user-simulator framework, the model is oriented toward sustained, multi-step tasks rather than single-turn execution.

Context1.0M

Max Output512K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="minimax/minimax-m3",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "minimax/minimax-m3",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"minimax/minimax-m3","messages":[{"role":"user","content":"Hello"}]}'

inclusionAI: Ring-2.6-1T text

inclusionAI Ring-2.6-1T is a 1-trillion-parameter thinking model with 63B active parameters (MoE), available free on OpenRouter. Optimized for coding agents, tool use, and long-horizon task execution. Features adaptive reasoning with high and xhigh modes that dynamically adjust the reasoning budget based on task complexity — delivering stronger performance with lower token overhead in tool-heavy, multi-turn agent workflows. Strong results on PinchBench, ClawEval, TAU2-Bench, and GAIA2-search. 262K context window. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context262K

Max Output66K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="inclusionai/ring-2.6-1t",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "inclusionai/ring-2.6-1t",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"inclusionai/ring-2.6-1t","messages":[{"role":"user","content":"Hello"}]}'

View Full Details →

Owl Alpha text

Owl Alpha is a high-performance foundation model designed for agentic workloads, available free on OpenRouter. Features a 1M context window — one of the largest of any model. Natively handles tool use, long-context tasks, code generation, automated workflows, and complex instruction execution. Compatible with Claude Code, OpenClaw, and other mainstream productivity tools. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit). Note: prompts and completions may be logged by the provider for model improvement.

Context1.0M

Max Output262K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="openrouter/owl-alpha",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "openrouter/owl-alpha",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openrouter/owl-alpha","messages":[{"role":"user","content":"Hello"}]}'

View Full Details →

NVIDIA: Nemotron 3 Nano Omni (free) textimageaudio

NVIDIA Nemotron 3 Nano Omni 30B A3B Reasoning is an open multimodal model with 30B total parameters (3B active via MoE), available free on OpenRouter. Accepts text, image, video, and audio input — built as a perception and context sub-agent for enterprise agent systems. Uses a Hybrid MoE Transformer-Mamba architecture with Conv3D video layers and Efficient Video Sampling (EVS) for ~2x higher throughput and 2.5x lower compute on video tasks vs. separate vision+speech pipelines. Up to 300K context, 16K reasoning budget. OpenAI-compatible. Free tier: 200 RPD.

Context256K

Max Output66K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Vision, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free","messages":[{"role":"user","content":"Hello"}]}'

Poolside: Laguna XS.2 (free) text

Poolside Laguna XS.2 is the second-generation efficient coding agent model from Poolside, available free on OpenRouter. Combines tool calling and reasoning capabilities with a compact footprint for software engineering and agentic coding workflows. 131K context window, up to 8K output. Quantized to fp8 for fast, cost-efficient inference. Released under Apache 2.0. Text/code-focused. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context262K

Max Output33K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="poolside/laguna-xs.2:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "poolside/laguna-xs.2:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"poolside/laguna-xs.2:free","messages":[{"role":"user","content":"Hello"}]}'

Poolside: Laguna M.1 (free) text

Poolside Laguna M.1 is the flagship coding agent model from Poolside, available free on OpenRouter. Designed for complex software engineering tasks with agentic coding workflows — supports tool calling and reasoning. Features a 131K context window with up to 8K output tokens. Quantized to fp8 for efficient inference. Users must agree to Poolside's End User License Agreement. Text/code-focused. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context262K

Max Output33K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="poolside/laguna-m.1:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "poolside/laguna-m.1:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"poolside/laguna-m.1:free","messages":[{"role":"user","content":"Hello"}]}'

View Full Details →

DeepSeek: DeepSeek V4 Flash text

DeepSeek V4 Flash is a Mixture-of-Experts model with 284B total parameters (13B active per token), available free on OpenRouter. It features a 1M context window with hybrid attention for efficient long-context processing. Supports configurable reasoning effort (high/xhigh levels). Strong performance on coding, reasoning, and agent workflows. Model weights available on Hugging Face. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).

Context1.0M

Max Output131K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="deepseek/deepseek-v4-flash",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "deepseek/deepseek-v4-flash",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek/deepseek-v4-flash","messages":[{"role":"user","content":"Hello"}]}'

View Full Details →

MoonshotAI: Kimi K2.6 (free) textimage

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and can convert prompts and visual inputs into production-ready interfaces. Its agent swarm architecture scales to hundreds of parallel sub-agents for autonomous task decomposition - delivering documents, websites, and spreadsheets in a single run without human oversight.

Context262K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="moonshotai/kimi-k2.6:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "moonshotai/kimi-k2.6:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"moonshotai/kimi-k2.6:free","messages":[{"role":"user","content":"Hello"}]}'

Z.ai: GLM 5.1 text

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

Context203K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="z-ai/glm-5.1",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "z-ai/glm-5.1",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"z-ai/glm-5.1","messages":[{"role":"user","content":"Hello"}]}'

Google: Gemma 4 26B A4B (free) textimage

Google Gemma 4 26B is a Mixture-of-Experts vision-language model with 25.2B total parameters (3.8B active per token), available free on OpenRouter. It supports multimodal input — text, images, and video (up to 60s at 1fps) — with a 262K context window. Includes native function calling, configurable thinking/reasoning mode, and structured output support. Delivers near-31B quality at a fraction of the compute. Apache 2.0 license. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context262K

Max Output33K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="google/gemma-4-26b-a4b-it:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "google/gemma-4-26b-a4b-it:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/gemma-4-26b-a4b-it:free","messages":[{"role":"user","content":"Hello"}]}'

Google: Gemma 4 31B (free) textimage

Google Gemma 4 31B is a dense 30.7B-parameter multimodal model from Google DeepMind, available free on OpenRouter. Supports text and image input with a 262K context window. Strong on coding, reasoning, and document understanding tasks. Includes native function calling, configurable thinking/reasoning mode, and multilingual support across 140+ languages. Apache 2.0 license. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context262K

Max Output33K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="google/gemma-4-31b-it:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "google/gemma-4-31b-it:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/gemma-4-31b-it:free","messages":[{"role":"user","content":"Hello"}]}'

Arcee AI: Trinity Large Thinking textreasoning

Arcee AI Trinity Large Thinking is a powerful open-source reasoning model available free on OpenRouter. With a 262K context window, it is optimized for agentic workflows and performs well on PinchBench and reasoning benchmarks. Best results come from preserving interleaved thinking (chain-of-thought) tokens. Model weights are open on Hugging Face. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).

Context262K

Max Output262K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="arcee-ai/trinity-large-thinking",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "arcee-ai/trinity-large-thinking",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"arcee-ai/trinity-large-thinking","messages":[{"role":"user","content":"Hello"}]}'

Google: Lyria 3 Pro Preview textimage

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Pro can generate full-length songs with verses, choruses, bridges.

Context1.0M

Max Output66K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="google/lyria-3-pro-preview",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "google/lyria-3-pro-preview",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/lyria-3-pro-preview","messages":[{"role":"user","content":"Hello"}]}'

Google: Lyria 3 Clip Preview textimage

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Clip can generate short clips, loops, previews.

Context1.0M

Max Output66K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="google/lyria-3-clip-preview",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "google/lyria-3-clip-preview",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"google/lyria-3-clip-preview","messages":[{"role":"user","content":"Hello"}]}'

NVIDIA: Nemotron 3 Super (free) text

NVIDIA Nemotron 3 Super 120B A12B is an open hybrid MoE model with 120B total parameters (12B active), available free on OpenRouter. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50% higher token generation speed compared to leading open models. Designed for multi-agent applications — long-term agent coherence, cross-document reasoning, and multi-step task planning. Trained with multi-environment RL across 10+ environments. Latent MoE calls 4 experts for the cost of one. Up to 1M context window. Fully open: weights, datasets, and recipes. OpenAI-compatible. Free tier: 200 RPD.

Context1.0M

Max Output262K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-3-super-120b-a12b:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-3-super-120b-a12b:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-3-super-120b-a12b:free","messages":[{"role":"user","content":"Hello"}]}'

MiniMax: MiniMax M2.5 text

MiniMax M2.5 is a SOTA large language model designed for real-world productivity, available free on OpenRouter. It builds on M2.1's coding strengths and extends into general office tasks — handling Word, Excel, and PowerPoint files, switching between software environments, and working across teams. Achieved 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp. 197K context window. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context205K

Max Output197K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="minimax/minimax-m2.5",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "minimax/minimax-m2.5",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"minimax/minimax-m2.5","messages":[{"role":"user","content":"Hello"}]}'

Free Models Router textimage

OpenRouter Free Models Router dynamically selects from 25 free models at random, smartly filtering for models that support the features needed for each request — image understanding, tool calling, and structured outputs. Rather than locking into a single model, it routes to the best available free model at request time. 200K context window. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).

Context200K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="openrouter/free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "openrouter/free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openrouter/free","messages":[{"role":"user","content":"Hello"}]}'

LiquidAI: LFM2.5-1.2B-Thinking (free) textreasoning

Liquid LFM 2.5 1.2B Thinking is a compact reasoning model optimized for agentic tasks, data extraction, and RAG, available free on OpenRouter. Despite its small 1.2B parameter footprint, it produces high-quality chain-of-thought responses and runs comfortably on edge devices. 33K context window, supporting long-context reasoning in a lightweight package. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context33K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="liquid/lfm-2.5-1.2b-thinking:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "liquid/lfm-2.5-1.2b-thinking:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"liquid/lfm-2.5-1.2b-thinking:free","messages":[{"role":"user","content":"Hello"}]}'

LiquidAI: LFM2.5-1.2B-Instruct (free) text

Liquid LFM 2.5 1.2B Instruct is a compact, high-performance instruction-tuned model designed for fast on-device AI, available free on OpenRouter. With a 1.2B parameter footprint and 33K context window, it delivers strong chat quality for edge deployment scenarios. Built for efficient inference across a broad range of runtimes. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context33K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="liquid/lfm-2.5-1.2b-instruct:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "liquid/lfm-2.5-1.2b-instruct:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"liquid/lfm-2.5-1.2b-instruct:free","messages":[{"role":"user","content":"Hello"}]}'

NVIDIA: Nemotron 3 Nano 30B A3B (free) text

NVIDIA Nemotron 3 Nano 30B A3B is a Mixture-of-Experts language model with 30B total parameters (3B active), available free on OpenRouter. Designed for highest compute efficiency and accuracy in agentic AI systems. Features a 262K context window — exceptionally long for its size class. Fully open with weights, datasets, and recipes under the NVIDIA Open License for custom deployment. Also available directly on NVIDIA NIM. OpenAI-compatible. Free tier: 200 RPD.

Context256K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-3-nano-30b-a3b:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-3-nano-30b-a3b:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-3-nano-30b-a3b:free","messages":[{"role":"user","content":"Hello"}]}'

OpenAI: gpt-oss-safeguard-20b text

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).

Context131K

Max Output66K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Coding

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="openai/gpt-oss-safeguard-20b",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "openai/gpt-oss-safeguard-20b",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-oss-safeguard-20b","messages":[{"role":"user","content":"Hello"}]}'

NVIDIA: Nemotron Nano 12B 2 VL (free) textimage

NVIDIA Nemotron Nano 12B V2 VL is a 12B-parameter multimodal reasoning model available free on OpenRouter. Uses a hybrid Transformer-Mamba architecture for better throughput and lower latency. Handles text and multi-image document inputs with 128K context. Optimized for optical character recognition (OCR), chart reasoning, and multimodal comprehension. Supports long-form video via Efficient Video Sampling (EVS). Achieves ~74 average across MMMU, MathVista, AI2D, OCRBench, ChartQA, DocVQA, and Video-MME. Open-weights under NVIDIA license. Also available on NVIDIA NIM. OpenAI-compatible. Free tier: 200 RPD.

Context128K

Max Output128K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-nano-12b-v2-vl:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-nano-12b-v2-vl:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-nano-12b-v2-vl:free","messages":[{"role":"user","content":"Hello"}]}'

Qwen: Qwen3 Next 80B A3B Instruct (free) text

Qwen3 Next 80B A3B Instruct is a Mixture-of-Experts chat model from Alibaba's Qwen team, available free on OpenRouter. With 80B total parameters (3B active), it is optimized for fast, stable responses without chain-of-thought traces — returning only final answers. Designed for production settings where deterministic, instruction-following outputs are preferred. Handles reasoning, code generation, knowledge QA, and multilingual tasks across a 262K context window. Supports RAG, tool use, and agentic workflows. Model weights on Hugging Face. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context262K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="qwen/qwen3-next-80b-a3b-instruct:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "qwen/qwen3-next-80b-a3b-instruct:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen/qwen3-next-80b-a3b-instruct:free","messages":[{"role":"user","content":"Hello"}]}'

NVIDIA: Nemotron Nano 9B V2 (free) text

NVIDIA Nemotron Nano 9B V2 is a 9B-parameter language model trained from scratch by NVIDIA, available free on OpenRouter. Designed as a unified model for both reasoning and non-reasoning tasks — it generates a reasoning trace before delivering a final answer, with the ability to skip intermediate reasoning via system prompt configuration. This makes it a single model that handles both chain-of-thought and direct-response tasks. 131K context window. Text-only. Also available directly on NVIDIA NIM. OpenAI-compatible. Free tier: 200 RPD.

Context128K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/nemotron-nano-9b-v2:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/nemotron-nano-9b-v2:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/nemotron-nano-9b-v2:free","messages":[{"role":"user","content":"Hello"}]}'

OpenAI: gpt-oss-120b (free) text

GPT-OSS 120B is OpenAI's open-weight 117B-parameter Mixture-of-Experts model with 5.1B active parameters per forward pass, available free on OpenRouter (also on Groq, Cerebras, and Cloudflare). Supports configurable reasoning depth, full chain-of-thought access, and native tool use — function calling, browsing, and structured outputs. Optimized to run on a single H100 GPU with native MXFP4 quantization. Built for reasoning-heavy and agentic tasks. 131K context window. Text-only. OpenAI-compatible. Free tier: 200 RPD on OpenRouter.

Context131K

Max Output131K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Coding

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="openai/gpt-oss-120b:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "openai/gpt-oss-120b:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-oss-120b:free","messages":[{"role":"user","content":"Hello"}]}'

OpenAI: gpt-oss-20b (free) text

GPT-OSS 20B is OpenAI's open-weight 21B-parameter Mixture-of-Experts model with 3.6B active per forward pass, available free on OpenRouter (also on Groq, Cerebras, and Cloudflare). Supports reasoning level configuration, fine-tuning, and agentic capabilities — function calling, tool use, and structured outputs. Trained in OpenAI's Harmony response format. Released under Apache 2.0. The low active parameter count enables lower-latency inference on consumer or single-GPU hardware. 131K context window. Text-only. OpenAI-compatible. Free tier: 200 RPD on OpenRouter.

Context131K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Coding

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="openai/gpt-oss-20b:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "openai/gpt-oss-20b:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-oss-20b:free","messages":[{"role":"user","content":"Hello"}]}'

Z.ai: GLM 4.5 Air (free) text

GLM-4.5 Air is the lightweight Mixture-of-Experts variant of Zhipu AI's (Z.ai) flagship GLM model family, available free on OpenRouter. Purpose-built for agent-centric applications with 131K context window. Supports hybrid inference modes — a thinking mode for reasoning and tool use, plus a non-thinking mode for real-time interaction, toggled via a reasoning.enabled parameter. Compact MoE architecture for efficient deployment. Text-only. Also available directly from Z AI's platform. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context131K

Max Output96K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="z-ai/glm-4.5-air:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "z-ai/glm-4.5-air:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"z-ai/glm-4.5-air:free","messages":[{"role":"user","content":"Hello"}]}'

Qwen: Qwen3 Coder 480B A35B (free) textcode

Qwen3 Coder 480B A35B is a Mixture-of-Experts code generation model from Alibaba's Qwen team, available free on OpenRouter. With 480B total parameters and 35B active per forward pass (8 of 160 experts), it is optimized for agentic coding tasks — function calling, tool use, and long-context reasoning over repositories. 262K context window. One of the most capable free coding models on OpenRouter. Model weights available on Hugging Face. Text/code. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context1.0M

Max Output262K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Coding

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="qwen/qwen3-coder:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "qwen/qwen3-coder:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen/qwen3-coder:free","messages":[{"role":"user","content":"Hello"}]}'

Venice: Uncensored (free) text

"Venice: Uncensored is a free 2-3 sentence model from OpenRouter, suitable for text-based chat tasks, capable of handling 32,768 input tokens, and producing up to 8,192 output tokens. It's openAI compatible and doesn't require a credit card, making it ideal for developers with a rate limit constraint."

Context33K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="cognitivecomputations/dolphin-mistral-24b-venice-edition:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "cognitivecomputations/dolphin-mistral-24b-venice-edition:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"cognitivecomputations/dolphin-mistral-24b-venice-edition:free","messages":[{"role":"user","content":"Hello"}]}'

Meta: Llama 3.3 70B Instruct (free) text

Meta Llama 3.3 70B Instruct is Meta's flagship open-weight multilingual dialogue model, available free on OpenRouter (also widely hosted on Groq, NVIDIA NIM, Cerebras, and Cloudflare). With a 131K context window, it outperforms many open and closed chat models on common industry benchmarks. Supports 8 languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Text-only. Model weights available on Hugging Face. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context131K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "meta-llama/llama-3.3-70b-instruct:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"meta-llama/llama-3.3-70b-instruct:free","messages":[{"role":"user","content":"Hello"}]}'

Meta: Llama 3.2 3B Instruct (free) text

Meta Llama 3.2 3B Instruct is a 3-billion-parameter multilingual model available free on OpenRouter. Trained on 9 trillion tokens, it excels in instruction-following, complex reasoning, and tool use across 8 languages (English, Spanish, Hindi, and 5 others). 80K context window. The smallest model in Meta's Llama 3.2 family, optimized for accuracy and efficiency in text generation. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context131K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="meta-llama/llama-3.2-3b-instruct:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "meta-llama/llama-3.2-3b-instruct:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"meta-llama/llama-3.2-3b-instruct:free","messages":[{"role":"user","content":"Hello"}]}'

Nous: Hermes 3 405B Instruct (free) text

Nous Research Hermes 3 405B is a frontier-level, full-parameter finetune of Llama 3.1 405B, available free on OpenRouter. It builds on Hermes 2 with advanced agentic capabilities, roleplaying, multi-turn conversation, and long-context coherence. Features powerful and reliable function calling, structured output capabilities, and improved code generation. Emphasizes user alignment — providing strong steering control to the end user. 131K context window. Text-only. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context131K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nousresearch/hermes-3-llama-3.1-405b:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nousresearch/hermes-3-llama-3.1-405b:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nousresearch/hermes-3-llama-3.1-405b:free","messages":[{"role":"user","content":"Hello"}]}'

Baidu Qianfan: CoBuddy textcode

Baidu CoBuddy is a code generation model optimized for coding tasks and AI agent workflows, available free on OpenRouter. It features a 131K context window with up to 65K output tokens — one of the largest output windows among free coding models. Runs on fp8 quantization for high throughput and low latency. Supports native tool calling and reasoning. OpenAI-compatible via OpenRouter. Free tier: 200 RPD (or 1,000 with $10 lifetime credit).

Context131K

Max Output65K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForCoding, Chat, Reasoning

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="baidu/cobuddy",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "baidu/cobuddy",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"baidu/cobuddy","messages":[{"role":"user","content":"Hello"}]}'

View Full Details →

NVIDIA: Llama Nemotron Embed VL 1B V2 (free) textimageembeddings

NVIDIA Llama Nemotron Embed VL 1B V2 is a compact multimodal retrieval model available free on OpenRouter. Optimized for multimodal question-answering retrieval — it embeds documents as image, text, or both, retrievable via text query. Supports images containing text, tables, charts, and infographics. 131K context window, 1B parameters. Not a chat model — purpose-built for embedding and retrieval pipelines. OpenAI-compatible via OpenRouter. Free tier: 200 RPD.

Context131K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat, Reasoning, Embedding

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="nvidia/llama-nemotron-embed-vl-1b-v2-20260224:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "nvidia/llama-nemotron-embed-vl-1b-v2-20260224:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/llama-nemotron-embed-vl-1b-v2-20260224:free","messages":[{"role":"user","content":"Hello"}]}'

Sourceful: Riverflow V2.5 Fast (free) textimage

Riverflow V2.5 Fast is the speed-optimized variant of Sourceful's Riverflow 2.5 lineup, best for production deployments and latency-critical workflows. The Riverflow 2.5 series is a unified text-to-image and image-to-image family that treats generation as a production workflow, using an integrated reasoning model to plan multi-step edits and judge candidates before accepting a result. Reasoning effort is controllable via the reasoning parameter (low/medium/high) - higher levels do more editing passes and apply a stricter internal judge, while lower levels return faster for early exploration. It generates at 1K and 2K resolution (no 4K) and accepts up to 4 input images for editing. Pricing is dynamic: cost is finalized per job at completion based on billable processing, so it scales with reasoning effort, resolution, and editing complexity rather than a fixed per-image rate. Additional features (via image_config): - Custom font rendering via font_inputs (max 2) to match brand lettering, spacing, and weight - Custom scoring via scoring_prompt and scoring_rubric, so the reasoning model evaluates and steers each candidate against the criteria you care about - Background control via background_mode (original, transparent, solid) and background_hex_color See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

Context8K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="sourceful/riverflow-v2.5-fast-20260605:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "sourceful/riverflow-v2.5-fast-20260605:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sourceful/riverflow-v2.5-fast-20260605:free","messages":[{"role":"user","content":"Hello"}]}'

Sourceful: Riverflow V2.5 Pro (free) textimage

Riverflow V2.5 Pro is the most powerful variant of Sourceful's Riverflow 2.5 lineup, best for top-tier control and quality-sensitive outputs. The Riverflow 2.5 series is a unified text-to-image and image-to-image family that treats generation as a production workflow, using an integrated reasoning model to plan multi-step edits and judge candidates before accepting a result. Reasoning effort is controllable via the reasoning parameter (low/medium/high/xhigh) - higher levels do more editing passes and apply a stricter internal judge, with xhigh suited to batch runs that need high repeatability. It generates at 1K, 2K, and 4K resolution and accepts up to 10 input images for editing. Pricing is dynamic: cost is finalized per job at completion based on billable processing, so it scales with reasoning effort, resolution, and editing complexity rather than a fixed per-image rate. Additional features (via image_config): - Custom font rendering via font_inputs (max 2) to match brand lettering, spacing, and weight - Custom scoring via scoring_prompt and scoring_rubric, so the reasoning model evaluates and steers each candidate against the criteria you care about - Background control via background_mode (original, transparent, solid) and background_hex_color See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

Context8K

Max Output8K

Rate LimitSee provider page

Credit CardNot required

OpenAI CompatibleYes

Best ForChat

Python

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_API_KEY"  # Get at https://openrouter.ai/workspaces/default/keys
)
response = client.chat.completions.create(
    model="sourceful/riverflow-v2.5-pro-20260605:free",
    messages=[{"role": "user", "content": "Hello"}]
)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.API_KEY  // Get at https://openrouter.ai/workspaces/default/keys
});
const response = await client.chat.completions.create({
  model: "sourceful/riverflow-v2.5-pro-20260605:free",
  messages: [{ role: "user", content: "Hello" }]
});

cURL

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"sourceful/riverflow-v2.5-pro-20260605:free","messages":[{"role":"user","content":"Hello"}]}'

Free Tier Pricing & Rate Limits

Credit Card Not required

Free Tier Permanently free

Context Range 8K – 1.0M

Total Models 39 free

API Compatibility OpenAI SDK-compatible (Chat Completions)

Tool Compatibility — Configure OpenRouter with Your AI Tools

OpenRouter works with popular AI coding tools. Here's how to configure each one:

Claude Code Anthropic CLI coding agent Compatible

# Claude Code works via OpenRouter's Anthropic-compatible API.
# Note: Only paid Anthropic Claude models are supported (e.g. claude-sonnet-4.6, claude-opus-4).
# Browse available Claude models at: https://openrouter.ai/models?q=anthropic

# Add to ~/.zshrc or ~/.bashrc
export OPENROUTER_API_KEY="<your-openrouter-api-key>"  # Get at https://openrouter.ai/settings/keys
export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
export ANTHROPIC_API_KEY=""  # Must be explicitly empty to avoid conflicts

# Optional: pin specific models for each role
# export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
# export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"

# Then simply run: claude

See full Claude Code configuration guide →

Cursor AI-first code editor Compatible

# Cursor → Settings (⚙️) → Models → Add Model
# Enter the model name exactly as shown, then fill in:
#   Override OpenAI Base URL: https://openrouter.ai/api/v1
#   OpenAI API Key: <your-api-key>   # Get at https://openrouter.ai/workspaces/default/keys
# Click "Verify" to confirm the connection, then enable the model.
#
# Model name to add: nex-agi/nex-n2-pro:free

See full Cursor configuration guide →

Codex OpenAI CLI coding agent Compatible

# Add to ~/.zshrc or ~/.bashrc
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="<your-api-key>"  # Get at https://openrouter.ai/workspaces/default/keys

# Then run:
codex --model "nex-agi/nex-n2-pro:free"

See full Codex configuration guide →

Gemini CLI Google Gemini CLI tool Compatible

# ~/.gemini/settings.json
{
  "apiKey": "<your-api-key>",
  "model": "nex-agi/nex-n2-pro:free"
}
# Get API key at https://openrouter.ai/workspaces/default/keys

See full Gemini CLI configuration guide →

OpenCode Open-source AI coding agent Compatible

// ~/.config/opencode/opencode.json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "free-llm": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Free LLM",
      "options": {
        "baseURL": "https://openrouter.ai/api/v1",
        "apiKey": "<your-api-key>"
      },
      "models": {
        "nex-agi/nex-n2-pro:free": { "name": "nex-agi/nex-n2-pro:free" }
      }
    }
  }
}
// Get API key at https://openrouter.ai/workspaces/default/keys

See full OpenCode configuration guide →

Hermes AI coding agent Compatible

# Step 1 — Edit config.yaml
# Windows: C:\Users\<you>\AppData\Local\hermes\config.yaml
# macOS/Linux: ~/.config/hermes/config.yaml

model:
  default: nex-agi/nex-n2-pro:free
  provider: custom
  base_url: ${CUSTOM_BASE_URL}
  api_key: ${CUSTOM_API_KEY}
  model_aliases:
    nex-agi/nex-n2-pro:free:
      model: "nex-agi/nex-n2-pro:free"
      provider: "custom"

# Step 2 — Edit .env (same directory as config.yaml)
# Windows: C:\Users\<you>\AppData\Local\hermes\.env
# macOS/Linux: ~/.config/hermes/.env

# ========================
# Custom API (OpenAI-compatible)
# ========================
CUSTOM_API_KEY=<your-api-key>        # Get at https://openrouter.ai/workspaces/default/keys
CUSTOM_BASE_URL=https://openrouter.ai/api/v1

See full Hermes configuration guide →

OpenClaw AI coding agent (messaging gateway) Compatible

// ~/.openclaw/openclaw.json  (JSON5 format)
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "nex-agi/nex-n2-pro:free",
      },
    },
  },
  "models": {
    "providers": {
      // Option A — Built-in provider (OpenAI, Anthropic, Google…)
      // Just add apiKey; OpenClaw handles the baseUrl automatically
      // "openai": { "apiKey": "<your-api-key>" },

      // Option B — Custom OpenAI-compatible base URL (e.g. OpenRouter, NVIDIA)
      "free-llm": {
        "baseUrl": "https://openrouter.ai/api/v1",
        "apiKey": "<your-api-key>",  // Get at https://openrouter.ai/workspaces/default/keys
        "api": "openai-completions", // openai-completions | anthropic-messages | …
        "models": [
          { "id": "nex-agi/nex-n2-pro:free", "name": "nex-agi/nex-n2-pro:free" },
        ],
      },
    },
  },
}
// Apply: openclaw gateway restart
// Verify: openclaw doctor --fix

See full OpenClaw configuration guide →

OpenHuman Personal AI super intelligence (desktop agent) Compatible

# config.toml — OpenHuman workspace config
# Edit via Settings → AI & Skills → Local AI, or directly in the file.
#
# OpenHuman defaults to its built-in subscription backend.
# Set inference_url + api_key below to route to a free third-party API.

inference_url = "https://openrouter.ai/api/v1"
api_key = "<your-api-key>"  # Get at https://openrouter.ai/workspaces/default/keys
default_model = "nex-agi/nex-n2-pro:free"

# Optional: pin hints to specific models
# [model_routing]
# reasoning = "nex-agi/nex-n2-pro:free"
# fast = "nex-agi/nex-n2-pro:free"

# Verify: check Settings → AI & Skills for connection status

See full OpenHuman configuration guide →

Use Cases

What OpenRouter's free models are best for, based on aggregated model capabilities:

Chat 39 models Reasoning 11 models Coding 5 models Vision 1 model Embedding 1 model

Limitations & Caveats

200 RPD on free tier (1,000 with $10 lifetime credit)
Free models may change without notice
Some free models have shorter context windows than paid variants

Frequently Asked Questions

What happens when I hit the 200 RPD limit on OpenRouter?

You'll get a 429 rate limit error. You can either wait for the daily reset (midnight UTC) or make a one-time $10 lifetime top-up to get 1,000 RPD permanently.

Do free models on OpenRouter cost anything?

No — models marked with :free in their ID are completely free to use within the rate limit. OpenRouter subsidizes free models through paid model margins.

Can I use OpenRouter with Claude Code?

Yes. Set ANTHROPIC_BASE_URL to https://openrouter.ai/api and your OpenRouter API key. Claude Code will route through OpenRouter. Note: free models won't work for Claude Code as it requires Anthropic Claude models specifically.

How to Get a Free OpenRouter API Key (2026)

Overview

Quick Start Guide

All Free OpenRouter Models — Context Windows & Rate Limits

Free Tier Pricing & Rate Limits

Tool Compatibility — Configure OpenRouter with Your AI Tools

Use Cases

Limitations & Caveats

Frequently Asked Questions

What happens when I hit the 200 RPD limit on OpenRouter?

Do free models on OpenRouter cost anything?

Can I use OpenRouter with Claude Code?

Other Free LLM API Providers

AI21 Labs

Aion Labs

Alibaba Cloud Model Studio

Cohere

DeepSeek

Google Gemini