inclusionAI Ring-2.6-1T: The 1T MoE Model You Can Use for Free in 2026

What Is inclusionAI?

inclusionAI is a new model provider focused on AI agents. Their thesis: agent workflows demand a different kind of model — one optimized not just for single-turn Q&A, but for multi-step planning, tool calling chains, and long-running autonomous tasks. Their model family has three members, all accessible via OpenRouter:

Ring-2.6-1T — Thinking/reasoning model, 1T total params, 63B active. This is the flagship.
Ling-2.6-1T — Instant/instruct model, same scale but optimized for speed over depth.
Ling-2.6-flash — 104B total, 7.4B active. Fast, efficient, for lightweight agent tasks.

Ring is free on OpenRouter. Ling and Ling-flash are paid. This article focuses on Ring.

Ring-2.6-1T Specs

Spec	Value
Total parameters	1,000B (1 trillion)
Active parameters	63B (MoE, ~6% activated per token)
Architecture	Mixture-of-Experts with adaptive reasoning
Context window	262,144 tokens
Max output	65,536 tokens
Reasoning modes	High, XHigh (adaptive effort based on task complexity)
Modality	Text in, text out
Released	May 8, 2026
API format	OpenAI-compatible (via OpenRouter)
Price	Free (OpenRouter free tier)
Rate limit	OpenRouter free tier limits (varies by demand)

What Makes Ring Different?

Ring was built for agent workflows — not chat. That means it was optimized against benchmarks that measure tool use, multi-step planning, and autonomous task execution:

PinchBench — Measures how well models handle real-world coding tasks with multiple files, dependencies, and iterative fixes.
ClawEval — Evaluates long-horizon agent performance: can the model stay coherent across 50+ tool calls?
TAU2-Bench — Tests tool-augmented understanding: retrieving data, calling APIs, and reasoning over structured outputs.
GAIA2-search — Multi-step research and information synthesis benchmark.

The key innovation is adaptive reasoning. Instead of burning the same compute on every request, Ring dynamically allocates its reasoning budget based on task complexity. A simple "what's 2+2" doesn't waste tokens. A complex debugging session across 5 files gets the full xhigh treatment.

Ring vs. Other Free Thinking Models

Model	Params (active)	Context	Free?	Best For
Ring-2.6-1T	1T (63B)	262K	Yes	Coding agents, multi-step tool use
DeepSeek V4 Pro	1.6T (49B)	1M	Yes	General reasoning, long context
MiniMax M2.7	—	260K	Yes	Multimodal, coding
GLM 5.1	—	200K	Yes	8h+ autonomous work
DeepSeek R1	671B (37B)	128K	Yes	Math, science reasoning

Ring's sweet spot is coding agents. If you're using Claude Code, OpenClaw, OpenCode, or Hermes and need deep reasoning across multi-file edits with tool calls, Ring is purpose-built for that workflow. DeepSeek V4 Pro has a larger context window; GLM 5.1 is better for truly autonomous 8-hour sessions. But for agentic coding at speed — Ring delivers.

How to Get an API Key

Ring-2.6-1T is available through OpenRouter. Getting a key takes under 2 minutes:

Go to openrouter.ai/keys and sign up (email or GitHub).
Click "Create Key". Copy it.
No credit card required for free models.

You can also browse Ring on our model directory: inclusionAI: Ring-2.6-1T details →

Config Snippets

Ring uses the OpenRouter endpoint — OpenAI-compatible format. Here's the config for each tool:

Claude Code (cc)

# ~/.bashrc or ~/.zshrc
export ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1"
export ANTHROPIC_AUTH_TOKEN="sk-or-v1-YOUR_OPENROUTER_KEY"
export ANTHROPIC_MODEL="inclusionai/ring-2.6-1t:free"

Cursor

Settings → Models → Add Model:

Model name: inclusionai/ring-2.6-1t:free
Base URL: https://openrouter.ai/api/v1
API Key: Your OpenRouter key

Codex CLI

export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="sk-or-v1-YOUR_OPENROUTER_KEY"
export CODEX_DEFAULT_MODEL="inclusionai/ring-2.6-1t:free"

OpenCode / Hermes / OpenClaw

# All accept OpenAI-compatible configs
OPENAI_BASE_URL="https://openrouter.ai/api/v1"
OPENAI_API_KEY="sk-or-v1-YOUR_OPENROUTER_KEY"
# Set model to: inclusionai/ring-2.6-1t:free

For more tool-specific configs (Gemini CLI, Kilo Code, etc.), use our Config Generator — select your tool, pick OpenRouter, and copy the ready-to-use snippet.

Rate Limits & Practical Usage

OpenRouter's free tier doesn't publish fixed RPM numbers — limits vary based on overall demand and model load. In practice:

Expect 20-30 RPM during off-peak hours for free models.
Free models may queue during peak demand (paid models get priority routing).
Ring is a reasoning model, so individual requests may take longer (10-30s for complex tasks). This means rate limits are less of a bottleneck than with instant models — you're not firing 30 requests per minute anyway.

When to Use Ring (and When Not To)

Use Ring when:

You're running coding agents that need sophisticated tool use and multi-step reasoning.
You're debugging across multiple files and need the model to understand the full codebase context.
You need strong performance on structured coding benchmarks (SWE-bench, etc.).
You want cutting-edge agent architecture without paying for it.

Skip Ring when:

You need fast, single-turn chat responses — use Ling-2.6-1T (paid) or Groq's Llama 3.3 70B (free, faster).
You need a 1M context window — use Gemini 2.5 Flash.
You need guaranteed throughput — OpenRouter free tier has no SLA. Paid plans start at credits-based pay-as-you-go.

View Ring-2.6-1T on our model directory → Model Details

Or browse all free models across 18 providers.