Free LLM API Models — Browse & Filter 129+ Models
Showing 129 of 129 models
Data refreshed May 14, 2026 — open source, verified via API daily
| Provider | Model | Context | Max Output | Modality | Rate Limit | Released | Weekly Tokens | Status | |
|---|---|---|---|---|---|---|---|---|---|
| OpenRouter | inclusionAI: Ring-2.6-1T (free) | 262K | 66K | See provider page | May 8, 2026 | 436.2B | Online | Details | |
| OpenRouter | Baidu Qianfan: CoBuddy (free) | 131K | 66K | See provider page | May 6, 2026 | 22.1B | Online | Details | |
| OpenRouter | Owl Alpha | 1.0M | 262K | See provider page | Apr 28, 2026 | 589.7B | Online | Details | |
| OpenRouter | NVIDIA: Nemotron 3 Nano Omni (free) | 256K | 66K | See provider page | Apr 28, 2026 | 17.3B | Online | Details | |
| OpenRouter | Poolside: Laguna XS.2 (free) | 131K | 8K | See provider page | Apr 28, 2026 | 35.2B | Online | Details | |
| OpenRouter | Poolside: Laguna M.1 (free) | 131K | 8K | See provider page | Apr 28, 2026 | 233.5B | Online | Details | |
| OpenRouter | DeepSeek: DeepSeek V4 Flash (free) | 256K | 256K | See provider page | Apr 24, 2026 | 3.1B | Online | Details | |
| OpenRouter | Baidu: Qianfan-OCR-Fast (free) | 66K | 29K | See provider page | Apr 20, 2026 | 503.6M | Online | Details | |
| OpenRouter | Google: Gemma 4 26B A4B (free) | 262K | 33K | See provider page | Apr 3, 2026 | 5.9B | Online | Details | |
| OpenRouter | Google: Gemma 4 31B (free) | 262K | 33K | See provider page | Apr 2, 2026 | 13.3B | Online | Details | |
| OpenRouter | Arcee AI: Trinity Large Thinking (free) | 262K | 80K | See provider page | Apr 1, 2026 | 11.2B | Online | Details | |
| OpenRouter | Google: Lyria 3 Pro Preview | 1.0M | 66K | See provider page | Mar 30, 2026 | 9.8M | Online | Details | |
| OpenRouter | Google: Lyria 3 Clip Preview | 1.0M | 66K | See provider page | Mar 30, 2026 | 4.2M | Online | Details | |
| OpenRouter | NVIDIA: Nemotron 3 Super (free) | 262K | 262K | See provider page | Mar 11, 2026 | 631.3B | Online | Details | |
| OpenRouter | NVIDIA: Llama Nemotron Embed VL 1B V2 (free) | 131K | 8K | See provider page | Feb 25, 2026 | — | Online | Details | |
| OpenRouter | MiniMax: MiniMax M2.5 (free) | 197K | 8K | See provider page | Feb 12, 2026 | 59.3B | Online | Details | |
| OpenRouter | Qwen: Qwen3 Coder 480B A35B (free) | 262K | 262K | See provider page | Feb 4, 2026 | 1.6B | Online | Details | |
| OpenRouter | Free Models Router | 200K | 8K | See provider page | Feb 1, 2026 | — | Online | Details | |
| OpenRouter | LiquidAI: LFM2.5-1.2B-Thinking (free) | 33K | 8K | See provider page | Jan 20, 2026 | 978.5M | Online | Details | |
| OpenRouter | LiquidAI: LFM2.5-1.2B-Instruct (free) | 33K | 8K | See provider page | Jan 20, 2026 | 536.6M | Online | Details | |
| OpenRouter | NVIDIA: Nemotron 3 Nano 30B A3B (free) | 256K | 8K | See provider page | Dec 14, 2025 | 39.6B | Online | Details | |
| OpenRouter | NVIDIA: Nemotron Nano 12B 2 VL (free) | 128K | 128K | See provider page | Oct 28, 2025 | 13.3B | Online | Details | |
| OpenRouter | Qwen: Qwen3 Next 80B A3B Instruct (free) | 262K | 8K | See provider page | Sep 11, 2025 | 1.0B | Online | Details | |
| OpenRouter | NVIDIA: Nemotron Nano 9B V2 (free) | 128K | 8K | See provider page | Sep 5, 2025 | 12.5B | Online | Details | |
| OpenRouter | OpenAI: gpt-oss-120b (free) | 131K | 131K | See provider page | Aug 5, 2025 | 143.4B | Online | Details | |
| OpenRouter | OpenAI: gpt-oss-20b (free) | 131K | 8K | See provider page | Aug 5, 2025 | 30.7B | Online | Details | |
| OpenRouter | Z.ai: GLM 4.5 Air (free) | 131K | 96K | See provider page | Jul 25, 2025 | 78.0B | Online | Details | |
| OpenRouter | Meta: Llama 3.3 70B Instruct (free) | 66K | 8K | See provider page | Dec 6, 2024 | 993.7M | Online | Details | |
| OpenRouter | Meta: Llama 3.2 3B Instruct (free) | 131K | 8K | See provider page | Sep 25, 2024 | 43.9M | Online | Details | |
| OpenRouter | Nous: Hermes 3 405B Instruct (free) | 131K | 8K | See provider page | Aug 16, 2024 | 45.8M | Online | Details | |
| Cloudflare Workers AI | @cf/meta/llama-4-scout-17b-16e-instruct | 10.0M | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| GitHub Models | gpt-4.1 | 1.0M | 32K | 10 RPM, 50 RPD | — | — | Online | Details | |
| GitHub Models | gpt-4.1-mini | 1.0M | 32K | 15 RPM, 150 RPD | — | — | Online | Details | |
| GitHub Models | Llama-4-Scout-17B-16E | 512K | 4K | 15 RPM, 150 RPD | — | — | Online | Details | |
| Cohere | Command A (111B) | 256K | 4K | 20 RPM | — | — | Online | Details | |
| Mistral AI | Mistral Small 4 | 256K | 256K | ~1 RPS, 500K TPM | — | — | Online | Details | |
| Mistral AI | Mistral Large 3 | 256K | 256K | ~1 RPS, 500K TPM | — | — | Online | Details | |
| Mistral AI | Codestral | 256K | 256K | ~1 RPS, 500K TPM | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/google/gemma-4-26b-a4b-it | 256K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| GitHub Models | Llama-4-Maverick-17B-128E | 256K | 4K | 10 RPM, 50 RPD | — | — | Online | Details | |
| Z AI (Zhipu AI) | GLM-4.7-Flash | 200K | 128K | 1 concurrent request | — | — | Online | Details | |
| GitHub Models | o3-mini | 200K | 100K | 10 RPM, 50 RPD | — | — | Online | Details | |
| GitHub Models | o4-mini | 200K | 100K | 10 RPM, 50 RPD | — | — | Online | Details | |
| Cohere | Embed 4 | 131K | 131K | 2,000 inputs/min | — | — | Online | Details | |
| Cohere | Rerank 3.5 | 131K | 131K | 10 RPM | — | — | Online | Details | |
| Cerebras | qwen-3-235b-a22b-instruct-2507 | 131K | 8K | 30 RPM, 14,400 RPD, 1M TPD | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/meta/llama-3.3-70b-instruct-fp8-fast | 131K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/meta/llama-3.1-8b-instruct-fp8-fast | 131K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/meta/llama-3.2-11b-vision-instruct | 131K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| GitHub Models | Meta-Llama-3.3-70B | 131K | 4K | 15 RPM, 150 RPD | — | — | Online | Details | |
| Cohere | Command R+ | 128K | 4K | 20 RPM | — | — | Online | Details | |
| Cohere | Command R7B | 128K | 4K | 20 RPM | — | — | Online | Details | |
| Mistral AI | Mistral Medium 3 | 128K | 128K | ~1 RPS, 500K TPM | — | — | Online | Details | |
| Mistral AI | Mistral Nemo (12B) | 128K | 128K | ~1 RPS, 500K TPM | — | — | Online | Details | |
| Mistral AI | Pixtral Large | 128K | 128K | ~1 RPS, 500K TPM | — | — | Online | Details | |
| Z AI (Zhipu AI) | GLM-4.5-Flash | 128K | 8K | 1 concurrent request | — | — | Online | Details | |
| Z AI (Zhipu AI) | GLM-4.6V-Flash | 128K | 4K | 1 concurrent request | — | — | Online | Details | |
| Cerebras | llama3.1-8b | 128K | 8K | 30 RPM, 14,400 RPD, 1M TPD | — | — | Online | Details | |
| Cerebras | gpt-oss-120b | 128K | 8K | 30 RPM, 14,400 RPD, 1M TPD | — | — | Online | Details | |
| Cerebras | zai-glm-4.7 | 128K | 8K | 10 RPM, 100 RPD, 1M TPD | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/mistralai/mistral-small-3.1-24b-instruct | 128K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| GitHub Models | gpt-4o | 128K | 16K | 10 RPM, 50 RPD | — | — | Online | Details | |
| GitHub Models | Mistral-Small-3.1 | 128K | 4K | 15 RPM, 150 RPD | — | — | Online | Details | |
| GitHub Models | DeepSeek-R1 | 64K | 8K | 15 RPM, 150 RPD | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/qwen/qwq-32b | 32K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| Cloudflare Workers AI | @cf/deepseek-ai/deepseek-r1-distill-qwen-32b | 32K | 131K | 10K neurons/day (shared) | — | — | Online | Details | |
| NVIDIA NIM | deepseek-ai/deepseek-v4-pro | 1.0M | 384K | Up to 40 RPM | — | — | Online | Details | |
| Google Gemini | Gemini 2.5 Flash | 1.0M | 65K | 10 RPM, 250 RPD | — | — | Online | Details | |
| Google Gemini | Gemini 2.5 Flash-Lite | 1.0M | 65K | 15 RPM, 1,000 RPD | — | — | Online | Details | |
| NVIDIA NIM | qwen/qwen3.5-122b-a10b | 262K | 66K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | qwen/qwen3.5-397b-a17b | 262K | 66K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | stepfun-ai/step-3.5-flash | 262K | 66K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | moonshotai/kimi-k2.6 | 262K | 262K | Up to 40 RPM | — | — | Online | Details | |
| Groq | kimi-k2-instruct | 262K | 262K | 30 RPM, 14,400 RPD | — | — | Online | Details | |
| Kilo Code | nvidia/nemotron-3-super-120b-a12b:free | 262K | 32K | ~200 req/hr | — | — | Online | Details | |
| OVHcloud AI Endpoints | Qwen3-Coder-30B-A3B-Instruct | 262K | 32K | 2 RPM (anonymous) | — | — | Online | Details | |
| NVIDIA NIM | deepseek-ai/deepseek-v4-flash | 256K | 256K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | z-ai/glm-5.1 | 203K | 66K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | minimaxai/minimax-m2.7 | 197K | 131K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | meta/llama-guard-4-12b | 164K | 16K | Up to 40 RPM | — | — | Online | Details | |
| Groq | whisper-large-v3 | 131K | 131K | 20 RPM, 2,000 RPD | — | — | Online | Details | |
| Groq | whisper-large-v3-turbo | 131K | 131K | 20 RPM, 2,000 RPD | — | — | Online | Details | |
| Kilo Code | bytedance-seed/dola-seed-2.0-pro:free | 131K | 131K | ~200 req/hr | — | — | Online | Details | |
| Kilo Code | x-ai/grok-code-fast-1:optimized:free | 131K | 131K | ~200 req/hr | — | — | Online | Details | |
| Kilo Code | arcee-ai/trinity-large-thinking:free | 131K | 131K | ~200 req/hr | — | — | Online | Details | |
| LLM7.io | deepseek-r1-0528 | 131K | 131K | 30 RPM (120 with token) | — | — | Online | Details | |
| LLM7.io | deepseek-v3-0324 | 131K | 131K | 30 RPM (120 with token) | — | — | Online | Details | |
| LLM7.io | gpt-4o-mini | 131K | 131K | 30 RPM (120 with token) | — | — | Online | Details | |
| LLM7.io | qwen2.5-coder-32b | 131K | 131K | 30 RPM (120 with token) | — | — | Online | Details | |
| ModelScope | Qwen/Qwen3.5-35B-A3B | 131K | 131K | 2,000 RPD total; <=500 RPD/model (dynamic) | — | — | Online | Details | |
| ModelScope | Qwen/Qwen3.5-27B | 131K | 131K | 2,000 RPD total; <=500 RPD/model (dynamic) | — | — | Online | Details | |
| ModelScope | Qwen/Qwen-Image | 131K | 131K | 2,000 RPD total; model/AIGC-specific caps | — | — | Online | Details | |
| SiliconFlow | deepseek-ai/DeepSeek-OCR | 131K | 8K | 1,000 RPM, 50K TPM | — | — | Online | Details | |
| SiliconFlow | Abbreviation | 131K | 8K | See provider page | — | — | Online | Details | |
| NVIDIA NIM | meta/llama-3.1-70b-instruct | 131K | 16K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | meta/llama-3.2-11b-vision-instruct | 131K | 16K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | meta/llama-3.2-3b-instruct | 131K | 8K | Up to 40 RPM | — | — | Online | Details | |
| NVIDIA NIM | nvidia/llama-3.3-nemotron-super-49b-v1.5 | 131K | 16K | Up to 40 RPM | — | — | Online | Details | |
| Groq | llama-3.3-70b-versatile | 131K | 32K | 30 RPM, 14,400 RPD | — | — | Online | Details | |
| Groq | llama-3.1-8b-instant | 131K | 131K | 30 RPM, 14,400 RPD | — | — | Online | Details | |
| Groq | llama-4-scout-17b-16e-instruct | 131K | 8K | 30 RPM, 14,400 RPD | — | — | Online | Details | |
| Groq | llama-4-maverick-17b-128e-instruct | 131K | 8K | 15 RPM, 500 RPD | — | — | Online | Details | |
| Groq | qwen3-32b | 131K | 131K | 30 RPM, 14,400 RPD | — | — | Online | Details | |
| Groq | deepseek-r1-distill-70b | 131K | 8K | 30 RPM, 14,400 RPD | — | — | Online | Details | |
| Hugging Face | Qwen2.5-7B-Instruct | 131K | 4K | ~1,000 RPD | — | — | Online | Details | |
| OVHcloud AI Endpoints | Meta-Llama-3_3-70B-Instruct | 131K | 4K | 2 RPM (anonymous) | — | — | Online | Details | |
| OVHcloud AI Endpoints | DeepSeek-R1-Distill-Llama-70B | 131K | 32K | 2 RPM (anonymous) | — | — | Online | Details | |
| SiliconFlow | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 131K | 131K | 1,000 RPM, 50K TPM | — | — | Online | Details | |
| Hugging Face | Meta-Llama-3.1-8B-Instruct | 128K | 4K | ~1,000 RPD | — | — | Online | Details | |
| Hugging Face | Phi-3.5-mini-instruct | 128K | 4K | ~1,000 RPD | — | — | Online | Details | |
| Ollama Cloud | llama3.1:cloud | 128K | 131K | Session/weekly limits (unpublished) | — | — | Online | Details | |
| Ollama Cloud | deepseek-r1:cloud | 128K | 131K | Session/weekly limits (unpublished) | — | — | Online | Details | |
| Ollama Cloud | qwen2.5:cloud | 128K | 131K | Session/weekly limits (unpublished) | — | — | Online | Details | |
| OVHcloud AI Endpoints | Qwen2.5-VL-72B-Instruct | 128K | 8K | 2 RPM (anonymous) | — | — | Online | Details | |
| OVHcloud AI Endpoints | Mistral-Nemo-Instruct-2407 | 128K | 4K | 2 RPM (anonymous) | — | — | Online | Details | |
| SiliconFlow | THUDM/GLM-4.1V-9B-Thinking | 66K | 66K | 1,000 RPM, 50K TPM | — | — | Online | Details | |
| NVIDIA NIM | meta/llama-3.2-1b-instruct | 60K | 8K | Up to 40 RPM | — | — | Online | Details | |
| SiliconFlow | deepseek-ai/DeepSeek-R1-0528-Qwen3-8B | 33K | 16K | 1,000 RPM, 50K TPM | — | — | Online | Details | |
| OpenRouter | Venice: Uncensored (free) | 33K | 8K | See provider page | — | — | Online | Details | |
| Hugging Face | Mistral-7B-Instruct-v0.3 | 32K | 4K | ~1,000 RPD | — | — | Online | Details | |
| Hugging Face | Mixtral-8x7B-Instruct-v0.1 | 32K | 4K | ~1,000 RPD | — | — | Online | Details | |
| LLM7.io | mistral-small-3.1-24b | 32K | 131K | 30 RPM (120 with token) | — | — | Online | Details | |
| Ollama Cloud | mistral:cloud | 32K | 131K | Session/weekly limits (unpublished) | — | — | Online | Details | |
| OVHcloud AI Endpoints | Qwen3Guard-Gen-8B | 32K | 4K | 2 RPM (anonymous) | — | — | Online | Details | |
| OVHcloud AI Endpoints | Qwen3Guard-Gen-0.6B | 32K | 4K | 2 RPM (anonymous) | — | — | Online | Details | |
| SiliconFlow | THUDM/glm-4-9b-chat | 32K | 32K | 1,000 RPM, 50K TPM | — | — | Online | Details | |
| Ollama Cloud | gemma2:cloud | 8K | 131K | Session/weekly limits (unpublished) | — | — | Online | Details | |
| NVIDIA NIM | mistralai/mistral-large-2-instruct | 131K | 8K | Up to 40 RPM | — | — | Unavailable | Details | |
| NVIDIA NIM | nvidia/llama-3.1-nemotron-ultra-253b-v1 | 131K | 8K | Up to 40 RPM | — | — | Unavailable | Details |
How to Use Free LLM API Resources
- Pick a model — Click any model name to see details, rate limits, and API key signup link.
- Get your API key — Sign up on the provider's website (most require no credit card).
- Copy the config — Go to the Config Generator, pick your tool and backend, copy the ready-to-use snippet.
- Test it — Use the Playground to test your API key before integrating.
New to LLM terminology? Check the 📖 Glossary — 22 terms explained in plain English →