How to Get a Free Cloudflare Workers AI API Key (2026)
8 free models available — no credit card required. Get your Cloudflare Workers AI API key → Test free models →
Cloudflare Workers AI FreeLLM Score
A usable option, though it may have noticeable restrictions or older models.
What is Cloudflare Workers AI?
Edge AI inference — 10,000 neurons/day, 50+ models.
Cloudflare Workers AI runs open-weight models directly on Cloudflare's global edge network. The free tier allocates 10,000 Neurons (compute units) per day, supporting 50+ models including Llama, Mistral, Gemma, DeepSeek, and Qwen. Unlike other providers, billing is based on Neurons rather than tokens, making it hard to predict exact request counts. Ideal for low-latency edge deployments.
- 50+ models on the free tier
- 10,000 Neurons/day
- Global edge network for low latency
- Text, image, audio, and embedding models
API Compatibility: OpenAI SDK-compatible (via REST)
How to Get a Cloudflare Workers AI API Key
- 1 Sign up at dash.cloudflare.com Free account. No credit card.
- 2 Go to Workers & Pages → AI
- 3 Create an API token with Workers AI permissions
- 4 Pick a model Llama 3.2 3B and Mistral 7B are reliable choices.
- 5 Configure OpenAI client Base URL: https://api.cloudflare.com/client/v4/accounts/YOUR_ACCOUNT_ID/ai/run
All Free Cloudflare Workers AI Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Released | Status |
|---|---|---|---|---|---|---|
| @cf/google/gemma-4-26b-a4b-it | 256K | 131K | 10K neurons/day (shared) | Apr 2, 2026 | Online | |
| @cf/openai/gpt-oss-120b | 128K | 131K | 10K neurons/day (shared) | — | Online | |
| @cf/zhipuai/glm-4.7-flash | 131K | 131K | 10K neurons/day (shared) | — | Online | |
| @cf/meta/llama-4-scout-17b-16e-instruct | 10.0M | 131K | 10K neurons/day (shared) | — | Online | |
| @cf/moonshotai/kimi-k2.7-code | 262K | 131K | 10K neurons/day (shared) | — | Online | |
| @cf/deepseek-ai/deepseek-r1-distill-qwen-32b | 32K | 131K | 10K neurons/day (shared) | Jan 20, 2025 | Online | |
| @cf/mistralai/mistral-small-3.1-24b-instruct | 128K | 131K | 10K neurons/day (shared) | Mar 17, 2025 | Online | |
| @cf/meta/llama-3.3-70b-instruct-fp8-fast | 131K | 131K | 10K neurons/day (shared) | Dec 6, 2024 | Online |
Cloudflare Workers AI Free Tier Limits & Pricing
Cloudflare Workers AI API Setup Tutorial & Tools
Cloudflare Workers AI is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →
Use Cases
What Cloudflare Workers AI's free models are best for, based on aggregated model capabilities:
Limitations & Caveats
- Neurons billing is opaque — hard to predict exact request counts
- Model availability varies by Cloudflare region
- 10,000 Neurons/day shared across all models
Frequently Asked Questions
How many requests is 10,000 Neurons on Cloudflare Workers AI?
It depends on the model and prompt length. For Llama 3.2 3B with a 500-token prompt, ~10,000 Neurons ≈ 200-400 requests/day. Larger models consume more Neurons per request. Monitor your usage in the Cloudflare dashboard.
Do I need a Cloudflare Workers plan to use Workers AI?
No — the free Workers plan includes 10,000 Neurons/day for AI inference. You don't need to deploy any Workers code; just use the AI API endpoint directly.
Is Cloudflare Workers AI good for production?
The free tier is great for prototyping and low-volume apps. For production, the paid tier offers higher limits and SLAs. The edge network provides low global latency, which is a unique advantage.