Cloudflare Workers AI logo How to Get a Free Cloudflare Workers AI API Key (2026)

8 free models available — no credit card required. Get your Cloudflare Workers AI API key → Test free models →

💡
Need help setting up Cloudflare Workers AI?
Read our step-by-step tutorial on getting your free API key and 10,000 daily Neurons →

Cloudflare Workers AI FreeLLM Score

63
👍 Good Option — Notable for stable service

A usable option, though it may have noticeable restrictions or older models.

🎁
Generosity Free limits
70/100
🌍
Accessibility Signup ease
75/100
📚
Breadth Model variety
55/100
Reliability Uptime
80/100
🔌
Compatibility Tool support
35/100
🧠
Quality Benchmarks
65/100

How we score →

What is Cloudflare Workers AI?

Edge AI inference — 10,000 neurons/day, 50+ models.

Cloudflare Workers AI runs open-weight models directly on Cloudflare's global edge network. The free tier allocates 10,000 Neurons (compute units) per day, supporting 50+ models including Llama, Mistral, Gemma, DeepSeek, and Qwen. Unlike other providers, billing is based on Neurons rather than tokens, making it hard to predict exact request counts. Ideal for low-latency edge deployments.

  • 50+ models on the free tier
  • 10,000 Neurons/day
  • Global edge network for low latency
  • Text, image, audio, and embedding models

API Compatibility: OpenAI SDK-compatible (via REST)

How to Get a Cloudflare Workers AI API Key

  1. 1
    Sign up at dash.cloudflare.com Free account. No credit card.
  2. 2
    Go to Workers & Pages → AI
  3. 3
    Create an API token with Workers AI permissions
  4. 4
    Pick a model Llama 3.2 3B and Mistral 7B are reliable choices.
  5. 5
    Configure OpenAI client Base URL: https://api.cloudflare.com/client/v4/accounts/YOUR_ACCOUNT_ID/ai/run

All Free Cloudflare Workers AI Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Released Status
@cf/google/gemma-4-26b-a4b-it 256K 131K text 10K neurons/day (shared) Apr 2, 2026 Online
@cf/openai/gpt-oss-120b 128K 131K text 10K neurons/day (shared) Online
@cf/zhipuai/glm-4.7-flash 131K 131K text 10K neurons/day (shared) Online
@cf/meta/llama-4-scout-17b-16e-instruct 10.0M 131K text 10K neurons/day (shared) Online
@cf/moonshotai/kimi-k2.7-code 262K 131K textcode 10K neurons/day (shared) Online
@cf/deepseek-ai/deepseek-r1-distill-qwen-32b 32K 131K textreasoning 10K neurons/day (shared) Jan 20, 2025 Online
@cf/mistralai/mistral-small-3.1-24b-instruct 128K 131K text 10K neurons/day (shared) Mar 17, 2025 Online
@cf/meta/llama-3.3-70b-instruct-fp8-fast 131K 131K text 10K neurons/day (shared) Dec 6, 2024 Online

Cloudflare Workers AI Free Tier Limits & Pricing

Credit Card Not required
Free Tier Permanently free
Context Range 32K – 10.0M
Total Models 8 free
Rate Limits 10K neurons/day (shared)
API Compatibility OpenAI SDK-compatible (via REST)

Cloudflare Workers AI API Setup Tutorial & Tools

Cloudflare Workers AI is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →

Use Cases

What Cloudflare Workers AI's free models are best for, based on aggregated model capabilities:

Chat 8 models Coding 2 models Reasoning 1 model

Limitations & Caveats

  • Neurons billing is opaque — hard to predict exact request counts
  • Model availability varies by Cloudflare region
  • 10,000 Neurons/day shared across all models

Frequently Asked Questions

How many requests is 10,000 Neurons on Cloudflare Workers AI?

It depends on the model and prompt length. For Llama 3.2 3B with a 500-token prompt, ~10,000 Neurons ≈ 200-400 requests/day. Larger models consume more Neurons per request. Monitor your usage in the Cloudflare dashboard.

Do I need a Cloudflare Workers plan to use Workers AI?

No — the free Workers plan includes 10,000 Neurons/day for AI inference. You don't need to deploy any Workers code; just use the AI API endpoint directly.

Is Cloudflare Workers AI good for production?

The free tier is great for prototyping and low-volume apps. For production, the paid tier offers higher limits and SLAs. The edge network provides low global latency, which is a unique advantage.

See our FAQ for common questions about free LLM APIs