@cf/meta/llama-3.3-70b-instruct-fp8-fast — Free API
Created by Meta ⭐ Score: 42cloudflare-workers-ai/cf-meta-llama-3-3-70b-instruct-fp8-fast What is @cf/meta/llama-3.3-70b-instruct-fp8-fast?
Llama 3.3 70B Instruct runs on Cloudflare Workers AI's global edge network, bringing Meta's 70B-parameter flagship to every Cloudflare data center worldwide. Deployed at the edge, it offers significantly lower latency than centralized API providers — requests are routed to the nearest Cloudflare PoP rather than a single-region GPU cluster. The free tier allocates 10,000 Neurons (compute units) per day across all models on your account, so available capacity depends on your other Workers AI usage. The API uses a Cloudflare-specific REST format rather than OpenAI SDK compatibility, and requires a Cloudflare account ID in the endpoint URL — lightweight setup for existing Cloudflare users, but an extra step for everyone else.
@cf/meta/llama-3.3-70b-instruct-fp8-fast API Code Example
Paste your API key and run. See the config generator for Claude Code, Cursor, and more tools.
Other Free Models from Cloudflare Workers AI
@cf/meta/llama-4-scout-17b-16e-instruct
10.0M ctx · No card
@cf/openai/gpt-oss-120b
128K ctx · No card
@cf/moonshotai/kimi-k2.7-code
262K ctx · No card
@cf/google/gemma-4-26b-a4b-it
256K ctx · No card
@cf/zhipuai/glm-4.7-flash
131K ctx · No card
@cf/mistralai/mistral-small-3.1-24b-instruct
128K ctx · No card
More About Cloudflare Workers AI
How to get an API key, rate limits, platform limitations, and tool configuration — everything you need to set up Cloudflare Workers AI as a free LLM API backend.
View Cloudflare Workers AI full guide →