@cf/meta/llama-3.3-70b-instruct-fp8-fast — Free API

Created by Meta ⭐ Score: 42
cloudflare-workers-ai/cf-meta-llama-3-3-70b-instruct-fp8-fast
🛠️ Function Calling JSON Mode chat

What is @cf/meta/llama-3.3-70b-instruct-fp8-fast?

Llama 3.3 70B Instruct runs on Cloudflare Workers AI's global edge network, bringing Meta's 70B-parameter flagship to every Cloudflare data center worldwide. Deployed at the edge, it offers significantly lower latency than centralized API providers — requests are routed to the nearest Cloudflare PoP rather than a single-region GPU cluster. The free tier allocates 10,000 Neurons (compute units) per day across all models on your account, so available capacity depends on your other Workers AI usage. The API uses a Cloudflare-specific REST format rather than OpenAI SDK compatibility, and requires a Cloudflare account ID in the endpoint URL — lightweight setup for existing Cloudflare users, but an extra step for everyone else.

Model ID
cf-meta-llama-3-3-70b-instruct-fp8-fast
Base URL
https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run

@cf/meta/llama-3.3-70b-instruct-fp8-fast API Code Example

Paste your API key and run. See the config generator for Claude Code, Cursor, and more tools.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="cf-meta-llama-3-3-70b-instruct-fp8-fast",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run",
  apiKey: "YOUR_API_KEY",
});

const completion = await openai.chat.completions.create({
  model: "cf-meta-llama-3-3-70b-instruct-fp8-fast",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);
curl https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "cf-meta-llama-3-3-70b-instruct-fp8-fast",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Other Free Models from Cloudflare Workers AI

More About Cloudflare Workers AI

How to get an API key, rate limits, platform limitations, and tool configuration — everything you need to set up Cloudflare Workers AI as a free LLM API backend.

View Cloudflare Workers AI full guide →