llama-4-scout-17b-16e-instruct — Free API
⭐ Score: 51groq/llama-4-scout-17b-16e-instruct What is llama-4-scout-17b-16e-instruct?
Llama 4 Scout 17B on Groq runs Meta's latest MoE generation model with Groq's ultra-fast LPU inference. The Scout variant uses 16 active experts to deliver broad capability in a compact 17B active footprint, with 8K output per request. Combined with Groq's sub-200ms time-to-first-token, it offers a responsive experience for interactive chat and agent workflows. Rate limits are 14,400 requests per day at 30 RPM — sufficient for sustained prototyping and light production use. OpenAI SDK compatible; registration required but no credit card needed.
llama-4-scout-17b-16e-instruct API Code Example
Paste your API key and run. See the config generator for Claude Code, Cursor, and more tools.
Other Free Models from Groq
More About Groq
How to get an API key, rate limits, platform limitations, and tool configuration — everything you need to set up Groq as a free LLM API backend.
View Groq full guide →