GitHub Models: Llama-4-Maverick-17B-128E

llama-4-maverick-17b-128e
Released Jun 17, 2025 · 256K context · Free / Free
chat

API Integration

Integrate Llama-4-Maverick-17B-128E into your applications using the following code snippets.

Python (OpenAI)
from openai import OpenAI

client = OpenAI(
    base_url="https://models.inference.ai.azure.com",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="llama-4-maverick-17b-128e",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
JavaScript (OpenAI)
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://models.inference.ai.azure.com",
  apiKey: "YOUR_API_KEY",
});

const completion = await openai.chat.completions.create({
  model: "llama-4-maverick-17b-128e",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);
cURL
curl "https://models.inference.ai.azure.com/models/llama-4-maverick-17b-128e:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "Hello!"}]}]
  }'