nvidia/llama-3.3-nemotron-super-49b-v1.5 — Free API

⭐ Score: 50 Verified
nvidia-nim/nvidia-llama-3-3-nemotron-super-49b-v1-5
🛠️ Function Calling JSON Mode chat reasoning

What is nvidia/llama-3.3-nemotron-super-49b-v1.5?

NVIDIA Nemotron Super 49B is a mid-size custom model by NVIDIA, free on NVIDIA NIM with up to 40 RPM and no daily token cap. Built on Llama 3.3 architecture with NVIDIA's training enhancements, it offers balanced performance for general-purpose tasks. OpenAI-compatible API. Requires free NVIDIA Developer Program membership and phone verification.

Model ID
nvidia/llama-3.3-nemotron-super-49b-v1.5
Base URL
https://integrate.api.nvidia.com/v1

nvidia/llama-3.3-nemotron-super-49b-v1.5 API Code Example

Paste your API key and run. See the config generator for Claude Code, Cursor, and more tools.

from openai import OpenAI

client = OpenAI(
    base_url="https://integrate.api.nvidia.com/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="nvidia/llama-3.3-nemotron-super-49b-v1.5",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://integrate.api.nvidia.com/v1",
  apiKey: "YOUR_API_KEY",
});

const completion = await openai.chat.completions.create({
  model: "nvidia/llama-3.3-nemotron-super-49b-v1.5",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);
curl https://integrate.api.nvidia.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "nvidia/llama-3.3-nemotron-super-49b-v1.5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Other Free Models from NVIDIA NIM

More About NVIDIA NIM

How to get an API key, rate limits, platform limitations, and tool configuration — everything you need to set up NVIDIA NIM as a free LLM API backend.

View NVIDIA NIM full guide →