OpenRouter: NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

This model is no longer free. See free alternatives.
openrouter/nvidia-llama-3.3-nemotron-super-49b-v1.5
Released Oct 10, 2025 · 131K context ·
chat reasoning

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and multi-turn chat, followed by multiple RL stages; Reward-aware Preference Optimization (RPO) for alignment, RL with Verifiable Rewards (RLVR) for step-wise reasoning, and iterative DPO to refine tool-use behavior. A distillation-driven Neural Architecture Search (“Puzzle”) replaces some attention blocks and varies FFN widths to shrink memory footprint and improve throughput, enabling single-GPU (H100/H200) deployment while preserving instruction following and CoT quality. In internal evaluations (NeMo-Skills, up to 16 runs, temp = 0.6, top_p = 0.95), the model reports strong reasoning/coding results, e.g., MATH500 pass@1 = 97.4, AIME-2024 = 87.5, AIME-2025 = 82.71, GPQA = 71.97, LiveCodeBench (24.10–25.02) = 73.58, and MMLU-Pro (CoT) = 79.53. The model targets practical inference efficiency (high tokens/s, reduced VRAM) with Transformers/vLLM support and explicit “reasoning on/off” modes (chat-first defaults, greedy recommended when disabled). Suitable for building agents, assistants, and long-context retrieval systems where balanced accuracy-to-cost and reliable tool use matter.

Try NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Test this model directly in the playground.

Click to start testing in Playground...

One-Click Config

Optimized configs for your favorite AI tools.

Claude Code

# Claude Code works via OpenRouter's Anthropic-compatible API.
# Note: Only paid Anthropic Claude models are supported (e.g. claude-sonnet-4.6, claude-opus-4).
# Browse available Claude models at: https://openrouter.ai/models?q=anthropic

# Add to ~/.zshrc or ~/.bashrc
export OPENROUTER_API_KEY="<your-openrouter-api-key>"  # Get at https://openrouter.ai/settings/keys
export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
export ANTHROPIC_API_KEY=""  # Must be explicitly empty to avoid conflicts

# Optional: pin specific models for each role
# export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
# export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"

# Then simply run: claude

Cursor

# Cursor → Settings (⚙️) → Models → Add Model
# Enter the model name exactly as shown, then fill in:
#   Override OpenAI Base URL: https://openrouter.ai/api/v1
#   OpenAI API Key: <your-api-key>   # Get at https://openrouter.ai/workspaces/default/keys
# Click "Verify" to confirm the connection, then enable the model.
#
# Model name to add: NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Codex

# Add to ~/.zshrc or ~/.bashrc
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="<your-api-key>"  # Get at https://openrouter.ai/workspaces/default/keys

# Then run:
codex --model "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5"

Gemini CLI

# ~/.gemini/settings.json
{
  "apiKey": "<your-api-key>",
  "model": "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5"
}
# Get API key at https://openrouter.ai/workspaces/default/keys

OpenCode

// ~/.config/opencode/opencode.json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "free-llm": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Free LLM",
      "options": {
        "baseURL": "https://openrouter.ai/api/v1",
        "apiKey": "<your-api-key>"
      },
      "models": {
        "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5": { "name": "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5" }
      }
    }
  }
}
// Get API key at https://openrouter.ai/workspaces/default/keys

Hermes

# Step 1 — Edit config.yaml
# Windows: C:\Users\<you>\AppData\Local\hermes\config.yaml
# macOS/Linux: ~/.config/hermes/config.yaml

model:
  default: NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
  provider: custom
  base_url: ${CUSTOM_BASE_URL}
  api_key: ${CUSTOM_API_KEY}
  model_aliases:
    NVIDIA: Llama 3.3 Nemotron Super 49B V1.5:
      model: "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5"
      provider: "custom"

# Step 2 — Edit .env (same directory as config.yaml)
# Windows: C:\Users\<you>\AppData\Local\hermes\.env
# macOS/Linux: ~/.config/hermes/.env

# ========================
# Custom API (OpenAI-compatible)
# ========================
CUSTOM_API_KEY=<your-api-key>        # Get at https://openrouter.ai/workspaces/default/keys
CUSTOM_BASE_URL=https://openrouter.ai/api/v1

OpenClaw

// ~/.openclaw/openclaw.json  (JSON5 format)
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5",
      },
    },
  },
  "models": {
    "providers": {
      // Option A — Built-in provider (OpenAI, Anthropic, Google…)
      // Just add apiKey; OpenClaw handles the baseUrl automatically
      // "openai": { "apiKey": "<your-api-key>" },

      // Option B — Custom OpenAI-compatible base URL (e.g. OpenRouter, NVIDIA)
      "free-llm": {
        "baseUrl": "https://openrouter.ai/api/v1",
        "apiKey": "<your-api-key>",  // Get at https://openrouter.ai/workspaces/default/keys
        "api": "openai-completions", // openai-completions | anthropic-messages | …
        "models": [
          { "id": "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5", "name": "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5" },
        ],
      },
    },
  },
}
// Apply: openclaw gateway restart
// Verify: openclaw doctor --fix

FAQ

Is it really free?

No, NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 was previously free but has since transitioned to a paid model. Browse our free model directory for alternatives.

How to use it with Cursor?

Go to Cursor Settings > Models, add a custom model named "NVIDIA: Llama 3.3 Nemotron Super 49B V1.5", and set the Base URL to "https://openrouter.ai/api/v1".