NVIDIA NIM logo How to Get a Free NVIDIA NIM API Key (2026)

17 free models available — no credit card required. Get your NVIDIA NIM API key → Test free models →

NVIDIA NIM FreeLLM Score

74
Solid Choice — Strong in great tool compatibility

A solid choice for most developers with balanced limits and model quality.

🎁
Generosity Free limits
65/100
🌍
Accessibility Signup ease
65/100
📚
Breadth Model variety
70/100
Reliability Uptime
45/100
🔌
Compatibility Tool support
100/100
🧠
Quality Benchmarks
100/100

How we score →

What is NVIDIA NIM?

100+ open models from NVIDIA — no credit card, 40 RPM.

NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.

  • 100+ open models available
  • No daily token cap
  • ~40 RPM free tier
  • No credit card required

API Compatibility: OpenAI SDK-compatible (Chat Completions)

How to Get a NVIDIA NIM API Key

  1. 1
    Sign up at build.nvidia.com Free NVIDIA Developer account. No credit card.
  2. 2
    Go to Settings → API Keys
  3. 3
    Generate an API key
  4. 4
    Browse available models 100+ open models. Nemotron Super 49B recommended.
  5. 5
    Configure OpenAI client Base URL: https://integrate.api.nvidia.com/v1

All Free NVIDIA NIM Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Released Status
minimaxai/minimax-m3 1.0M 512K textimage Up to 40 RPM Jun 1, 2026 Online
moonshotai/kimi-k2.6 262K 262K textimage Up to 40 RPM Apr 20, 2026 Online
z-ai/glm-5.1 203K 8K text Up to 40 RPM Apr 7, 2026 Online
stepfun-ai/step-3.7-flash 256K 256K textimage Up to 40 RPM May 29, 2026 Online
deepseek-ai/deepseek-v4-pro 1.0M 384K text Up to 40 RPM Apr 24, 2026 Online
minimaxai/minimax-m2.7 205K 131K text Up to 40 RPM Mar 18, 2026 Online
deepseek-ai/deepseek-v4-flash 1.0M 66K text Up to 40 RPM Apr 24, 2026 Online
stepfun-ai/step-3.5-flash 262K 16K text Up to 40 RPM Feb 2, 2026 Online
qwen/qwen3.5-397b-a17b 256K 8K textimage Up to 40 RPM Feb 16, 2026 Online
qwen/qwen3.5-122b-a10b 262K 262K textimage Up to 40 RPM Feb 24, 2026 Online
nvidia/nemotron-3.5-content-safety 128K 8K textimage Up to 40 RPM Jun 4, 2026 Online
nvidia/llama-3.3-nemotron-super-49b-v1.5 131K 16K text Up to 40 RPM Oct 10, 2025 Online
meta/llama-3.2-3b-instruct 131K 8K text Up to 40 RPM Sep 25, 2024 Online
meta/llama-3.1-70b-instruct 131K 16K text Up to 40 RPM Jul 23, 2024 Online
meta/llama-3.2-11b-vision-instruct 131K 16K textimage Up to 40 RPM Sep 25, 2024 Online
meta/llama-3.2-1b-instruct 131K 60K text Up to 40 RPM Sep 25, 2024 Online
meta/llama-guard-4-12b 164K 16K textimage Up to 40 RPM Apr 30, 2025 Online

NVIDIA NIM Free Tier Limits & Pricing

Credit Card Not required
Free Tier Permanently free
Context Range 128K – 1.0M
Total Models 17 free
Rate Limits Up to 40 RPM
API Compatibility OpenAI SDK-compatible (Chat Completions)

NVIDIA NIM API Setup Tutorial & Tools

NVIDIA NIM is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →

Use Cases

What NVIDIA NIM's free models are best for, based on aggregated model capabilities:

Chat 17 models Reasoning 2 models Vision 1 model

Limitations & Caveats

  • ~40 RPM shared across all models, not per-model
  • Some models require additional registration per model family
  • Unavailable models listed in catalog but uncallable with standard key

Frequently Asked Questions

Why can't I call certain models on NVIDIA NIM even though they're listed?

NVIDIA NIM's catalog includes all models, but some require additional per-model-family registration. If you get a 403 error, go to the model's page and click "Try API" to register for that specific model family.

Is the 40 RPM limit shared across all models?

Yes — NVIDIA NIM applies a global ~40 RPM limit to your API key, shared across all model calls. If you're using multiple models in parallel, the combined rate cannot exceed ~40 RPM.

Does NVIDIA NIM require phone verification?

Yes, NVIDIA Developer account signup requires phone number verification. This is a one-time step during account creation.

See our FAQ for common questions about free LLM APIs