How to Get a Free NVIDIA NIM API Key (2026)
17 free models available — no credit card required. Get your NVIDIA NIM API key → Test free models →
NVIDIA NIM FreeLLM Score
A solid choice for most developers with balanced limits and model quality.
What is NVIDIA NIM?
100+ open models from NVIDIA — no credit card, 40 RPM.
NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.
- 100+ open models available
- No daily token cap
- ~40 RPM free tier
- No credit card required
API Compatibility: OpenAI SDK-compatible (Chat Completions)
How to Get a NVIDIA NIM API Key
- 1 Sign up at build.nvidia.com Free NVIDIA Developer account. No credit card.
- 2 Go to Settings → API Keys
- 3 Generate an API key
- 4 Browse available models 100+ open models. Nemotron Super 49B recommended.
- 5 Configure OpenAI client Base URL: https://integrate.api.nvidia.com/v1
All Free NVIDIA NIM Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Released | Status |
|---|---|---|---|---|---|---|
| minimaxai/minimax-m3 | 1.0M | 512K | Up to 40 RPM | Jun 1, 2026 | Online | |
| moonshotai/kimi-k2.6 | 262K | 262K | Up to 40 RPM | Apr 20, 2026 | Online | |
| z-ai/glm-5.1 | 203K | 8K | Up to 40 RPM | Apr 7, 2026 | Online | |
| stepfun-ai/step-3.7-flash | 256K | 256K | Up to 40 RPM | May 29, 2026 | Online | |
| deepseek-ai/deepseek-v4-pro | 1.0M | 384K | Up to 40 RPM | Apr 24, 2026 | Online | |
| minimaxai/minimax-m2.7 | 205K | 131K | Up to 40 RPM | Mar 18, 2026 | Online | |
| deepseek-ai/deepseek-v4-flash | 1.0M | 66K | Up to 40 RPM | Apr 24, 2026 | Online | |
| stepfun-ai/step-3.5-flash | 262K | 16K | Up to 40 RPM | Feb 2, 2026 | Online | |
| qwen/qwen3.5-397b-a17b | 256K | 8K | Up to 40 RPM | Feb 16, 2026 | Online | |
| qwen/qwen3.5-122b-a10b | 262K | 262K | Up to 40 RPM | Feb 24, 2026 | Online | |
| nvidia/nemotron-3.5-content-safety | 128K | 8K | Up to 40 RPM | Jun 4, 2026 | Online | |
| nvidia/llama-3.3-nemotron-super-49b-v1.5 | 131K | 16K | Up to 40 RPM | Oct 10, 2025 | Online | |
| meta/llama-3.2-3b-instruct | 131K | 8K | Up to 40 RPM | Sep 25, 2024 | Online | |
| meta/llama-3.1-70b-instruct | 131K | 16K | Up to 40 RPM | Jul 23, 2024 | Online | |
| meta/llama-3.2-11b-vision-instruct | 131K | 16K | Up to 40 RPM | Sep 25, 2024 | Online | |
| meta/llama-3.2-1b-instruct | 131K | 60K | Up to 40 RPM | Sep 25, 2024 | Online | |
| meta/llama-guard-4-12b | 164K | 16K | Up to 40 RPM | Apr 30, 2025 | Online |
NVIDIA NIM Free Tier Limits & Pricing
NVIDIA NIM API Setup Tutorial & Tools
NVIDIA NIM is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →
Use Cases
What NVIDIA NIM's free models are best for, based on aggregated model capabilities:
Limitations & Caveats
- ~40 RPM shared across all models, not per-model
- Some models require additional registration per model family
- Unavailable models listed in catalog but uncallable with standard key
Frequently Asked Questions
Why can't I call certain models on NVIDIA NIM even though they're listed?
NVIDIA NIM's catalog includes all models, but some require additional per-model-family registration. If you get a 403 error, go to the model's page and click "Try API" to register for that specific model family.
Is the 40 RPM limit shared across all models?
Yes — NVIDIA NIM applies a global ~40 RPM limit to your API key, shared across all model calls. If you're using multiple models in parallel, the combined rate cannot exceed ~40 RPM.
Does NVIDIA NIM require phone verification?
Yes, NVIDIA Developer account signup requires phone number verification. This is a one-time step during account creation.