Nvidia
NVIDIA: Nemotron 3 Ultra
nvidia/nemotron-3-ultra-550b-a55b
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it supports text input and output with a context window of up to 1M tokens. It is suited for long-running agentic workflows, including agent orchestration, coding agents, deep research, and complex enterprise tasks. It is particularly strong at multi-step reasoning and planning, with high-throughput inference designed for high-volume agent pipelines. It is part of the NVIDIA Nemotron family of open models for agentic AI.
- Context window: 1,000,000 tokens
- Input: text
- Output: text
- Pricing: $0.5/M input tokens, $2.5/M output tokens
View on OpenRouter. Model data sourced from OpenRouter.