Nvidia

NVIDIA: Nemotron 3 Ultra

nvidia/nemotron-3-ultra-550b-a55b

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it supports text input and output with a context window of up to 1M tokens. It is suited for long-running agentic workflows, including agent orchestration, coding agents, deep research, and complex enterprise tasks. It is particularly strong at multi-step reasoning and planning, with high-throughput inference designed for high-volume agent pipelines. It is part of the NVIDIA Nemotron family of open models for agentic AI.

  • Context window: 1,000,000 tokens
  • Input: text
  • Output: text
  • Pricing: $0.5/M input tokens, $2.5/M output tokens

View on OpenRouter. Model data sourced from OpenRouter.