Nvidia

NVIDIA: Parakeet TDT 0.6B v3

nvidia/parakeet-tdt-0.6b-v3

Parakeet TDT 0.6B v3 is NVIDIA's 600M-parameter multilingual speech-to-text model built on the FastConformer-TDT architecture. Trained on the Granary dataset (670,000+ hours of audio), it supports automatic language detection across all official EU languages and achieves a 6.34% average word error rate on the HuggingFace Open ASR Leaderboard. Returns transcribed text with punctuation and segment timestamps.

  • Input: audio
  • Output: transcription

View on OpenRouter. Model data sourced from OpenRouter.