Nvidia
NVIDIA: Parakeet TDT 0.6B v3
nvidia/parakeet-tdt-0.6b-v3
Parakeet TDT 0.6B v3 is NVIDIA's 600M-parameter multilingual speech-to-text model built on the FastConformer-TDT architecture. Trained on the Granary dataset (670,000+ hours of audio), it supports automatic language detection across all official EU languages and achieves a 6.34% average word error rate on the HuggingFace Open ASR Leaderboard. Returns transcribed text with punctuation and segment timestamps.
- Input: audio
- Output: transcription
View on OpenRouter. Model data sourced from OpenRouter.