OpenAI

OpenAI: GPT Audio

openai/gpt-audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

  • Context window: 128,000 tokens
  • Input: text, audio
  • Output: text, audio
  • Pricing: $2.5/M input tokens, $10/M output tokens

View on OpenRouter. Model data sourced from OpenRouter.