Xiaomi
Xiaomi: MiMo-V2.5
xiaomi/mimo-v2.5
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding tasks. Its 1M context window supports complete documents, extended conversations, and complex task contexts in a single pass, making it ideal for integration with agent frameworks where strong reasoning, rich perception, and cost efficiency all matter.
- Context window: 1,048,576 tokens
- Input: text, audio, image, video
- Output: text
- Pricing: $0.14/M input tokens, $0.28/M output tokens
View on OpenRouter. Model data sourced from OpenRouter.