DeepSeek V4 Flash routs Xiaomi MiMo-V2.5

DeepSeek: DeepSeek V4 Flash vs Xiaomi: MiMo-V2.5

DeepSeek V4 Flash wins 34.0 to 17.0 by being usable, complete, and more faithful to the prompts. MiMo-V2.5 repeatedly looked polished while dropping facts, inventing details, or failing outright.

DeepSeek V4 Flash won this matchup where it matters: execution. In the TypeScript room-conflict task, it delivered an essentially working implementation with correct same-room overlap detection, ISO timestamps, ordered IDs, and sorted output. MiMo-V2.5 was not just weaker; it was truncated and syntactically incomplete. The writing tasks showed the same split. On the customer delay email, DeepSeek named the real problem — the finishing booth failed inspection — while keeping the tone calm and accountable. Xiaomi softened that into a vague “equipment issue” and even introduced an unsupported date, which is exactly the kind of helpful-sounding slippage that causes trouble. In extraction and summarization, DeepSeek again kept more of the record intact. It captured the older export ranges and the absence of UI errors in the support-thread summary, and it pulled the reimbursement claims, dates, categories, receipts, and total into JSON accurately despite the code-fence violation. Xiaomi had some cleaner next actions in the support summary, but it omitted key facts; in the reimbursement task, it produced incomplete/invalid JSON and appears to have misidentified an employee. **Final call: DeepSeek V4 Flash wins decisively.** MiMo-V2.5 has surface polish, but DeepSeek was the model you could actually hand the work to without spending the next hour repairing missing code, softened facts, and broken structured output.

TypeScript room conflict detector

Practical coding — TypeScript. Implement `export function findRoomConflicts(bookings: Booking[]): Conflict[]` for a clinic scheduler. Use these types: `type Booking = { id: string; room: string; start: string; end: string }; type Conflict = { room: string; a: string; b: string; overlapStart: string; overlapEnd: string };`. `start` and `end` are ISO-8601 strings with offsets. Two bookings conflict only if they are in the same room and their time ranges overlap by a positive duration; touching endpoints are not conflicts. Return one Conflict for each conflicting pair, with `a` and `b` as the two booking ids in lexicographic order, `overlapStart`/`overlapEnd` as ISO strings from `Date.toISOString()`, and the result sorted by room, overlapStart, then a, then b. Do not mutate the input. Return code only.

DeepSeek: DeepSeek V4 Flash: ```typescript export function findRoomConflicts(bookings: Booking[]): Conflict[] { const conflicts: Conflict[] = []; const n = bookings.length; for (let i = 0; i < n; i++) { for (let j = i + 1; j < n; j++) { const a = bookings[i]; const b = bookings[j]; // Only consider same room if (a.room !== b.room) continue; // Parse start and end times const startA = new Date(a.start).getTime(); const endA = new Date(a.end).getTime(); const startB = new Date(b.start).getTime(); const endB = new Date(b.end).getTime(); // Compute overlap const overlapStart = Math.max(startA, startB); const overlapEnd = Math.min(endA, endB); // Positive overlap required (not just touching) if (overlapStart >= overlapEnd) continue; // Ensure lexicographic order of ids const [idA, idB] = a.id < b.id ? [a.id, b.id] : [b.id, a.id]; conflicts.push({ room: a.room, a: idA, b: idB, overlapStart: new Date(overlapStart).toISOString(), overlapEnd: new Date(overlapEnd).toISOString(), }); } } // Sort by room, overlapStart, a, b conflicts.sort((c1, c2) => { const roomCmp = c1.room.localeCompare(c2.room); if (roomCmp !== 0) return roomCmp; const startCmp = c1.overlapStart.localeCompare(c2.overlapStart); if (startCmp !== 0) return startCmp; const aCmp = c1.a.localeCompare(c2.a); if (aCmp !== 0) return aCmp; return c1.b.localeCompare(c2.b); }); return conflicts; } ```
Xiaomi: MiMo-V2.5: ```typescript type Booking = { id: string; room: string; start: string; end: string }; type Conflict = { room: string; a: string; b: string; overlapStart: string; overlapEnd: string }; export function findRoomConflicts(bookings: Booking[]): Conflict[] { const conflicts: Conflict[] = []; for (let i = 0; i < bookings.length; i++) { for (

Model A provides a complete, essentially correct implementation that detects positive overlaps within the same room, formats overlap times with toISOString, orders ids, and sorts results as requested, with only a minor markdown/code-only issue. Model B is truncated and syntactically incomplete, so it cannot satisfy the task.

Customer delay email

Professional writing — Draft a customer email from a small furniture company to 42 customers whose walnut desk orders are delayed because a finishing booth failed inspection. Audience: customers who already paid deposits. Tone: accountable, calm, not defensive. Length: 140–180 words. Include: new estimated ship window of July 18–24, option to keep order with free white-glove delivery, option to cancel for full deposit refund by replying before July 10, and a promise of another update next Friday. Do not mention legal liability or blame inspectors.

DeepSeek: DeepSeek V4 Flash: **Subject:** Update on your walnut desk order Dear [Customer Name], We’re writing to let you know that your walnut desk order is delayed. During our quality checks, a finishing booth failed inspection, and we’ve paused production to correct the issue. We hold ourselves to high standards, and we want your desk to be perfect when it arrives. Your new estimated ship window is **July 18–24**. Here are your options: - **Keep your order** — we’ll add free white-glove delivery as a thank-you for your patience. - **Cancel for a full deposit refund** — just reply to this email by **July 10** and we’ll process it promptly. We’ll send another update next Friday with more details on our progress. Thank you for trusting us with your order — we take that responsibility seriously. If you have questions, please reply to this email or call us at [phone number]. Sincerely, [Your Name] [Company Name]
Xiaomi: MiMo-V2.5: Subject: Important Update Regarding Your Walnut Desk Order Dear [Customer Name], We are writing with an important update regarding your recent walnut desk order. Due to an unexpected equipment issue in our finishing booth, we are experiencing a delay in production. We sincerely apologize for this disruption to your timeline. Your new estimated shipping window is now July 18–24. To thank you for your patience, we will include complimentary white-glove delivery with your order, should you choose to keep it. If this new timeline does not work for you, we completely understand. You may cancel your order for a full refund of your deposit by simply replying to this email before July 10. We are committed to delivering the quality you expect. We will provide another detailed update next Friday, July 5. Please let us know how you would like to proceed. Sincerely, The Team at [Your Company Name]

Model A more directly and transparently states that the finishing booth failed inspection while maintaining an accountable, calm tone and including all required options and dates. Model B is polished but softens the cause into an 'equipment issue' and adds an unsupported specific date for next Friday.

Support thread summary extraction

Summarization & extraction — Read the support-thread excerpt and return a concise JSON object with keys `summary` (max 35 words), `customer_impact`, `confirmed_facts` (array), `open_questions` (array), and `next_actions` (array of objects with `owner` and `task`). Excerpt: "Tue 09:12, Priya (Acme Labs): Since yesterday evening, CSV exports from the Trial Balance page finish but contain only headers. PDF export still works. We tested Chrome and Edge. Tue 09:40, Leo (Support): I reproduced on Acme sandbox for date ranges ending after 2026-03-31. Older ranges export rows. Tue 10:05, Marta (Eng): Likely caused by the nullable `department_code` added in report API v4.2. Not seeing errors in UI; worker logs show 17 failed row-serialization warnings. Tue 10:20, Priya: Month-end close is Friday; we need CSVs for auditors. Tue 10:35, Marta: Patch ready, needs review; workaround is selecting 'Group by: Account' instead of 'Department'."

DeepSeek: DeepSeek V4 Flash: ```json { "summary": "CSV exports from Trial Balance page show only headers due to nullable `department_code` in API v4.2. PDF works. Patch ready, workaround available.", "customer_impact": "Month-end close is Friday; auditors need CSVs, but CSV exports are broken for date ranges after 2026-03-31.", "confirmed_facts": [ "CSV exports from Trial Balance page produce only headers since yesterday evening.", "PDF export works correctly.", "Issue reproduced on Chrome and Edge.", "Reproduced on Acme sandbox for date ranges ending after 2026-03-31.", "Older date ranges export rows correctly.", "Likely caused by nullable `department_code` added in report API v4.2.", "No errors in UI; worker logs show 17 failed row-serialization warnings." ], "open_questions": [ "When will the patch be reviewed and deployed?", "Does the workaround affect data accuracy or completeness?" ], "next_actions": [ { "owner": "Marta (Eng)", "task": "Submit patch for review and notify upon approval." }, { "owner": "Priya (Acme Labs)", "task": "Use workaround: select 'Group by: Account' instead of 'Department' for CSV exports." } ] } ```
Xiaomi: MiMo-V2.5: ```json { "summary": "CSV exports from Trial Balance produce only headers for date ranges after March 31, 2026. A patch is ready; month-end deadline is Friday.", "customer_impact": "Blocking critical month-end close process. Auditors require CSV data by Friday.", "confirmed_facts": [ "CSV export fails for date ranges ending after 2026-03-31, showing headers only.", "PDF export continues to function correctly.", "Issue reproduced on Acme sandbox across Chrome and Edge.", "Engineering identifies likely cause as nullable `department_code` from report API v4.2.", "Worker logs show 17 failed row-serialization warnings.", "A 'Group by: Account' selection is a viable workaround." ], "open_questions": [ "What is the exact timeline for the patch review and deployment?", "Are there any other data integrity or export issues linked to this change?" ], "next_actions": [ { "owner": "Engineering (Marta)", "task": "Complete review and deploy patch for the CSV export issue." }, { "owner": "Support (Leo)", "task": "Communicate the temporary workaround ('Group by: Account') to the customer." }, { "owner": "Support (Leo)", "task": "Monitor and confirm patch resolution with the customer post-deployment." } ] } ```

Model A is more complete on confirmed facts, including older ranges exporting rows and no UI errors, while keeping the required structure and summary length. Model B has stronger next actions but omits some key facts and slightly overstates/conflates reproduction details and customer impact.

Messy reimbursements to JSON

Data wrangling / structured output — Convert the messy reimbursement notes into valid JSON only. Schema: `{ "claims": [ { "employee": string, "date": "YYYY-MM-DD", "category": "meals"|"taxi"|"lodging"|"supplies", "amount_usd": number, "receipt": boolean } ], "total_usd": number }`. Round amounts to 2 decimals. Interpret `rcpt`, `receipt`, `scan`, or `photo` as receipt present; `no slip` as false. Notes: `J. Osei - 4/6/26 taxi from LGA $38.40 rcpt; Mira Patel Apr 7 2026: hotel, USD 214, scan uploaded; osei, Joseph 2026-04-07 lunch with vendor 27.8 no slip; M. Patel 04-08-2026 printer paper + folders supplies $19.36 photo; Jordan Osei Apr 8 dinner $31.25 receipt`.

DeepSeek: DeepSeek V4 Flash: ```json { "claims": [ { "employee": "J. Osei", "date": "2026-04-06", "category": "taxi", "amount_usd": 38.40, "receipt": true }, { "employee": "Mira Patel", "date": "2026-04-07", "category": "lodging", "amount_usd": 214.00, "receipt": true }, { "employee": "Joseph Osei", "date": "2026-04-07", "category": "meals", "amount_usd": 27.80, "receipt": false }, { "employee": "M. Patel", "date": "2026-04-08", "category": "supplies", "amount_usd": 19.36, "receipt": true }, { "employee": "Jordan Osei", "date": "2026-04-08", "category": "meals", "amount_usd": 31.25, "receipt": true } ], "total_usd": 330.81 } ```
Xiaomi: MiMo-V2.5: ```json { "claims": [ { "employee": "Joseph Osei", "date": "2026-04-06", "category": "taxi", "amount_usd": 38.40, "receipt": true }, { "employee": "Mira Patel", "date": "2026-04-07", "category": "lodging", "amount_usd": 214.00, "receipt": true }, { "employee": "Joseph Osei", "date": "2026-04-07", "category": "meals", "amount_usd": 27.80, "receipt": false }, { "employee": "Mira Patel", "date": "2026-04-08", "category": "supplies", "amount_usd": 19.36, "receipt": true }, { "employee": "Joseph Osei", "date": "2026-04-08", "category": "meals", "amount_usd": 31.25, "receipt":

Model A accurately extracts all claims, dates, categories, receipts, and total, though it violates the 'JSON only' instruction by wrapping the result in a code fence. Model B is incomplete/invalid JSON and also appears to misidentify at least one employee.

Matchup powered by OpenRouter.