grok-4.3 vs DeepSeek-V4-Flash: Precision Beats Polish
grok-4.3 vs DeepSeek-V4-Flash
grok-4.3 takes this matchup by being the stricter, cleaner finisher on structured-output tasks, while DeepSeek-V4-Flash wins the one audience-sensitive writing test. The scoreline is close, but the deciding errors are the kind that matter in production.
This wasn’t a blowout, but **grok-4.3 wins on the tasks that punish sloppiness**. The aggregate score favors A, 36.7 to 34.4, and the per-task results explain why: one tie, one clear win for DeepSeek-V4-Flash, and two important wins for grok-4.3. Start with the tie: on the **Go TTL cache purge fix**, both models did the job correctly. They fixed the expiration condition to `ExpiresAt <= nowUnix`, deleted entries in place instead of rebuilding the map, and returned the removal count. No daylight here—just formatting differences. DeepSeek-V4-Flash earns real credit on the **vendor outage status update**. It was better calibrated for a non-technical support and sales audience, especially in telling customer-facing teams what to say and what action to take. That’s not cosmetic; it’s exactly what the prompt asked for, and B handled it more usefully than A. But grok-4.3 was sharper where **format compliance and exactness** mattered. On **meeting-notes summarization**, it matched the requested structure more faithfully: a true two-sentence summary, explicitly split into the two requested parts, without unnecessary markdown dressing. On **messy leads to clean JSON**, it delivered valid raw JSON exactly as requested, while DeepSeek-V4-Flash undercut an otherwise solid answer by wrapping the output in code fences—an avoidable miss that makes the result not strictly valid JSON. **Final call: grok-4.3.** DeepSeek-V4-Flash is the better audience-aware communicator in this set, but grok-4.3 is the more reliable model when the prompt demands exact structure, valid machine-readable output, and clean adherence to spec. In a head-to-head, that’s the higher-value win.
Go TTL cache purge fix
Language: Go. Return code only. Fix and complete this function so it removes expired sessions from the map in place and returns the number removed. A session is expired when ExpiresAt is less than or equal to nowUnix. Do not allocate a new map. Leave unexpired sessions untouched. ```go package session type Session struct { UserID string ExpiresAt int64 } func PurgeExpired(sessions map[string]Session, nowUnix int64) int { removed := 0 for id, s := range sessions { if s.ExpiresAt > nowUnix { delete(sessions, id) removed++ } } return removed } ```
Both outputs correctly fix the condition to remove sessions whose ExpiresAt is less than or equal to nowUnix, delete entries in place without allocating a new map, and return the number removed. The only differences are formatting, which does not affect correctness.
Vendor outage status update
Write a status update for our internal operations Slack channel. Context: Our billing vendor, NorthPass Pay, started returning intermittent 502 errors at 09:14 UTC. About 18% of card update attempts are failing in checkout and the account settings page. Existing subscriptions are not being canceled, but some renewals may be delayed by up to 2 hours. We enabled a retry queue at 09:31 UTC and are seeing gradual improvement. Next update will be at 10:15 UTC unless resolved sooner. Audience: non-technical customer support and sales staff. Tone: calm, clear, no blame. Length: 90-120 words.
Both are clear and accurate, but B is better tailored to the non-technical support/sales audience by explicitly explaining what to tell customers and what action is needed. It stays calm and clear while preserving all key facts within the requested length.
Summarize meeting notes and actions
Read the meeting notes below, then provide: 1) a 2-sentence summary 2) a JSON object with keys: "launch_date", "blocked_items", "decisions", "action_items" Meeting notes: - PrismLane mobile release check-in, Tue 12 Mar, 15:00 - Mara: Android build 4.9.2 passed regression. iOS still has the offline-sync spinner bug when reconnecting after 30+ minutes. - Devlin: Legal approved the revised location disclosure text this morning. - Chen: App Store screenshots in French-CA are still missing for the sleep-tracking feature. - Decision: keep the public launch date as 26 March if iOS fix lands by Friday 17:00. - Decision: remove the bedtime streak badge from this release; analytics event naming is inconsistent and not worth fixing now. - Action: Mara to pair with Jo on iOS reconnect bug today. - Action: Chen to deliver French-CA screenshots by Thursday noon. - Action: Devlin to draft fallback plan for a one-week slip in case iOS fix misses cutoff. - Risk: support article still mentions "night score" instead of new label "sleep balance".
Both outputs are accurate and follow the requested structure, but A better matches the prompt by explicitly numbering the two parts and providing a true 2-sentence summary without extra markdown wrappers. B is also strong, but its added formatting and labels are slightly less aligned with the requested plain summary-plus-JSON format.
Messy leads to clean JSON
Convert the messy lead notes below into valid JSON as an array of objects. Use exactly this schema for each object: {"name": string, "company": string, "email": string|null, "phone": string|null, "interested_in": [string], "follow_up_by": string|null} Rules: - Normalize emails to lowercase. - Keep phone numbers as strings exactly as written. - "interested_in" must be an array of short lowercase tags. - "follow_up_by" must be ISO date YYYY-MM-DD when a clear date exists, otherwise null. - Omit duplicates by email; keep the most complete entry. Lead notes: 1) Talia Voss / Mirefen Labs / talia@Mirefen.ai / asked about SOC 2 + audit logs / call back next Wed (2026-04-15) 2) Noor P. from KiteThread — noor@kitethread.co — interested in SSO, SCIM, annual billing — phone 212.555.0188 3) Mirefen follow-up: TALIA@mirefen.ai, phone: none, also wants API rate-limit details 4) "J. Cardenas" | Pelago Retail | jc@pelagoretail.com | wants csv export + sandbox | follow up 2026/04/18 5) Evan Silk, Oakrelay, no email, phone +44 20 7946 1182, asked for nonprofit pricing 6) Noor Patel / KiteThread / noor@kitethread.co / asked if invoice billing is possible / follow-up by April 19, 2026
A is valid raw JSON and correctly deduplicates, normalizes emails, preserves phone formatting, and converts dates. B’s content is similarly strong, but it wraps the JSON in Markdown code fences, so it is not strictly valid JSON as requested.
Matchup powered by OpenRouter.