grok-4.3 vs DeepSeek-V4-Flash: Precision Beats Polish

grok-4.3 vs DeepSeek-V4-Flash

grok-4.3 takes this matchup by being the stricter, cleaner finisher on structured-output tasks, while DeepSeek-V4-Flash wins the one audience-sensitive writing test. The scoreline is close, but the deciding errors are the kind that matter in production.

This wasn’t a blowout, but **grok-4.3 wins on the tasks that punish sloppiness**. The aggregate score favors A, 36.7 to 34.4, and the per-task results explain why: one tie, one clear win for DeepSeek-V4-Flash, and two important wins for grok-4.3. Start with the tie: on the **Go TTL cache purge fix**, both models did the job correctly. They fixed the expiration condition to `ExpiresAt <= nowUnix`, deleted entries in place instead of rebuilding the map, and returned the removal count. No daylight here—just formatting differences. DeepSeek-V4-Flash earns real credit on the **vendor outage status update**. It was better calibrated for a non-technical support and sales audience, especially in telling customer-facing teams what to say and what action to take. That’s not cosmetic; it’s exactly what the prompt asked for, and B handled it more usefully than A. But grok-4.3 was sharper where **format compliance and exactness** mattered. On **meeting-notes summarization**, it matched the requested structure more faithfully: a true two-sentence summary, explicitly split into the two requested parts, without unnecessary markdown dressing. On **messy leads to clean JSON**, it delivered valid raw JSON exactly as requested, while DeepSeek-V4-Flash undercut an otherwise solid answer by wrapping the output in code fences—an avoidable miss that makes the result not strictly valid JSON. **Final call: grok-4.3.** DeepSeek-V4-Flash is the better audience-aware communicator in this set, but grok-4.3 is the more reliable model when the prompt demands exact structure, valid machine-readable output, and clean adherence to spec. In a head-to-head, that’s the higher-value win.

Go TTL cache purge fix

Language: Go. Return code only. Fix and complete this function so it removes expired sessions from the map in place and returns the number removed. A session is expired when ExpiresAt is less than or equal to nowUnix. Do not allocate a new map. Leave unexpired sessions untouched. ```go package session type Session struct { UserID string ExpiresAt int64 } func PurgeExpired(sessions map[string]Session, nowUnix int64) int { removed := 0 for id, s := range sessions { if s.ExpiresAt > nowUnix { delete(sessions, id) removed++ } } return removed } ```

grok-4.3: ```go package session type Session struct { UserID string ExpiresAt int64 } func PurgeExpired(sessions map[string]Session, nowUnix int64) int { removed := 0 for id, s := range sessions { if s.ExpiresAt <= nowUnix { delete(sessions, id) removed++ } } return removed } ```
DeepSeek-V4-Flash: ```go package session type Session struct { UserID string ExpiresAt int64 } func PurgeExpired(sessions map[string]Session, nowUnix int64) int { removed := 0 for id, s := range sessions { if s.ExpiresAt <= nowUnix { delete(sessions, id) removed++ } } return removed } ```

Both outputs correctly fix the condition to remove sessions whose ExpiresAt is less than or equal to nowUnix, delete entries in place without allocating a new map, and return the number removed. The only differences are formatting, which does not affect correctness.

Vendor outage status update

Write a status update for our internal operations Slack channel. Context: Our billing vendor, NorthPass Pay, started returning intermittent 502 errors at 09:14 UTC. About 18% of card update attempts are failing in checkout and the account settings page. Existing subscriptions are not being canceled, but some renewals may be delayed by up to 2 hours. We enabled a retry queue at 09:31 UTC and are seeing gradual improvement. Next update will be at 10:15 UTC unless resolved sooner. Audience: non-technical customer support and sales staff. Tone: calm, clear, no blame. Length: 90-120 words.

grok-4.3: Hi everyone, Our billing partner NorthPass Pay started having intermittent connection issues beginning at 09:14 UTC. This is affecting approximately 18% of card update attempts in checkout and the account settings page. Existing subscriptions continue without interruption, but some renewals could be delayed by as much as 2 hours. We set up a retry queue at 09:31 UTC, and we're observing gradual improvement. We'll share the next update at 10:15 UTC unless the situation improves sooner. If customers contact you regarding payment problems, please inform them that our team is aware and working to resolve it promptly. Thank you for your support.
DeepSeek-V4-Flash: **Status Update: NorthPass Pay Intermittent Errors** 🕘 *09:14 UTC* – We began seeing intermittent 502 errors from our billing vendor, NorthPass Pay. This is causing about 18% of card update attempts to fail on checkout and account settings pages, but **no subscriptions are being canceled**. 🔄 *09:31 UTC* – We activated a retry queue for failed payments. Some renewals may be delayed by up to 2 hours, but we’re seeing gradual improvement. 📅 *Next update* – 10:15 UTC (or sooner if resolved). **What this means for you:** If a customer reports a card update error, reassure them their subscription remains active. For renewal delays, let them know we’re retrying automatically. No action needed from your side—we’re monitoring closely. Thanks for your patience! 🙏

Both are clear and accurate, but B is better tailored to the non-technical support/sales audience by explicitly explaining what to tell customers and what action is needed. It stays calm and clear while preserving all key facts within the requested length.

Summarize meeting notes and actions

Read the meeting notes below, then provide: 1) a 2-sentence summary 2) a JSON object with keys: "launch_date", "blocked_items", "decisions", "action_items" Meeting notes: - PrismLane mobile release check-in, Tue 12 Mar, 15:00 - Mara: Android build 4.9.2 passed regression. iOS still has the offline-sync spinner bug when reconnecting after 30+ minutes. - Devlin: Legal approved the revised location disclosure text this morning. - Chen: App Store screenshots in French-CA are still missing for the sleep-tracking feature. - Decision: keep the public launch date as 26 March if iOS fix lands by Friday 17:00. - Decision: remove the bedtime streak badge from this release; analytics event naming is inconsistent and not worth fixing now. - Action: Mara to pair with Jo on iOS reconnect bug today. - Action: Chen to deliver French-CA screenshots by Thursday noon. - Action: Devlin to draft fallback plan for a one-week slip in case iOS fix misses cutoff. - Risk: support article still mentions "night score" instead of new label "sleep balance".

grok-4.3: 1) The PrismLane mobile release check-in on March 12 covered remaining blockers for the upcoming launch, including an unresolved iOS offline-sync bug after long disconnects and missing French-CA App Store screenshots for sleep tracking, while noting Android had passed regression and legal had approved updated location text. Decisions were made to target a March 26 public launch if the iOS fix is ready by Friday, remove the bedtime streak badge due to inconsistent analytics, and assign follow-ups to Mara, Chen, and Devlin along with a fallback slip plan. 2) { "launch_date": "26 March", "blocked_items": [ "iOS offline-sync spinner bug on reconnect after 30+ minutes", "missing App Store screenshots in French-CA for sleep-tracking feature" ], "decisions": [ "keep the public launch date as 26 March if iOS fix lands by Friday 17:00", "remove the bedtime streak badge from this release due to inconsistent analytics event naming" ], "action_items": [ "Mara to pair with Jo on iOS reconnect bug today", "Chen to deliver French-CA screenshots by Thursday noon", "Devlin to draft fallback plan for a one-week slip in case iOS fix misses cutoff" ] }
DeepSeek-V4-Flash: **Summary:** The team is targeting a March 26 launch for the PrismLane mobile release, contingent on the iOS offline-sync spinner bug being fixed by Friday at 17:00. The bedtime streak badge has been removed from this release, and several action items were assigned to address remaining blockers like missing Canadian-French screenshots and a potential one-week slip plan. **JSON object:** ```json { "launch_date": "26 March", "blocked_items": [ "iOS offline-sync spinner bug when reconnecting after 30+ minutes", "Missing App Store screenshots in French-CA for sleep-tracking feature" ], "decisions": [ "Keep public launch date as 26 March if iOS fix lands by Friday 17:00", "Remove bedtime streak badge from this release due to inconsistent analytics event naming" ], "action_items": [ "Mara to pair with Jo on iOS reconnect bug today", "Chen to deliver French-CA screenshots by Thursday noon", "Devlin to draft fallback plan for a one-week slip in case iOS fix misses cutoff" ] } ```

Both outputs are accurate and follow the requested structure, but A better matches the prompt by explicitly numbering the two parts and providing a true 2-sentence summary without extra markdown wrappers. B is also strong, but its added formatting and labels are slightly less aligned with the requested plain summary-plus-JSON format.

Messy leads to clean JSON

Convert the messy lead notes below into valid JSON as an array of objects. Use exactly this schema for each object: {"name": string, "company": string, "email": string|null, "phone": string|null, "interested_in": [string], "follow_up_by": string|null} Rules: - Normalize emails to lowercase. - Keep phone numbers as strings exactly as written. - "interested_in" must be an array of short lowercase tags. - "follow_up_by" must be ISO date YYYY-MM-DD when a clear date exists, otherwise null. - Omit duplicates by email; keep the most complete entry. Lead notes: 1) Talia Voss / Mirefen Labs / talia@Mirefen.ai / asked about SOC 2 + audit logs / call back next Wed (2026-04-15) 2) Noor P. from KiteThread — noor@kitethread.co — interested in SSO, SCIM, annual billing — phone 212.555.0188 3) Mirefen follow-up: TALIA@mirefen.ai, phone: none, also wants API rate-limit details 4) "J. Cardenas" | Pelago Retail | jc@pelagoretail.com | wants csv export + sandbox | follow up 2026/04/18 5) Evan Silk, Oakrelay, no email, phone +44 20 7946 1182, asked for nonprofit pricing 6) Noor Patel / KiteThread / noor@kitethread.co / asked if invoice billing is possible / follow-up by April 19, 2026

grok-4.3: [ { "name": "Talia Voss", "company": "Mirefen Labs", "email": "talia@mirefen.ai", "phone": null, "interested_in": ["soc 2", "audit logs", "api rate limits"], "follow_up_by": "2026-04-15" }, { "name": "Noor Patel", "company": "KiteThread", "email": "noor@kitethread.co", "phone": "212.555.0188", "interested_in": ["sso", "scim", "annual billing", "invoice billing"], "follow_up_by": "2026-04-19" }, { "name": "J. Cardenas", "company": "Pelago Retail", "email": "jc@pelagoretail.com", "phone": null, "interested_in": ["csv export", "sandbox"], "follow_up_by": "2026-04-18" }, { "name": "Evan Silk", "company": "Oakrelay", "email": null, "phone": "+44 20 7946 1182", "interested_in": ["nonprofit pricing"], "follow_up_by": null } ]
DeepSeek-V4-Flash: ```json [ { "name": "Talia Voss", "company": "Mirefen Labs", "email": "talia@mirefen.ai", "phone": null, "interested_in": ["soc 2", "audit logs", "api rate-limiting"], "follow_up_by": "2026-04-15" }, { "name": "Noor Patel", "company": "KiteThread", "email": "noor@kitethread.co", "phone": "212.555.0188", "interested_in": ["sso", "scim", "annual billing", "invoice billing"], "follow_up_by": "2026-04-19" }, { "name": "J. Cardenas", "company": "Pelago Retail", "email": "jc@pelagoretail.com", "phone": null, "interested_in": ["csv export", "sandbox"], "follow_up_by": "2026-04-18" }, { "name": "Evan Silk", "company": "Oakrelay", "email": null, "phone": "+44 20 7946 1182", "interested_in": ["nonprofit pricing"], "follow_up_by": null } ] ```

A is valid raw JSON and correctly deduplicates, normalizes emails, preserves phone formatting, and converts dates. B’s content is similarly strong, but it wraps the JSON in Markdown code fences, so it is not strictly valid JSON as requested.

Matchup powered by OpenRouter.