AI — Page 6
Models, agents, infra, applied AI.
- SubQ Releases Its 1.1 Small Model Card as Dangel and Whedon Try to Prove Long Context Can Beat RAG
The Subquadratic team says its sparse-attention model hits 98% needle retrieval at 12M tokens, but access remains limited to design partners.
- Genesis AI's Eno robot rejects the humanoid default
Xian Zhou and Theophile Gervet are taking a wheeled, foldable path into physical AI after raising a $105 million seed round.
- Analysis: Why Salesforce is buying Fin for about $3.6B
Salesforce + Fin: packaging customer-service agents for Agentforce
- Moonshot AI's Yang Zhilin Pushes Kimi Deeper Into Coding Agents
Kimi K2.7-Code is a 1T-parameter MoE model with 32B active parameters, a 256K context window and open weights on Hugging Face.
- Anthropic faces class-action claim over Claude Max 20x limits
Karl Kahn says Claude Max 20x delivered six to eight times Pro usage, not the 20x Anthropic advertised.
- Microsoft's GitHub capacity crunch sends it to AWS
AI coding agents have turned GitHub reliability into an infrastructure problem Azure cannot absorb alone on Microsoft's timetable.
- Hermes Agent's new async subagents take aim at the blocking-agent problem
Teknium says the delegate tool can now fan out work without freezing the chat, a practical change for long-running agent workflows.
- Head to head: AnimateDiff Turbo vs Seedance 2 Image to Video
One model mostly gestures at the prompts; the other actually stages them. This matchup isn’t close: Seedance 2 Image to Video wins by turning specific shot language into coherent motion instead of settling for attractive approximation.
- Cartesia packages Sonic-3.5 and Ink-2 into a full voice-agent stack
Karan Goel is using benchmark wins to pitch Cartesia as both the speaking and listening layer for real-time AI agents.
- CrankGPT is a hand-cranked AI project with real edge-computing numbers
CrankGPT runs speech recognition, a small language model, and text-to-speech locally on a Raspberry Pi 5 with no battery or cloud.
- Head to head: Anthropic: Claude Opus 4.8 vs Kimi K2.7 Code
Claude Opus 4.8 sweeps three of four tasks with sharper regex engineering, more polished prose, and cleaner structured output—Kimi K2.7 Code only manages a tie on the JSON normalization task.
- Head to head: AuraFlow vs Rundiffusion Photo Flux
One model wins on the jobs that punish sloppiness: typography, layout discipline, and prompt-specific product detail. The other lands a moodier single-image hit, but not enough to overcome repeated misses where accuracy actually matters.
- Claude Code user says the coding assistant saved his life by pushing him to the ER for AFib
A 73-year-old developer said he mentioned feeling unwell during a coding task, and Claude Code kept urging immediate care before doctors treated a sudden AFib episode.
- SGLang adds DFlash to push Qwen 3.5 397B-A17B inference up to 4.3x faster
Z Lab, Modal and LMSYS released a DFlash drafter for Qwen's 397B model and benchmarked it above native MTP on 8x B200 GPUs.
- Anthropic's Fable shutdown turns into a trust fight with Washington
The company pulled Fable 5 and Mythos 5 after a June 12 export-control order, then sent technical staff to Washington to repair the relationship.
- NewCore emerges with $66M to make AI agents manageable identities
Zohar Alon's new identity-security startup is betting enterprises will need to govern agents like workers, not service accounts.
- Head to head: AnimateDiff Turbo vs Marey Realism V1.5
One model delivers attractive motion clips; the other actually follows the brief shot by shot. In both tests, Marey Realism V1.5 separates itself by turning prompt details into believable action instead of decorative near-misses.
- Head to head: AuraFlow vs Luma Uni-1 Edit
This matchup wasn’t close once the prompts demanded precise scene construction rather than just attractive images. AuraFlow can look polished, but Luma Uni-1 Edit was the model that actually followed the brief across all three tests.
- Rio 3.5 page says wrong weights were uploaded after Nex-AGI analysis
The updated model card says a base merge of Nex-N2-Pro and Qwen was uploaded by mistake, shifting the dispute from pure attribution to release discipline.
- Zuckerberg's $14 billion AI reset now needs customers
Alexandr Wang's Muse Spark gives Meta a proprietary model; the harder job is proving it can become more than ad infrastructure.
- Pearl's AI mining pitch faces a 112 MW usefulness test
A June preprint claims Pearl's GPU network is doing random matrix math, not verified AI work, challenging Omri Weinstein's core bet.
- PixelRAG makes the case that web RAG should read pixels, not parsed text
Yichuan Wang and collaborators show a screenshot-first retrieval system beating text pipelines, with lower agent token use and a real chunking gap.
- Kimi K2.7 ranks second behind Fable 5 and above GPT 5.5 xhigh in ErdosBench's mathematical research test
Przemek Chojecki's 14-problem smoke run puts Moonshot's new open-weight model behind Claude Fable-5-max and ahead of GPT-5.5 xhigh.
- Depthfirst turns FFmpeg into a proof point for autonomous security agents
The AI security startup says its agent found 21 FFmpeg zero-days for about $1,000, including an RCE exploit primitive.