Rio de Janeiro ships an open AI model built on Qwen
IplanRIO put Rio 3.5 Open 397B on Hugging Face with MIT licensing, a 1M-token context claim, and self-reported gains over Qwen's base model.
By Ryan Merket · Published · Updated
Why it matters
A city government publishing a 397B-class open model shows how Qwen-based post-training is moving frontier-style AI work beyond private labs and cloud platforms.

Rio de Janeiro's municipal IT company, IplanRIO, has released Rio 3.5 Open 397B, a large open-weight AI model post-trained from Alibaba's Qwen 3.5 397B base model.
ELI5: Rio de Janeiro's city IT agency took a huge open AI model from Alibaba's Qwen, tuned it with a reasoning-efficiency method, and published the weights for others to inspect or run. The big question is whether its self-reported gains hold up when independent teams test it.
The release was spotted Saturday by SemiAnalysis (@SemiAnalysis_), which noted the unusual source of the model: not a frontier AI lab, cloud provider, or venture-backed startup, but the city government of Rio de Janeiro. The Hugging Face organization behind the upload is listed as "Prefeitura do Rio de Janeiro (City of Rio de Janeiro)," and the model card says the work was developed by IplanRIO, the municipal technology company responsible for information and communications technology resources for the city.
That makes the release more than another Qwen fine-tune. It is a public-sector model lab move, using an open Chinese foundation model as the base layer and publishing the result under an MIT license. The model card says Rio 3.5 Open 397B is a mixture-of-experts model with about 397 billion total parameters and about 17 billion active parameters, a 1,010,000-token context window, multilingual support, and image-text-to-text capability. Hugging Face metadata lists the repository as a Transformers and Safetensors model with Portuguese and English tags and shows the model size as 403 billion parameters.
The base model is Qwen/Qwen3.5-397B-A17B, which Qwen's own model card describes as a 397 billion parameter vision-language model with 17 billion activated parameters, native 262,144-token context and extensibility up to 1,010,000 tokens. Qwen's card also says Qwen3.5-Plus is the hosted version corresponding to Qwen3.5-397B-A17B, with managed production features through Alibaba Cloud Model Studio. Rio's release takes that open base and tries to move the public checkpoint closer to the performance profile of the hosted model tier.
The technical bet is SwiReasoning, a framework from Dachuan Shi, Abedelkadir Asi, Keying Li, Xiangchi Yuan, Leyan Pan, Wenke Lee, and Wen Xiao. The arXiv paper describes SwiReasoning as a training-free method that switches between explicit chain-of-thought reasoning and latent-space reasoning using confidence signals derived from next-token entropy trends. In plain terms, the model does not always have to write out every step of its reasoning path; the framework is designed to let it move between visible token-by-token reasoning and hidden-state reasoning, then return to explicit output when confidence improves.
IplanRIO's model card says Rio 3.5 Open 397B was "explicitly trained to maximize the efficiency gained via latent reasoning." That is the claim to watch. The strongest open-model work right now is not only about parameter count. It is about whether post-training, inference policy, tool use and long-context behavior can make an already large base model cheaper or more useful in real workloads. Rio's card frames SwiReasoning as an efficiency layer as much as a quality layer: fewer emitted reasoning tokens when the model can search internally, more explicit reasoning when it needs to commit.
The benchmark claims are substantial, but they remain model-card claims. IplanRIO reports Rio 3.5 Open 397B at 70.8 on Terminal-Bench 2.1 versus 52.5 for the Qwen 3.5 397B base model and 70.3 for Qwen 3.7 Plus. It reports 80.2 on SWE-Bench Verified versus 76.2 for the base model, 90.9 on GPQA Diamond versus 88.4, and 93.9 on HMMT 2026 February versus 87.9. The card also lists comparisons against DeepSeek V4 Pro, Kimi-K2.6 and GPT 5.5 across coding, knowledge, math, multilingual, multimodal, agent and economic-value benchmarks.
Those tables should be read with the usual discipline: they are useful directional data from the release author, not independent validation. The model is large enough that broad third-party replication will be expensive. Hugging Face currently says Rio 3.5 Open 397B is not deployed by an inference provider, which means most developers will not casually test it through a hosted button. The model card's vLLM example uses tensor parallelism across eight GPUs and a 1,048,576-token max model length, a configuration that puts practical evaluation beyond hobbyist hardware.
Still, the release lands at a specific pressure point in the open-model market. Qwen has become one of the default bases for developers who want frontier-adjacent capability without depending entirely on closed APIs. The gap has shifted from "can I get the weights?" to "can I post-train, serve and specialize them well enough to match hosted systems?" Rio 3.5 Open 397B is a public-sector answer to that question: start from Qwen, add a reasoning-control method, publish the resulting weights, and let the community inspect the bet.
The public-sector angle matters because the incentives are different. A startup would use a release like this to sell API capacity, raise a round, or recruit enterprise pilots. A cloud provider would use it to pull workloads onto its infrastructure. IplanRIO's stated institutional role is to administer the city's technology resources, and the Rio model card presents the release as public AI infrastructure rather than a commercial product. That does not make the benchmark claims stronger, but it does make the move harder to slot into the standard AI-lab playbook.
The unanswered question is whether Rio 3.5 Open 397B becomes a real base for builders or a high-profile curiosity. MIT licensing helps. Qwen compatibility helps. Portuguese support gives it a local reason to exist beyond benchmark theater. But a 397B-class model is still a serious serving problem, and without widely available quantizations, hosted endpoints or replicated evals, the practical user base starts with well-funded labs and infrastructure-heavy teams.
For now, Rio de Janeiro has done the part most public agencies never attempt: it put a frontier-scale model artifact in the open, with a clear base model, a disclosed reasoning framework, and benchmark claims specific enough to be tested. The next test is whether independent operators can reproduce the gains once the release moves from a Hugging Face card into real deployments.