Z.ai opens GLM-5.2 to every coding-plan tier

The new flagship adds High and Max reasoning modes and a 1M-context configuration for coding agents such as Claude Code and OpenClaw.

By · Published

Why it matters

Z.ai is using GLM-5.2 to compete where developers already work: inside coding agents. The constraint is not access alone, but whether quota and reliability hold up on real repos.

An abstract, expanding AI core (GLM-5.2) connecting with diverse coding agent interfaces (Watercolor and ink — wet-on-wet washes for background and energy fields, sharp ink lines for the AI core, connections, and legible code elements, mask

Z.ai made GLM-5.2 available on Saturday to every GLM Coding Plan user, including Lite, Pro, Max and Team plans, according to a two-post thread on X. The move puts the Beijing AI company's newest flagship model directly inside the subscription channel it has been building for coding agents, instead of holding the model back for a narrower enterprise or API-only rollout.

The announcement is short, but the accompanying developer documentation shows the product decision underneath it. Z.ai's model-switching guide says the GLM Coding Plan now supports GLM-5.2 for all users and lets developers switch to it inside their preferred coding agent. The same guide gives configuration paths for Claude Code and OpenClaw, and says GLM-5.2 supports two thinking-effort levels: High and Max. For coding work, Z.ai recommends Max effort for deeper reasoning and more reliable performance.

That is the point of the launch. Z.ai is not only shipping another model name into an already crowded model picker. It is trying to make GLM a drop-in backend for the tools developers already use to run agents over real codebases. In Claude Code, Z.ai's docs tell users to edit ~/.claude/settings.json and map ANTHROPIC_DEFAULT_SONNET_MODEL and ANTHROPIC_DEFAULT_OPUS_MODEL to glm-5.2[1m]. In OpenClaw, the docs show GLM-5.2 as a provider model with a 1,000,000-token context window and maxTokens set to 131,072.

The 1M-context path is the most concrete technical claim Z.ai has documented around the rollout. The docs say users can enable GLM 1M context by adding the [1m] suffix to the model name and setting Claude Code's auto-compact window to 1,000,000. That gives Z.ai a clean pitch to developers working on large repos: route bigger codebase-understanding tasks through GLM-5.2, while keeping cheaper or routine work on older GLM models.

Z.ai's own usage policy makes clear that 'available to all users' does not mean unconstrained usage. Its GLM Coding Plan overview says Lite, Pro and Max plans are governed by 5-hour and weekly limits. It lists estimated caps of about 80 prompts per 5 hours and 400 per week for Lite, 400 per 5 hours and 2,000 per week for Pro, and 1,600 per 5 hours and 8,000 per week for Max. The company also says one prompt may invoke the model 15 to 20 times, so the prompt cap is not the same thing as a raw request cap.

The same page adds the caveat that matters for serious users: GLM-5.2 and GLM-5-Turbo are treated as advanced models and are deducted at a higher rate, 3x during peak hours and 2x during off-peak hours. Z.ai says a limited-time benefit keeps GLM-5.2 and GLM-5-Turbo at 1x quota during off-peak hours through the end of September. It also recommends using GLM-5.2 for complex tasks and GLM-4.7 for routine tasks to avoid burning through quota too quickly.

That quota language turns the launch into a segmentation move. Lite users can touch the flagship, which broadens distribution and gives Z.ai more developer feedback. Pro and Max users remain the target for heavier repo-scale tasks. The Team tier named in the X post suggests Z.ai also wants GLM-5.2 to be evaluated in group workflows, where a model's value is measured less by a single benchmark and more by whether it can survive messy, multi-file engineering work without exhausting a team's allowance.

Z.ai's positioning is tied to the company's origins. The company behind the brand, Knowledge Atlas Technology, lists HKEX stock code 02513 on its official site and presents GLM as part of a broader stack covering model APIs, AutoClaw, ChatGLM, AutoGLM, Zread.ai and AMiner. Public company profiles identify Peng Zhang as co-founder and CEO, Juanzi Li as co-founder and non-executive director, and Jie Tang as co-founder and chief scientist. That Tsinghua-rooted lineage has shaped Z.ai's model strategy: push GLM into open and developer-facing channels, then monetize access, tooling and usage at scale.

The competitive backdrop is obvious even though Z.ai does not name it in the thread. Claude Code has become one of the default interfaces for agentic software development, while Cursor, Cline, OpenCode and other coding tools increasingly let developers swap model providers. Z.ai is exploiting that opening. Rather than forcing developers into a proprietary editor, the GLM Coding Plan is framed as a subscription that works inside supported tools, including Claude Code, Cline and OpenCode, with extra MCP access for vision, web search, web reader and Zread.

The model itself is not yet backed in the announcement by fresh third-party benchmark results, and Z.ai's X thread does not disclose API timing, weights, parameter count or evaluation scores for GLM-5.2. The verified news is narrower and more operational: GLM-5.2 is now in the coding subscription, it exposes High and Max reasoning modes, and the docs describe a 1M-context configuration path for coding agents.

For Z.ai, that is still a meaningful release. In developer tools, distribution increasingly runs through the agent shell, not the model lab's homepage. GLM-5.2 reaching every coding-plan tier gives Z.ai a broader test surface and a stronger reason for developers to keep a GLM subscription wired into their daily workflow. The unanswered question is whether the model's reliability, latency and quota economics hold up once developers point it at the kind of long-running refactors that make or break coding-agent subscriptions.

Reader comments

Conversation for this story loads after sign-in.