GLM 4.5: Open-Weights Model Aimed Squarely at Agents and1 and Long Context

Medium · 6 min read · original

GLM 4.5: Open-Weights Model Aimed Squarely at Agents and Long Context | by Fariha Batool | Medium

Sitemap

Open in app

Sign up

Sign in

Get app

Write

Search

Sign up

Sign in

Image 2

GLM 4.5: Open-Weights Model Aimed Squarely at Agents and Long Context

Image 3: Fariha Batool

Fariha Batool

Follow

2 min read

·

Jul 31, 2025

6

1

Listen

Share

On 28 July 2025 Z.ai (the company formerly known as Zhipu AI) released two fully open Mixture-of-Experts (MoE) checkpoints under an MIT licence:

GLM-4.5–355B-A32B — 355 B total parameters, 32 B active per token

GLM-4.5-Air-106B-A12B — cost-sensitive sibling, 106 B total / 12 B active

Both appear on Hugging Face and ModelScope in BF16, FP8 and INT formats; all 52 internal evaluation traces are public.

Why this release is notable

1 .Frontier-level, benchmarked quality

Internal runs across 12 public suites place the flagship model third overall (trailing only OpenAI o3 and Grok 4) and the Air variant sixth. Highlights:

2 Agent-centric architecture

3 Built-in speculative decoding

A new Multi-Token Prediction (MTP) layer drafts several tokens inside a single forward pass, no external “draft” model. The same pass also verifies the draft, yielding lower latency on mixed CPU-GPU rigs.

4 Transparent release artefacts

Weights, evaluation trajectories, and vLLM / SGLang guides ship together. GLM 4.5 is also only the second frontier model to validate the Muon optimiser at massive scale.

A deliberately broad prompt test

I ran the same wide but imprecise prompt against GLM 4.5 and OpenAI o3 to gauge coverage:

Let’s think step by step. Explain Azure Bot Framework and Custom Engine Agents—include App Registration, Redis, Application Insights, Key Vault, Web App, Tunnel Relay… Be very detailed.

GLM 4.5

o3

For documentation or diagram-heavy workflows, that’s a practical win for GLM 4.5. You can reproduce the test here:

Chat with Z.ai — Free AI for Presentations, Writing & Coding

What’s still missing

Bottom line

GLM 4.5 blends large-scale MoE, agent-friendly features and built-in speculative decoding while keeping everything under an open license. If the internal numbers hold up, it could become a go-to base for multilingual coding agents and long-context RAG pipelines. Another push in the East-West race toward truly general-purpose, tool-using LLMs.

Get Fariha Batool’s stories in your inbox

Join Medium for free to get updates from this writer.

Subscribe

Subscribe

Remember me for faster sign in

Have you benchmarked it yet? Comments are most welcome!

6

6

1

Image 4: Fariha Batool

Image 5: Fariha Batool

Follow

Written by Fariha Batool

6 followers

·3 following

AI enthusiast turning generative AI ideas into product features. I write bite‑sized notes on AI tools and breakthroughs, usually with a cup of coffee in hand.

Follow

Responses (1)

Image 6

Write a response

What are your thoughts?

Cancel

Respond

Image 7: Paragutekar

Paragutekar

Aug 1, 2025

Interesting article. GLM 4.5’s agentic architecture and speculative decoding really stand out.

Reply

More from Fariha Batool

Image 8: AI-generated composite of Barack Obama and Donald Trump sitting by a pool.

Image 9: Fariha Batool

Fariha Batool

I tried Gemini 2.5 Flash Image ### I spent some time putting Gemini 2.5 Flash Image Preview through real work. As a straight image generator, it feels excellent. Scenes come…

Aug 27, 2025

Image 10: The AI Agents Are Here, and They’re Already Building Their Own Society

Image 11: Fariha Batool

Fariha Batool

The AI Agents Are Here, and They’re Already Building Their Own Society ### Something extraordinary happened in the last week of January 2026. An open-source AI assistant called OpenClaw garnered over 100,000 GitHub…

Feb 1

1

Image 12: How Mixture‑of‑Experts (MoE) supercharges Today’s Frontier Language Models

Image 13: Fariha Batool

Fariha Batool

How Mixture‑of‑Experts (MoE) supercharges Today’s Frontier Language Models ### Recent public releases such as Kimi K2 (≈ 1 T total parameters, 32 B active) and GLM 4.5 (355 B total, 32 B active) highlight a sharp…

Jul 29, 2025

5

Image 14: From Idea to Research Pack in Under 15 Minutes

Image 15: Fariha Batool

Fariha Batool

From Idea to Research Pack in Under 15 Minutes ### ChatGPT’s new Agent mode doesn’t just answer questions — it opens its own browser and terminal, clicks around, writes files, and then…

Jul 27, 2025

13

See all from Fariha Batool

Recommended from Medium

Image 16: What Claude Design actually changes for designers

Image 17: Bootcamp

In

Bootcamp

by

Fanny

What Claude Design actually changes for designers ### The handoff problem is finally getting solved — and it’s happening faster than most of us expected.

Apr 20

270 6

Image 18: 6 brain images

Image 19: Write A Catalyst

In

Write A Catalyst

by

Dr. Patricia Schmidt 🧠

As a Neuroscientist, I Quit These 5 Morning Habits That Destroy Your Brain ### Most people do #1 within 10 minutes of waking (and it sabotages your entire day)

Jan 14

49K 1013

Image 20: I Failed Uber’s System Design Interview Last Month. Here’s Every Question They Asked.

Image 21: Emily

Emily

I Failed Uber’s System Design Interview Last Month. Here’s Every Question They Asked. ### It was much harder and the rejection email taught me more than any LeetCode grind ever could.

Feb 20

2.2K 49

Image 22: Vibe Coding is Over illustration of three ai generated landing pages with the words IT’S OVER written at the top in large text

Image 23: Michal Malewicz

Michal Malewicz

Vibe Coding is OVER. ### Here’s What Comes Next.

Mar 24

7.8K 303

Image 24: If You Understand These 5 AI Terms, You’re Ahead of 90% of People

Image 25: Towards AI

In

Towards AI

by

Shreyas Naphad

If You Understand These 5 AI Terms, You’re Ahead of 90% of People ### Master the core ideas behind AI without getting lost

Mar 29

15.3K 309

Image 26: AI Agents: Complete Course

Image 27: Data Science Collective

In

Data Science Collective

by

Marina Wyss

AI Agents: Complete Course ### From beginner to intermediate to production.

Dec 6, 2025

6.2K 264

See more recommendations

Help

Status

About

Careers

Press

Blog

Privacy

Rules

Terms

Text to speech