I don't find a closed-source Chinese agent system trustworthy.
It is essentially a black box with full user permissions, meaning you are just handing over your entire system to a Chinese-owned server. With OpenCode and its GLM provider, at least I can monitor which files were read, which were edited, and what commands were executed.
Not to mention that Chinese national security laws legally obligate companies to cooperate with state intelligence and counter-espionage efforts [0]. If you have this installed on a corporate workstation, and your company is large enough, the possibility of them spying on you is not just a risk—it's almost a certainty.
Thank you. It doesn't make sense to me how much people trust our companies so much more than Chinese ones for no reason. This country has an abysmal track record when it comes to respecting its citizen's rights or privacy. Propaganda working as intended I suppose.
It’s not no reason. At a fundamental level I don’t trust the companies any differently. But at a professional level, nobody is going to question my using Claude or OpenAI in a professional capacity - to work on customer projects, analyze their data, etc.
I also consider Microsoft to be the biggest industrial spy in the world, them and google both are no doubt mining everything you type into office / gsuite, all your emails, etc. But nobody bats an eye when you write a word doc about some sensitive matter.
If my customers thought I was feeding their data into a Chinese owned LLM API (which to be clear I’m not), I don’t think it would go over well, and I’d be exposed legally to all sorts of things.
So the reason is risk aversion and desire to participate in US / western commerce. One can debate the actual threat, but why would you ever risk sending your data to a processor perceived as dodgy?
Both are abysmal, but as a US citizen bad behavior from Chinese corporations and government is vastly more limited in how negatively it can impact my life in a practical way than bad behavior from US corporations and government.
I think it’s a real concern. Chinese companies are much more closely tied to the state, as in if you decide to go to China one day they might already have all the data on how you have interacted with their models.
The US is certainly inching in that direction but it’s not like someone from the US government sits at Anthropic’s HQ reading chats from state people of interest.
> all the data on how you have interacted with their models
1) there is a very non-zero chance that the US government also has that data from OpenAI and possibly Anthropic
2) unless you are asking the chinese models to draw up plans to overthrow the chinese government, it's extremely unlikely they would ever care.
while china has a track record of harassing it's own dissident citizens abroad, if you're not chinese and not trying to subvert their government (or are a high-ranking government official yourself), it's kind of silly to suppose they would ever care about you or what you do.
and if you have information they want for their own national development purposes, like EUV engineers, they are much more likely to offer you fabulous amounts of money instead of try to intimidate or threaten it out of you.
As someone who loves using OpenCode w/ local Chinese open source models, this is basically my take on this as well. There's no way I would ever put a piece of proprietary Chinese software that gets full system control on anything important. This is definitely something I would only ever run sandboxed in a lab environment for toy projects, not for serious work. I feel only marginally better about Codex/Claude Code, hence my strong preference for local LLMs w/ OpenCode, but a proprietary approach to Chinese models is a hard no from me dawg.
It's impressive all these companies are getting away with "base usage allowance included" [1] or "standard limits" [2], layering the higher plans as a multiplier of that "base" but never disclosing what it is.
I guess the base is whatever the profit margin needs to be this month.
I'm somewhat surprised that this is not open source (from what I can tell). Compare to Mimo Code https://github.com/XiaomiMiMo/MiMo-Code (which is a CLI, while this is a desktop app).
I don't even know what I would do with a desktop app. I'm running these things in headless VMs, so I can run them with `--dangerously-skip-permissions` or whatever. I don't trust them, even without that flag, on my desktop/laptop.
They might be sending some user requests to Anthropic to gather trading data for their own models. If they do so, perhaps they need to add some tracer to request that they prefer to hide.
Verbally minimising potential threats is not a valid approach to managing risk. We have seen mass misuse of tokens acquired through nefarious means to distill models and enhance training as a way of catching up recently, among other related issues. It is quite appropriate to wonder what else might be going on.
Given that there's such severe concern being expressed by Anthropic about Claude being distilled, and the idea that the harness is part of the the moat, it doesn't seem super surprising that the other side of that would try to also make it harder for them to tell how well they're doing and what their approach is.
It's only a cli because they yanked out the opencode desktop code. (As well as the opencode go/zen model provider)
Edit: my theory is they wanted to mimic being the primary provider in a quick way with a lot of string replace. Though they could have added opencode back as a regular provider.
If you're already used to your TUI coding agent, you don't need the desktop agent. Although it is nice that it is there for folks who prefer the Codex App/Claude App UI approach.
Yeah, I use GLM 5.2 in OpenCode, running in a Docker container with CodeNomad as the web-based GUI. It works perfectly; I can access it from anywhere, and it runs all models (except for Anthropic's subscriptions).
For some tasks, it's better. Opus refuses tasks for me pretty regularly. GLM 5.2 has never refused a task. So for anything security-related or that touches on topics that trigger Opus's safety guardrails, I use GLM 5.2.
OTOH, for anything related to UI design, I use Opus 4.8. It's much better at taking relatively vague descriptions of user interfaces and a mockup of a related UI and combining them into an immaculate design.
For anything else, I tend to run tasks in Opus and then have GLM review them and write a Markdown file with anything it finds. Then I have Opus review the markdown file and fix the issues it agrees with. The reason I usually go with Opus 4.8 first is mainly that it's faster. Opus 4.8 is, on average, about twice as fast as GLM 5.2 running on z'ai's infrastructure for the same task. There's a large variance (sometimes GLM 5.2 is pretty fast and Opus 4.8 is pretty slow), but on average it's a very noticeable difference.
When I run into Anthropic's Quota, I switch to GLM 5.2 rather than Sonnet. I don't think there's much reason to ever use Sonnet for anything if you can use GLM 5.2 instead.
This is all pretty subjective, of course. On average, I think Opus 4.8 is still a better, more reliable, and faster model, but if it went away tomorrow and I only had GLM 5.2, I wouldn't be too sad about it; I'd get things done with GLM 5.2 just fine.
I’ve never had a refusal coding, and in some areas (AI red teaming specifically) I’ve found it quite good at recognizing and discussing “white hat” stuff that in the past I think would have got refusals.
But when there was the Hantavirus thing a while back, I asked it if there was a vaccine under development and got a refusal immediately. I’ve had a few like that. It seems they’ve implemented really poor guardrails on certain topics (CBRN and cyber) that have lots of false positives. But if you actually chat with the model itself it’s quite lucid about what is legitimately dangerous and what is just performative “AI Safety” style refusal.
Yeah, I’ve had Opus (and Fable) perform full security audits on my codebases that would run for 30mins. That’s what I think would have tripped it but went just fine.
Do you guys use it through open router? Do you have any concerns about how the data you send is being intercepted? Not that I trust Anthropic but it’s widely agreed that it’s kosher to use them for commercial work, I can’t see comfortably sending any customer data to openrouter.
Edit- I see down-thread you use z.ai directly. Same concern, aren’t you worried about using it for professional stuff.
Are you micromanaging your GLM costs? It seems the best bang for buck strategy right now is a Opencode Go subscription to get the subsidized rate and then switch to Openrouter's model above and beyond that + make use of a dual model strategy by having GLM 5.2 do planning and Deepseek V4 Flash for implementation.
Nice, lucky! The Opencode Go GLM 5.2 quota gets used up so fast. It's an expensive model. And while impressive for being open weight, it seems slower than Opus and GPT. So I typically only use it after exhausting quotas of discounted GPT5.5 or Opus 4.6^ paid plans.
Yea not touching this with an any-foot pole. They are just keeping up with the Joneses now. There is no reason for this to exist but there IS a reason it is not open source. ;)
Looks quite pretty! Not sure if I want to try that instead of OpenCode, maybe. OpenCode also has a desktop app, I will admit that I like their TUI one better (and honestly more than Claude Code TUI) but whole the desktop version is kinda more basic, it's nice enough: https://opencode.ai/download
That said, it's interesting that they're releasing a bunch of stuff: ZCode, OCR.z.ai, Image.z.ai, Audio.z.ai, AutoClaw and some other stuff that https://chat.z.ai/ links to. That's a lot of stuff for one org to pull off.
Figured I'd try out their Pro coding plan, seems like it doesn't necessarily give me that much quota than Opus (at least given how many tokens are needed for accomplishing a certain task), but GLM 5.2 in of itself seems like a beefier Sonnet model, pretty good.
It gets even wilder when you click on "Join the Linux Beta Group". That leads you to https://www.feishu.cn/download eventually. I have no clue what feishu.cn is and I don't see a language toggle. Sometimes it just seems like chinese companies simply don't want international business.
Does anyone use an agnostic TUI or harness for development tasks that can fairly seamlessly switch between providers?
I'm wanting local context in the spirit of "here are 3 AI providers available, for coding tasks use this one... and for writing prose use this one... and for generating images use this one..." etc.
OpenCode was the first agent harness I used, and I have always like it. You can configure a wide variety of providers, but it's open source and has a number of core contributors.
The other opinionated option is Pi (the Pi agent harness). This is a great lightweight option and also supports a number of providers. You can also use local model servers.
I’ve written a skill for codex and Claude code that designates an orchestrator on the primary worktree and “workers” on N supporting worktrees.
The supporting worktrees are labeled wb1, wb2…wbn. You run either Claude or Codex in tabs for each work tree.
Then you generally work with the orchestrator on the primary worktree. It delegates tasks to the different workers and answers their smaller questions, surfacing results and assisting them with context clearing when needed.
The orchestrator and workers communicate using a simple shared file system under tmp/* and together they can handle a big and varied workload.
The orchestrator knows which AI client is running in any given worktree, so it would be fairly easy to designate which AI should receive what kind of tasks.
I do have some AI TUI specific instructions, for instance codex is primitive at monitoring compared to CC.
I use iterm2, so I’ve also added iterm2 specific python that allows the orchestrator to “kick” a worker by modifying the input and submitting it.
This also allows the orchestrator to monitor and reset its own context when necessary.
All context resets are handled gracefully, and continuation prompt and comms history allows workers and orchestrators to ably restore and continue their work without need to compact.
I use the one that I've been developing since 2023. It's intended to be used in exactly this spirit! Written in Go, has image support (which has yet to be fleshed out).
It supports MCP (unlike Pi), sandboxing (with user-mode networking), and runs efficiently at huge contexts.
have used both pi and opencode for the last 6 months, haven't opened a proprietary harness (cc, codex, cursor) in that same amount of time. right now i'm on pi and i can switch seamlessly between any model across any provider i want, even mid session. can even point them at locally running models.
i think people don't realize how much better life is over on this side, cc and codex rely entirely on vendor lock in imo.
i like the more minimal design of the tui, feels more integrated with my existing terminal workflows. oc always looked a little out of place. i really like pi's extension ecosystem as well.
For GLM Coding Plan subscribers, quota consumed via Coding Plan for GLM-5.2 in ZCode is discounted by the coefficients below — the same usage draws down less quota, roughly 1.5x the effective allowance.
Peak hours (14:00–18:00 daily) 3x -> 2x
Off-peak (remaining 20 hours) 1x -> 0.67x
I wonder whether that is referring to local time, or CST (UTC+8)?
Thanks. Those are some odd hours though, why would evening time be peak hours? Usually (in the western world anyway), 9AM - 12PM would be peak hours. Things normally slow down post-lunch, and be its slowest at close-of-business.
I would very much agree. Even the hand icon, the usage in the text field, and the sidebar style are 1:1 identical to Codex. It's a misleading title - it's not close the Claude Code.
if you're going to try this one out, don't be surprised to get this message repeatedly, like 4 out of 5 prompts you're trying to send, 24/7, this is gonna be your new friend, then you'll learn to write the only prompt that matters: "retry", "retry", "retry"
Here's the message: "Cannot connect to API: write EPIPE"
It's sad to see that the teams that have the most resources that can contribute to development of next-gen harnesses are essentially copying the same exact thing from each other, with no meaningful changes.
And most of the advancement and experimentation happens in some random 0-star github repos.
I've been working on my own private harness for the past 8 months, and I've been collecting ideas from such repos I've stumbled upon.
pi-tmux is one such example (seems to be archived now) which inspired me to use tmux as communication layer and provide visibility of subagents of multiple models in their native harnesses [1].
There's also herdr, which is not 0-stars, but is super interesting but lesser known project [2]. This also has interesting substrates to allow agent coordination.
None of these are harnesses per se, but they're pointing towards clear gaps in existing harnesses. For example, we've known for a while now that compounding knowledge of different class of models achieves better performance. Why is there no harness where this is a native functionality? And there's no harness where subagents are first class citizens both in terms of capabilities and UX.
"Quality" of the harness matters a lot to the user experience, and the construction of the harness will depend on the behavior/quirks of the underlying model. So, if you're using Claude Code, you can expect it to work best with Anthropic models, and expect other model-makers to want you to use the harness they've developed.
It is essentially a black box with full user permissions, meaning you are just handing over your entire system to a Chinese-owned server. With OpenCode and its GLM provider, at least I can monitor which files were read, which were edited, and what commands were executed.
Not to mention that Chinese national security laws legally obligate companies to cooperate with state intelligence and counter-espionage efforts [0]. If you have this installed on a corporate workstation, and your company is large enough, the possibility of them spying on you is not just a risk—it's almost a certainty.
[0]: https://en.wikipedia.org/wiki/National_Intelligence_Law_of_t...
I also consider Microsoft to be the biggest industrial spy in the world, them and google both are no doubt mining everything you type into office / gsuite, all your emails, etc. But nobody bats an eye when you write a word doc about some sensitive matter.
If my customers thought I was feeding their data into a Chinese owned LLM API (which to be clear I’m not), I don’t think it would go over well, and I’d be exposed legally to all sorts of things.
So the reason is risk aversion and desire to participate in US / western commerce. One can debate the actual threat, but why would you ever risk sending your data to a processor perceived as dodgy?
The US is certainly inching in that direction but it’s not like someone from the US government sits at Anthropic’s HQ reading chats from state people of interest.
1) there is a very non-zero chance that the US government also has that data from OpenAI and possibly Anthropic
2) unless you are asking the chinese models to draw up plans to overthrow the chinese government, it's extremely unlikely they would ever care.
while china has a track record of harassing it's own dissident citizens abroad, if you're not chinese and not trying to subvert their government (or are a high-ranking government official yourself), it's kind of silly to suppose they would ever care about you or what you do.
and if you have information they want for their own national development purposes, like EUV engineers, they are much more likely to offer you fabulous amounts of money instead of try to intimidate or threaten it out of you.
Do you really think the US government doesn't get access or couldn't get access to any of your chats with Claude?
I guess the base is whatever the profit margin needs to be this month.
[1]: https://zcode.z.ai/en#:~:text=Base%20usage%20allowance%20inc...
[2]: https://support.google.com/gemini/answer/16275805?hl=en#:~:t...
Edit: my theory is they wanted to mimic being the primary provider in a quick way with a lot of string replace. Though they could have added opencode back as a regular provider.
If you're already used to your TUI coding agent, you don't need the desktop agent. Although it is nice that it is there for folks who prefer the Codex App/Claude App UI approach.
For some tasks, it's better. Opus refuses tasks for me pretty regularly. GLM 5.2 has never refused a task. So for anything security-related or that touches on topics that trigger Opus's safety guardrails, I use GLM 5.2.
OTOH, for anything related to UI design, I use Opus 4.8. It's much better at taking relatively vague descriptions of user interfaces and a mockup of a related UI and combining them into an immaculate design.
For anything else, I tend to run tasks in Opus and then have GLM review them and write a Markdown file with anything it finds. Then I have Opus review the markdown file and fix the issues it agrees with. The reason I usually go with Opus 4.8 first is mainly that it's faster. Opus 4.8 is, on average, about twice as fast as GLM 5.2 running on z'ai's infrastructure for the same task. There's a large variance (sometimes GLM 5.2 is pretty fast and Opus 4.8 is pretty slow), but on average it's a very noticeable difference.
When I run into Anthropic's Quota, I switch to GLM 5.2 rather than Sonnet. I don't think there's much reason to ever use Sonnet for anything if you can use GLM 5.2 instead.
This is all pretty subjective, of course. On average, I think Opus 4.8 is still a better, more reliable, and faster model, but if it went away tomorrow and I only had GLM 5.2, I wouldn't be too sad about it; I'd get things done with GLM 5.2 just fine.
But when there was the Hantavirus thing a while back, I asked it if there was a vaccine under development and got a refusal immediately. I’ve had a few like that. It seems they’ve implemented really poor guardrails on certain topics (CBRN and cyber) that have lots of false positives. But if you actually chat with the model itself it’s quite lucid about what is legitimately dangerous and what is just performative “AI Safety” style refusal.
Edit- I see down-thread you use z.ai directly. Same concern, aren’t you worried about using it for professional stuff.
That said, it's interesting that they're releasing a bunch of stuff: ZCode, OCR.z.ai, Image.z.ai, Audio.z.ai, AutoClaw and some other stuff that https://chat.z.ai/ links to. That's a lot of stuff for one org to pull off.
Figured I'd try out their Pro coding plan, seems like it doesn't necessarily give me that much quota than Opus (at least given how many tokens are needed for accomplishing a certain task), but GLM 5.2 in of itself seems like a beefier Sonnet model, pretty good.
There's an `EN` link at top right
I'm wanting local context in the spirit of "here are 3 AI providers available, for coding tasks use this one... and for writing prose use this one... and for generating images use this one..." etc.
OpenCode was the first agent harness I used, and I have always like it. You can configure a wide variety of providers, but it's open source and has a number of core contributors.
The other opinionated option is Pi (the Pi agent harness). This is a great lightweight option and also supports a number of providers. You can also use local model servers.
The supporting worktrees are labeled wb1, wb2…wbn. You run either Claude or Codex in tabs for each work tree.
Then you generally work with the orchestrator on the primary worktree. It delegates tasks to the different workers and answers their smaller questions, surfacing results and assisting them with context clearing when needed.
The orchestrator and workers communicate using a simple shared file system under tmp/* and together they can handle a big and varied workload.
The orchestrator knows which AI client is running in any given worktree, so it would be fairly easy to designate which AI should receive what kind of tasks.
I do have some AI TUI specific instructions, for instance codex is primitive at monitoring compared to CC.
I use iterm2, so I’ve also added iterm2 specific python that allows the orchestrator to “kick” a worker by modifying the input and submitting it.
This also allows the orchestrator to monitor and reset its own context when necessary.
All context resets are handled gracefully, and continuation prompt and comms history allows workers and orchestrators to ably restore and continue their work without need to compact.
https://goose-docs.ai/
It supports MCP (unlike Pi), sandboxing (with user-mode networking), and runs efficiently at huge contexts.
https://codeberg.org/mlow/lmcli
(The screenshot in the folder is a little bit out of date, but is still representative of the overall look)
https://github.com/charmbracelet/crush
i think people don't realize how much better life is over on this side, cc and codex rely entirely on vendor lock in imo.
I don't think I understand the token/cost implications of this feature
https://zcode.z.ai/en/docs/welcome
> Explanation and Recommendations Regarding Usage for Plan-Supported Models
> Note: Peak hours are from 14:00 to 18:00 daily (UTC+8).
https://docs.z.ai/devpack/overview
Here's the message: "Cannot connect to API: write EPIPE"
And most of the advancement and experimentation happens in some random 0-star github repos.
pi-tmux is one such example (seems to be archived now) which inspired me to use tmux as communication layer and provide visibility of subagents of multiple models in their native harnesses [1].
There's also herdr, which is not 0-stars, but is super interesting but lesser known project [2]. This also has interesting substrates to allow agent coordination.
None of these are harnesses per se, but they're pointing towards clear gaps in existing harnesses. For example, we've known for a while now that compounding knowledge of different class of models achieves better performance. Why is there no harness where this is a native functionality? And there's no harness where subagents are first class citizens both in terms of capabilities and UX.
[1] https://github.com/offline-ant/pi-tmux
[2] https://github.com/ogulcancelik/herdr
But mostly vendor lock-in, I imagine.
It does have a 1.5x usage promotion for GLM 5.2 on the coding plan so now is a good time to test it...