If Anthropic's compute is fully saturated then the Claude code power users do represent an opportunity cost to Anthropic much closer to $5,000 then $500.
Anthropic's models may be similar in parameter size to model's on open router, but none of the others are in the headlines nearly as much (especially recently) so the comparison is extremely flawed.
The argument in this article is like comparing the cost of a Rolex to a random brand of mechanical watch based on gear count.
How confident are you in the opus 4.6 model size? I've always assumed it was a beefier model with more active params that Qwen397B (17B active on the forward pass)
Opus 4.6 likely has approximately 100B active parameters. OpenRouter lists the following throughput for Google Vertex:
42 tps for Claude Opus 4.6 https://openrouter.ai/anthropic/claude-opus-4.6
143 tps for GLM 4.7 (32B active parameters) https://openrouter.ai/z-ai/glm-4.7
70 tps for Llama 3.3 70B (dense model) https://openrouter.ai/meta-llama/llama-3.3-70b-instruct
For GLM 4.7, that makes 143 * 32B = 4576B parameters per second, and for Llama 3.3, we get 70 * 70B = 4900B, which makes sense since denser models are easier to optimize. As a lower bound, we get 4576B / 42 ≈ 109B active parameters for Opus 4.6. (This makes the assumption that all three models use the same number of bits per parameter.)
1. It would be nice to define terms like RSI or at least link to a definition.
2. I found the graph difficult to read. It's a computer font that is made to look hand-drawn and it's a bit low resolution. With some googling I'm guessing the words in parentheses are the clouds the model is running on. You could make that a bit more clear.
Claude Code Max obviously doesn't cost 10x more than Kimi. The article even confirms that you can get $5k worth of computer for $200 with Claude Code Max.
So no, Claude would not be getting NEARLY as much usage as it's currently getting if it weren't for the $100/$200 monthly subscription. You're comparing Kimi to the price that most people aren't paying.
I can’t get past all the LLM-isms. Do people really not care about AI-slopifying their writing? It’s like learning about bad kerning, you see it everywhere.
> The real story is actually in the article. … And the real issue for Cursor … They have real "brand awareness", and they are genuinely better than the cheaper open weights models - for now at least. It's a real conundrum for them.
> … - these are genuinely massive expenses that dwarf inference costs.
Popular content is popular because it is above the threshold for average detection.
In a better world, platforms would empower defenders, by granting skilled human noticers flagging priority, and by adopting basic classifiers like Pangram.
Unfortunately, mainstream platforms have thus far not demonstrated strong interest in banning AI slop. This site in particular has actually taken moderation actions to unflag AI slop, in certain occasions...
Is it fair to say the Open Router models aren't subsidized though? They make the case that companies on there are running a business, but there are free models, and companies with huge AI budgets that want to gather training data and show usage.
Ed Zitron made that came (in particular here: [1]). In the same article he admits he not a programmer, and had to ask someone else to try out Claude Code and ccusage for him. He doesn't have any understanding of how LLMs or caching works. But he's prominent because he's received leaked financial details for Anthropic and OpenAI, eg [2]
I mean, the very first paragraph of TFA is describing who is under that impression. Literally the first sentence:
> My LinkedIn and Twitter feeds are full of screenshots from the recent Forbes article on Cursor claiming that Anthropic's $200/month Claude Code Max plan can consume $5,000 in compute.
If Anthropic's compute is fully saturated then the Claude code power users do represent an opportunity cost to Anthropic much closer to $5,000 then $500.
Anthropic's models may be similar in parameter size to model's on open router, but none of the others are in the headlines nearly as much (especially recently) so the comparison is extremely flawed.
The argument in this article is like comparing the cost of a Rolex to a random brand of mechanical watch based on gear count.
How confident are you in the opus 4.6 model size? I've always assumed it was a beefier model with more active params that Qwen397B (17B active on the forward pass)
Yeah that's a massive assumption they're making. I remember musk revealed Grok was multiple trillion parameters. I find it likely Opus is larger.
I'm sure Anthropic is making money off the API but I highly doubt it's 90% profit margins.
Even if it's larger, OpenRouter has DeepSeek v3.2 (685B/37B active) at $0.26/0.40 and Kimi K2.5 (1T/32B active) at $0.45/2.25 (mentioned in the post).
Opus 4.6 likely has approximately 100B active parameters. OpenRouter lists the following throughput for Google Vertex:
For GLM 4.7, that makes 143 * 32B = 4576B parameters per second, and for Llama 3.3, we get 70 * 70B = 4900B, which makes sense since denser models are easier to optimize. As a lower bound, we get 4576B / 42 ≈ 109B active parameters for Opus 4.6. (This makes the assumption that all three models use the same number of bits per parameter.)Also curious if any experts can weigh in on this. I would guess in the 1 trillion to 2 trillion range.
Good article! Small suggestions:
1. It would be nice to define terms like RSI or at least link to a definition.
2. I found the graph difficult to read. It's a computer font that is made to look hand-drawn and it's a bit low resolution. With some googling I'm guessing the words in parentheses are the clouds the model is running on. You could make that a bit more clear.
By the way, one of the charts in the article shows that Opus 4.6 is 10x costlier than Kimi K2.5.
I thought there was no moat in AI? Even being 10x costlier, Anthropic still doesn't have enough compute to meet demand.
Those "AI has no moat" opinions are going to be so wrong so soon.
Claude Code Max obviously doesn't cost 10x more than Kimi. The article even confirms that you can get $5k worth of computer for $200 with Claude Code Max.
So no, Claude would not be getting NEARLY as much usage as it's currently getting if it weren't for the $100/$200 monthly subscription. You're comparing Kimi to the price that most people aren't paying.
This is such a well-written essay. Every line revealed the answer to the immediate question I had just thought of
I can’t get past all the LLM-isms. Do people really not care about AI-slopifying their writing? It’s like learning about bad kerning, you see it everywhere.
I think you're just hallucinating because this does not come across as an AI article
I see quite a few:
“what X actually is”
“the X reality check”
Overuse of “real” and “genuine”:
> The real story is actually in the article. … And the real issue for Cursor … They have real "brand awareness", and they are genuinely better than the cheaper open weights models - for now at least. It's a real conundrum for them.
> … - these are genuinely massive expenses that dwarf inference costs.
This style just screams “Claude” to me.
It was almost certainly at least heavily edited with one. Ignoring the content, every single thing about the structure and style screams LLM.
Name checks out
People care, when they can tell.
Popular content is popular because it is above the threshold for average detection.
In a better world, platforms would empower defenders, by granting skilled human noticers flagging priority, and by adopting basic classifiers like Pangram.
Unfortunately, mainstream platforms have thus far not demonstrated strong interest in banning AI slop. This site in particular has actually taken moderation actions to unflag AI slop, in certain occasions...
I don’t see the usual tells in this essay
Is it fair to say the Open Router models aren't subsidized though? They make the case that companies on there are running a business, but there are free models, and companies with huge AI budgets that want to gather training data and show usage.
These margins are far greater than the ones Dario has indicated during many of his recent podcasts appearances.
What did he say?
Nobody gets RSI typing “iterate until tests pass”
Recursive self improvement and Repetitive Strain Injury being the same initialism is really funny to me
Was anyone under the impression that it does? Serious question. I've never heard that, personally.
Ed Zitron made that came (in particular here: [1]). In the same article he admits he not a programmer, and had to ask someone else to try out Claude Code and ccusage for him. He doesn't have any understanding of how LLMs or caching works. But he's prominent because he's received leaked financial details for Anthropic and OpenAI, eg [2]
[1] https://www.wheresyoured.at/anthropic-is-bleeding-out/ [2] https://www.wheresyoured.at/costs/
I mean, the very first paragraph of TFA is describing who is under that impression. Literally the first sentence:
> My LinkedIn and Twitter feeds are full of screenshots from the recent Forbes article on Cursor claiming that Anthropic's $200/month Claude Code Max plan can consume $5,000 in compute.
Twitter.
Ok but so it does cost Cursor $5k per power-Cursor user?? Still seems pretty rough..
Yes, you could turn it around to say that using Anthropic models in Cursor, Copilot, Junie, etc. is 'subsidising' Claude Code users.
$5 = $5
but $5 that I amortize over 7 years might end up being $1.7 maybe if I don't rapidly combust (supply chain risk)
No, to use $5k in Cursor you have to pay $5k.
I wonder how they are defining a power user. How many tokens, what could be the size the code base?
The $5k power user is the one that consistently uses all input and output tokens available under the Max subscription
> I'm fairly confident the Forbes sources are confusing retail API prices with actual compute costs
Aren't they losing money on the retail API pricing, too?
> ... comparisons to artificially low priced Chinese providers...
Yeah, no this article does not pass the sniff test.
> Aren't they losing money on the retail API pricing, too?
No, they aren't, and probably neither is anyone else offering API pricing. Anthropic's margins on API usage are probably higher than anyone else.