> And there are reasons to even be really bullish about AI’s long-run profitability — most notably, the sheer scale of value that AI could create. Many higher-ups at AI companies expect AI systems to outcompete humans across virtually all economically valuable tasks. If you truly believe that in your heart of hearts, that means potentially capturing trillions of dollars from labor automation. The resulting revenue growth could dwarf development costs even with thin margins and short model lifespans.
We keep seeing estimates like this repeated by AI companies and such. There is something that really irks me with it though, which is that it assumes companies that are replacing labor with LLMs are willing to pay as much as (or at least a significant fraction of) the labor costs they are replacing.
In practice, I haven't seen that to be true anywhere. If Claude Code (for example) can replace 30% of a developers job, you would expect companies to be willing to pay tens of thousands of dollars per seat for it. Anecdotally at $WORK, we get nickel and dimed on dev tools (better for AI tools somewhat). I don't expect corporate to suddenly accept to pay Anthropic 50k$ per developer even if they can lay off 1/3 of us. Will anyone pay enough to realize the capture of "trillion dollar"?
You’re right, and a big reason they won’t be able to capture the “full” value is because of competition, especially with open-source models. Sure, Claude will probably always be better… but better to the tune of $50k/seat?
LLMs are, ultimately, software. And we’ve had plenty of advances in software. None of them are priced at the market value of the labor they save. That’s just not how the economics work.
> But we can still do an illustrative calculation: let’s conservatively assume that OpenAI started R&D on GPT-5 after o3’s release last April. Then there’d still be four months between then and GPT-5’s release in August,20 during which OpenAI spent around $5 billion on R&D. But that’s still higher than the $3 billion of gross profits. In other words, OpenAI spent more on R&D in the four months preceding GPT-5, than it made in gross profits during GPT-5’s four-month tenure.
These numbers actually look really good. OpenAI's revenue has been increasing 10x year-on-year since 2023, which means spending even 3x the amount of resources to produce another model next year would likely generate a healthy profit. The newer models are more efficient so inference costs tend to decrease as well.
As long as models can keep improving, and crucially if that depends on the amount of compute you put in them, OpenAI and other closed-source AI companies will succeed. If those two key assumptions stop being true, I can definitely see the whole castle of cards crumbling as open source competition will eat their lunch.
But on the other hand every single model is competing in the sense that there's only one world to fill up with slop.
Maybe in a few years LinkedIn will have 10 times more AI slop, but that won't make either LinkedIn or the AI companies 10x more valuable.
I guess my point is that the models, IRL, salt the earth. Since model use became common people have started using them to cheat on their work as students, cheat on job interviews, cheat their duties to scientific integrity, and generally stop building and sharing communal knowledge. Why wouldn't you cheat and steal instead of improving yourself when you're told that in the future your skill and intelligence will be utterly and completely negligible compared to that of The Machine.
This is already costing society trillions of dollars in practical damages. OpenAI is one of the world's biggest polluters, but in the past there has never been a need for legal protections against indsutrial-scale polluting of the well of human knowledge so we're still catching up. Nevertheless the fact that the unit economics look even this good is because they're dumping the real costs onto society itself.
Framing GPT 5 as a loss because of its short run like that is a bit weird. They say "the R&D that went into GPT-5 likely informs future models like GPT-6". but that really understates what is happening here.
Barring solid evidence otherwise you would think that GPT 5.2 was built largely on GPT 5, enough that possibly the majority of the cost of 5.2 was in developing GPT 5.
It would be like if you shipped something v1.0 on day one and discovered a bug and shipped something v1.01 the next day. Then at the end of the year reported that v1.0 massively lost money but you wouldn't believe the profit we made on v1.01 it was the single largest return on a single day of development we've ever seen.
The Uber comparison is funny because they burned $32B over 14 years before profitability. OpenAI alone is burning something like $10B/year and growing, so the US AI labs are probably close to 10x Ubers burn rate.
Given the scale of AI lab burn rates being so high that the AI Capex shows up in nationwide economic stats, it clearly cannot burn for that long.
So what happens first - labs figure out how to get compute costs down by an order of magnitude, they add enough value to raise prices an order of magnitude (Uber), or some labs begin imploding?
Keep in mind another aspect of the comparison is that there wasn't an entire supply chain spend effect triggered by Uber. That is - you didn't have new car companies building new factories to produce 10x more cars, new roads being built, new gas stations being built, etc the way you have for AI.
It's like the entire economy has taken 1 giant correlated bet here.
What I dont understand is, why would a company pay $10,000s a month to anthropic for Claude in a situation where a Chinese LLM is 99% as good, is open weight and runs on US servers and is 5% the price?
Open models have been about 6 to 9 months behind frontier models, and this has been the case since 2024. That is a very long time for this technology at it's current rate of development. If fast takeoff theory is right, this should widen (although with Kimi K2.5 it might have actually shortened).
If we consider what typically happens with other technologies, we would expect open models to match others on general intelligence benchmarks in time. Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
> That is a very long time for this technology at it's current rate of development.
Yes, as long as that gap stays consistent, there is no problem with building on ~9 months old tech from a business perspective. Heck, many companies are lagging behind tech advancements by decades and are doing fine.
> Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
They all get made in China, mostly all in the same facilities. Designs tend to converge under such conditions. Especially since design is not open loop - you talk to the supplier that will make your drill and the supplier might communicate how they already make drills for others.
I'm still testing myself and cannot make a confident statement yet, but Artifical Analysis is a solid and independent, though also to be fair somewhat imperfect source for a general overview: https://artificialanalysis.ai/
Purely according to Artificial Analysis, Kimi K2.5 is rather competitive in regard to pure output quality, agentic evals are also close to or beating US made frontier models and, lest we forget, the model is far more affordable than said competitors, to a point where it is frankly silly that we are actually comparing them.
For what it's worth, of the models I have been able to test as of yet, when focusing purely on raw performance (meaning solely task adherence, output quality and agentic capabilities; so discounting price, speed, hosting flexibility), I have personally found the prior Kimi K2 Thinking model to be overall more usable and reliable than Gemini 3 Pro and Flash. Purely on output quality in very specific coding tasks, Opus 4.5 was in my testing leaps and bounds superior of both the Gemini models and K2 Thinking however, though task adherence was surprisingly less reliable than Haiku 4.5 or K2 Thinking.
Being many times more expensive and in some cases less reliably adhering to tasks, I really cannot say that Opus 4.5 is superior or Kimi K2 Thinking is inferior here. The latter is certainly better in my specific usage than any Gemini model and again, I haven't yet gone through this with K2.5. I try not to just presume from the outset that K2.5 is better than K2 Thinking, though even if K2.5 remains at the same level of quality and reliability, just with multi modal input, that'd make the model very competitive.
Usain Bolt's top speed is about 44.72 km/h. My top speed sprinting is about 25 km/h. That's at least 50% as good. But I'd have a hard time getting paid even half as much as Mr Bolt.
Even someone with marginally less top speed isn’t getting paid half as much as Usain Bolt. Athletes aren’t getting paid per unit of output. This analogy is not analogising.
It is highly dependent on what the best represents.
If you had a 100% chance of not breaking your arm on any given day, what kind of value would you place on it over a 99% chance on any given day. I would imagine it to be pretty high.
The top models are not perfect, so they don't really represent 100% of anything on any scale.
If the best you could do is have a 99% chance of not breaking your arm on any given day, then perhaps you might be more stoic about something that is 99% of 99% Which is close enough to 98% that you are 'only' going to double the number of broken arms you get in a year.
I suspect using AI in production will be a calculated more as likelihood of pain than increased widgets per hour. Recovery from disaster can eat any productivity gains easily.
As a heavy Claude Code user, I would like to have that option.
But if it's just 33% as good, I wouldn't bother.
Top LLMs have passed a usability threshold in the past few months. I haven't had the feeling the open models (from any country) have passed it as well.
When they do, we'll have a realistic option of using the best and the most expensive vs the good and cheap. That will be great.
There are many providers (Fireworks, Groq, Cerebras, Google Vertex), some using rather common hardware from Nvidia, etc., others their own solutions focused solely on high throughput inference. They often tend to be faster, cheaper and/or more reliable than what the lab that trained the model is charging [0], simply because there is some competition, unlike with US frontier models which at best can be hosted by Azure, AWS or GCloud at the same price as the first party.
Isn't there pretty good indications that the chinese llms have been trained on top of the expensive models?
Their cost is not real.
Plus you have things like MCP or agents that are mostly being spearheaded by companies like Anthropic. So if it is "the future" and you believe in it, then you should pay a premium to spearhead it.
You want to bet on the first Boeing not the cheapest copy of a Wright brother plane.
(Full disclosure, I dont think its the future and I think we are over leveraging on AI to a degree that is, no pun intended, misanthropic)
> Isn't there pretty good indications that the chinese llms have been trained on top of the expensive models?
How do you even do that? You can train on glorified chat logs from an expensive model, but that's hardly the same thing. "Model extraction" is ludicrously inefficient.
Well it raises an interesting conundrum. Suppose there's a microcontroller that's $5.00 and another that's $0.50. The latter is a clone of the former. Are you better off worrying only about your short term needs, or should you take the long view and direct your business towards the former despite it being more expensive?
Suppose both microcontrollers will be out of date in a week and replaced by far more capable microcontrollers.
The long view is to see the microcontroller as a commodity piece of hardware that is rapidly changing. Now is not the time to go all in on betamax and take 10 years leases on physical blockbuster stores when streaming is 2 weeks away.
Ai is possibly the most open technological advance I have experienced - there is no excuse, this time, for skilled operators to be stuck for decades with AWS or some other propriety blend of vendor lock-in.
Well the company of the former microcontroller has gone out of their way to make getting and developing on actual hardware as difficult and expensive as possible as possible, and could reasonably accused of doing “suspect financial shenanigans”, and the other company will happily sell me the microcontroller for a reasonable price. And sure, thy started off cloning the former, but their own stuff is getting really quite good these days.
So really, the argument pretty well makes itself in favour of the $0.5 micro controller.
That's a very tenuous analogy. Microcontrollers are circuits that are designed. LLMs are circuits that learned using vast amounts of data scraped from the internet, and pirated e-books[1][2][3].
This is proven. You can prove it yourself easily. Take a novel from your bookshelf, type in any sentence from the novel and ask it what book it's from. Ask it for the next sentence.
This works with every novel I've tried so far in Gemini 3.
My actual prompt was a bit more convoluted than this (involving translation) so you may need to experiment a bit.
I think the question is as it matures will, the value of the model become more stable and then what will happen to price?
If you can compare phones or pcs, there was a time when each new version was a huge upgrade to the last version, but eventually these ai models are gonna mature and something else is gonna happen
TL;DR the prices have gone down (to achieve same benchmarks) by more 9x to 400x.
This should clearly tell you that the margins are high. It would be absurd for OpenAI to be constantly be under a loss when the prices have gone down by ~50x on average. Instead of being ~50x cheaper, couldn't OpenAI be like 45x cheaper and be in profit? What's the difference?
I genuinely don't know why you need any more proof than just this statistic?
Does anyone have a paywall free link to inference costs for models? The article links to an estimate from the information which has a heavy subscription cost and paywalls everything
This totally glosses over the debacle that was GPT-4.5 (which possibly was GPT-5 too, btw), and the claim that it'll ever outcompete humans also totally depends on whether these systems still require human "steering" or work autonomously.
The discussion about Chinese vs US models misses a key enterprise reality: switching costs are enormous once you've built production systems around a specific API. Companies aren't just buying the model - they're buying reliability, compliance guarantees, and the ecosystem that reduces integration risk. Price matters, but in enterprise AI the "last mile" of trust and operational certainty often justifies significant premiums.
OTOH most model APIs are basically identical to each other. You can switch from one to the other using openrouter without even altering the code. Furthermore, they aren't reliable (drop rates can be as high as 20%) and compliance "guarantees" are, AFAIK, completely untested. As anyone used the Copilot compliance guarantees to defend themselves in a copyright infringement suit yet?
I think you are right that trust and operational certainty justifies significant premiums. It would be great if trust and operational certainty were available.
> And there are reasons to even be really bullish about AI’s long-run profitability — most notably, the sheer scale of value that AI could create. Many higher-ups at AI companies expect AI systems to outcompete humans across virtually all economically valuable tasks. If you truly believe that in your heart of hearts, that means potentially capturing trillions of dollars from labor automation. The resulting revenue growth could dwarf development costs even with thin margins and short model lifespans.
We keep seeing estimates like this repeated by AI companies and such. There is something that really irks me with it though, which is that it assumes companies that are replacing labor with LLMs are willing to pay as much as (or at least a significant fraction of) the labor costs they are replacing.
In practice, I haven't seen that to be true anywhere. If Claude Code (for example) can replace 30% of a developers job, you would expect companies to be willing to pay tens of thousands of dollars per seat for it. Anecdotally at $WORK, we get nickel and dimed on dev tools (better for AI tools somewhat). I don't expect corporate to suddenly accept to pay Anthropic 50k$ per developer even if they can lay off 1/3 of us. Will anyone pay enough to realize the capture of "trillion dollar"?
You’re right, and a big reason they won’t be able to capture the “full” value is because of competition, especially with open-source models. Sure, Claude will probably always be better… but better to the tune of $50k/seat?
LLMs are, ultimately, software. And we’ve had plenty of advances in software. None of them are priced at the market value of the labor they save. That’s just not how the economics work.
We will see what happens when they start raising prices.
If developers which complain today that $200/month is too much will stop using it or start paying $2000/month.
> But we can still do an illustrative calculation: let’s conservatively assume that OpenAI started R&D on GPT-5 after o3’s release last April. Then there’d still be four months between then and GPT-5’s release in August,20 during which OpenAI spent around $5 billion on R&D. But that’s still higher than the $3 billion of gross profits. In other words, OpenAI spent more on R&D in the four months preceding GPT-5, than it made in gross profits during GPT-5’s four-month tenure.
These numbers actually look really good. OpenAI's revenue has been increasing 10x year-on-year since 2023, which means spending even 3x the amount of resources to produce another model next year would likely generate a healthy profit. The newer models are more efficient so inference costs tend to decrease as well.
As long as models can keep improving, and crucially if that depends on the amount of compute you put in them, OpenAI and other closed-source AI companies will succeed. If those two key assumptions stop being true, I can definitely see the whole castle of cards crumbling as open source competition will eat their lunch.
But on the other hand every single model is competing in the sense that there's only one world to fill up with slop.
Maybe in a few years LinkedIn will have 10 times more AI slop, but that won't make either LinkedIn or the AI companies 10x more valuable.
I guess my point is that the models, IRL, salt the earth. Since model use became common people have started using them to cheat on their work as students, cheat on job interviews, cheat their duties to scientific integrity, and generally stop building and sharing communal knowledge. Why wouldn't you cheat and steal instead of improving yourself when you're told that in the future your skill and intelligence will be utterly and completely negligible compared to that of The Machine.
This is already costing society trillions of dollars in practical damages. OpenAI is one of the world's biggest polluters, but in the past there has never been a need for legal protections against indsutrial-scale polluting of the well of human knowledge so we're still catching up. Nevertheless the fact that the unit economics look even this good is because they're dumping the real costs onto society itself.
Framing GPT 5 as a loss because of its short run like that is a bit weird. They say "the R&D that went into GPT-5 likely informs future models like GPT-6". but that really understates what is happening here.
Barring solid evidence otherwise you would think that GPT 5.2 was built largely on GPT 5, enough that possibly the majority of the cost of 5.2 was in developing GPT 5.
It would be like if you shipped something v1.0 on day one and discovered a bug and shipped something v1.01 the next day. Then at the end of the year reported that v1.0 massively lost money but you wouldn't believe the profit we made on v1.01 it was the single largest return on a single day of development we've ever seen.
The Uber comparison is funny because they burned $32B over 14 years before profitability. OpenAI alone is burning something like $10B/year and growing, so the US AI labs are probably close to 10x Ubers burn rate.
Given the scale of AI lab burn rates being so high that the AI Capex shows up in nationwide economic stats, it clearly cannot burn for that long.
So what happens first - labs figure out how to get compute costs down by an order of magnitude, they add enough value to raise prices an order of magnitude (Uber), or some labs begin imploding?
Keep in mind another aspect of the comparison is that there wasn't an entire supply chain spend effect triggered by Uber. That is - you didn't have new car companies building new factories to produce 10x more cars, new roads being built, new gas stations being built, etc the way you have for AI.
It's like the entire economy has taken 1 giant correlated bet here.
What I dont understand is, why would a company pay $10,000s a month to anthropic for Claude in a situation where a Chinese LLM is 99% as good, is open weight and runs on US servers and is 5% the price?
By what metrics are they 99% as good? There are a lot of benchmarks out there. Please share them.
I think the answer lies in the "we actually care a lot about that 1% (which is actually a lot more than 1%)".
Open models have been about 6 to 9 months behind frontier models, and this has been the case since 2024. That is a very long time for this technology at it's current rate of development. If fast takeoff theory is right, this should widen (although with Kimi K2.5 it might have actually shortened).
If we consider what typically happens with other technologies, we would expect open models to match others on general intelligence benchmarks in time. Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
> That is a very long time for this technology at it's current rate of development.
Yes, as long as that gap stays consistent, there is no problem with building on ~9 months old tech from a business perspective. Heck, many companies are lagging behind tech advancements by decades and are doing fine.
> Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
They all get made in China, mostly all in the same facilities. Designs tend to converge under such conditions. Especially since design is not open loop - you talk to the supplier that will make your drill and the supplier might communicate how they already make drills for others.
I'm still testing myself and cannot make a confident statement yet, but Artifical Analysis is a solid and independent, though also to be fair somewhat imperfect source for a general overview: https://artificialanalysis.ai/
Purely according to Artificial Analysis, Kimi K2.5 is rather competitive in regard to pure output quality, agentic evals are also close to or beating US made frontier models and, lest we forget, the model is far more affordable than said competitors, to a point where it is frankly silly that we are actually comparing them.
For what it's worth, of the models I have been able to test as of yet, when focusing purely on raw performance (meaning solely task adherence, output quality and agentic capabilities; so discounting price, speed, hosting flexibility), I have personally found the prior Kimi K2 Thinking model to be overall more usable and reliable than Gemini 3 Pro and Flash. Purely on output quality in very specific coding tasks, Opus 4.5 was in my testing leaps and bounds superior of both the Gemini models and K2 Thinking however, though task adherence was surprisingly less reliable than Haiku 4.5 or K2 Thinking.
Being many times more expensive and in some cases less reliably adhering to tasks, I really cannot say that Opus 4.5 is superior or Kimi K2 Thinking is inferior here. The latter is certainly better in my specific usage than any Gemini model and again, I haven't yet gone through this with K2.5. I try not to just presume from the outset that K2.5 is better than K2 Thinking, though even if K2.5 remains at the same level of quality and reliability, just with multi modal input, that'd make the model very competitive.
Usain Bolt's top speed is about 44.72 km/h. My top speed sprinting is about 25 km/h. That's at least 50% as good. But I'd have a hard time getting paid even half as much as Mr Bolt.
Even someone with marginally less top speed isn’t getting paid half as much as Usain Bolt. Athletes aren’t getting paid per unit of output. This analogy is not analogising.
Yeah, but you'd both be quite suitable to go walk to the grocery store.
I don't think 99% of the best is a good metric.
It is highly dependent on what the best represents.
If you had a 100% chance of not breaking your arm on any given day, what kind of value would you place on it over a 99% chance on any given day. I would imagine it to be pretty high.
The top models are not perfect, so they don't really represent 100% of anything on any scale.
If the best you could do is have a 99% chance of not breaking your arm on any given day, then perhaps you might be more stoic about something that is 99% of 99% Which is close enough to 98% that you are 'only' going to double the number of broken arms you get in a year.
I suspect using AI in production will be a calculated more as likelihood of pain than increased widgets per hour. Recovery from disaster can eat any productivity gains easily.
As a heavy Claude Code user, I would like to have that option.
But if it's just 33% as good, I wouldn't bother.
Top LLMs have passed a usability threshold in the past few months. I haven't had the feeling the open models (from any country) have passed it as well.
When they do, we'll have a realistic option of using the best and the most expensive vs the good and cheap. That will be great.
Maybe in 2026.
I tried that but I'm back to paying OpenAI $200/month because the quality was significantly worse on my codebase.
How do they run on US servers? Self host? That’s not going to be cheap whilst the big AI players horde resources like memory.
There are many providers (Fireworks, Groq, Cerebras, Google Vertex), some using rather common hardware from Nvidia, etc., others their own solutions focused solely on high throughput inference. They often tend to be faster, cheaper and/or more reliable than what the lab that trained the model is charging [0], simply because there is some competition, unlike with US frontier models which at best can be hosted by Azure, AWS or GCloud at the same price as the first party.
[0] https://openrouter.ai/moonshotai/kimi-k2-thinking
Isn't there pretty good indications that the chinese llms have been trained on top of the expensive models?
Their cost is not real.
Plus you have things like MCP or agents that are mostly being spearheaded by companies like Anthropic. So if it is "the future" and you believe in it, then you should pay a premium to spearhead it.
You want to bet on the first Boeing not the cheapest copy of a Wright brother plane.
(Full disclosure, I dont think its the future and I think we are over leveraging on AI to a degree that is, no pun intended, misanthropic)
> Isn't there pretty good indications that the chinese llms have been trained on top of the expensive models?
How do you even do that? You can train on glorified chat logs from an expensive model, but that's hardly the same thing. "Model extraction" is ludicrously inefficient.
> Isn't there pretty good indications that the chinese llms have been trained on top of the expensive models?
So what ?
Well it raises an interesting conundrum. Suppose there's a microcontroller that's $5.00 and another that's $0.50. The latter is a clone of the former. Are you better off worrying only about your short term needs, or should you take the long view and direct your business towards the former despite it being more expensive?
Suppose both microcontrollers will be out of date in a week and replaced by far more capable microcontrollers.
The long view is to see the microcontroller as a commodity piece of hardware that is rapidly changing. Now is not the time to go all in on betamax and take 10 years leases on physical blockbuster stores when streaming is 2 weeks away.
Ai is possibly the most open technological advance I have experienced - there is no excuse, this time, for skilled operators to be stuck for decades with AWS or some other propriety blend of vendor lock-in.
Well the company of the former microcontroller has gone out of their way to make getting and developing on actual hardware as difficult and expensive as possible as possible, and could reasonably accused of doing “suspect financial shenanigans”, and the other company will happily sell me the microcontroller for a reasonable price. And sure, thy started off cloning the former, but their own stuff is getting really quite good these days.
So really, the argument pretty well makes itself in favour of the $0.5 micro controller.
That's a very tenuous analogy. Microcontrollers are circuits that are designed. LLMs are circuits that learned using vast amounts of data scraped from the internet, and pirated e-books[1][2][3].
[1]: https://finance.yahoo.com/news/nvidia-accused-trying-cut-dea...
[2]: https://arstechnica.com/tech-policy/2025/12/openai-desperate...
[3]: https://www.businessinsider.com/anthropic-cut-pirated-millio...
You're asking whether businesses will choose to pay a 1000% markup on commodities?
> Isn't there pretty good indications that the chinese llms have been trained on top of the expensive models?
there are pretty good indications that the american llms have been trained on top of stolen data
This is proven. You can prove it yourself easily. Take a novel from your bookshelf, type in any sentence from the novel and ask it what book it's from. Ask it for the next sentence.
This works with every novel I've tried so far in Gemini 3.
My actual prompt was a bit more convoluted than this (involving translation) so you may need to experiment a bit.
This so-called "PC compatible" seems like a cheap copy, give me a real IBM every time.
> Their cost is not real.
They can’t even officially account for any nvidia gpus they managed to buy outside the official channels.
Original is here (linked in the article):
https://epoch.ai/gradient-updates/can-ai-companies-become-pr...
Thanks, could the link for this post be replaced with the original?
I think the question is as it matures will, the value of the model become more stable and then what will happen to price?
If you can compare phones or pcs, there was a time when each new version was a huge upgrade to the last version, but eventually these ai models are gonna mature and something else is gonna happen
The only thing you need to know about unit economics is this: https://epoch.ai/data-insights/llm-inference-price-trends
TL;DR the prices have gone down (to achieve same benchmarks) by more 9x to 400x.
This should clearly tell you that the margins are high. It would be absurd for OpenAI to be constantly be under a loss when the prices have gone down by ~50x on average. Instead of being ~50x cheaper, couldn't OpenAI be like 45x cheaper and be in profit? What's the difference?
I genuinely don't know why you need any more proof than just this statistic?
Does anyone have a paywall free link to inference costs for models? The article links to an estimate from the information which has a heavy subscription cost and paywalls everything
This totally glosses over the debacle that was GPT-4.5 (which possibly was GPT-5 too, btw), and the claim that it'll ever outcompete humans also totally depends on whether these systems still require human "steering" or work autonomously.
Frankly, it's real slop.
The discussion about Chinese vs US models misses a key enterprise reality: switching costs are enormous once you've built production systems around a specific API. Companies aren't just buying the model - they're buying reliability, compliance guarantees, and the ecosystem that reduces integration risk. Price matters, but in enterprise AI the "last mile" of trust and operational certainty often justifies significant premiums.
OTOH most model APIs are basically identical to each other. You can switch from one to the other using openrouter without even altering the code. Furthermore, they aren't reliable (drop rates can be as high as 20%) and compliance "guarantees" are, AFAIK, completely untested. As anyone used the Copilot compliance guarantees to defend themselves in a copyright infringement suit yet?
I think you are right that trust and operational certainty justifies significant premiums. It would be great if trust and operational certainty were available.
Responses API is a commodity.
That's why OpenAI tries to push Assistants API, Agents SDK and ChatGPT Apps which are more of a lock in: https://senkorasic.com/articles/openai-product-strategy-2025
Funny thing is, even OpenAI seems to ignore Assistant/Apps API internally. Codex (cli) uses Responses API.
I think AI has the potential to break this model. It reduces switching costs immensely