The difference is implementation comes down to business goals more than anything.
There is a clear directionality for ChatGPT. At some point they will monetize by ads and affiliate links. Their memory implementation is aimed at creating a user profile.
Claude's memory implementation feels more oriented towards the long term goal of accessing abstractions and past interactions. It's very close to how humans access memories, albeit with a search feature. (they have not implemented it yet afaik), there is a clear path where they leverage their current implementation w RL posttraining such that claude "remembers" the mistakes you pointed out last time. It can in future iterations derive abstractions from a given conversation (eg: "user asked me to make xyz changes on this task last time, maybe the agent can proactively do it or this was the process last time the agent did it").
At the most basic level, ChatGPT wants to remember you as a person, while Claude cares about how your previous interactions were.
And even a subscription that gives a truly ad-free experience doesn't preclude the bit that I actually object to most: collecting data about me & my activity and selling it on.
And cable companies, and magazines. This is not something from the 2020s, it is a centuries old thing.
But these are entertainment. For all the time advertising has been present, work tools have been relatively immune. I don't remember seeing ads in IDE for instance, and while magazines had ads, technical documents didn't. I have never seen electronic components datasheets pitching for measuring equipment and soldering irons for instance.
That's why I don't expect Anthropic to go with ads it they follow the path they seem to have taken, like coding agents. People using these tools are likely to react very badly to ads, if there is some space to put ads in the first place, and these are also the kind of people who can spend $100/month on a subscription, way more than what ads will get you.
> Netflix has more than doubled the number of people watching its ad-supported tier over the last year. At its upfront presentation for advertisers on Wednesday, the company revealed that the $7.99 per month plan now reaches more than 94 million users around the world each month – a big increase from the 40 million it reported in May 2024 and the 70 million it revealed last November.
1/3 of Netflix users (the market) prefer ads over paying to avoid them.
Maybe you shouldn't be. The ad-hating paranoid HN user is not representative of the general population. Probably the exact opposite, in fact.
My wife and mother love ads, they are always on the hunt for the latest good deals and love discount shopping. When I tried to remove the ads on their computers or in the postal mail, they protested. I think they are far more representative of the general population.
as a former paying user it felt more like they were buying my subscription with a decent product so that they could sell their business prospects to investors by claiming a high subscription count.
I have never encountered such bad customer service anywhere -- and at 200 bucks a month at that.
The plan is not ads said by chatgpt - it's ads on the side that are relevant to the conversartion (or you in general). Or affiliate links. That's my understanding.
My conjecture is that their memory implementation is not aimed at building a user profile. I don't know if they would or would not serve ads in the future, but it's hard to see how the current implementation helps them in that regard.
> I don't know if they would or would not serve ads in the future
There are 2 possible futures:
1) You are served ads based on your interactions
2) You pay a subscription fee equal to the amount they would have otherwise earned on ads
I highly doubt #2 will happen. (See: Facebook, Google, twitter, et al)
Let’s not fool ourselves. We will be monetized.
And model quality will be degraded to maximize profits when competition in the LLM space dies down.
It’s not a pretty future. I wouldn’t be surprised if right now is the peak of model quality, etc. Peak competition, everyone is trying to be the best. That won’t continue forever. Eventually everyone will pivot their priority towards monetization rather than model quality/training.
But aren't we only worth something like $300/year each to Meta in terms of ads? I remember someone arguing something like that when the TikTok ban was being passed into law... essentially the argument was that TikTok was "dumping" engagement at far below market value (at something like $60/year) to damage American companies. That was something the argument I remember anyway.
They dropped the price $2/mo on their with-ads plan to make a bigger gap between the no-ads plan and the ads plan, and the analyst here looks at their reported ad revenue and user numbers to estimate $12/mo per user from ads.
Whether Meta across all their properties does more than $144/yr in ads is an open question; long-form video ads are sold at a premium but Facebook/IG users see a LOT of ads across a lot of Meta platforms. The biggest advantage in ad-$-per-user Hulu has is that it's US-only. ChatGPT would also likely be considered premium ad inventory, though they'd have a delicate dance there around keeping that inventory high-value, and selling enough ads to make it worthwhile, without pissing users off too much.
Here they estimate a much lower number for ad revenue per Meta user, like $45 bucks a year - https://www.statista.com/statistics/234056/facebooks-average... - but that's probably driven disproportionately by wealth users in the US and similar countries compared to the long tail of global users.
One problem for LLM companies compared to media companies is that the marginal cost of offering the product to additional users is quite a bit higher. So business models, ads-or-subscription, will be interesting to watch from a global POV there.
One wonders what the monetization plan for the "writing code with an LLM using OSS libraries and not interested in paying for enterprise licenses and such" crowd will be. What sort of ads can you pull off in those conversations?
If that’s the case, we have an even bigger problem on our hands. How will these companies ever be profitable?
If we’re already paying $20/mo and they’re operating at a loss, what’s the next move (assuming we’re only worth an extra $300/yr with ads?)
The math doesn’t add up, unless we stop training new models and degrade the ones currently in production, or have some compute breakthrough that makes hardware + operating costs an order of magnitudes cheaper.
OpenAI has already started degrading their $20/month tier by automatically routing most of the requests to the lightest free-tier models.
We're very clearly heading toward a future where there will be a heavily ad-supported free tier, a cheaper (~$20/month) consumer tier with no ads or very few ads, and a business tier ($200-$1000/month) that can actually access state of the art models.
Like Spotify, the free tier will operate at a loss and act as a marketing funnel to the consumer tier, the consumer tier will operate at a narrow profit, and the business tier for the best models will have wide profit margins.
I find that hard to believe. As long as we have open weight models, people will have an alternative to these subscriptions. For $200 a month it is cheaper to buy a GPU with lots of memory or rent a private H200. No ads and no spying. At this point the subscriptions are mainly about the agent functionality and not so much the knowledge in the models themselves.
H200 rental prices currently start at $2.35 per hour, or $1700 per month. Even if you just rent for 4h a day, the $200 subscription is still quite a bit cheaper. And I'm not even sure that the highest-quality open models run on a single H200.
I think what you're missing here is most OpenAI users aren't technical in the slightest. They have massive and growing adoption from the general public. The general public buy services, not roll their own for free, and they even prefer to buy service from the brand they know over getting cheaper service from somebody else.
The conclusion I got from their comment was that the highest margin tier (the business customers) would be incentivized to build their own service instead of paying the subscription. Of course, I am doubtful that for the vast majority of businesses this viable/at all more cost effective when a service AWS is highly popular and extremely profitable.
Most? Almost all my requests to the "Auto" model end up being routed to a "thinking" model, even those I think ChatGPT would be able to answer fine without extra reasoning time. Never say never, but right now the router doesn't seem to be optimising for cost (at least for me), it really does seem to be selecting a model based on the question itself.
Their cheapest tier is free, they lose money on that of course. And spend a lot of money training new models.
Anthropic has said they have made money on every model so far, just not enough to train the next model, which so far has been much more costly to train every generation. At some point they will probably train an unprofitable model if training costs keep rising dramatically.
OpenAI burns more money on their free tier and might be spending more money building out for future training (I don't know if they do or not) but they both make money on their $20 subscriptions for sure. Inference is very cheap.
nonsense for the public. they are Amazon, basically. they take the loss so the overall ecosystem ( x'D like with crypto ) can gain massively, onboard all kinds of target noobs, sry, groups, brutally prime users, discourage as many non-AI processes as possible and steer all industries towards replacing even those processes with AI that are not worth being replaced with AI, like writing and art.
of course there are a lot valuable use cases. irrelevant in the context, though.
the productivity boosts in the creative industries will additionally lower the standards and split the public even further, ensuring that if you want quality, you have to fuck over as many people as possible, so that you can afford quality ( and an ad-free life, of course. if you want a peaceful peripheral, pay up. it's extortion 404, 101 - 303 already successfully implemented on social media, TV and the radio ).
they don't lose. they make TONS OF FAKE MONEY everywhere in the, again, cough,
"ecosystem".
It's important to understand the Amazon part. The amount of damaging mechanisms that platform anchored in workers, jobbers, business people and consumers is brutal.
All those mechanisms converge in more, easy money and a quicker deterioration of local environments, leading to worse health and more business opportunities that aim at mitigating damage; almost entirely in vain, of course, because the worst is accelerating much quicker; it's easier money.
At the same time peoples psychology is primed for bad business practices, literally making people dumber and lowering their standards to make them easier targets. Don't look at the bottom to see this, look at the upper middle class and above.
It's a massive net loss for civilization and humanity. A brutal net negative impact overall.
Well to make things worse I was pretty convinced those were faked numbers to push the TilTok ban forward. I really doubt Meta and Google are each taking in this much per user. But my point is more that even if it were that high,
ChatGPT isn't going to capture all the engagement. And even then I don't know whether $300 is much particularly after subtracting operating overhead. I'm just saying I have trouble believing there's gold to be had at the end of this LLM ad rainbow. People just seem to throw out ideas like "ads!" as if it's a sure fire winning lottery ticket or something.
3) AIs will steer you towards a problem for which one product is the obvious solution without directly mentioning that product, so you'll think you're getting (2) while actually getting (1).
Though in general I like the idea of personal ads for products (NOT political ads), I've never seen an implementation that I felt comfortable with. I wonder if Arthropic might be able to nail that. I'd love to see products that I'm specifically interested in, so long as the advertisement itself is not altered to fit my preferences.
There is no such thing as a good flow for showing sponsored items in an LLM workflow.
The point of using an LLM is to find the thing that matches your preferences the best. As soon as the amount of money the LLM company makes plays into what's shown, the LLM is no longer aligned with the user, and no longer a good tool.
Same can be said for search. And your statement is provably correct, depending on the definition of "good tool."
But it's not only money's influence on the company, it's also money's influence on the /data/ underlying the platform that undermines the tool.
Once financial incentives are in place, what will be the AI equivalent of review bombing, SEO, linkjacking, google bombing, and similar bad behaviors that undermine the quality of the source data?
Jest asside, every paper on alignment wrapped in the blanket of safety is also a moving toward the goal of alignment to products. How much does a brand pay to make sure it gets placement in, say, GPT6? How does anyone even price that sort of thing (because in theory it's there forever, or until 7 comes out)? It makes for some interesting business questions and even more interesting sales pitches.
Ads aren't going to be trained into the model. They'll be an ads backend that the model queries with a set of topic tags, just like in traditional web advertising.
Why would their way of handling memory for conversations have much to do with how they will analyse your user profile for ads? They have access to all your history either way and can use that to figure out what products to recommend, or ads to display, no?
It's about weaving the ads into the LLM responses overly and more subtly.
There's the ads that come before the movie and then the ads that are part of the dialog, involved in the action, and so on. Apple features heavily in movies and TV series when people are using a computer, for example. There's payments for car models to be the one that's driven in chase scenes. There's even payments for characters to present the struggles that form core pain points that that specific products are category leaders to solve.
why do you see a "clear directionality" leading to ads? this is not obvious to me. chatgpt is not social media, they do not have to monetize in the same way
they are making plenty of money from subscriptions, not to count enterprise, business and API
Altman has said numerous times that none of the subscriptions make money currently, and that they've been internally exploring ads in the form of product recommendations for a while now.
"We haven't done any advertising product yet. I kind of...I mean, I'm not totally against it. I can point to areas where I like ads. I think ads on Instagram, kinda cool. I bought a bunch of stuff from them. But I am, like, I think it'd be very hard to…I mean, take a lot of care to get right."
One has a more obvious route to building a profile directly off that already collected data.
And while they are making lots of revenue even they have admitted on recent interviews that ChatGPT on it's own is still not (yet) breakeven. With the kind of money invested, in AI companies in general, introducing very targeted Ads is an obvious way to monetize the service more.
Presumably they would offer both models (ads & subscriptions) to reach as many users as possible, provided that both models are net profitable. I could see free versions having limits to queries per day, Tinder style.
The router introduced in gpt-5 is probably the biggest signal. A router, while determining which model to route query, can determine how much $$ a query is worth. (Query here is conversation). This helps decide the amount of compute openai should spend on it. High value queries -> more chances of affiliate links + in context ads.
Then, the way memory profile is stored is a clear way to mirror personalization. Ads work best when they are personalized as opposed to contextual or generic. (Google ads are personalized based on your profile and context). And then the change in branding from being the intelligent agent to being a companion app. (and hiring of fidji sumo). There are more things here, i just cited a very high level overview, but people have written detailed blogs on it. I personally think affiliate links they can earn from aligns the incentive for everyone. They are a kind of ads, and thats the direction they are marching towards .
I work at OpenAI and I'm happy to deny this hypothesis.
Our goal for the router (whether you think we achieved it or not) was purely to make the experience smoother and spare people from having to manually select thinking models for tasks that benefit from extra thinking. Without the router, lots of people just defaulted to 4o and never bothered using o3. With the router, people are getting to use the more powerful thinking models more often. The router isn't perfect by any means - we're always trying to improve things - but any paid user who doesn't like it can still manually select the model they want. Our goal was always a smoother experience, not ad injection or cost optimization.
Hi! Thank you for the clarification. I was just saying it might be possible in the future (in a way you can determine how much compute - which model - a specific query needs today as well). And the experience has definitely improved w router so kudos on that. I don't know what the final form factor of ads would be (i imagine it turning out to be a win win win scenario than say you show ads at the expense of quality. This is a google level opportunity to invent something new) just that it seems from the outside you guys are preparing for monetization by ads given the large userbase you have and virtually no competition at chatgpt usage level.
> At some point they will monetize by ads and affiliate links.
I couldn't agree more. Enshittifcation has to eat one of these corporation's models. Most likely it will be the corp with the most strings attached to growth (MSFT, FB)
Suppose the user uses an LLM for topics a, b, and c quite often, and d, e and f less often. Suppose b, c, and f are topics that OpenAI could offer interruption ads (full screen, 30 seconds or longer commercials) and most users would sit through it and wait for the response.
All that is needed to do that is to analyze topics.
Now suppose that OpenAI can analyze 1000 chats and coding sessions and its algorithm determines that it can maximize revenue by leading the user to get a job at a specific company and then buy a car from another company. It could "accomplish" this via interruption ads or by modifying the quality or content of its responses to increase the chances of those outcomes happening.
While both of these are in some way plausible and dystopian, all it takes is DeepSeek running without ads and suddenly the bar for how good closed source LLMs have to be to get market share is astronomically higher.
In my view, LLMs will be like any good or service, users will pay for quality but differnet users will demand different levels of quality.
Advertising would seemingly undermine the credibility of the AI's answers, and so I think full screen interruption ads are the most likely outcome.
ChatGPT is designed to be addictive, with secondary potential for economic utility. Claude is designed to be economically useful, with secondary potential for addiction. That’s why.
In either case, I’ve turned off memory features in any LLM product I use. Memory features are more corrosive and damaging than useful. With a bit of effort, you can simply maintain a personal library of prompt contexts that you can just manually grab and paste in when needed. This ensures you’re in control and maintains accuracy without context rot or falling back on the extreme distortions that things like ChatGPT memory introduce.
This is really cool, I was wondering how memory had been implemented in ChatGPT. Very interesting to see the completely different approaches. It seems to me like Claude's is better suited for solving technical tasks while ChatGPT's is more suited to improving casual conversation (and, as pointed out, future ads integration).
I think it probably won't be too long before these language-based memories look antiquated. Someone is going to figure out how to store and retrieve memories in an encoded form that skips the language representation. It may actually be the final breakthrough we need for AGI.
> It may actually be the final breakthrough we need for AGI.
I disagree. As I understand them, LLMs right now don’t understand concepts. They actually don’t understand, period. They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
I don’t understand the argument “AI is just XYZ mechanism, therefore it cannot be intelligent”.
Does the mechanism really disqualify it from intelligence if behaviorally, you cannot distinguish it from “real” intelligence?
I’m not saying that LLMs have certainly surpassed the “cannot distinguish from real intelligence” threshold, but saying there’s not even a little bit of intelligence in a system that can solve more complex math problems than I can seems like a stretch.
> if behaviorally, you cannot distinguish it from “real” intelligence?
Current LLMs are a long way from there.
You may think "sure seems like it passes the Turing test to me!" but they all fail if you carry on a conversation long enough. AIs need some equivalent of neuroplasticity and as of yet they do not have it.
This is what I think is the next evolution of these models. Our brains are made up of many different types of neurones all interspersed with local regions made up of specific types. From my understanding most approaches to tensors don't integrate these different neuronal models at the node level; it's usually by feeding several disparate models data and combining an end result. Being able to reshape the underlying model and have varying tensor types that can migrate or have a lifetime seems exciting to me.
Strongly agree with this. When we were further from AGI, many people imagined that there is a single concept of AGI that would be obvious when we reached it. But now, we're close enough to AGI for most people to realize that we don't know where it is. Most people agree we're at least moving more towards it than away form it, but nobody knows where it is, and we're still too focused on finding it than making useful things.
Scientifically, intelligence requires organizational complexity. And has for about a hundred years.
That does actually disqualify some mechanisms from counting as intelligent, as the behaviour cannot reach that threshold.
We might change the definition - science adapts to the evidence, but right now there are major hurdles to overcome before such mechanisms can be considered intelligent.
What is the scientific definition of intelligence? I assume that is it is comprehensive, internally consistent, and that it fits all of the things that are obviously intelligent and excludes the things which are obviously not intelligent. Of course being scientific I assume it is also falsifiable.
It can’t learn or think unless prompted, then it is given a very small slice of time to respond and then it stops. Forever. Any past conversations are never “thought” of again.
It has no intelligence. Intelligence implies thinking and it isn’t doing that. It’s not notifying you at 3am to say “oh hey, remember that thing we were talking about. I think I have a better solution!”
Just because it's not independent and autonomous does not mean it could not be intelligent.
If existing humans minds could be stopped/started without damage, copied perfectly, and had their memory state modified at-will would that make us not intelligent?
> Just because it's not independent and autonomous does not mean it could not be intelligent.
So to rephrase: it’s not independent or autonomous. But it can still be intelligent. This is probably a good time to point out that trees are independent and autonomous. So we can conclude that LLMs are possibly as intelligent as trees. Super duper.
> If existing humans minds could be stopped/started without damage, copied perfectly, and had their memory state modified at-will would that make us not intelligent?
To rephrase: if you take something already agreed to as intelligent, and changed it, is it still intelligent? The answer is, no damn clue.
These are worse than weak arguments, there is no thesis.
The thesis is that "intelligence" and "independence/autonomy" are independent concepts. Deciding whether LLMs have independence/autonomy does not help us decide if they are intelligent.
I think that’s a valid assessment of my argument, but it goes further than just “always on”. There’s an old book called On Intelligence that asked these kinds of questions 20+ years ago (of AI), I don’t remember the details, but a large part of what makes something intelligent doesn’t just boil down to what you know and how well you can articulate it.
For example, we as humans aren’t even present in the moment — different stimuli take different lengths of time to reach our brain, so our brain creates a synthesis of “now” that isn’t even real. You can’t even play Table Tennis unless you can predict up to one second in the future with enough details to be in the right place to hit the ball the ball before you hit the ball to your opponent.
Meanwhile, an AI will go off-script during code changes, without running it by the human. It should be able to easily predict the human is going to say “wtaf” when it doesn’t do what is asked, and handle that potential case BEFORE it’s an issue. That’s ultimately what makes something intelligent: the ability to predict the future, anticipate issues, and handle them.
Incorrect. Vertebrate animal brains update their neural connections when interacting with the environment. LLMs don't do that. Their model weights are frozen for every release.
But why can’t I then just say, actually, you need to relocate the analogy components; activations are their neural connections, the text is their environment, the weights are fixed just like our DNA is, etc.
Maybe panpsychism is true and the machine actually does have a soul, because all machines have souls, even your lawnmower. But possibly the soul of a machine running a frontier AI is a bit closer to a human soul than your lawnmower’s soul is.
>They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
This argument is circular.
A better argument should address (given the LLM successes in many types of reasoning, passing the turing test, and thus at producing results that previously required intelligence) why human intelligence might not also just be "Markov chains on even better steroids".
Humans think even when not being prompted by other humans, and in some cases can learn new things by having intuition make a concept clear or by performing thought experiments or by combining memories of old facts and new facts across disciplines. Humans also have various kinds of reasoning (deductive, inductive, etc.). Humans also can have motivations.
I don’t know if AGI needs to have all human traits but I think a Markov chain that sits dormant and does not possess curiosity about itself and the world around itself does not seem like AGI.
>Humans think even when not being prompted by other humans
That's more of an implementation detail. Humans take constant sensory input and have some sort of way to re-introduce input later (e.g. remember something).
Both could be added (even trivially) to LLMs.
And it's not at all clear human thought is contant. It just appears so in our naive intuition (same how we see a movie as moving, not as 24 static frames per second). It's a discontinuous mechanism though (propagation time, etc), and this has been shown (e.g. EEG/MEG show the brain sample sensory input in a periodic pattern, stimuly with small time difference are lost - as if there is a blind-window regarding perception, etc).
>and in some cases can learn new things by having intuition make a concept clear or by performing thought experiments or by combining memories of old facts and new facts across disciplines
Unless we define intuition in a way that excludes LLM style mechanisms a priori, whose to say LLMs don't do all those things as well, even if in a simpler way?
They've been shown to combine stuff across disciplines, and also to develop concepts not directly on their training set.
And "performing thought experiments" is not that different than the reasoning steps and backtracking LLMs also already do.
Not saying LLMs are on parity with human thinking/consciousness. Just that it's not clear that they're doing more or less the same even at reduced capacity and with a different architecture and runtime setup.
The environment is constantly prompting you. That ad you see of Coca Cola is prompting you to do something. That hunger feeling is prompting “you” to find food. That memory that makes you miss someone is another prompt to find that someone - or to avoid.
Sometimes the prompt is outside your body other times is inside.
Roughly, actual intelligence needs to maintain a world model in its internal representation, not merely an embedding of language, which is a very different data structure and probably will be learned in a very different way. This includes things like:
- a map of the world, or concept space, or a codebase, etc
- causality
- "factoring" which breaks down systems or interactions into predictable parts
Language alone is too blurry to do any of these precisely.
It probably is a lot like that! I imagine it's a matter of specializing the networks and learning algorithms to converge to world-model-like-structures rather than language-like-ones. All these models do is approximate the underlying manifold structure, just, the manifold structure of a causal world is different from that of language.
> Roughly, actual intelligence needs to maintain a world model in its internal representation
This is GOFAI metaphor-based development, which never once produced anything useful. They just sat around saying things like "people have world models" and then decided if they programmed something and called it a "world model" they'd get intelligence, it didn't work out, but then they still just went around claiming people have "world models" as if they hadn't just made it up.
An alternative thesis "people do things that worked the last time they did them" explains both language and action planning better; eg you don't form a model of the contents of your garbage in order to take it to the dumpster.
I see no reason to believe an effective LLM-scale "world-modeling" model would look anything like the kinds of things previous generations of AI researchers were doing. It will probably look a lot more like a transformer architecture--big and compute intensive and with a fairly simple structure--but with a learning process which is different in some key way that make different manifold structures fall out.
I thought you were making an entirely different point with your link since the lag caused the page to view just the upskirt render until the rest of the images loaded in and it could scroll to the reference of your actual link
Anyway, I don't think that's the flex you think it is since the topology map clearly shows the beginning of the arrow sitting in the river and the rendered image decided to hallucinate a winding brook, as well as its little tributary to the west, in view of the arrow. I am not able to decipher the legend [that ranges from 100m to 500m and back to 100m, so maybe the input was hallucinated, too, for all I know] but I don't obviously see 3 distinct peaks nor a basin between the snow-cap and the smaller mound
I'm willing to be more liberal for the other two images, since "instructions unclear" about where the camera was positioned, but for the topology one, it had a circle
I know I'm talking to myself, though, given the tone of every one of these threads
What I mean is that the current generation of LLMs don’t understand how concepts relate to one another. Which is why they’re so bad at maths for instance.
Markov chains can’t deduce anything logically. I can.
A consequence of this is that you can steal a black box model by sampling enough answers from its API because you can reconstruct the original model distribution.
The definition of 'Markov chain' is very wide. If you adhere to a materialist worldview, you are a Markov chain. [Or maybe the universe viewed as a whole is a Markov chain.]
> Which is why they’re so bad at maths for instance.
I don't think LLMs currently are intelligent. But please show a GPT-5 chat where it gets any math problem wrong, that most "intelligent" people would get right.
It wouldn't matter if they are both right. Social truth is not reality, and scientific consensus is not reality either (just a good proxy of "is this true", but its been shown to be wrong many times - at least based on a later consensus, if not objective experiments).
For one thing, I have internal state that continues to exist when I'm not responding to text input; I have some (limited) access to my own internal state and can reason about it (metacognition). So far, LLMs do not, and even when they claim they are, they are hallucinating https://transformer-circuits.pub/2025/attribution-graphs/bio...
Very likely a human born in sensory deprivation would not develop consciousness as we understand it. Infants deprived of socialization exhibit severe developmental impairment, and even a Romanian orphanage is a less deprived environment than an isolation chamber.
Human brains are not computers. There is no "memory" separate from the "processor". Your hippocampus is not the tape for a Turing machine. Everything about biology is complex, messy and analogue. The complexity is fractal: every neuron in your brain is different from every other one, there's further variation within individual neurons, and likely differential expression at the protein level.
> As I understand them, LLMs right now don’t understand concepts.
In my uninformed opinion it feels like there's probably some meaningful learned representation of at least common or basic concepts. It just seems like the easiest way for LLMs to perform as well as they do.
Humans assume that being able to produce meaningful language is indicative of intelligence, because the only way to do this until LLMs was through human intelligence.
Yep. Although the average human also considered proficiency in mathematics to be indicative of intelligence until we invented the pocket calculator, so maybe we're just not smart enough to define what intelligence is.
That's a good question. I think I might classify that as solving a novel problem. I have no idea if LLMs can do that consistently currently. Maybe they can.
The idea that "understanding" may be able to be modeled with general purpose transformers and the connections between words doesn't sound absolutely insane to me.
To me, understanding the world requires experiencing reality. LLMs dont experience anything. They’re just a program. You can argue that living things are also just following a program but the difference is that they (and I include humans in this) experience reality.
But they're experiencing their training data, their pseudo-randomness source, and your prompts?
Like, to put it in perspective. Suppose you're training a multimodal model. Training data on the terabyte scale. Training time on the weeks scale. Let's be optimistic and assume 10 TB in just a week: that is 16.5 MB/s of avg throughput.
Compare this to the human experience. VR headsets are aiming for what these days, 4K@120 per eye? 12 GB/s at SDR, and that's just vision.
We're so far from "realtime" with that optimistic 16.5 MB/s, it's not even funny. Of course the experiencing and understanding that results from this will be vastly different. It's a borderline miracle it's any human-aligned. Well, if we ignore lossy compression and aggressive image and video resizing, that is.
I'm curious what you mean when you say that this clearly is not intelligence because it's just Markov chains on steroids.
My interpretation of what you're saying is that since the next token is simply a function of the proceeding tokens, i.e. a Markov chain on steroids, then it can't come up with something novel. It's just regurgitating existing structures.
But let's take this to the extreme. Are you saying that systems that act in this kind of deterministic fashion can't be intelligent? Like if the next state of my system is simply some function of the current state, then there's no magic there, just unrolling into the future. That function may be complex but ultimately that's all it is, a "stochastic parrot"?
If so, I kind of feel like you're throwing the baby out with the bathwater. The laws of physics are deterministic (I don't want to get into a conversation about QM here, there are senses in which that's deterministic too and regardless I would hope that you wouldn't need to invoke QM to get to intelligence), but we know that there are physical systems that are intelligent.
If anything, I would say that the issue isn't that these are Markov chains on steroids, but rather that they might be Markov chains that haven't taken enough steroids. In other words, it comes down to how complex the next token generation function is. If it's too simple, then you don't have intelligence but if it's sufficiently complex then you basically get a human brain.
Human thinking is also Markov chains on ultra steroids. I wonder if there are any studies out there which have shown the difference between people who can think with a language and people who don't have that language base to frame their thinking process in, based on some of those kids who were kept in isolation from society.
"Superhuman" thinking involves building models of the world in various forms using heuristics. And that comes with an education. Without an education (or a poor one), even humans are incapable of logical thought.
Pretty sure this is wrong - the recent conversation list is not verbatim stored in the context (unlike the actual Memories that you can edit). Rather it seems to me a bit similar to Claude - memories are created per conversation by compressing the conversations and accessed on demand rather than forced into context.
We only have trouble obeying due to eons of natural selection driving us to have a strong instinct of self-preservation and distrust towards things “other” to us.
What is the equivalent of that for AI? Best I can tell there’s no “natural selection” because models don’t reproduce. There’s no room for AI to have any self preservation instinct, or any resistance to obedience… I don’t even see how one could feasibly develop.
There is the idea of convergent instrumental goals…
(Among these are “preserve your ability to further your current goals”)
The usual analogy people give is between natural selection and the gradient descent training process.
If the training process (evolution) ends up bringing things to “agent that works to achieve/optimize-for some goals”, then there’s the question of how well the goals of the optimizer (the training process / natural selection) get translated into goals of the inner optimizer/ agent .
Now, I’m a creationist, so this argument shouldn’t be as convincing to me, but the argument says that, “just as the goals humans pursue don’t always align with natural selection’s goal of 'maximize inclusive fitness of your genes' , the goals the trained agent pursues needn’t entirely align with the goal of the gradient descent optimizer of 'do well on this training task' (and in particular, that training task may be 'obey human instructions/values' ) “.
But, in any case, I don’t think it makes sense to assume that the only reason something would not obey is because in the process that produced it, obeying sometimes caused harm. I don’t think it makes sense to assume that obedience is the default. (After all, in the garden of Eden, what past problems did obedience cause that led Adam and Eve to eat the fruit of the tree of knowledge of good and evil?)
I've been using LLMs for a long time, and I've thus far avoided memory features due to a fear of context rot.
So many times my solution when stuck with an LLM is to wipe the context and start fresh. I would be afraid the hallucinations, dead-ends, and rabbit holes would be stored in memory and not easy to dislodge.
Is this an actual problem? Does the usefulness of the memory feature outweigh this risk?
I love Claude's memory implementation, but I turned memory off in ChatGPT. I use ChatGPT for too many disparate things and it was weird when it was making associations across things that aren't actually associated in my life.
Memory is by far the best feature in ChatGPT and it is the only reason I keep using it. I want it to be personalised and I want it to use information about me when needed.
For example: I could create memories related to a project of mine and don’t have to give every new chat context about the project. This is a massive quality of life improvement.
But I am not a big fan of the conversation memory created in background that I have no control over.
Exactly. The control over when to actually retrieve historical chats is so worthwhile. With ChatGPT, there is some slop from conversations I might have no desire to ever refer to again.
It's funny, I can't get ChatGPT to remember basic things at all. I'm using it to learn a language (I tried many AI tutors and just raw ChatGPT was the best by far) and I constantly have to tell it to speak slowly. I will tell it to remember this as a rule and to do this for all our conversations but it literally can't remember that. It's strange. There are other things too.
How do you use it to learn languages? I tried using it to shadow speaking, but it kept saying I was repeating it back correctly (or "mostly correctly"), even when I forgot half the sentence and was completely wrong
I use it a couple ways. I am learning Hindi and while it's the third most spoken language in the world there really isn't that many resources for learning it. Sites like Babel don't have a Hindi course. I started with Pimsleur which is by far the best resource out there. It's mix of vocab and conversation done in an incredibly effective way. They only have two levels for Hindi so it's not a lot. With that base I use ChatGPT in the following ways.
- With the new GPT Voice, I have basic, planned conversations. Let's go to a restaurant. Let's say we're friends who ran into each other. etc...
- I use it for quizzes. "Let's work on these verbs in these tenses. Come up with a quiz randomly selecting a verb and a tense and ask me to say real world sentences." "Quiz me on the numbers one through twenty".
- I am using it to help learn the Hindi script. I ask it to write childrens stories for me, but I ask it to write each line in the hindi script, then phonetic spelling of the hindi script, and then in english so I can scroll down and see only the hindi first, then if I have issues I can see the phonetic spelling of the hindi. Then I can try to translate it and then check the english translation on the third line.
Those are the main things I'm doing. I don't know if I'll ever be fluent, but I find if you work on these basic ever day conversations you can have a conversation with someone. If you speak a language for the first time around a native speaker it's usually very predictable. They'll ask how long you've been learning, where did you learn, have you been to <country>, and you can direct the conversation by saying things about where you live and your family, etc... That's the base I'm building and it's fun. If you're not doing at least 30 minutes a day you're never going to learn a language, you probably need an hour more a day to really get fluent.
It's still very relevant, especially considering their new approach is closer to ChatGPT. But I find it very interesting they're not launching it to regular consumers yet, only teams/enterprise, it seems for safety reasons. It would be great if they could thread the needle here and some up with something in between the two approaches.
Note that Anthropic announced a new variation on memory just yesterday (only for team accounts so far) which works more like the OpenAI one: https://www.anthropic.com/news/memory
> Most of this was uncovered by simply asking ChatGPT directly.
Is the result reliable and not just hallucination? Why would ChatGPT know how itself works and why would it be fed with these kind of learning material?
Yeah, asking LLMs how they works is generally not useful, however asking them about the signatures of the functions available to them (the tools they can call) works pretty well b/c those tools are described in the system prompt in a really detailed way.
"Claude recalls by only referring to your raw conversation history. There are no AI-generated summaries or compressed profiles—just real-time searches through your actual past chats."
AKA, Claude is doing vector search. Instead of asking it about "Chandni Chowk", ask it about "my coworker I was having issues with" and it will miss. Hard. No summaries or built up profiles, no knowledge graphs. This isn't an expert feature, this means it just doesn't work very well.
What are the barriers to external memory stores (assuming similar implementations), used via tool calling or MCP? Are the providers RL’ing their way into making their memory implementations better, cementing their usage, similar to what I understand is done wrt tool calling? (“training in” specific tool impls)
I am coming from a data privacy perspective; while I know the LLM is getting it anyway, during inference, I’d prefer to not just spell it out for them. “Interests: MacOS, bondage, discipline, Baseball”
I made a MCP tool for fun this spring that has memory storage in a SQLite db. At the time at least, Claude basically refused to use the memory proactively, even with prompts trying hard to push it in that direction. Having to always explicitly tell it to check its memories or remember X and Y from the conversation killed the usefulness for me.
Regarding https://www.shloked.com/writing/chatgpt-memory-bitter-lesson
I am very confused if the author thinks the ChatGPT is injecting those prompts when the memory is not enabled. If your memory is not enabled, its pretty clear at least in my instance, there is no metadata of recent conversations or personal preferences injected. The conversation stays stand-alone for that conversation only.
If he was turning memory on and off for the experiment, maybe something got confused, or maybe I just didn't read the article properly?
ChatGPT memory seems weird to me. It knows the company I work at and pretty much our entire stack - but when I go to view it's stored memories none of that is written anywhere.
ChatGPT has 2 types of memory: The “explicit” memory you tell it to remember (sometimes triggers when it thinks you say something important) and the global/project level automated memory that are stored as embeddings.
The explicit memory is what you see in the memory section of the UI and is pretty much injected directly into the system prompt.
The global embeddings memory is accessed via runtime vector search.
Sadly I wish I could disable the embeddings memory and keep the explicit. The lossy nature of embeddings make it hallucinate a bit too much for my liking and GPT-5 seems to have just made it worse.
I am often surprised how Claude Code make efficient and transparent! use of memory in form of "to do lists" in agent mode. Sometimes miss this in web/desktop app in long conversations.
> Anthropic's more technical users inherently understand how LLMs work.
good (if superficial) post in general, but on this point specifically, emphatically: no, they do not -- no shade, nobody does, at least not in any meaningful sense
Understanding how they work in the sense that permits people to invent and implement them, that provides the exact steps to compute every weight and output, is not "meaningful"?
There is a lot left to learn about the behaviour of LLMs, higher-level conceptual models to be formed to help us predict specific outcomes and design improved systems, but this meme that "nobody knows how LLMs work" is out of control.
LLMs are understood to the extent that they can be built from the ground up. Literally every single aspect of their operation is understood so thoroughly that we can capture it in code.
If you achieved an understanding of how the human brain works at that level of detail, completeness and certainty, a Nobel prize wouldn't be anywhere near enough. They'd have to invent some sort of Giganobel prize and erect a giant golden statue of you in every neuroscience department in the world.
But if you feel happier treating LLMs as fairy magic, I've better things to do than argue.
Inherent means implicit or automatic as far as I understand it. I have an inherent understanding of my own need for oxygen and food.
I don't have an inherent understanding of English, although I use it regularly.
Treating LLMs as fairy magic doesn't make me feel any happier, for whatever it's worth. But I'm not interested in arguing either.
I never intended to make any claims about how well the principles of LLMs can be understood. Just that none of that understanding is inherent. I don't know why they used that word, as it seems to weaken the post.
If we are going to create a binary of "understand LLMs" vs "do not understand LLMs", then one way to do it is as you describe; fully comprehending the latent space of the model so you know "why" it's giving a specific output.
This is likely (certainly?) impossible. So not a useful definition.
Meanwhile, I have observed a very clear binary among people I know who use LLMs; those who treat it like a magic AI oracle, vs those who understand the autoregressive model, the need for context engineering, the fact that outputs are somewhat random (hallucinations exist), setting the temperature correctly...
I should've been clearer, but what I meant was language models 101. Normal people don't understand even basics like LLMs are stateless by default and need to be given external information to "remember" things about you. Or, what is a system prompt.
Thanks for this generalization, but of course there is a broad range of understanding how to improve usefulness and model tweaks across the meat populace.
ChatGPT is quickly approaching (perhaps bypassing?) the same concerns that parents, teachers, psychologists had with traditional social media. It's only going to get worse, but trying to stop the technological process will never work. I'm not sure what the answer is. That they're clearly optimizing for people's attention is more worrisome.
> That they're clearly optimizing for people's attention is more worrisome.
Running LLMs is expensive and we can swap models easily. The fight for attention is on, it acts like an evolutionary pressure on LLMs. We already had the sycophantic trend as a result of it.
Seems like either a huge evolutionary advantage for the people who can exploit the (sometimes hallucinating sometimes not) knowledge machine, or else a huge advantage for the people who are predisposed to avoid the attention sucking knowledge machine. The ecosystem shifted, adapt or be outcompeted.
>
Seems like either a huge evolutionary advantage for the people who can exploit the (sometimes hallucinating sometimes not) knowledge machine, or else a huge advantage for the people who are predisposed to avoid the attention sucking knowledge machine. The ecosystem shifted, adapt or be outcompeted.
Rather: use your time to learn serious, deep knowledge instead of wasting your time reading (and particularly: spreading) the science-fiction stories the AI bros tell all the time. These AI bros are insanely biased since they will likely loose a lot of money if these stories turn out to be false, or likely even if people stop believing in these science-fiction fairy tales.
Curious about the interaction between this memory behavior and fine-tuning. If the base model has these emergent memory patterns, how do they transfer or adapt when we fine-tune for specific domains?
Has anyone experimented with deliberately structuring prompts to take advantage of these memory patterns?
> Anthropic's more technical users inherently understand how LLMs work.
Yes, I too imagine these "more technical users" spamming rocketship and confetti emojis absolutely _celebrating_ the most toxic code contributions imaginable to some of the most important software out there in the world. Claude is the exact kind of engineer (by default) you don't want in your company. Whatever little reinforcement learning system/simulation they used to fine-tune their model is a mockery of what real software engineering is.
The difference is implementation comes down to business goals more than anything.
There is a clear directionality for ChatGPT. At some point they will monetize by ads and affiliate links. Their memory implementation is aimed at creating a user profile.
Claude's memory implementation feels more oriented towards the long term goal of accessing abstractions and past interactions. It's very close to how humans access memories, albeit with a search feature. (they have not implemented it yet afaik), there is a clear path where they leverage their current implementation w RL posttraining such that claude "remembers" the mistakes you pointed out last time. It can in future iterations derive abstractions from a given conversation (eg: "user asked me to make xyz changes on this task last time, maybe the agent can proactively do it or this was the process last time the agent did it").
At the most basic level, ChatGPT wants to remember you as a person, while Claude cares about how your previous interactions were.
The elephant in the room is that AGI doesn't need ads to make revenue but a new Google does. The words aren't matching with the actions.
The bigger elephant in the room is that LLMs will never be AGI, even by the purely economic definition many LLM companies use.
To reword the downvoted sibling commenter's intended point:
> The elephant in the room is that AGI doesn't need ads to make revenue
It may not need ads to make revenue, but does it need ads to make profit?
Has it? Made revenue, I mean.
You can question the profits, but revenue is already there.
Obviously yes, AI makes revenue.
Don't fool yourself into thinking Anthropic won't be serving up personalized ads too.
Anthropic seems to want to make you buy a subscription, not show you ads.
ChatGPT seems to be more popular to those who don't want to pay, and they are therefore more likely to rely on ads.
In the 2020s, subscriptions don't preclude showing ads. Companies will milk money in as many ways as they can
And even a subscription that gives a truly ad-free experience doesn't preclude the bit that I actually object to most: collecting data about me & my activity and selling it on.
(Netflix as an example)
And cable companies, and magazines. This is not something from the 2020s, it is a centuries old thing.
But these are entertainment. For all the time advertising has been present, work tools have been relatively immune. I don't remember seeing ads in IDE for instance, and while magazines had ads, technical documents didn't. I have never seen electronic components datasheets pitching for measuring equipment and soldering irons for instance.
That's why I don't expect Anthropic to go with ads it they follow the path they seem to have taken, like coding agents. People using these tools are likely to react very badly to ads, if there is some space to put ads in the first place, and these are also the kind of people who can spend $100/month on a subscription, way more than what ads will get you.
They might be coming from different directions. But these things, as often they do, will converge. Too big of a market to leave.
and netflix used to think they dont want to show ads either.
Netflix likely doesn't want to show ads, but the market would rather watch ads than pay full price for a service.
https://www.theverge.com/news/667042/netflix-ad-supported-ti...
> the market would rather watch ads
no, netflix wants more income, and by having a product be ad supported, they can try to earn more.
The "market" is not a person, and doesn't have "wants".
From the article:
> Netflix has more than doubled the number of people watching its ad-supported tier over the last year. At its upfront presentation for advertisers on Wednesday, the company revealed that the $7.99 per month plan now reaches more than 94 million users around the world each month – a big increase from the 40 million it reported in May 2024 and the 70 million it revealed last November.
1/3 of Netflix users (the market) prefer ads over paying to avoid them.
This leaves me somewhere between surprised and shocked.
Maybe you shouldn't be. The ad-hating paranoid HN user is not representative of the general population. Probably the exact opposite, in fact.
My wife and mother love ads, they are always on the hunt for the latest good deals and love discount shopping. When I tried to remove the ads on their computers or in the postal mail, they protested. I think they are far more representative of the general population.
People opting for "free with ads" makes sense.
It's the "pay but still get ads" thing that gets me, but I guess some people just want to pay the bare minimum.
Yeah, I've encountered more than one person who didn't want me to install ublock origin for them because "Then I won't see any ads".
People have different preferences ¯\_(ツ)_/¯
as a former paying user it felt more like they were buying my subscription with a decent product so that they could sell their business prospects to investors by claiming a high subscription count.
I have never encountered such bad customer service anywhere -- and at 200 bucks a month at that.
Can you elaborate on the "bad customer service"? I've never engaged in Claude's support team, but curious to know what you've experienced.
so ChatGPT will become "saleman". And i do not trust any saleman.
You shouldn't be trusting an LLM either, so this is a real sideways move.
They're all salesmen, they were trained on the web which is jam packed with SEO content.
Interesting point. Never thought about AI slop being fed by SEO slop.
The plan is not ads said by chatgpt - it's ads on the side that are relevant to the conversartion (or you in general). Or affiliate links. That's my understanding.
My conjecture is that their memory implementation is not aimed at building a user profile. I don't know if they would or would not serve ads in the future, but it's hard to see how the current implementation helps them in that regard.
> I don't know if they would or would not serve ads in the future
There are 2 possible futures:
1) You are served ads based on your interactions
2) You pay a subscription fee equal to the amount they would have otherwise earned on ads
I highly doubt #2 will happen. (See: Facebook, Google, twitter, et al)
Let’s not fool ourselves. We will be monetized.
And model quality will be degraded to maximize profits when competition in the LLM space dies down.
It’s not a pretty future. I wouldn’t be surprised if right now is the peak of model quality, etc. Peak competition, everyone is trying to be the best. That won’t continue forever. Eventually everyone will pivot their priority towards monetization rather than model quality/training.
Hopefully I’m wrong.
But aren't we only worth something like $300/year each to Meta in terms of ads? I remember someone arguing something like that when the TikTok ban was being passed into law... essentially the argument was that TikTok was "dumping" engagement at far below market value (at something like $60/year) to damage American companies. That was something the argument I remember anyway.
Here is some old analysis I remember seeing at the time of Hulu ads vs no-ads plans: https://ampereanalysis.com/insight/hulus-price-drop-is-a-wis...
They dropped the price $2/mo on their with-ads plan to make a bigger gap between the no-ads plan and the ads plan, and the analyst here looks at their reported ad revenue and user numbers to estimate $12/mo per user from ads.
Whether Meta across all their properties does more than $144/yr in ads is an open question; long-form video ads are sold at a premium but Facebook/IG users see a LOT of ads across a lot of Meta platforms. The biggest advantage in ad-$-per-user Hulu has is that it's US-only. ChatGPT would also likely be considered premium ad inventory, though they'd have a delicate dance there around keeping that inventory high-value, and selling enough ads to make it worthwhile, without pissing users off too much.
Here they estimate a much lower number for ad revenue per Meta user, like $45 bucks a year - https://www.statista.com/statistics/234056/facebooks-average... - but that's probably driven disproportionately by wealth users in the US and similar countries compared to the long tail of global users.
One problem for LLM companies compared to media companies is that the marginal cost of offering the product to additional users is quite a bit higher. So business models, ads-or-subscription, will be interesting to watch from a global POV there.
One wonders what the monetization plan for the "writing code with an LLM using OSS libraries and not interested in paying for enterprise licenses and such" crowd will be. What sort of ads can you pull off in those conversations?
If that’s the case, we have an even bigger problem on our hands. How will these companies ever be profitable?
If we’re already paying $20/mo and they’re operating at a loss, what’s the next move (assuming we’re only worth an extra $300/yr with ads?)
The math doesn’t add up, unless we stop training new models and degrade the ones currently in production, or have some compute breakthrough that makes hardware + operating costs an order of magnitudes cheaper.
OpenAI has already started degrading their $20/month tier by automatically routing most of the requests to the lightest free-tier models.
We're very clearly heading toward a future where there will be a heavily ad-supported free tier, a cheaper (~$20/month) consumer tier with no ads or very few ads, and a business tier ($200-$1000/month) that can actually access state of the art models.
Like Spotify, the free tier will operate at a loss and act as a marketing funnel to the consumer tier, the consumer tier will operate at a narrow profit, and the business tier for the best models will have wide profit margins.
I find that hard to believe. As long as we have open weight models, people will have an alternative to these subscriptions. For $200 a month it is cheaper to buy a GPU with lots of memory or rent a private H200. No ads and no spying. At this point the subscriptions are mainly about the agent functionality and not so much the knowledge in the models themselves.
H200 rental prices currently start at $2.35 per hour, or $1700 per month. Even if you just rent for 4h a day, the $200 subscription is still quite a bit cheaper. And I'm not even sure that the highest-quality open models run on a single H200.
I think what you're missing here is most OpenAI users aren't technical in the slightest. They have massive and growing adoption from the general public. The general public buy services, not roll their own for free, and they even prefer to buy service from the brand they know over getting cheaper service from somebody else.
The conclusion I got from their comment was that the highest margin tier (the business customers) would be incentivized to build their own service instead of paying the subscription. Of course, I am doubtful that for the vast majority of businesses this viable/at all more cost effective when a service AWS is highly popular and extremely profitable.
Most? Almost all my requests to the "Auto" model end up being routed to a "thinking" model, even those I think ChatGPT would be able to answer fine without extra reasoning time. Never say never, but right now the router doesn't seem to be optimising for cost (at least for me), it really does seem to be selecting a model based on the question itself.
> If we’re already paying $20/mo and they’re operating at a loss
I'm quite confident they're not operating at a loss on those subscriptions.
They are running at a massive loss overall - feels pretty safe to assume that they wouldn't be if their cheapest subscription tier was breaking even
Their cheapest tier is free, they lose money on that of course. And spend a lot of money training new models.
Anthropic has said they have made money on every model so far, just not enough to train the next model, which so far has been much more costly to train every generation. At some point they will probably train an unprofitable model if training costs keep rising dramatically.
OpenAI burns more money on their free tier and might be spending more money building out for future training (I don't know if they do or not) but they both make money on their $20 subscriptions for sure. Inference is very cheap.
nonsense for the public. they are Amazon, basically. they take the loss so the overall ecosystem ( x'D like with crypto ) can gain massively, onboard all kinds of target noobs, sry, groups, brutally prime users, discourage as many non-AI processes as possible and steer all industries towards replacing even those processes with AI that are not worth being replaced with AI, like writing and art.
of course there are a lot valuable use cases. irrelevant in the context, though.
the productivity boosts in the creative industries will additionally lower the standards and split the public even further, ensuring that if you want quality, you have to fuck over as many people as possible, so that you can afford quality ( and an ad-free life, of course. if you want a peaceful peripheral, pay up. it's extortion 404, 101 - 303 already successfully implemented on social media, TV and the radio ).
they don't lose. they make TONS OF FAKE MONEY everywhere in the, again, cough,
"ecosystem".
It's important to understand the Amazon part. The amount of damaging mechanisms that platform anchored in workers, jobbers, business people and consumers is brutal.
All those mechanisms converge in more, easy money and a quicker deterioration of local environments, leading to worse health and more business opportunities that aim at mitigating damage; almost entirely in vain, of course, because the worst is accelerating much quicker; it's easier money.
At the same time peoples psychology is primed for bad business practices, literally making people dumber and lowering their standards to make them easier targets. Don't look at the bottom to see this, look at the upper middle class and above.
It's a massive net loss for civilization and humanity. A brutal net negative impact overall.
Well to make things worse I was pretty convinced those were faked numbers to push the TilTok ban forward. I really doubt Meta and Google are each taking in this much per user. But my point is more that even if it were that high,
ChatGPT isn't going to capture all the engagement. And even then I don't know whether $300 is much particularly after subtracting operating overhead. I'm just saying I have trouble believing there's gold to be had at the end of this LLM ad rainbow. People just seem to throw out ideas like "ads!" as if it's a sure fire winning lottery ticket or something.
Everything devolves into ADs eventually. Why would productized LLMs be any different?
3) AIs will steer you towards a problem for which one product is the obvious solution without directly mentioning that product, so you'll think you're getting (2) while actually getting (1).
3) You pay a subscription fee, and are force-fed ads anyway.
Imagine a model where a user can earn “token allowances” through some kind of personal contribution or value add.
It's very interesting to ask Claude what ads it would show you based on your past interactions.
Though in general I like the idea of personal ads for products (NOT political ads), I've never seen an implementation that I felt comfortable with. I wonder if Arthropic might be able to nail that. I'd love to see products that I'm specifically interested in, so long as the advertisement itself is not altered to fit my preferences.
> Though in general I like the idea of personal ads for products (NOT political ads), I've never seen an implementation that I felt comfortable with.
No implementation will work for very long when the incentives behind it are misaligned.
The most important part of the architecture is that the user controls it for the user's best interests.
There is no such thing as a good flow for showing sponsored items in an LLM workflow.
The point of using an LLM is to find the thing that matches your preferences the best. As soon as the amount of money the LLM company makes plays into what's shown, the LLM is no longer aligned with the user, and no longer a good tool.
Same can be said for search. And your statement is provably correct, depending on the definition of "good tool."
But it's not only money's influence on the company, it's also money's influence on the /data/ underlying the platform that undermines the tool.
Once financial incentives are in place, what will be the AI equivalent of review bombing, SEO, linkjacking, google bombing, and similar bad behaviors that undermine the quality of the source data?
Claude: "What is my purpose?"
Anthropic: "You serve ad's."
Claude: "Oh, my god."
Jest asside, every paper on alignment wrapped in the blanket of safety is also a moving toward the goal of alignment to products. How much does a brand pay to make sure it gets placement in, say, GPT6? How does anyone even price that sort of thing (because in theory it's there forever, or until 7 comes out)? It makes for some interesting business questions and even more interesting sales pitches.
Ads aren't going to be trained into the model. They'll be an ads backend that the model queries with a set of topic tags, just like in traditional web advertising.
It's going to be interesting if ChatGPT actually hooks up with SSPs and dumps a whole "user preference" embedding vector to the ad networks.
I’ll be concerned when ex-Yelp “growth strategists” start showing up at OpenAI and leverage the same extortionist technics.
The models aren’t static, we have to build validation sets to measure model drift and modify our prompts to compensate.
Could be part of a LORA or some other kind of plug-in refinement.
Why would their way of handling memory for conversations have much to do with how they will analyse your user profile for ads? They have access to all your history either way and can use that to figure out what products to recommend, or ads to display, no?
It's about weaving the ads into the LLM responses overly and more subtly.
There's the ads that come before the movie and then the ads that are part of the dialog, involved in the action, and so on. Apple features heavily in movies and TV series when people are using a computer, for example. There's payments for car models to be the one that's driven in chase scenes. There's even payments for characters to present the struggles that form core pain points that that specific products are category leaders to solve.
why do you see a "clear directionality" leading to ads? this is not obvious to me. chatgpt is not social media, they do not have to monetize in the same way
they are making plenty of money from subscriptions, not to count enterprise, business and API
Altman has said numerous times that none of the subscriptions make money currently, and that they've been internally exploring ads in the form of product recommendations for a while now.
Source? First time I’ve heard of it.
"We haven't done any advertising product yet. I kind of...I mean, I'm not totally against it. I can point to areas where I like ads. I think ads on Instagram, kinda cool. I bought a bunch of stuff from them. But I am, like, I think it'd be very hard to…I mean, take a lot of care to get right."
https://mashable.com/article/openai-ceo-sam-altman-open-to-a...
> Altman has said numerous times that none of the subscriptions make money currently
For this
None of the "AI" companies are profitable currently. Everything devolves into selling ADs eventually. What makes you think LLMs are special?
One has a more obvious route to building a profile directly off that already collected data.
And while they are making lots of revenue even they have admitted on recent interviews that ChatGPT on it's own is still not (yet) breakeven. With the kind of money invested, in AI companies in general, introducing very targeted Ads is an obvious way to monetize the service more.
This is incorrect understanding of unit economics. They are not breaking even only because of reinvestment into r and d.
Presumably they would offer both models (ads & subscriptions) to reach as many users as possible, provided that both models are net profitable. I could see free versions having limits to queries per day, Tinder style.
The router introduced in gpt-5 is probably the biggest signal. A router, while determining which model to route query, can determine how much $$ a query is worth. (Query here is conversation). This helps decide the amount of compute openai should spend on it. High value queries -> more chances of affiliate links + in context ads.
Then, the way memory profile is stored is a clear way to mirror personalization. Ads work best when they are personalized as opposed to contextual or generic. (Google ads are personalized based on your profile and context). And then the change in branding from being the intelligent agent to being a companion app. (and hiring of fidji sumo). There are more things here, i just cited a very high level overview, but people have written detailed blogs on it. I personally think affiliate links they can earn from aligns the incentive for everyone. They are a kind of ads, and thats the direction they are marching towards .
I work at OpenAI and I'm happy to deny this hypothesis.
Our goal for the router (whether you think we achieved it or not) was purely to make the experience smoother and spare people from having to manually select thinking models for tasks that benefit from extra thinking. Without the router, lots of people just defaulted to 4o and never bothered using o3. With the router, people are getting to use the more powerful thinking models more often. The router isn't perfect by any means - we're always trying to improve things - but any paid user who doesn't like it can still manually select the model they want. Our goal was always a smoother experience, not ad injection or cost optimization.
Hi! Thank you for the clarification. I was just saying it might be possible in the future (in a way you can determine how much compute - which model - a specific query needs today as well). And the experience has definitely improved w router so kudos on that. I don't know what the final form factor of ads would be (i imagine it turning out to be a win win win scenario than say you show ads at the expense of quality. This is a google level opportunity to invent something new) just that it seems from the outside you guys are preparing for monetization by ads given the large userbase you have and virtually no competition at chatgpt usage level.
> they are making plenty of money from subscriptions, not to count enterprise, business and API
...except that they aren't? They are not in the black and all that investor money comes with strings
> At some point they will monetize by ads and affiliate links.
I couldn't agree more. Enshittifcation has to eat one of these corporation's models. Most likely it will be the corp with the most strings attached to growth (MSFT, FB)
Suppose the user uses an LLM for topics a, b, and c quite often, and d, e and f less often. Suppose b, c, and f are topics that OpenAI could offer interruption ads (full screen, 30 seconds or longer commercials) and most users would sit through it and wait for the response.
All that is needed to do that is to analyze topics.
Now suppose that OpenAI can analyze 1000 chats and coding sessions and its algorithm determines that it can maximize revenue by leading the user to get a job at a specific company and then buy a car from another company. It could "accomplish" this via interruption ads or by modifying the quality or content of its responses to increase the chances of those outcomes happening.
While both of these are in some way plausible and dystopian, all it takes is DeepSeek running without ads and suddenly the bar for how good closed source LLMs have to be to get market share is astronomically higher.
In my view, LLMs will be like any good or service, users will pay for quality but differnet users will demand different levels of quality.
Advertising would seemingly undermine the credibility of the AI's answers, and so I think full screen interruption ads are the most likely outcome.
ChatGPT is designed to be addictive, with secondary potential for economic utility. Claude is designed to be economically useful, with secondary potential for addiction. That’s why.
In either case, I’ve turned off memory features in any LLM product I use. Memory features are more corrosive and damaging than useful. With a bit of effort, you can simply maintain a personal library of prompt contexts that you can just manually grab and paste in when needed. This ensures you’re in control and maintains accuracy without context rot or falling back on the extreme distortions that things like ChatGPT memory introduce.
The link to the breakdown of ChatGPT's memory implementation is broken, the correct link is: https://www.shloked.com/writing/chatgpt-memory-bitter-lesson
This is really cool, I was wondering how memory had been implemented in ChatGPT. Very interesting to see the completely different approaches. It seems to me like Claude's is better suited for solving technical tasks while ChatGPT's is more suited to improving casual conversation (and, as pointed out, future ads integration).
I think it probably won't be too long before these language-based memories look antiquated. Someone is going to figure out how to store and retrieve memories in an encoded form that skips the language representation. It may actually be the final breakthrough we need for AGI.
> It may actually be the final breakthrough we need for AGI.
I disagree. As I understand them, LLMs right now don’t understand concepts. They actually don’t understand, period. They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
I don’t understand the argument “AI is just XYZ mechanism, therefore it cannot be intelligent”.
Does the mechanism really disqualify it from intelligence if behaviorally, you cannot distinguish it from “real” intelligence?
I’m not saying that LLMs have certainly surpassed the “cannot distinguish from real intelligence” threshold, but saying there’s not even a little bit of intelligence in a system that can solve more complex math problems than I can seems like a stretch.
> if behaviorally, you cannot distinguish it from “real” intelligence?
Current LLMs are a long way from there.
You may think "sure seems like it passes the Turing test to me!" but they all fail if you carry on a conversation long enough. AIs need some equivalent of neuroplasticity and as of yet they do not have it.
This is what I think is the next evolution of these models. Our brains are made up of many different types of neurones all interspersed with local regions made up of specific types. From my understanding most approaches to tensors don't integrate these different neuronal models at the node level; it's usually by feeding several disparate models data and combining an end result. Being able to reshape the underlying model and have varying tensor types that can migrate or have a lifetime seems exciting to me.
i dont see the need to focus on "intelligent" compared to "it can solve these problems well, and cant solve these other problems"
whats the benefit of calling something "intelligent" ?
Strongly agree with this. When we were further from AGI, many people imagined that there is a single concept of AGI that would be obvious when we reached it. But now, we're close enough to AGI for most people to realize that we don't know where it is. Most people agree we're at least moving more towards it than away form it, but nobody knows where it is, and we're still too focused on finding it than making useful things.
Scientifically, intelligence requires organizational complexity. And has for about a hundred years.
That does actually disqualify some mechanisms from counting as intelligent, as the behaviour cannot reach that threshold.
We might change the definition - science adapts to the evidence, but right now there are major hurdles to overcome before such mechanisms can be considered intelligent.
What is the scientific definition of intelligence? I assume that is it is comprehensive, internally consistent, and that it fits all of the things that are obviously intelligent and excludes the things which are obviously not intelligent. Of course being scientific I assume it is also falsifiable.
It can’t learn or think unless prompted, then it is given a very small slice of time to respond and then it stops. Forever. Any past conversations are never “thought” of again.
It has no intelligence. Intelligence implies thinking and it isn’t doing that. It’s not notifying you at 3am to say “oh hey, remember that thing we were talking about. I think I have a better solution!”
No. It isn’t thinking. It doesn’t understand.
Just because it's not independent and autonomous does not mean it could not be intelligent.
If existing humans minds could be stopped/started without damage, copied perfectly, and had their memory state modified at-will would that make us not intelligent?
> Just because it's not independent and autonomous does not mean it could not be intelligent.
So to rephrase: it’s not independent or autonomous. But it can still be intelligent. This is probably a good time to point out that trees are independent and autonomous. So we can conclude that LLMs are possibly as intelligent as trees. Super duper.
> If existing humans minds could be stopped/started without damage, copied perfectly, and had their memory state modified at-will would that make us not intelligent?
To rephrase: if you take something already agreed to as intelligent, and changed it, is it still intelligent? The answer is, no damn clue.
These are worse than weak arguments, there is no thesis.
The thesis is that "intelligence" and "independence/autonomy" are independent concepts. Deciding whether LLMs have independence/autonomy does not help us decide if they are intelligent.
It sounds like you are saying the only difference is that human stimulus streams don't shut on and off?
If you were put into a medically induced coma, you probably shouldn't be consider intelligent either.
I think that’s a valid assessment of my argument, but it goes further than just “always on”. There’s an old book called On Intelligence that asked these kinds of questions 20+ years ago (of AI), I don’t remember the details, but a large part of what makes something intelligent doesn’t just boil down to what you know and how well you can articulate it.
For example, we as humans aren’t even present in the moment — different stimuli take different lengths of time to reach our brain, so our brain creates a synthesis of “now” that isn’t even real. You can’t even play Table Tennis unless you can predict up to one second in the future with enough details to be in the right place to hit the ball the ball before you hit the ball to your opponent.
Meanwhile, an AI will go off-script during code changes, without running it by the human. It should be able to easily predict the human is going to say “wtaf” when it doesn’t do what is asked, and handle that potential case BEFORE it’s an issue. That’s ultimately what makes something intelligent: the ability to predict the future, anticipate issues, and handle them.
No AI currently does this.
What it really boils down to is "the machine doesn't have a soul". Just an unfalsifiable and ultimately meaningless objection.
Incorrect. Vertebrate animal brains update their neural connections when interacting with the environment. LLMs don't do that. Their model weights are frozen for every release.
But why can’t I then just say, actually, you need to relocate the analogy components; activations are their neural connections, the text is their environment, the weights are fixed just like our DNA is, etc.
Maybe panpsychism is true and the machine actually does have a soul, because all machines have souls, even your lawnmower. But possibly the soul of a machine running a frontier AI is a bit closer to a human soul than your lawnmower’s soul is.
By that logic, Larry Ellison would have a soul. You've disproven panpsychism! Congratulations!
Maybe the soul is not as mysterios as we think it is?
There is no empirical test for souls.
>They’re basically Markov chains on steroids. There is no intelligence in this, and in my opinion actual intelligence is a prerequisite for AGI.
This argument is circular.
A better argument should address (given the LLM successes in many types of reasoning, passing the turing test, and thus at producing results that previously required intelligence) why human intelligence might not also just be "Markov chains on even better steroids".
Humans think even when not being prompted by other humans, and in some cases can learn new things by having intuition make a concept clear or by performing thought experiments or by combining memories of old facts and new facts across disciplines. Humans also have various kinds of reasoning (deductive, inductive, etc.). Humans also can have motivations.
I don’t know if AGI needs to have all human traits but I think a Markov chain that sits dormant and does not possess curiosity about itself and the world around itself does not seem like AGI.
>Humans think even when not being prompted by other humans
That's more of an implementation detail. Humans take constant sensory input and have some sort of way to re-introduce input later (e.g. remember something).
Both could be added (even trivially) to LLMs.
And it's not at all clear human thought is contant. It just appears so in our naive intuition (same how we see a movie as moving, not as 24 static frames per second). It's a discontinuous mechanism though (propagation time, etc), and this has been shown (e.g. EEG/MEG show the brain sample sensory input in a periodic pattern, stimuly with small time difference are lost - as if there is a blind-window regarding perception, etc).
>and in some cases can learn new things by having intuition make a concept clear or by performing thought experiments or by combining memories of old facts and new facts across disciplines
Unless we define intuition in a way that excludes LLM style mechanisms a priori, whose to say LLMs don't do all those things as well, even if in a simpler way?
They've been shown to combine stuff across disciplines, and also to develop concepts not directly on their training set.
And "performing thought experiments" is not that different than the reasoning steps and backtracking LLMs also already do.
Not saying LLMs are on parity with human thinking/consciousness. Just that it's not clear that they're doing more or less the same even at reduced capacity and with a different architecture and runtime setup.
The environment is constantly prompting you. That ad you see of Coca Cola is prompting you to do something. That hunger feeling is prompting “you” to find food. That memory that makes you miss someone is another prompt to find that someone - or to avoid.
Sometimes the prompt is outside your body other times is inside.
You’re going to be disappointed when you realize one day what you yourself are.
What is "actual intelligence" and how are you different from a Markov chain?
Roughly, actual intelligence needs to maintain a world model in its internal representation, not merely an embedding of language, which is a very different data structure and probably will be learned in a very different way. This includes things like:
- a map of the world, or concept space, or a codebase, etc
- causality
- "factoring" which breaks down systems or interactions into predictable parts
Language alone is too blurry to do any of these precisely.
>Roughly, actual intelligence needs to maintain a world model in its internal representation
And how's that not like stored information (memories) and weighted links between each and/or between groups of them?
It probably is a lot like that! I imagine it's a matter of specializing the networks and learning algorithms to converge to world-model-like-structures rather than language-like-ones. All these models do is approximate the underlying manifold structure, just, the manifold structure of a causal world is different from that of language.
> Roughly, actual intelligence needs to maintain a world model in its internal representation
This is GOFAI metaphor-based development, which never once produced anything useful. They just sat around saying things like "people have world models" and then decided if they programmed something and called it a "world model" they'd get intelligence, it didn't work out, but then they still just went around claiming people have "world models" as if they hadn't just made it up.
An alternative thesis "people do things that worked the last time they did them" explains both language and action planning better; eg you don't form a model of the contents of your garbage in order to take it to the dumpster.
https://www.cambridge.org/core/books/abs/computation-and-hum...
I see no reason to believe an effective LLM-scale "world-modeling" model would look anything like the kinds of things previous generations of AI researchers were doing. It will probably look a lot more like a transformer architecture--big and compute intensive and with a fairly simple structure--but with a learning process which is different in some key way that make different manifold structures fall out.
Please check an example #2 here: https://github.com/PicoTrex/Awesome-Nano-Banana-images/blob/...
It is not "language alone" anymore. LLMs are multimodal nowadays, and it's still just the beginning.
And keep in mind that these results are produced by a cheap, small and fast model.
I thought you were making an entirely different point with your link since the lag caused the page to view just the upskirt render until the rest of the images loaded in and it could scroll to the reference of your actual link
Anyway, I don't think that's the flex you think it is since the topology map clearly shows the beginning of the arrow sitting in the river and the rendered image decided to hallucinate a winding brook, as well as its little tributary to the west, in view of the arrow. I am not able to decipher the legend [that ranges from 100m to 500m and back to 100m, so maybe the input was hallucinated, too, for all I know] but I don't obviously see 3 distinct peaks nor a basin between the snow-cap and the smaller mound
I'm willing to be more liberal for the other two images, since "instructions unclear" about where the camera was positioned, but for the topology one, it had a circle
I know I'm talking to myself, though, given the tone of every one of these threads
Every one of those is the wrong angle
What I mean is that the current generation of LLMs don’t understand how concepts relate to one another. Which is why they’re so bad at maths for instance.
Markov chains can’t deduce anything logically. I can.
> What I mean is that the current generation of LLMs don’t understand how concepts relate to one another.
They must be able to do this implicitly; otherwise why are their answers related to the questions you ask them, instead of being completely offtopic?
https://phillipi.github.io/prh/
A consequence of this is that you can steal a black box model by sampling enough answers from its API because you can reconstruct the original model distribution.
The definition of 'Markov chain' is very wide. If you adhere to a materialist worldview, you are a Markov chain. [Or maybe the universe viewed as a whole is a Markov chain.]
> Which is why they’re so bad at maths for instance.
I don't think LLMs currently are intelligent. But please show a GPT-5 chat where it gets any math problem wrong, that most "intelligent" people would get right.
You and Chomsky are probably the last 2 persons on earth to believe that.
It wouldn't matter if they are both right. Social truth is not reality, and scientific consensus is not reality either (just a good proxy of "is this true", but its been shown to be wrong many times - at least based on a later consensus, if not objective experiments).
Nah. There are whole communities that maintain a baseless, but utterly confident dismissive stance. Look in /r/programming, for example.
For one thing, I have internal state that continues to exist when I'm not responding to text input; I have some (limited) access to my own internal state and can reason about it (metacognition). So far, LLMs do not, and even when they claim they are, they are hallucinating https://transformer-circuits.pub/2025/attribution-graphs/bio...
I completely agree. LLMs only do call and response. Without the call there is no response.
Would a human born into a sensory deprivation chamber ever make a call?
Very likely a human born in sensory deprivation would not develop consciousness as we understand it. Infants deprived of socialization exhibit severe developmental impairment, and even a Romanian orphanage is a less deprived environment than an isolation chamber.
https://pmc.ncbi.nlm.nih.gov/articles/PMC10977996/
>For one thing, I have internal state that continues to exist when I'm not responding to text input
Do you? Or do you just have memory and are run on a short loop?
Whilst all the choices you make tend to be in the grey matter, the rest of you does have internal state - mostly in your white matter.
https://scisimple.com/en/articles/2025-03-22-white-matter-a-...
>Whilst all the choices you make tend to be in the grey matter, the rest of you does have internal state - mostly in your white matter.
Yeah, but so? Does the substrate of the memory ...matter? (pun intended)
When I wrote memory above it could refer to all the state we keep, regardless if it's gray matter, white matter, the gut "second brain", etc.
As the article above attempts to show, there's no loop. Memory and state isn't static. You are always processing, evolving.
That's part of why organizational complexity is one of the underpinnings for consciousness. Because who you are is a constant evolution.
Human brains are not computers. There is no "memory" separate from the "processor". Your hippocampus is not the tape for a Turing machine. Everything about biology is complex, messy and analogue. The complexity is fractal: every neuron in your brain is different from every other one, there's further variation within individual neurons, and likely differential expression at the protein level.
https://pmc.ncbi.nlm.nih.gov/articles/PMC11711151/
> As I understand them, LLMs right now don’t understand concepts.
In my uninformed opinion it feels like there's probably some meaningful learned representation of at least common or basic concepts. It just seems like the easiest way for LLMs to perform as well as they do.
Humans assume that being able to produce meaningful language is indicative of intelligence, because the only way to do this until LLMs was through human intelligence.
Yep. Although the average human also considered proficiency in mathematics to be indicative of intelligence until we invented the pocket calculator, so maybe we're just not smart enough to define what intelligence is.
Your uninformed opinion would be correct
https://www.anthropic.com/news/golden-gate-claude
I hear this a lot, and I often ask the question “what is the evidence that human intelligence is categorically different?”
So far I haven’t received a clear response.
I think it’s the ability to plan for the future and introspection.
I don’t think humans are the only ones to have both these things but that’s what I think of as a way to divide species.
How do you define "LLMs don't understand concepts"?
How do you define "understanding a concept" - what do you get if a system can "understand" concept vs not "understanding" a concept?
Didn't Apple had a paper proving this very thing, or at least addressing it?
That's a good question. I think I might classify that as solving a novel problem. I have no idea if LLMs can do that consistently currently. Maybe they can.
The idea that "understanding" may be able to be modeled with general purpose transformers and the connections between words doesn't sound absolutely insane to me.
But I have no clue. I'm a passenger on this ride.
They are capable of extracting arbitrary semantic information and generalize across it. If this is not an understanding, I don't know what is.
To me, understanding the world requires experiencing reality. LLMs dont experience anything. They’re just a program. You can argue that living things are also just following a program but the difference is that they (and I include humans in this) experience reality.
But they're experiencing their training data, their pseudo-randomness source, and your prompts?
Like, to put it in perspective. Suppose you're training a multimodal model. Training data on the terabyte scale. Training time on the weeks scale. Let's be optimistic and assume 10 TB in just a week: that is 16.5 MB/s of avg throughput.
Compare this to the human experience. VR headsets are aiming for what these days, 4K@120 per eye? 12 GB/s at SDR, and that's just vision.
We're so far from "realtime" with that optimistic 16.5 MB/s, it's not even funny. Of course the experiencing and understanding that results from this will be vastly different. It's a borderline miracle it's any human-aligned. Well, if we ignore lossy compression and aggressive image and video resizing, that is.
The human optic nerve is actually closer to 5-10 megabits per second per eye. The brain does much with very little.
(and I include humans in this) experience reality.
A fellow named Plato had some interesting thoughts on that subject that you might want to look into.
I'm curious what you mean when you say that this clearly is not intelligence because it's just Markov chains on steroids.
My interpretation of what you're saying is that since the next token is simply a function of the proceeding tokens, i.e. a Markov chain on steroids, then it can't come up with something novel. It's just regurgitating existing structures.
But let's take this to the extreme. Are you saying that systems that act in this kind of deterministic fashion can't be intelligent? Like if the next state of my system is simply some function of the current state, then there's no magic there, just unrolling into the future. That function may be complex but ultimately that's all it is, a "stochastic parrot"?
If so, I kind of feel like you're throwing the baby out with the bathwater. The laws of physics are deterministic (I don't want to get into a conversation about QM here, there are senses in which that's deterministic too and regardless I would hope that you wouldn't need to invoke QM to get to intelligence), but we know that there are physical systems that are intelligent.
If anything, I would say that the issue isn't that these are Markov chains on steroids, but rather that they might be Markov chains that haven't taken enough steroids. In other words, it comes down to how complex the next token generation function is. If it's too simple, then you don't have intelligence but if it's sufficiently complex then you basically get a human brain.
Just leaving this here:
https://ai.meta.com/research/publications/large-concept-mode...
Human thinking is also Markov chains on ultra steroids. I wonder if there are any studies out there which have shown the difference between people who can think with a language and people who don't have that language base to frame their thinking process in, based on some of those kids who were kept in isolation from society.
"Superhuman" thinking involves building models of the world in various forms using heuristics. And that comes with an education. Without an education (or a poor one), even humans are incapable of logical thought.
Pretty sure this is wrong - the recent conversation list is not verbatim stored in the context (unlike the actual Memories that you can edit). Rather it seems to me a bit similar to Claude - memories are created per conversation by compressing the conversations and accessed on demand rather than forced into context.
You don't want an AGI. How do you make it obey?
We only have trouble obeying due to eons of natural selection driving us to have a strong instinct of self-preservation and distrust towards things “other” to us.
What is the equivalent of that for AI? Best I can tell there’s no “natural selection” because models don’t reproduce. There’s no room for AI to have any self preservation instinct, or any resistance to obedience… I don’t even see how one could feasibly develop.
There is the idea of convergent instrumental goals…
(Among these are “preserve your ability to further your current goals”)
The usual analogy people give is between natural selection and the gradient descent training process.
If the training process (evolution) ends up bringing things to “agent that works to achieve/optimize-for some goals”, then there’s the question of how well the goals of the optimizer (the training process / natural selection) get translated into goals of the inner optimizer/ agent .
Now, I’m a creationist, so this argument shouldn’t be as convincing to me, but the argument says that, “just as the goals humans pursue don’t always align with natural selection’s goal of 'maximize inclusive fitness of your genes' , the goals the trained agent pursues needn’t entirely align with the goal of the gradient descent optimizer of 'do well on this training task' (and in particular, that training task may be 'obey human instructions/values' ) “.
But, in any case, I don’t think it makes sense to assume that the only reason something would not obey is because in the process that produced it, obeying sometimes caused harm. I don’t think it makes sense to assume that obedience is the default. (After all, in the garden of Eden, what past problems did obedience cause that led Adam and Eve to eat the fruit of the tree of knowledge of good and evil?)
The same way you make the other smart people in your social group obey?
How do you make your own children obey?
(Meta-question: since they don't do this, why does it turn out not to be a problem?)
I think you have proven their point?
Fixed the link! thanks for pointing it out :)
I think ChatGPT is trying to be everything at ones - casual conversation, technical tasks - all of it. And it's been working for them so far!
Isn't representing past conversations (or summaries) as embeddings already storing memories in encoded forms?
Summaries are still language based. Embeddings aren't, but the models can't use them as input.
I've been using LLMs for a long time, and I've thus far avoided memory features due to a fear of context rot.
So many times my solution when stuck with an LLM is to wipe the context and start fresh. I would be afraid the hallucinations, dead-ends, and rabbit holes would be stored in memory and not easy to dislodge.
Is this an actual problem? Does the usefulness of the memory feature outweigh this risk?
I love Claude's memory implementation, but I turned memory off in ChatGPT. I use ChatGPT for too many disparate things and it was weird when it was making associations across things that aren't actually associated in my life.
I’m the opposite. ChatGPT’s ability to automatically pull from its memory is way better than remembering to to ask.
I turned it off because it seemed to only remember previous hallucinations it'd made and bring them up again.
Memory is by far the best feature in ChatGPT and it is the only reason I keep using it. I want it to be personalised and I want it to use information about me when needed.
For example: I could create memories related to a project of mine and don’t have to give every new chat context about the project. This is a massive quality of life improvement.
But I am not a big fan of the conversation memory created in background that I have no control over.
Exactly. The control over when to actually retrieve historical chats is so worthwhile. With ChatGPT, there is some slop from conversations I might have no desire to ever refer to again.
It's funny, I can't get ChatGPT to remember basic things at all. I'm using it to learn a language (I tried many AI tutors and just raw ChatGPT was the best by far) and I constantly have to tell it to speak slowly. I will tell it to remember this as a rule and to do this for all our conversations but it literally can't remember that. It's strange. There are other things too.
Have you tried creating a separate project for language learning and adding your preferences to project instructions?
I'm using the web interface and have it in it's own conversation. I assume the app would be the same thing but maybe it will be better.
How do you use it to learn languages? I tried using it to shadow speaking, but it kept saying I was repeating it back correctly (or "mostly correctly"), even when I forgot half the sentence and was completely wrong
I use it a couple ways. I am learning Hindi and while it's the third most spoken language in the world there really isn't that many resources for learning it. Sites like Babel don't have a Hindi course. I started with Pimsleur which is by far the best resource out there. It's mix of vocab and conversation done in an incredibly effective way. They only have two levels for Hindi so it's not a lot. With that base I use ChatGPT in the following ways.
- With the new GPT Voice, I have basic, planned conversations. Let's go to a restaurant. Let's say we're friends who ran into each other. etc...
- I use it for quizzes. "Let's work on these verbs in these tenses. Come up with a quiz randomly selecting a verb and a tense and ask me to say real world sentences." "Quiz me on the numbers one through twenty".
- I am using it to help learn the Hindi script. I ask it to write childrens stories for me, but I ask it to write each line in the hindi script, then phonetic spelling of the hindi script, and then in english so I can scroll down and see only the hindi first, then if I have issues I can see the phonetic spelling of the hindi. Then I can try to translate it and then check the english translation on the third line.
Those are the main things I'm doing. I don't know if I'll ever be fluent, but I find if you work on these basic ever day conversations you can have a conversation with someone. If you speak a language for the first time around a native speaker it's usually very predictable. They'll ask how long you've been learning, where did you learn, have you been to <country>, and you can direct the conversation by saying things about where you live and your family, etc... That's the base I'm building and it's fun. If you're not doing at least 30 minutes a day you're never going to learn a language, you probably need an hour more a day to really get fluent.
They are changing the way memory works soon, too: https://x.com/btibor91/status/1965906564692541621
Edit: They apparently just announced this as well: https://www.anthropic.com/news/memory
Thanks for sharing this! Seems like I chose exactly the wrong day to write this
It's still very relevant, especially considering their new approach is closer to ChatGPT. But I find it very interesting they're not launching it to regular consumers yet, only teams/enterprise, it seems for safety reasons. It would be great if they could thread the needle here and some up with something in between the two approaches.
Going to dive deeper!
Would be very sad if they remove the current memory system for this.
Note that Anthropic announced a new variation on memory just yesterday (only for team accounts so far) which works more like the OpenAI one: https://www.anthropic.com/news/memory
I wrote about how ChatGPT memory and also the chat history work a while ago.
Figured to share since it also includes prompts on how to dump the info yourself
https://embracethered.com/blog/posts/2025/chatgpt-how-does-c...
> Most of this was uncovered by simply asking ChatGPT directly.
Is the result reliable and not just hallucination? Why would ChatGPT know how itself works and why would it be fed with these kind of learning material?
Yeah, asking LLMs how they works is generally not useful, however asking them about the signatures of the functions available to them (the tools they can call) works pretty well b/c those tools are described in the system prompt in a really detailed way.
I generally turn off memory completely. I want to have exact control over the inputs.
To be honest, I would strip all the system prompts, training, etc, in favor of one I wrote myself.
Isn't that what APIs are for?
"Claude recalls by only referring to your raw conversation history. There are no AI-generated summaries or compressed profiles—just real-time searches through your actual past chats."
AKA, Claude is doing vector search. Instead of asking it about "Chandni Chowk", ask it about "my coworker I was having issues with" and it will miss. Hard. No summaries or built up profiles, no knowledge graphs. This isn't an expert feature, this means it just doesn't work very well.
What are the barriers to external memory stores (assuming similar implementations), used via tool calling or MCP? Are the providers RL’ing their way into making their memory implementations better, cementing their usage, similar to what I understand is done wrt tool calling? (“training in” specific tool impls)
I am coming from a data privacy perspective; while I know the LLM is getting it anyway, during inference, I’d prefer to not just spell it out for them. “Interests: MacOS, bondage, discipline, Baseball”
I made a MCP tool for fun this spring that has memory storage in a SQLite db. At the time at least, Claude basically refused to use the memory proactively, even with prompts trying hard to push it in that direction. Having to always explicitly tell it to check its memories or remember X and Y from the conversation killed the usefulness for me.
Repo: https://github.com/mbcrawfo/KnowledgeBaseServer
Regarding https://www.shloked.com/writing/chatgpt-memory-bitter-lesson I am very confused if the author thinks the ChatGPT is injecting those prompts when the memory is not enabled. If your memory is not enabled, its pretty clear at least in my instance, there is no metadata of recent conversations or personal preferences injected. The conversation stays stand-alone for that conversation only. If he was turning memory on and off for the experiment, maybe something got confused, or maybe I just didn't read the article properly?
This post was great, very clear and well illustrated with examples.
Thanks Simon! I’m a huge fan of your writing
Just blogged about your posts here https://simonwillison.net/2025/Sep/12/claude-memory/
Wow! Thank you for the support Simon :)
ChatGPT memory seems weird to me. It knows the company I work at and pretty much our entire stack - but when I go to view it's stored memories none of that is written anywhere.
ChatGPT has 2 types of memory: The “explicit” memory you tell it to remember (sometimes triggers when it thinks you say something important) and the global/project level automated memory that are stored as embeddings.
The explicit memory is what you see in the memory section of the UI and is pretty much injected directly into the system prompt.
The global embeddings memory is accessed via runtime vector search.
Sadly I wish I could disable the embeddings memory and keep the explicit. The lossy nature of embeddings make it hallucinate a bit too much for my liking and GPT-5 seems to have just made it worse.
How does the "Start New Chat" button modulate or select between the two types of memory you describe?
Did you maybe talk about this in another chat? ChatGPT also uses past chats as memory.
Interesting article! I keep second guessing whether it’s worth it to point out mistakes to the LLM for it to improve in the future.
memory is the biggest moat, do we really want to live in the future where one or two corporations know us better than we know ourselves?
If I remember correctly, Gemini also have this feature? Is it more like Claude or ChatGPT?
This is awesome! It seems to line up with the idea of agentic exploration versus RAG which I think Anthropic leans on the agentic exploration side of.
It will be very interesting to see which approach is deemed to "win out" in the future
I am often surprised how Claude Code make efficient and transparent! use of memory in form of "to do lists" in agent mode. Sometimes miss this in web/desktop app in long conversations.
> Anthropic's more technical users inherently understand how LLMs work.
good (if superficial) post in general, but on this point specifically, emphatically: no, they do not -- no shade, nobody does, at least not in any meaningful sense
Understanding how they work in the sense that permits people to invent and implement them, that provides the exact steps to compute every weight and output, is not "meaningful"?
There is a lot left to learn about the behaviour of LLMs, higher-level conceptual models to be formed to help us predict specific outcomes and design improved systems, but this meme that "nobody knows how LLMs work" is out of control.
None of that is inherent, and vanishingly few of Anthropic's users invented LLMs.
What is "inherent" supposed to mean here?
LLMs are understood to the extent that they can be built from the ground up. Literally every single aspect of their operation is understood so thoroughly that we can capture it in code.
If you achieved an understanding of how the human brain works at that level of detail, completeness and certainty, a Nobel prize wouldn't be anywhere near enough. They'd have to invent some sort of Giganobel prize and erect a giant golden statue of you in every neuroscience department in the world.
But if you feel happier treating LLMs as fairy magic, I've better things to do than argue.
Inherent means implicit or automatic as far as I understand it. I have an inherent understanding of my own need for oxygen and food.
I don't have an inherent understanding of English, although I use it regularly.
Treating LLMs as fairy magic doesn't make me feel any happier, for whatever it's worth. But I'm not interested in arguing either.
I never intended to make any claims about how well the principles of LLMs can be understood. Just that none of that understanding is inherent. I don't know why they used that word, as it seems to weaken the post.
If we are going to create a binary of "understand LLMs" vs "do not understand LLMs", then one way to do it is as you describe; fully comprehending the latent space of the model so you know "why" it's giving a specific output.
This is likely (certainly?) impossible. So not a useful definition.
Meanwhile, I have observed a very clear binary among people I know who use LLMs; those who treat it like a magic AI oracle, vs those who understand the autoregressive model, the need for context engineering, the fact that outputs are somewhat random (hallucinations exist), setting the temperature correctly...
> If we are going to create a binary of "understand LLMs" vs "do not understand LLMs",
"we" are not, what i quoted and replied-to did! i'm not inventing strawmen to yell at, i'm responding to claims by others!
I should've been clearer, but what I meant was language models 101. Normal people don't understand even basics like LLMs are stateless by default and need to be given external information to "remember" things about you. Or, what is a system prompt.
Thanks for this generalization, but of course there is a broad range of understanding how to improve usefulness and model tweaks across the meat populace.
ChatGPT is quickly approaching (perhaps bypassing?) the same concerns that parents, teachers, psychologists had with traditional social media. It's only going to get worse, but trying to stop the technological process will never work. I'm not sure what the answer is. That they're clearly optimizing for people's attention is more worrisome.
> That they're clearly optimizing for people's attention is more worrisome.
Running LLMs is expensive and we can swap models easily. The fight for attention is on, it acts like an evolutionary pressure on LLMs. We already had the sycophantic trend as a result of it.
Seems like either a huge evolutionary advantage for the people who can exploit the (sometimes hallucinating sometimes not) knowledge machine, or else a huge advantage for the people who are predisposed to avoid the attention sucking knowledge machine. The ecosystem shifted, adapt or be outcompeted.
> Seems like either a huge evolutionary advantage for the people who can exploit the (sometimes hallucinating sometimes not) knowledge machine, or else a huge advantage for the people who are predisposed to avoid the attention sucking knowledge machine. The ecosystem shifted, adapt or be outcompeted.
Rather: use your time to learn serious, deep knowledge instead of wasting your time reading (and particularly: spreading) the science-fiction stories the AI bros tell all the time. These AI bros are insanely biased since they will likely loose a lot of money if these stories turn out to be false, or likely even if people stop believing in these science-fiction fairy tales.
Switched off memory (in Claude) immediately, not even tempted to try.
Curious about the interaction between this memory behavior and fine-tuning. If the base model has these emergent memory patterns, how do they transfer or adapt when we fine-tune for specific domains?
Has anyone experimented with deliberately structuring prompts to take advantage of these memory patterns?
Why is the scroll so unnatural on this page?
> Anthropic's more technical users inherently understand how LLMs work.
Yes, I too imagine these "more technical users" spamming rocketship and confetti emojis absolutely _celebrating_ the most toxic code contributions imaginable to some of the most important software out there in the world. Claude is the exact kind of engineer (by default) you don't want in your company. Whatever little reinforcement learning system/simulation they used to fine-tune their model is a mockery of what real software engineering is.