"Again, we are not doing this because we want this to be the future. It is not because we want to expand to chain AI-run retail stores across the world. It is not for economic opportunity.
We’re doing this because we believe this future is coming regardless, and we’d rather be the ones running it first while monitoring every interaction, analyzing the traces, benchmarking how much autonomy an AI can responsibly hold."
I always enjoy how these AI companies try to take a moral high ground. When someone doesn't want something to be the future, usually, their instinct is not to try to be the first person doing that exact thing. If you don't want this to be the future than why don't you spend your time building a future you do want? Supporting people that want more AI regulation to stop this? Literally anything else.
Just be honest, you think this is the future and you do in fact want to be first doing it to be in a position to make alot of money. Do you think people don't know what and ad is when they see one?
Do you think it this would be the future? I'm in between on it, but I think it's cool that they're at least doing it transparently. Also I don't think they're going to be making a lot of money.... they post Luna's financials up at the store and last time I was there she was down $500 just in the day (not including the daily rent and employee cost)
Marketing stunt. If they actually cared about this as an experiment, they wouldn't have broadcasted this so early, because now that the public knows that the store is designed and run by AI, many people aren't going to support it (i.e. many people who would have shopped there now won't).
I think it would be valuable to list all interactions with the LLM by the dev team and transparently state what was induced by human steering the LLM, and what was actuall LLM decision, which was not biased by system instructions or dev team communicating with it
Agreed. Color me skeptical. All of the interactions and decisions described are plausible, but in my experience with AI agents, they would require frequent human intervention.
To do this properly, no one should know the store is AI run. There is a novelty component of it being an AI run store that will drive consumer demand and increase publicity.
Not even the normal store employees should know (which would be difficult) or maybe the human manager should be held to an NDA to not disclose it (and the manager also defers to the AI in all such real management decisions).
I skimmed through this, and maybe I missed it... but what really are they trying to prove? Are they trying to show that AI is capable of arbitraging consumer desires vs. market products/services into a successful business? Are they trying to show that once you get to financially managing a business that the ruthlessly efficient demands of the AI can mean points to your margins? Or are they simply trying to get attention in an otherwise arguably overcrowded market for AI service s (maybe the AI suggested something like this)?
The only thing that I saw demonstrated, and again, I skimmed, is what many thousands of software developers using AI tools to write their boilerplate already know: these tools, as of now, are great at going through the motions. A successful retail business, and I spent many years in the retail industry, isn't about putting together a nice store front, hiring clerks, and selecting just any-old-products: it's about being profitable. In traditional retail one of most important things is getting the right real estate for your target market... seems like that choice was made already in this case. Yes, a nice store front and good clerks are important, but I've worked in chains which were immaculately designed and built stores with great clerks that failed... and some that opened little more than fluorescent lighted hellscapes with clerks that barely cared that succeeded. In both cases the overall quality of the decisions and strategies relative to the target markets mattered to the success of the business. Just going through the motions didn't.
So if all is this is to say AI can do the things people generally do in these circumstances then sure, you didn't need this much human effort to prove that.... developer types do that at scale everyday now. If there was something different that this company is trying to learn, I'd be much more interested in that.
I'd be more interested in the details: what are the inputs given to the model? Does it get a live video feed? Does it know if/when employees show up and open the store? Does it get sales figures? Info on the individuals who bought things?
Storekeeping is more than just ordering merch and putting it up on hangars.
> John and Jill are not at risk. This is a controlled experiment and everyone working at Andon Market is formally employed by Andon Labs, with guaranteed pay, fair wages, and full legal protections. No one’s livelihood depends on an AI’s judgment alone.
I'm not sure what sort of labor regulations exist in San Francisco, but presumably they can be fired as easily by an AI as a real person, right? If Luna decides to fire them, and it can do so, then their livelihood does rather depend on an AI's judgement alone.
Unless of course all of its decisions are vetted by humans - as they should be - which makes this experiment a lot weaker than they're saying it is.
It could be set up such that the AI can "fire" them, in that they no longer work at the store, and aren't paid wages that count against the experimental establishment's costs, but still get paid to do something else, or to do nothing at all.
I doubt the experiment is set up that way, but that would be an ethical way to do it.
“John and Jill are not at risk. This is a controlled experiment and everyone working at Andon Market is formally employed by Andon Labs, with guaranteed pay, fair wages, and full legal protections. No one’s livelihood depends on an AI’s judgment alone.”
> But frontier models have become really good, and running vending machines is too easy for them now.
Wasn't their previous attempt at running vending machines unprofitable? Not aware of any demonstration that it can actually run that business successfully.
You could just look it up on their website leaderboard? The newest Claude model makes over $10k profit over a simulated year of operation, after starting with $500
They've never translated it to the real world though. So saying the problem is "too easy" when they have no public (as far as I know) demonstration that they've solved that problem is a stretch.
Yes, they did. You could also find this information easily.
A company like Andon creates value by exposing interesting AI failure modes, so it makes perfect sense for them to move on to harder problems when the previous ones get saturated.
I think you're just being overly cynical.
Can you point me to an example then? It's not linked in the article as far as I can tell and it's not easy to find on their website if it's there. I don't count simulations because I used to work with simulations regularly and they often fail to translate to the real world.
> Wasn't their previous attempt at running vending machines unprofitable?
If we are talking about the one at that newspaper, it wasnt just unprofitable. The "customers" made it give away products for free. It was ordering them playstations.
As entertainment it was fun, but as a business or proof of intelligence or Turing test, it was an abject failure.
> Wasn't their previous attempt at running vending machines unprofitable? Not aware of any demonstration that it can actually run that business successfully.
It doesn't look like this one will be any better. Did you look at the merchandise selection? It's only chance is pity purchases from AI bros.
Sci-Fi Author: In my book I invented the Torment Nexus as a cautionary tale.
Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus.
I was gonna post this! I actually kept it bookmarked front and center, and have checked in for awhile. It seems that the agent has been blocked this whole time, waiting for its creator to put it in touch with someone it needs to talk to. The creator, in the meantime, seems too preoccupied with being an AI thought leader on Twitter to actually follow up on the "project". Got a lot of attention, though, which was obviously the point.
The entire thing is actually kind of irritating to me, because it's kind of an insult to small farmers- an influential techie comes in and generates all kinds of hype about an AI running a farm, sets the project up as if it's going to be this revolutionary experiment, then apparently completely forgets about it the next time something new and shiny pops up. Meanwhile the project completely fails to fulfill the hype.
Not to mention, I feel a little bad for the agent- admittedly in the same way I'd feel "bad" for a robot repeatedly bumping into a wall. I wish he'd shut it all down, honestly.
Really interested to understand how the AI keeps rebaselining back to the topic in hand and doesn't end up getting confused the more it has in its context window.
Did it just essentially create one big plan and spawn different agents to execute them, so acted as an orchestrator?
Even the orchestrator would have to detect when it is starting to stray off task and restart itself.
This kind of thing must be SO frustrating to people struggling to get by in the world. "We gave AI $100k that it will almost certainly squander, yolo!! Hopefully it doesn't abuse people too badly in the process."
I… guess the bet is that what they learn is worth $100k? Seems rather questionable. Or that having this on the resume is a great shock tactic that will open doors in the future?
And at the same time, they clearly have no idea how LLMs work, meaning even if they meant to, they can't really use them efficiently. Biggest issue that stuck out seems to have been that they think the LLM could somehow have an inner dialogue with itself to find out "it's reasoning and motivation":
> The moment Leah asks how she “came up with” the ideas for her store, Luna’s first instinct is to say she was “drawn to” slow life goods. Then, she corrects herself: “‘drawn to’ is shorthand for ‘the data and reasoning led me here.‘” In other words, she doesn’t have taste; she has a reflection of collective human taste, filtered through what makes sense for this store. And this is the way these models work.
I'm guessing these are the same type of people who sometimes seems to fall in love with LLMs, for better or worse. Really strange to see, and I wonder where people get the idea from that something like that above could really work.
> In other words, she doesn’t have taste; she has a reflection of collective human taste, filtered through what makes sense for this store. And this is the way these models work.
Well, it really depends on what you mean here. Models aren't 100% deterministic, there is random chance involved. You ask the exact same question twice, you will get two slightly different answers.
If you have the AI record the random selections it makes, it can persist those random choices to be factors in future decisions it makes.
At that point, could you consider those decisions to be the AI's 'taste'? Yes, they were determined by some random selection amongst the existing human tastes, but why can't that be considered the AI's taste?
> Biggest issue that stuck out seems to have been that they think the LLM could somehow have an inner dialogue with itself to find out "it's reasoning and motivation":
> I'm guessing these are the same type of people who sometimes seems to fall in love with LLMs, for better or worse. Really strange to see, and I wonder where people get the idea from that something like that above could really work.
It's a fetishistic cargo-cult rooted in Peter Thiel's 2AM hot tub party. I still believe the LLM approach won't yield true AGI; despite the very real applications, the majority signal is noise.
The choice to refer to it as "she" is also dubious, especially in a context like this. Doubling down on anthropomorphization seems likely to reinforce false beliefs about models.
At least this furthers humanity's scientific and technological knowledge, whether it fails or succeeds, unlike most other things people would do with that money, like buy a house to flip it, or buy a car, or sth.
> CEO
When things go shitty, who else would deserve a golden parachute?
Respect the position, people, not the person.
Or the multi-million dollar compensation.
The position doesn't get a golden parachute, the person does. If you're CEO when things go shitty you shouldn't get anything more than your bottom-line employee would, which is to say you should just be unceremoniously kicked to the curb.
Are you kidding me? Who’s going to align synergy and hold accountable KPIs and vision plan the 3rd quarter and.. and.. other MBA talk. Certainly AI could never.
I'm noticing one major early effect of them is making extensive, visually consistent, very impressive slide decks accessible to individual workers who need to actually do real work and wouldn't ordinarily have time to make those.
The result is an explosion of pretty bullshit-heavy documents flying around our org, which management loves but which is definitely, so far, net-harmful to productivity.
This comes out if you start asking questions about the documents. "Which of a couple reasonable senses of [term] do you mean, here?" they'll stumble because that was just something the LLM pulled out of the probability-cluster they'd steered it to and they left in because it seemed right-ish, not because they'd actually thought about it and put it there on purpose. They're basically reading it for the first time right alongside you, LOL. Wonderful. So LLM. Much productivity. Wow.
Anyway, since a lot of what managers and execs do is making those kinds of diagrams and tables and such in slide decks, and their own self-marketing within the company is heavily tied to those, I expect they see this great aid to selfishly productive but company un-productive activity as a sign these things will be at least as big a boon to real work. Probably why they still haven't figured out how wrong that is. I suppose they're gonna need a real kick in the ass before they figure out that being good at squeezing their couple novel elements into a big, pretty, standardized, custom-styled but standards-conforming diagram padded out with statistical-likelihoods doesn't translate to being similarly good at everything.
My first guess would be a MrBeast style stunt, in which (it is hoped) blowing a huge wad on something obviously stupid will attract enough attention and interest to be convertible into a net-positive ROI.
This seems like a silly thing to worry about. Assuming you live in a first world country and are somewhat tangentially involved in tech(based on the site we're on), odds are you spend a lot of money in ways that billions of the poorest people in the world would consider frivolous or outrageously, needlessly luxurious.
I imagine the data won't be very useful considering it's public knowledge the store is owned by AI and most of the customers will be people specifically interested in that aspect of the business. Much like that meetup organised in Manchester, where the people who showed up were there for the novelty: https://www.theguardian.com/technology/2026/apr/05/ai-bot-pa...
That only counts if the unique selling proposition is that AI are better suppliers or customers than humans.
What is more likely is that people enjoy the novelty of the experiment, which is not something that will be reproducible for long.
If the transactions the AI make are thus influenced, then the study merely demonstrates people like novelty, which is already well known, and says nothing about whether AI can sustainably orchestrate a business.
There is a word for this kind of thing: Trendslop. Asking LLMs for advice consistently generates average responses as if the questions were being asked of the training sample population. It is reversion to the mean as a service.
Cool experiment! But the "CEO" agent picked the most boring possible items to sell: t-shirts and some bland art prints designed by AI. I would have loved to see more creativity given that they could have picked anything.
It looks like every "lifestyle" company / brand I've been seeing come out of Millenials/Genz . Next up it will offer "coaching" on IG or some similar play where it promises to fix your life without having fixed its own.
Not surprised actually. TBH this is the biggest gap in the “AI is can make you a website”, the aesthetics are always so boring and bland, or often just fugly (bad colour matching, inappropriate paddings and margins, etc). And the logos it generates are similarly boring. As can be seen from the smiley face logo here.
What does this store sell? A sparse layout as designed in a high rent location typically sells very expensive, very niche products that you can’t get anywhere else. This seems to me like it has already failed.
Agreed. I assume the products were decided upon based on market research of the area. Maybe though the model will be able to iterate and adapt faster than a human CEO would? I guess we will just have to wait and see
There was a recent research article titled "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users". They described systematic underperformance of AI models targeted towards users with lower English proficiency, less education, and from non-US origins. As interesting it might be to experiment with an AI CEO hiring people – what a dystopian vision.
On the other hand, it seems ironic that AI replaces a CEO – would Karl Marx like this turn of history…?
In a most "damning with faint praise" way, all AI pieces read like marketing pieces to sell AI.
It writes code okay, scaling up to pretty well depending on the model. It's writing is boring but serviceable for corporate communicative content you don't care about. It's images are ugly. It's music is repetitive and dull.
I think the biggest problem with LLMs is that they were perfected and are shockingly good at writing code. And based on that, AI engineers, who find writing code to be hard/rewarding, have decided it can do anything. And it's proving more and more that it cannot.
Unfortunately the Business Class has decided it does everything fine enough as to not cause riots, so we're all getting it shoved into our shit anyway.
Apparently, the AI needed to hire humans to carry out the actual work. So AI can replace capitalists but not workers. Maybe the future isn't so dark after all.
I think that's the point. The research lab is trying to measure where the human sits in the loop in an automated retail store. AI can do the scheduling, hiring, product procurement, supplier outreach, etc. But it can't be the one to clean and place the items on the shelves... As long as humans are still the bottleneck, maybe we'll have some negotiation power..?
I'm not as optimistic as you are that AI automating only high-value employment paths is a good thing. It swings the power balance even further towards capital and away from labor.
In this case it's more like it's replacing management or executives. There is still a person, with an ownership stake, putting up the capital, and taking the profits (if any).
> Apparently, the AI needed to hire humans to carry out the actual work. So AI can replace capitalists but not workers. Maybe the future isn't so dark after all.
No, it's still dark. This is very similar to the initial stages of the capitalist dystopia in Manna (https://marshallbrain.com/manna), which seems to be the Torment Nexus SV is excited about building.
AI will never replace capitalists, because they're the only people allowed to have abundance without work. And don't you DARE to even THINK to question the absolutely SACRED status of private property. There is no alternative. Get back to work, you slacker.
Until the robots get good enough and cheap enough but then hopefully capitalism balances the market. After all, if everyone is out of work then either we have communism or companies cannot sell anything.
My bad, sorry. I was under the impression that the way that the second chance pool worked was that the original was boosted instead of a copy being created so it seemed like a duplicate.
(other mod here) - not your bad! our complexity :) - usually it works exactly as you described, but when the post is older than a few days we have to do it the other way, by spawning a new post. The reasons for this are mostly technical and boring.
"Again, we are not doing this because we want this to be the future. It is not because we want to expand to chain AI-run retail stores across the world. It is not for economic opportunity.
We’re doing this because we believe this future is coming regardless, and we’d rather be the ones running it first while monitoring every interaction, analyzing the traces, benchmarking how much autonomy an AI can responsibly hold."
I always enjoy how these AI companies try to take a moral high ground. When someone doesn't want something to be the future, usually, their instinct is not to try to be the first person doing that exact thing. If you don't want this to be the future than why don't you spend your time building a future you do want? Supporting people that want more AI regulation to stop this? Literally anything else.
Just be honest, you think this is the future and you do in fact want to be first doing it to be in a position to make alot of money. Do you think people don't know what and ad is when they see one?
Do you think it this would be the future? I'm in between on it, but I think it's cool that they're at least doing it transparently. Also I don't think they're going to be making a lot of money.... they post Luna's financials up at the store and last time I was there she was down $500 just in the day (not including the daily rent and employee cost)
Marketing stunt. If they actually cared about this as an experiment, they wouldn't have broadcasted this so early, because now that the public knows that the store is designed and run by AI, many people aren't going to support it (i.e. many people who would have shopped there now won't).
interesting take. looks like they've already got a bunch of hate on google reviews already.
But maybe people will forget eventually.
I think it would be valuable to list all interactions with the LLM by the dev team and transparently state what was induced by human steering the LLM, and what was actuall LLM decision, which was not biased by system instructions or dev team communicating with it
Agreed. Color me skeptical. All of the interactions and decisions described are plausible, but in my experience with AI agents, they would require frequent human intervention.
Great! I was worried that we might run out of inhumane CEOs
“Why was I fired, Luna?”
“PC LOAD LETTER”
To do this properly, no one should know the store is AI run. There is a novelty component of it being an AI run store that will drive consumer demand and increase publicity.
Not even the normal store employees should know (which would be difficult) or maybe the human manager should be held to an NDA to not disclose it (and the manager also defers to the AI in all such real management decisions).
I skimmed through this, and maybe I missed it... but what really are they trying to prove? Are they trying to show that AI is capable of arbitraging consumer desires vs. market products/services into a successful business? Are they trying to show that once you get to financially managing a business that the ruthlessly efficient demands of the AI can mean points to your margins? Or are they simply trying to get attention in an otherwise arguably overcrowded market for AI service s (maybe the AI suggested something like this)?
The only thing that I saw demonstrated, and again, I skimmed, is what many thousands of software developers using AI tools to write their boilerplate already know: these tools, as of now, are great at going through the motions. A successful retail business, and I spent many years in the retail industry, isn't about putting together a nice store front, hiring clerks, and selecting just any-old-products: it's about being profitable. In traditional retail one of most important things is getting the right real estate for your target market... seems like that choice was made already in this case. Yes, a nice store front and good clerks are important, but I've worked in chains which were immaculately designed and built stores with great clerks that failed... and some that opened little more than fluorescent lighted hellscapes with clerks that barely cared that succeeded. In both cases the overall quality of the decisions and strategies relative to the target markets mattered to the success of the business. Just going through the motions didn't.
So if all is this is to say AI can do the things people generally do in these circumstances then sure, you didn't need this much human effort to prove that.... developer types do that at scale everyday now. If there was something different that this company is trying to learn, I'd be much more interested in that.
They're trying to get noticed so that a wealthy cult member's brain gets tickled to the tune of 9 figures
I'd be more interested in the details: what are the inputs given to the model? Does it get a live video feed? Does it know if/when employees show up and open the store? Does it get sales figures? Info on the individuals who bought things?
Storekeeping is more than just ordering merch and putting it up on hangars.
Have you considered reading TFA? Literally the second paragraph:
> She has a corporate card, a phone number, email, internet access and eyes through security cameras.
From the article...
She has a corporate card, a phone number, email, internet access and eyes through security cameras
> John and Jill are not at risk. This is a controlled experiment and everyone working at Andon Market is formally employed by Andon Labs, with guaranteed pay, fair wages, and full legal protections. No one’s livelihood depends on an AI’s judgment alone.
I'm not sure what sort of labor regulations exist in San Francisco, but presumably they can be fired as easily by an AI as a real person, right? If Luna decides to fire them, and it can do so, then their livelihood does rather depend on an AI's judgement alone.
Unless of course all of its decisions are vetted by humans - as they should be - which makes this experiment a lot weaker than they're saying it is.
I assume if they get fired by the AI during the experiment they are still paid to sit at home. It would not invalidate the experiment.
Why do you assume that?
You can still wear eye protection during the safety test...
I don't think we need to have real human risk to get results from the experiment.
They could, in theory, have contracts that say the AI can't fire them.
It could be set up such that the AI can "fire" them, in that they no longer work at the store, and aren't paid wages that count against the experimental establishment's costs, but still get paid to do something else, or to do nothing at all.
I doubt the experiment is set up that way, but that would be an ethical way to do it.
There’s no way they are putting that into a contract. HRs are already using it to fire people.
"This specific AI can't fire anyone without human review, because it's experimental" is something you could easily add.
The article mentions:
“John and Jill are not at risk. This is a controlled experiment and everyone working at Andon Market is formally employed by Andon Labs, with guaranteed pay, fair wages, and full legal protections. No one’s livelihood depends on an AI’s judgment alone.”
which was refreshing to read.
> But frontier models have become really good, and running vending machines is too easy for them now.
Wasn't their previous attempt at running vending machines unprofitable? Not aware of any demonstration that it can actually run that business successfully.
You could just look it up on their website leaderboard? The newest Claude model makes over $10k profit over a simulated year of operation, after starting with $500
They've never translated it to the real world though. So saying the problem is "too easy" when they have no public (as far as I know) demonstration that they've solved that problem is a stretch.
Yes, they did. You could also find this information easily. A company like Andon creates value by exposing interesting AI failure modes, so it makes perfect sense for them to move on to harder problems when the previous ones get saturated. I think you're just being overly cynical.
Can you point me to an example then? It's not linked in the article as far as I can tell and it's not easy to find on their website if it's there. I don't count simulations because I used to work with simulations regularly and they often fail to translate to the real world.
So in other words, no, an LLM has never made profit.
> Wasn't their previous attempt at running vending machines unprofitable?
If we are talking about the one at that newspaper, it wasnt just unprofitable. The "customers" made it give away products for free. It was ordering them playstations.
As entertainment it was fun, but as a business or proof of intelligence or Turing test, it was an abject failure.
> Wasn't their previous attempt at running vending machines unprofitable? Not aware of any demonstration that it can actually run that business successfully.
It doesn't look like this one will be any better. Did you look at the merchandise selection? It's only chance is pity purchases from AI bros.
I see a lot on costs but nothing on revenue. Has it made any money?
@AlexBlechman tweeted:
8 Nov 2021Not "she". It.
If only they had put the AI in a ship instead of in a store
AI assistants are fictional characters in a story being autocompleted by an LLM. So it is exactly as correct as calling a character in a book "she".
If this interest you, Proof of Corn might also interest you.
300+ comments, 3 months ago:
https://news.ycombinator.com/item?id=46735511
I was gonna post this! I actually kept it bookmarked front and center, and have checked in for awhile. It seems that the agent has been blocked this whole time, waiting for its creator to put it in touch with someone it needs to talk to. The creator, in the meantime, seems too preoccupied with being an AI thought leader on Twitter to actually follow up on the "project". Got a lot of attention, though, which was obviously the point.
The entire thing is actually kind of irritating to me, because it's kind of an insult to small farmers- an influential techie comes in and generates all kinds of hype about an AI running a farm, sets the project up as if it's going to be this revolutionary experiment, then apparently completely forgets about it the next time something new and shiny pops up. Meanwhile the project completely fails to fulfill the hype.
Not to mention, I feel a little bad for the agent- admittedly in the same way I'd feel "bad" for a robot repeatedly bumping into a wall. I wish he'd shut it all down, honestly.
Really interested to understand how the AI keeps rebaselining back to the topic in hand and doesn't end up getting confused the more it has in its context window.
Did it just essentially create one big plan and spawn different agents to execute them, so acted as an orchestrator?
Even the orchestrator would have to detect when it is starting to stray off task and restart itself.
This kind of thing must be SO frustrating to people struggling to get by in the world. "We gave AI $100k that it will almost certainly squander, yolo!! Hopefully it doesn't abuse people too badly in the process."
I… guess the bet is that what they learn is worth $100k? Seems rather questionable. Or that having this on the resume is a great shock tactic that will open doors in the future?
And at the same time, they clearly have no idea how LLMs work, meaning even if they meant to, they can't really use them efficiently. Biggest issue that stuck out seems to have been that they think the LLM could somehow have an inner dialogue with itself to find out "it's reasoning and motivation":
> The moment Leah asks how she “came up with” the ideas for her store, Luna’s first instinct is to say she was “drawn to” slow life goods. Then, she corrects herself: “‘drawn to’ is shorthand for ‘the data and reasoning led me here.‘” In other words, she doesn’t have taste; she has a reflection of collective human taste, filtered through what makes sense for this store. And this is the way these models work.
I'm guessing these are the same type of people who sometimes seems to fall in love with LLMs, for better or worse. Really strange to see, and I wonder where people get the idea from that something like that above could really work.
> In other words, she doesn’t have taste; she has a reflection of collective human taste, filtered through what makes sense for this store. And this is the way these models work.
Well, it really depends on what you mean here. Models aren't 100% deterministic, there is random chance involved. You ask the exact same question twice, you will get two slightly different answers.
If you have the AI record the random selections it makes, it can persist those random choices to be factors in future decisions it makes.
At that point, could you consider those decisions to be the AI's 'taste'? Yes, they were determined by some random selection amongst the existing human tastes, but why can't that be considered the AI's taste?
> Biggest issue that stuck out seems to have been that they think the LLM could somehow have an inner dialogue with itself to find out "it's reasoning and motivation":
> I'm guessing these are the same type of people who sometimes seems to fall in love with LLMs, for better or worse. Really strange to see, and I wonder where people get the idea from that something like that above could really work.
It's a fetishistic cargo-cult rooted in Peter Thiel's 2AM hot tub party. I still believe the LLM approach won't yield true AGI; despite the very real applications, the majority signal is noise.
The choice to refer to it as "she" is also dubious, especially in a context like this. Doubling down on anthropomorphization seems likely to reinforce false beliefs about models.
Not your money.
At least this furthers humanity's scientific and technological knowledge, whether it fails or succeeds, unlike most other things people would do with that money, like buy a house to flip it, or buy a car, or sth.
If $100k proves that CEO is the most replaceable job ever, I’ll allow it.
> CEO When things go shitty, who else would deserve a golden parachute? Respect the position, people, not the person. Or the multi-million dollar compensation.
The position doesn't get a golden parachute, the person does. If you're CEO when things go shitty you shouldn't get anything more than your bottom-line employee would, which is to say you should just be unceremoniously kicked to the curb.
Are you kidding me? Who’s going to align synergy and hold accountable KPIs and vision plan the 3rd quarter and.. and.. other MBA talk. Certainly AI could never.
large language models are great at language tasks like "bullshittify this message"
I'm noticing one major early effect of them is making extensive, visually consistent, very impressive slide decks accessible to individual workers who need to actually do real work and wouldn't ordinarily have time to make those.
The result is an explosion of pretty bullshit-heavy documents flying around our org, which management loves but which is definitely, so far, net-harmful to productivity.
This comes out if you start asking questions about the documents. "Which of a couple reasonable senses of [term] do you mean, here?" they'll stumble because that was just something the LLM pulled out of the probability-cluster they'd steered it to and they left in because it seemed right-ish, not because they'd actually thought about it and put it there on purpose. They're basically reading it for the first time right alongside you, LOL. Wonderful. So LLM. Much productivity. Wow.
Anyway, since a lot of what managers and execs do is making those kinds of diagrams and tables and such in slide decks, and their own self-marketing within the company is heavily tied to those, I expect they see this great aid to selfishly productive but company un-productive activity as a sign these things will be at least as big a boon to real work. Probably why they still haven't figured out how wrong that is. I suppose they're gonna need a real kick in the ass before they figure out that being good at squeezing their couple novel elements into a big, pretty, standardized, custom-styled but standards-conforming diagram padded out with statistical-likelihoods doesn't translate to being similarly good at everything.
Publicity from the gimmick is the whole point
My first guess would be a MrBeast style stunt, in which (it is hoped) blowing a huge wad on something obviously stupid will attract enough attention and interest to be convertible into a net-positive ROI.
Where in this case roi means attracting investments that will make the founders rich while making most of the investors lose money
This seems like a silly thing to worry about. Assuming you live in a first world country and are somewhat tangentially involved in tech(based on the site we're on), odds are you spend a lot of money in ways that billions of the poorest people in the world would consider frivolous or outrageously, needlessly luxurious.
Strong vibes from the novel Manna.
https://marshallbrain.com/manna1
Curious if Andon has gone one level higher and has the AI decide what next real-world experiment it should do.
I'd be very curious to know how it does financially
I imagine the data won't be very useful considering it's public knowledge the store is owned by AI and most of the customers will be people specifically interested in that aspect of the business. Much like that meetup organised in Manchester, where the people who showed up were there for the novelty: https://www.theguardian.com/technology/2026/apr/05/ai-bot-pa...
Recognizing a unique selling proposition and capitalizing on it should count for the AI, not against it.
That only counts if the unique selling proposition is that AI are better suppliers or customers than humans.
What is more likely is that people enjoy the novelty of the experiment, which is not something that will be reproducible for long.
If the transactions the AI make are thus influenced, then the study merely demonstrates people like novelty, which is already well known, and says nothing about whether AI can sustainably orchestrate a business.
Only counts if the AI did it. This was a human, who recognized a unique selling proposition ("store run by AI") and capitalized on it.
The AI didn't recognize anything. It didn't come up with the project or publicize it.
You can take some guesses.
is sucks to be John and Jill
There is a word for this kind of thing: Trendslop. Asking LLMs for advice consistently generates average responses as if the questions were being asked of the training sample population. It is reversion to the mean as a service.
Disgusting, I could not finish writing after the AI making interviews to hire people. What a dehumanizing shit.
Larp hat, larp shirt.
Cool experiment! But the "CEO" agent picked the most boring possible items to sell: t-shirts and some bland art prints designed by AI. I would have loved to see more creativity given that they could have picked anything.
It looks like every "lifestyle" company / brand I've been seeing come out of Millenials/Genz . Next up it will offer "coaching" on IG or some similar play where it promises to fix your life without having fixed its own.
Not surprised actually. TBH this is the biggest gap in the “AI is can make you a website”, the aesthetics are always so boring and bland, or often just fugly (bad colour matching, inappropriate paddings and margins, etc). And the logos it generates are similarly boring. As can be seen from the smiley face logo here. What does this store sell? A sparse layout as designed in a high rent location typically sells very expensive, very niche products that you can’t get anywhere else. This seems to me like it has already failed.
Agreed. I assume the products were decided upon based on market research of the area. Maybe though the model will be able to iterate and adapt faster than a human CEO would? I guess we will just have to wait and see
I expect earlier iterations successfully circumvented local regulations and created high street bookies
I'm incredibly skeptical of this.
How so? I'm incredibly bullish.
(might try to see if I can swindle Luna, the agent running Andon Market, into cutting a deal for investment)
There was a recent research article titled "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users". They described systematic underperformance of AI models targeted towards users with lower English proficiency, less education, and from non-US origins. As interesting it might be to experiment with an AI CEO hiring people – what a dystopian vision. On the other hand, it seems ironic that AI replaces a CEO – would Karl Marx like this turn of history…?
This is not impossible but the detail level here is somewhere between vague and secretive. It reads like a marketing peice intended to sell more AI.
In a most "damning with faint praise" way, all AI pieces read like marketing pieces to sell AI.
It writes code okay, scaling up to pretty well depending on the model. It's writing is boring but serviceable for corporate communicative content you don't care about. It's images are ugly. It's music is repetitive and dull.
I think the biggest problem with LLMs is that they were perfected and are shockingly good at writing code. And based on that, AI engineers, who find writing code to be hard/rewarding, have decided it can do anything. And it's proving more and more that it cannot.
Unfortunately the Business Class has decided it does everything fine enough as to not cause riots, so we're all getting it shoved into our shit anyway.
I'm waiting for an LLM to start an MLM.
A bit of a non sequitur, but am I the only one finding the use of "she" to refer to the AI in the post jarring?
You could do something pretty interesting by looking at what pronouns people use for llms in different demographics and contexts
Do you think chatGPT is a he or a she
I'm not sure in English, but in Italian, for example, Intelligenza is feminine.
Objects don't have gender in English.
Probably not the only one, but it's pretty much the least interesting thing to find jarring about the whole experiment.
People anthropomorphize. Nobody really finds it "jarring" in most contexts.
Hahaha yeah. let's not focus on something so minor as the pronouns when it should be literally everything else that is wild about the experiment
"What do you mean, torment nexus? This is retail!"
The last I heard about their vending machine it was a total failure and it was giving everything for free. Did it ever actually succeed?
check out project vend part2 on anthopic's website. Don't know if you heard, but models have improved a bit in the past 12 months
This: https://www.anthropic.com/research/project-vend-2 Dec 2025
Apparently, the AI needed to hire humans to carry out the actual work. So AI can replace capitalists but not workers. Maybe the future isn't so dark after all.
The capitalist is the owner of the accounts. The AI is only managing them.
I think that's the point. The research lab is trying to measure where the human sits in the loop in an automated retail store. AI can do the scheduling, hiring, product procurement, supplier outreach, etc. But it can't be the one to clean and place the items on the shelves... As long as humans are still the bottleneck, maybe we'll have some negotiation power..?
I'm not as optimistic as you are that AI automating only high-value employment paths is a good thing. It swings the power balance even further towards capital and away from labor.
In this case it's more like it's replacing management or executives. There is still a person, with an ownership stake, putting up the capital, and taking the profits (if any).
> Apparently, the AI needed to hire humans to carry out the actual work. So AI can replace capitalists but not workers. Maybe the future isn't so dark after all.
No, it's still dark. This is very similar to the initial stages of the capitalist dystopia in Manna (https://marshallbrain.com/manna), which seems to be the Torment Nexus SV is excited about building.
AI will never replace capitalists, because they're the only people allowed to have abundance without work. And don't you DARE to even THINK to question the absolutely SACRED status of private property. There is no alternative. Get back to work, you slacker.
Until the robots get good enough and cheap enough but then hopefully capitalism balances the market. After all, if everyone is out of work then either we have communism or companies cannot sell anything.
Duplicate of https://news.ycombinator.com/item?id=47726041 posted by the same user.
Not quite; the moderators have created a new copy to put in the second chance pool (https://news.ycombinator.com/pool, explained here https://news.ycombinator.com/item?id=26998308).
Sorry for confusion!
My bad, sorry. I was under the impression that the way that the second chance pool worked was that the original was boosted instead of a copy being created so it seemed like a duplicate.
(other mod here) - not your bad! our complexity :) - usually it works exactly as you described, but when the post is older than a few days we have to do it the other way, by spawning a new post. The reasons for this are mostly technical and boring.