AI has a cargo cult problem

(ft.com)

166 points | by cs702 12 hours ago ago

122 comments

empath75 11 hours ago

Un-paywalled version.

tra3 10 hours ago

If I'm tired of one thing related to AI/llm/chatbots it's the claims that it's not useful. It 100% is. We have to separate the massive financial machinations from the actual tech.

Reading this article though, I'm questioning my decision to avoid hosting open source LLMs. Supposedly the performance of Owen-coder is comparable to the likes of Sonnet4. If I invest in a homelab that can host something like Qwen3 I'll recoup my costs in about 20 months without having to rely on Anthropic.

[-]

mynameisash 10 hours ago

I don't think I've ever seen anyone say they're not useful. Rather, they don't appear to live up to the hype, and they're sure as hell not a panacea.

I'm pretty bearish on LLMs. I also think they're over-hyped and that the current frenzy will end badly (global economically speaking). Than said, sure, they're useful. Doesn't mean they're worth it.

[-]

Agingcoder 10 hours ago

To some extent it’s not that they don’t live up to the hype - rather that the gains are hard to measure.

Llms have spared me hours of research on exotic topics actually useful for my day job However, that’s the whole problem - I don’t know how much.

If they had a real price ( accounting for OpenAI losses for example) with ChatGPT at 50 usd/month for everyone, OpenAI being profitable, and people actually paying for this, I think things might self adjust and we’d have some idea.

Right now, we live in some kind of parallel world.

[-]

mattlutze 9 hours ago

> exotic topics [...] I don't know how much

We also don't know, in situations like this, whether all of or how much of the research is true. As has been regularly and publicly demonstrated [0][1][2], the most capable of these systems still make very fundamental mistakes, misaligned to their goals.

The LLMs really, really want to be our friend, and production models do exhibit tendencies to intentionally mislead when it's advantageous [3], even if it's against their alignment goals.

0: https://www.afr.com/companies/professional-services/oversigh... 1: https://www.nbcnews.com/world/australia/australian-lawyer-so... 2: https://calmatters.org/economy/technology/2025/09/chatgpt-la... 3: https://arxiv.org/pdf/2509.18058?

[-]

autoexec 9 hours ago

> The LLMs really, really want to be our friend

They want you to think they are your friend but they actually want to be your master and steal your personal data. It's what the companies who want to be masters over you and the AI have programed them to do. LLMs want to gain your confidence, and then your dependence, and then they can control you.

[-]

dangus 8 hours ago

This seems hyperbolic to me. Sometimes companies just want to make money.

Similarly, a SaaS company that would very much prefer you renew your subscription isn’t trying to make you into an Orwellian slave. They’re trying to make a product that makes me want to pay for it.

100% of paid AI tools include the option to not train on your data, and most free ones do as well. Also, AI doesn’t magically invalidate GDPR.

[-]

autoexec 5 hours ago

> Sometimes companies just want to make money.

Companies never just want money, because more power means more money. Regulatory capture means more money. More control means more money. Polluting the environment and wasting natural resources means more money. Exploiting workers means more money. Their endless lust for money causes them want all sorts of harmful things. If companies were making billions and nothing was being actively harmed by any of it no one would care.

These companies do want your money, but once you're locked in you are no longer the customer. If these AI companies had to depend on the income they get from subscriptions to survive they'd have gone out of business years ago. Instead AI is just shoved down people's throats everywhere they look and the money these companies live off of is coming from investors who are either praying that the AI becomes something it isn't or they're hoping they can help drive up stock value and cash out before the bubble breaks and leave somebody else holding the bag.

0% of AI tools include the option to not train on my data. They've already stolen it. They've scraped every word and line of code I've ever written that's been transmitted over the internet. It's been trained on photos of my family. It's been trained on the shitty artwork I've sent to my friends. By now it's probably been trained on my medical information and my tax records.

AI is controlled by some of the most untrustworthy companies and people on earth who have been caught over and over lying to the public and breaking the law. They can promise all day long not to steal anything I voluntarily give them, but I have zero trust in them and there is no outside oversight to ensure that they will do what they say.

The people behind what passes for AI don't give a shit about you beyond whatever they can take from you. They are absolutely not your friend. AI is incapable of being your friend. It's just a tool for the people who control it.

ToucanLoucan 7 hours ago

> This seems hyperbolic to me. Sometimes companies just want to make money.

It's not hyperbolic at all. The entire moat is brand lock-in. OpenAI owns the public impression of what AI is- for now- with a strong second place going to Claude for coders in specific. But that doesn't change that ChatGPT can generate code too, and Claude can also write poems. If you can't lock users into good experiences with your LLM product, you have no future in the market, so data retention and flattery are the names of the game.

All the transformer-based LLMs out there can all do what all the other ones can do. Some are gated off about it, but it's simulated at best. Sometimes even circumvent-able with raw input. Twitter bots regularly get tricked into answering silly prompts by people simply requesting they forget current instructions.

And, between DeepSeek's incredibly resource-light implementations of solid if limited models, which do largely the same sort of work without massive datacenters full of GPUs, plus Apple Intelligence rolling out experiences that largely run on ML-specific hardware in their local devices which immediately, full stop, wins the privacy argument, OpenAI and co are either getting nervous, or they're in denial. The capex for this stuff, the valuations, and the actual user experiences are simply not cohering.

If this was indeed the revolution the valley said it was, and the people were lining up to pay prices that reflected the cost of running this tech, then there wouldn't be a debate at all. But that's simply not true: most LLM products are heavily subsidized, a lot of the big players in the space are downsizing what they had planned to build out to power this "future," and a whole lot of people cite their experiences as "fine." That's not a revolution.

dangus 8 hours ago

Despite those mistakes, the utility is undeniable.

I converted some tooling from bash scripts leveraging the AWS CLI to a Go program leveraging the AWS SDK, improving performance, utility, and reliability.

I did this in less than two days and I don’t even know how to write Go.

Yes, it made some mistakes, but I was able to correct them easily. Yes, I needed to have general programming knowledge to correct those mistakes.

But overall, this project would not exist without AI. I wouldn’t have had the spare time learn all I needed to learn (mostly boilerplate) and to implement what I wanted to do.

marcellus23 7 hours ago

I wonder if, in any of those legal cases, the users turned on web search or not. We just don't know -- but in my experience, a thinking LLM with web search on has never just hallucinated nonexistent information.

[-]

andrepd 7 hours ago

I'm sorry to be so blunt but this is a massive cope and deeply annoying to see this every. fucking. time. the limitations of LLMs are brought up. There is every single time someone saying yeah you didn't use web search / deep thinking / gpt-5-plus-pro-turbo-420B.

It's absurd. You can trivially spend 2 minutes on chatgpt and it will hallucinate on some factually incorrect answer. Why why why always this cope.

[-]

andybak 6 hours ago

The 2 minutes thing feels out by one or two orders of magnitude.

But then - I'm constantly amazed by how everyone's subjective and presumably honest accounts of their experiences with AI differ so wildly.

1718627440 6 hours ago

Well I agree with you that LLMs really like to answer with stuff, that is not grounded in reality, but I also agree with the parent, that grounding it in something else absolutely helps. I let the LLM invent garbage how ever it feels like, but then tell it to only ever answer with a citing valid existing URLs. Suddenly it generates claims, that something doesn't exist or it truly doesn't know.

This really results in zero hallucination (but the content is also mostly not generated by a LLM).

noosphr 8 hours ago

>We also don't know, in situations like this, whether all of or how much of the research is true.

That's perfectly fine since we don't know how much of the original research is true either: https://en.wikipedia.org/wiki/Replication_crisis

If I waste three months doing a manual literature review on papers which are fraudulent with 100% accuracy have I gained anything compared to doing it with an AI in 20 minutes with 60% accuracy?

[-]

Jensson 7 hours ago

> If I waste three months doing a manual literature review on papers which are fraudulent with 100% accuracy have I gained anything compared to doing it with an AI in 20 minutes with 60% accuracy?

You don't see how adding 40% error rate on top of that makes things worse? Your 20 minute study there made you less informed, not more, at least the fraudulent papers teaches you what the community thinks about the topic while the AI just misinforms you about the world in your example.

For example, while reading all those fraudulent papers you will probably discover that they don't add up and thus figure out that they are fraudulent. The AI study however will likely try to connect the data in those so they make sense (due to how LLM works, it has seen more examples that connect and make sense than not, so hallucinations will go in that direction) then the studies will not seem as fraudulent as they actually are and you might even miss the fraud entirely due to AI hallucinating arguments in favor of the studies.

[-]

noosphr 7 hours ago

You are assuming my time is free and then comparing the difference between spending infinite time on something and minimal time.

[-]

Jensson 7 hours ago

Uninformed is better than misinformed, its better to not do that research at all than having such a high error rate as your example had. AI models often have much less error rate than you said there for certain topics, but the 40% error rate in your example does firmly put it where you are better off doing nothing at all than using that for research.

alganet 10 hours ago

> I don’t know how much.

If you're not willing to measure how it helps you, then it's probably not worth it.

I would go even further: if the effort of measuring is not feasible, then it's probably not worth it.

That is more targeted at companies than you specifically, but it also works as an individual reflection.

In the individual reflection, it works like this: you should think "how can I prove to myself that I'm not being bamboozled?". Once you acquire that proof, it should be easy to share it with others. If it's not, it's probably not a good proof (like an anecdote).

I already said this, and I'll say it again: record yourself using LLMs. Then watch the recording. Is it that good? Notice that I am removing myself from the equation here, I will not judge how good is it, you're going to do it yourself.

[-]

wongarsu 8 hours ago

There is a difference between confirming that something is worth it and quantifying the benefit though. One only requires satisfying a lower bound, the other requires an exact number.

For example I use a $30/month chatbot subscription for various utility tasks. If I value my time at above $60/hour I need to save half an hour each month (a minute a day) to make the investment worth it. That is absolutely true, just with simple googleable questions and light research tasks I save much more than 7 minutes a week.

But how much do I actually save? What exactly is my time actually worth? Those are much more difficult questions to answer

[-]

brailsafe 8 hours ago

For me it's less important how much time I think I save on any discrete task, and how much time I net over that time, accounting for how much time the set of tasks I'd be working on would have otherwise taken had I just manually done them. Right now, that means debits and credits in the ledger of time. Sometimes I gain a lot on tasks I probably otherwise wouldn't have done, but I also don't gain much overall by doing, and sometimes I lose a ton of time simply by leaning on a loop of re-doing inaccurate agent work in a way that's actually more time intensive than had I internalized the system in working memory and produced functionality more slowly.

If I save an hour, but lose 6, when I'd otherwise have spent 2, then I net -4, but sometimes overall it's positive, so the value is more ambiguous. If my employer didn't pay for the tools, I really don't know whether I would.

A good price and conservative usage pattern might net more.

alganet 7 hours ago

The user @brailsafe gave an answer that embodies some things I was going to say.

You're accounting for the time wins, not accounting for the time losses.

For a human chat user, that's when the LLM fails an answer or answers wrong. For an LLM coder, that's when context rot creeps in and you have to restart your work, and so on.

There are people who don't care much if they are being bamboozled for $30/mo, they have nothing to prove nor grand expectations for the thing. To them, cargo culting might be fun and that's what they extract from the bargain.

I am directing my answers mostly to people, companies or individuals, who have something to prove (evangelists, AI companies, etc). To those, a series of imperceptible small losses that results in debt in the long run is a big problem.

My suggestion (the recording session) also works as a metaphor. That could be, instead of video, metrics about how contexts are discarded. It is, in that sense, also something they can decide to share or not, and the extent to what they share should be a sign of confidence in their product.

Makes sense?

aeon_ai 9 hours ago

I just did it.

You were right.

It is, in fact, that good.

[-]

alganet 9 hours ago

You could have recorded, found it to be good, and didn't shared the news. Only used for your self. But you decided to share only the news, not the recording. That tells me something.

To be more clear, I can move this argument further. I promise you that if you share the recording that led you to believe that, I will not judge it. In fact, I will do the opposite and focus on people who judge it, trying my best to make the recording look good and point out whoever is nitpicking.

thatjoeoverthr 8 hours ago

The thing with the hype is it's always the same hype. "If you can just 3D print another 3D printer ..." "Apps are dead, everything will be AJAX" etc. I no longer believe the hype itself warrants attention or pushback. Let the hype boys raise money. No need to protect naive VCs.

[-]

bigfishrunning 7 hours ago

But if the hype boys manage to capture big portions of the market (Microsoft, Amazon, etc...) it starts affecting pensions and retirement accounts. The next few years are gonna be rough because of this hype.

watwut 6 hours ago

> Let the hype boys raise money. No need to protect naive VCs.

I genuinely 100% believe that ability of hype boys to raise money is harming the economy and us all. Whatever structural reason for it existing is there, it would be the best to end it.

QuantumGood 8 hours ago

Those that claim not useful usually link it to something like "never trust because hallucinations", or backtrack when called out like "yes, I should have added details", or speak of problems outweighing usefulness hence not useful, etc. But online, people do make this statement.

James_K 10 hours ago

Something not being useful is distinct from it having no uses. It could well be the case that the use of AI creates more damage than it does good. Many people have found it a useful tool to create the appearance of work where none is happening.

tra3 10 hours ago

Fair enough, I may have conflated "there's an AI bubble" with "AIs aren't useful".

My employer pays for Claude pro access, and if they stopped paying tomorrow I'd consider paying for it myself. Although, it's much more likely for me to start self hosting them.

So that's what it's worth to me, say $2500 USD in hardware over the next 3 years.

I'd love to hear what your take on this is.

[-]

brailsafe 8 hours ago

$2500 is a relatively small investment for any sort of useful tool over 3 years, but that seems very low to me for this specific self-hosting endeavor

peteforde 8 hours ago

My dude, there is a small but weirdly dedicated group of people on this site that are hellbent on demanding "proof" that the wins we've personally gained from using LLMs in an intelligent way are real. It's actually been kind of exhausting, leading me to not weigh in on many threads.

[-]

hatthew 5 hours ago

Because there's a lot of evidence that people tend to overestimate/overstate how useful LLMs are.

Everyone says "I wrote this thing using AI" but most of the time reading the prompt would be just as useful as reading the final product.

Everyone says "I wrote this large codebase using AI" but most of the time the code is unmaintainable and probably could have been implemented with much less code by a real human, and also the final software isn't actually ready for prod yet.

Everyone says "I find AI coding very useful" and neglects to mention that they are making small adhoc scripts, or they're in a domain that's mostly boilerplate anyways (e.g. some parts of web dev).

The one killer application of LLMs seems to be text summarization. Everything else that I have seen is either a niche domain that doesn't apply to the vast majority of people, a final product that is slop and shouldn't been made in the first place, or minor gains that are worthwhile but nowhere near as groundbreaking as people claim.

To be clear, I think LLMs are useful, and I personally use them regularly. But I've gained at most 5% productivity from them (likely much less). For me, it's exhausting to keep on trying to realize these gains everyone is talking about, while every time I dig into someone claiming to get massive gains I find that the actual impact is highly questionable.

llm_nerd 7 hours ago

>I don't think I've ever seen anyone say they're not useful.

https://news.ycombinator.com/item?id=45577203

There are thousands and thousands of comments just like this on this site. I would dare say tens of thousands. They regularly appear in any AI-related discussion.

I've been involved in many threads on here where devs with Very Important Work announce that none of the AI tools are useful for them or for anyone with Real Problems, and at best they work for copy/paste junior devs who don't know what they're doing and are doing trivial work. This is right after they declare that anyone that isn't building a giant monolithic PHP app just like them are trend-chasers who are "cargo culting, like some tribe or something".

>I also think they're over-hyped and that the current frenzy will end badly (global economically speaking)

In a world where Tesla is a trillion dollar company based upon vapourware, and the president of largest economy (for now) is launching shitcoins and taking bribes through crypto, and every Western country saw a massive real-estate ramp up by unmetered mass migration, and Bitcoin is a $2T "currency" that has literally zero real world use beyond betting on itself, and sites like Polymarket exist for insiders to scam foolish rube outsiders out of their money, and... Dude, the AI bubble doesn't even remotely measure.

didibus 10 hours ago

> it's the claims that it's not useful

I think the reason is because it depends what impact metrics you want to measure. "Usefulness" is in the eye of the beholder. You have to decide what metric you consider "useful".

If it's company profit for example, maybe the data shows it's not yet useful and not having impact on profit.

If it's the level of concentration needed by engineers to code, then you probably can see that metric having improved as less mental effort is needed to accomplish the same thing. If that's the impact you care about, you can consider it "useful".

Etc.

Octoth0rpe 10 hours ago

> It 100% is [useful]

It's worth disambiguating between "worth $50b of investment" useful versus "worth $1t of investment" useful

[-]

pseudosavant 9 hours ago

For perspective, there are 10 companies with a market cap over $1T. Is the value of LLMs greater than Tesla? Absolutely.

The problem of course is that plenty of that $1T in investment will go to stupid investments. The people whose investments pan out will be the next generation of Zuckerbergs. The rest will be remembered like MySpace or Webvan.

[-]

pseudosavant 8 hours ago

I'll add that MSFT, AAPL, GOOGL, AMZN, and META generated >$450B in net income in the last 4 quarters. It can't be overstated how much profits they can burn on AI without losing money.

rz2k 8 hours ago

To be fair, while the incremental value of each additional year that Tesla remains in existence may not be so great, it did finally change the conventional wisdom about the viability of electric vehicles which will continue to have substantial impact.

Furthermore the price of the most recently sold share times the number outstanding does not represent the total R&D or spending to make Teslas.

mattlutze 9 hours ago

Especially when, as it is currently in vogue to observe, the difference between $50b and $1t is roughly $1t.

[-]

marcosdumay 9 hours ago

Up to 2 significant figures...

criemen 9 hours ago

> Supposedly the performance of Owen-coder is comparable to the likes of Sonnet4. If I invest in a homelab that can host something like Qwen3 I'll recoup my costs in about 20 months without having to rely on Anthropic.

You can always try it via openrouter without investing in the home setup first. That allows you to evaluate whether it hits your quality bar or not, and is much cheaper. It is less fun than self-hosting though.

silversmith 10 hours ago

The issue is that the field is still moving too fast - in 20 months, you might break even on costs, but the LLMs you are able to run might be 20 months behind "state of the art". As long as providers keep selling cheap inference, I'm holding out.

[-]

ants_everywhere 9 hours ago

I agree, but also don't underestimate the value of developing a competency in self-hosting a model.

Dan Luu has a relevant post on this that tracks with my experience https://danluu.com/in-house/

tra3 10 hours ago

That's where I am at too. Also it's not clear what's going to happen with hardware prices. I think there's a huge demand for hardware right now, but it should fall off at some point hopefully.

wmf 10 hours ago

The gap between local models and SOTA is around 6 months and it's either steady or dropping. (Obviously this depends on your benchmark and preferences.)

[-]

criddell 7 hours ago

Seriously? So I can run the best models from 2024 at home now?

For example, what would I need to run Open AI's o1 model from 2024 at home? Are there good guides for setting this up?

[-]

wmf 6 hours ago

It's not the same model, but for example GPT-OSS-120B is smarter than o1. The guide is buy 128 GB of VRAM then install LM Studio.

[-]

criddell 5 hours ago

An NVIDIA 5090 with 128 GB of VRAM is $13k. It doesn’t make any sense to run that at home when you can pay OpenAI $20 / month to use it (it would take more than 50 years to spend $13k at OpenAI this way).

So technically you might be able to run a six month old model at home, but it would be foolish to do so from a financial point of view.

Or is there a way to get 128 GB of VRAM for a lot less than that?

[-]

wmf 5 hours ago

Ryzen AI Max is $2,000, M4 Max is $3,500, and DGX Spark is $4,000. Still not really economically feasible but I see it as an insurance policy. And that's the most expensive local model; smaller models will run on any PC.

rz2k 8 hours ago

Fortunately the models are increasing in efficiency about as fast as they are increasing in performance, so your homelab surprisingly doesn’t become out of date as fast as you might expect. However, I expect there will also be very capable machines like 1TB or 2TB Mac Studio M5 or M6 Ultras within a year or two.

mrbungie 10 hours ago

It's hell useful, I use Cursor several times a week (and I'm not even working as a dev full time rn), and ChatGPT is my daily driver.

Yet, it's weird to me that we're 3 years into this "revolution" and I can't get a decent slideshow from an LLM without having to practically build a framework for doing so.

[-]

jacobr1 10 hours ago

It is a focus, data, and benchmarking problem. If someone comes up with good benchmarks, which means having a good dataset, and gets some publicility around, they can attract the frontier labs attention to focus training and optimization effort on making the models better for that benchmark. This is how most the capabilities we have today have become useful. Maybe there is some emergent initial detection of utility, but the refinement comes from labs beating others on the benchmarks. So we need a slideshow benchmark and I think we'd see rapid improvement. LLMs are actually ok at a building html decks, not great, but ok. Enough so that if we there was some good objective criteria to tune things toward I think the last-mile kinks would get worked out (formats, object/text overlaps). the raw content is mainly a function of the core intelligence of model, so that wouldn't be impacted (if you get get it to build a good bullet-point markdown of you presentation today it would be just a good as a prezo, but maybe not as visually compelling as you like. Also this might need to be an agentic benchmark to allow for both text and image creation and other considerations like data sourcing. Which is why everyone doing this ends up building their own mini framework.

A ton of the reinforcement type training work really just aligning the vague commands a user would give to the same capability a model would produce with a much more flushed out prompt.

mrdependable 9 hours ago

They are useful, but I find it is only slightly more convenient than a Google search. Losing something like GPS on my phone would be a much bigger disruption to my life.

arjie 9 hours ago

I used Qwen3-480B-Coder with Cerebras and it was not very good for my use case. You can run these models online first to see if they will work for you. I recommend you try that first.

noosphr 8 hours ago

I had a hilarious exchange on here where I used an LLM to explain to a poster at length why they fundamentally didn't understand what I said. It did a bang up job. The poster, and a lot of other people, got mad I used AI and they still didn't understand my original post, or the AI explanation.

LLMs aren't terribly useful to people who fundamentally can't read. When those people can also type very fast you get the current situation.

[-]

Jensson 7 hours ago

> I used an LLM to explain to a poster at length why they fundamentally didn't understand what I said. It did a bang up job. The poster still didn't understand my original post.

It didn't do a bang up job if the poster still didn't understand you, so sorry this example doesn't prove what you think it does.

You have to measure actual results, your own take will always be biased so you can't say "I thought it was great but it didn't work" and expect people to get convinced by that.

Edit: And if that doesn't convince you, why not read what this AI has to say about your post, if you like them so much you should read this right and acknowledge you were wrong just like you expected those people to: https://chatgpt.com/s/t_68f2ae740f98819183539767b921965b

[-]

noosphr 5 hours ago

Claude, explain the fallacy fallacy to someone fond of pointing fallacies in arguments:

#### *The Fallacy Fallacy: A Metacognitive Error in Logical Analysis*

The fallacy fallacy, also known as the argument from fallacy or argumentum ad logicam, represents a second-order logical error wherein one incorrectly infers that a conclusion must be false solely because it has been argued through fallacious reasoning. This metacognitive error constitutes a significant impediment to rigorous philosophical discourse and warrants careful examination.

#### *Theoretical Framework and Definition*

Within the domain of informal logic, fallacies constitute "mistakes of reasoning, as opposed to making mistakes that are of a factual nature". The fallacy fallacy emerges when interlocutors conflate the validity of argumentative structure with the truth value of propositional content. Specifically, this error manifests when one advances the following invalid inference pattern:

1. Argument X contains logical fallacy F 2. Therefore, the conclusion C of argument X is false

This inference pattern itself represents a non sequitur, as the presence of fallacious reasoning does not necessarily bear upon the truth or falsity of the conclusion in question.

#### *Epistemological Implications*

The commission of the fallacy fallacy reveals a fundamental misunderstanding of the relationship between logical validity and factual accuracy. *Truth values of propositions exist independently of the quality of arguments marshaled in their support*. A proposition may be demonstrably true despite being defended through specious reasoning, just as a false proposition may be supported by formally valid argumentation with false premises.

Consider the following syllogistic example: - Major premise: All mammals are warm-blooded - Minor premise: Dogs are mammals because they bark - Conclusion: Dogs are warm-blooded

While the minor premise employs irrelevant reasoning (dogs' classification as mammals is unrelated to their vocalization), the conclusion remains factually correct.

#### *Methodological Considerations for Critical Analysis*

Scholars engaged in the identification of logical fallacies must exercise epistemic humility regarding the scope of their critique. As noted in the academic literature, "fallacies are common errors in reasoning that will undermine the logic of your argument", yet this undermining pertains exclusively to the argumentative structure rather than to the ontological status of the conclusion.

The appropriate scholarly response to encountering fallacious reasoning involves:

1. *Methodological separation* - Distinguishing between the evaluation of argumentative form and the assessment of propositional content 2. *Constructive engagement* - Requesting alternative justification rather than dismissing conclusions outright 3. *Epistemic charity* - Acknowledging that interlocutors may possess valid intuitions despite articulating them through flawed logical frameworks

#### *Conclusion*

The fallacy fallacy represents a particularly insidious form of intellectual error, as it masquerades as sophisticated logical analysis while itself committing a fundamental category mistake. Academics and scholars must remain vigilant against this metacognitive trap, recognizing that the identification of fallacious reasoning, while valuable for improving argumentative rigor, does not constitute sufficient grounds for rejecting the truth claims embedded within poorly constructed arguments. The pursuit of truth demands that we evaluate propositions on their merits, independent of the quality of their initial presentation.

huevosabio 10 hours ago

The problem of self-hosting is that you increase the friction to swap models and use whatever is SOTA or whatever fits your purpose best.

Also, I've heard from others that the Qwen models are a bit too overfit to the benchmarks and that their real-life usage is not as impressive as they would appear on the benchmarks.

[-]

acutesoftware 4 hours ago

Switching models when running locally is fairly easy - as long as you have them downloaded you can switch them in and out with a just a config setting - cant quite remember, but you may need to rebuild the vectorstore when switching though.

LangChain has the embeddings for major providers:

  def build_vectorstore(docs):
    """
    Create vectorstore from documents using configured embedding model.
    """
    # Choose embedding model
    if cfg.EMBED_MODEL.lower() == "openai":
        embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
    elif cfg.EMBED_MODEL.lower() == "huggingface":
        from langchain_community.embeddings import HuggingFaceEmbeddings
        embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
    elif cfg.EMBED_MODEL.lower() == "nomic-embed-text":
        from langchain_ollama import OllamaEmbeddings
        embeddings = OllamaEmbeddings(model=cfg.EMBED_MODEL)

reissbaker 10 hours ago

Qwen3 Coder unfortunately isn't on par with Sonnet, no matter what the benchmarks say. GLM-4.6 does feel pretty competitive though.

You'll need a pretty expensive home lab to run it though... I'd be surprised if you could do it at long context with only 20 months of Sonnet usage.

Ferret7446 2 hours ago

Does that include electricity and maintenance costs?

ants_everywhere 9 hours ago

The other thing that's tiring is talking about how AI is a bubble as if that's an indictment of AI.

Being a bubble is a statement about the value of the stock market, not about the technology. There was a dotcom bubble, but that does not mean the internet wasn't valuable. And if you bought at the top of the dotcom bubble you'd be much wealthier now than you were when you bought. But it would have taken you a significant time to break even.

[-]

decimalenough 8 hours ago

If you bought ETFs, yes, but not if you bought Pets.com and Yahoo.

[-]

ants_everywhere 8 hours ago

Right, which is a distinction that matters if you have a sensible view of what it means to be in a bubble.

But many people talking about AI being a bubble aren't trying to figure out which ticker is going to win in the long run, they're trying to convey a belief that AI is bogus altogether.

There's widespread agreement that nobody knows whether the AI valuations we see are right. What I'm saying is tiring is people who confuse that idea with an indictment of the technology.

zmmmmm 7 hours ago

> If I invest in a homelab that can host something like Qwen3 I'll recoup my costs in about 20 months without having to rely on Anthropic

For me it's equally that I don't trust any of these service providers to keep maintaining whatever service or model I'm relying on. Imagine if I build a whole entire process and then the bubble bursts and they either take away what I'm using or start charging outrageous amounts for it.

I feel we are well into the point where the base technology is useful enough and all the work is in how you implement and adapt it in to your process / workflow. A new model coming out that is 3% better is relatively meaningless compared to me figuring out how better to integrate what I already have which might give me a 20% bump for very little effort.

So at this point all I really want is stability in the tech so I can optimise everything else. Constant churn of hosted providers thrusting change at me every second day is actively harmful to my productive use of it at this point. Hence I want local models so I can just tune out the noise and focus on getting things done.

electroglyph 8 hours ago

you need at least an RTX 6000 pro, maybe 2 to run local models on that level. probably only worth it if you plan on doing other workloads like finetuning or generating a lot of synthetic data

somewhereoutth 7 hours ago

> the claims that it's not useful

There are many credible claims that not only is it not useful, but that it is actually causing serious damage.

dvfjsdhgfv 8 hours ago

> If I'm tired of one thing related to AI/llm/chatbots it's the claims that it's not useful.

That is the best example of straw argument I've seen this year. I enjoy reading discussions on LLMs and have seen a huge number of arguments, some reasonable and some ridiculous, but one thing I haven't seen is someone claiming that LLMs are not useful. We can discuss usefulness for a particular purpose, or the level of its fitness for it, but not the fact that millions of people find LLMs useful enough to pay for them.

imiric 10 hours ago

> If I'm tired of one thing related to AI/llm/chatbots it's the claims that it's not useful. It 100% is. We have to separate the massive financial machinations from the actual tech.

It's indisputable that the tech is and can be very useful, but it's also surrounded by a bubble of grifters and opportunists riding the hype and money train.

The sooner we start ignoring the "AI", "ASI", "AGI", anthropomorphization, and every other snake oil these people are peddling, the sooner we can focus on practical applications of the tech, which are numerous.

satisfice 3 hours ago

Few people say they are not useful. But when people like me say they aren’t reliable and worthy of trust, AI fanboys like to pretend we are saying something else.

andrepd 7 hours ago

It might be "useful" as in "it has a non-zero number of use cases", and still being massively overhyped (orders of magnitude in my opinion).

I guess there are use cases for it, even if we discount undisputed net negatives like the proliferation of slop online, scam calls, deepfakes, etc. That doesn't mean it provides an amount of utility that justifies pivoting a significant portion of world capital and production towards that end.

It will never be AGI, by the way. We are way past the inflection point of the logistic curve, so this is more or less what it is.

btucker 11 hours ago

https://archive.is/RVTHE

[-]

jjangkke 8 hours ago

I disagree with labeling AI to be a cargo cult. Crypto fits the description but the definition of a cargo cult has to imply some sort of ultimate end in which its follower's expectations are drastically reduced.

What AI feels like is the early days of the internet. We've seen the dot com bubble but we ultimately live in the internet age. There is no doubt that post-AI bubble will be very much AI orientated.

This is very different from crypto which isn't by any measure a technological leap rather more than a crowd frenzy aimed at self-enrichment via ponzi mechanisms.

moomin 10 hours ago

I want to ask ChatGPT to point to a behaviour described in the article that resembles cargo-culting with AI, but I don’t want to waste my future overlord’s time.

johnohara 10 hours ago

Not sure "Cargo Cult" is an apt description. Feynman's description of Cargo Cult Science was predicated on the behavior of islanders building structures in expectation it would summon the planes, cargo, personnel, etc. that used the island during WWII.

Without a previous experience they would not have built anything.

There is no previous AI experience behind today's pursuit of the AI grail. In other words, no planes with cargo driving an expectation of success. Instead, the AI pursuit is based upon the probability of success, which is aptly defined as risk.

A correct analog would be the islanders building a boat and taking the risk of sailing off to far away shores in an attempt to procure the cargo they need.

[-]

wmf 10 hours ago

Arguably AI is already "successful" in terms of funding and press coverage and that's what many people are chasing.

smogcutter 10 hours ago

This is a good point as a tangent. “Cargo Cult” is a meaningful phrase for ritualizing a process without understanding it.

Debasing the phrase makes it less useful and informative.

It’s a cargo cult usage of “cargo cult”!

sails 9 hours ago

I’m amazed they published it with such a poorly applied analogy.

blamestross 9 hours ago

Yeah, "cargo cult" is abused as a term. Those islanders were smarter than what is happening here.

We use it dismissively but "cargo cult" behaviour is entirely reasonable. You know an effect is possible, and you observe novel things corellating with it. You try them to test the causality. It looks silly when you know the lesson already, but it was intelligent and reasonable behaviour the entire way.

The current situation is bubble denial, not cargo culting. Blaming cargo culting is a mechanism of bubble denial here.

[-]

o11c 8 hours ago

It's called "science experiments", except when it produces the null result.

nextworddev 10 hours ago

Yes, this rally seems overextended. But investor sentiment - if anything - has already swung to very negative, which isn't ideal if you want it to crash.

Bubbles don't pop without indiscriminate euphoria (Private markets are a different story, but VCs are fked anyways). If anything, the prices have reflected less than 20% of Capex projections, so the market clearly thinks OpenAI / Stargate / FAANG's capex plans are BS.

p.s. if everyone thinks it's a bubble, it generally rallies even more..

[-]

vonneumannstan 10 hours ago

>If anything, the prices have reflected less than 20% of Capex projections, so the market clearly thinks OpenAI / Stargate / FAANG's capex plans are BS.

I'd say if anything the market is massively underestimating the scale of their capex plans. These things are using as much electricity as small cities. They are well past breaking ground, the buildings are going up as we speak.

https://www.datacenterdynamics.com/en/news/openai-and-oracle...

https://x.com/sama/status/1947640330318156074/photo/1

There are dozens of these planned.

[-]

nextworddev 9 hours ago

Think we said the same thing

andoando 9 hours ago

Oh jesus. I think AI is useful, but I figure 95%+ of it used on complete nonsense.

lazide 8 hours ago

Same thing happened with the dot-com boom and bust, except with fiber (later called dark fiber) and datacenters.

A lot of people lost a lot of money. Post bankruptcy, it also fueled the later tech booms, as now there was a ton of dark fiber waiting to be used at rock bottom prices, and underutilized datacenters and hardware. Google was a major beneficiary.

[-]

nextworddev 8 hours ago

think you are mis-quoting some of the poorly written vendor financing articles / linkedin posts.

the market hasn't priced in Sam Altman's Capex projections, so it's probably akin to 1998 or 1999

[-]

lazide 2 hours ago

nope, just forecasting based on lived experience.

jerf 11 hours ago

Cargo cult as a metaphor doesn't work here. That's for when the cargo culters don't understand what is going on, and attempt to imitate the actions without understanding or accuracy. AI investors understand what is going on and understand that this may be a bubble and they may lose their investment. We may disagree with them about the probabilities of such an outcome, perhaps even quite substantially, but that's not the same thing as thinking that if I just write some number-looking-squiggles on a piece of paper and slide it under the door of a building that looks like it has computers on it I will have a pool and a lambo when I get home. That's what "cargo cult" investing would look like.

The AI investors know what they are doing, by which I mean, if this is every bit the bubble some of us think it is and it pops as viciously as it possibly can and these investors lose everything from top to bottom, if they tried to say "I didn't know that could happen!" I simply wouldn't believe them and neither would anyone else. Of course they know it's possible. They may not believe it is likely, but they are 100% operating from a position of knowledge and understanding and taking actions that have a completely reasonable through-line to successfully achieving their goals. Indeed I'm sure some people have sufficiently cashed out of their positions or diversified them such that they have already completely succeeded; worries about the bubble are worries about a sector and a broad range of people but some individuals can and will come out of this successfully even if it completely detonates in the future. If nothing else the people simply drawing salaries against the bubble, even completely normal non-inflated ones, can be called net winners.

[-]

alphazard 10 hours ago

Ironically, the author of TFA is playing the part of the cargo cult. They don't actually understand the cargo cult metaphor, but since it is a popular metaphor, they reference it in naive imitation hoping that people engage with their content.

[-]

NotBillBellaC 10 hours ago

and we did! So it works?

leptons 10 hours ago

The original "cargo culters" had nothing to lose, so your comment falls apart pretty quickly.

[-]

jerf 9 hours ago

My comment about how "cargo culting" is not an appropriate metaphor "falls apart" because you named another way in which the metaphor is not appropriate?

This is some bold new definition of "falls apart" with which I am not familiar.

jasonthorsness 10 hours ago

Everyone has imperfect information; this isn't a cargo cult situation where it's massively asymmetric, this is more like when you see everyone else running, it's generally a good idea to start running too. But when that heuristic fails it fails in a pretty spectacular way.

gdulli 11 hours ago

Maybe it's human nature that has a cargo cult problem and AI is just the current flypaper?

micromacrofoot 11 hours ago

Tech has a cargo cult problem

[-]

blackoil 10 hours ago

Tech has a winner takes all problem. All those billions are chasing trillions of valuation. Many will fail, but some will be ruling(metaphorically) the world

[-]

fancyfredbot 6 hours ago

Right now the AI business does not looking like a winner takes all situation at all. Everyone is losing money and nobody has any lock-in. Free open source models are only 6-12 months behind the frontier labs. Seems like a tricky business to me with no clear route to metaphorical world domination.

Making stuff for AI companies looks like better business to me!

burnt-resistor 3 hours ago

The problem is the self-reinforcing valuation entanglements that make NVIDIA have a market cap of 4.42 teradollars.. until Meta, Goog, Apple, and Microsoft develop their own custom NPUs or the fragile bubble bursts in some other way.

There's value here, but probably not as much as the market thinks... yet.

llm_nerd 11 hours ago

The cargo cult metaphor is weak. If an article written in the year of our FSM 2025 describes Melanesian cargo cults to make a point, they're probably just copying a trope from other articles. Cargo culting, if you will, much like Melanesian cargo cults that would wear bamboo earpieces and...

Is it a gold rush? Absolutely. There is a massive FOMO and everyone is rushing to claim some land, while the biggest profiteers of all are ones selling the shovels and pick axes. It's all going to wash out and in the end a very small number of players will be making money, while everyone else goes bust.

While many people think the broadly described AI is overhyped, I think people are grossly underestimating how much this changes almost everything. Very few industries will be untouched.

[-]

rjsw 10 hours ago

The author is an anthropologist, I think she knows the original meaning of "cargo cult".

The 'cult' behaviour described in the article is that of building big data centres without knowing how they will make money for the real business of the tech companies doing it. They have all bought AI startups but that doesn't mean that the management of the wider company understands it.

[-]

llm_nerd 8 hours ago

>The author is an anthropologist, I think she knows the original meaning of "cargo cult".

I am perplexed how you thought this refuted or offered any value to what I said. Or are you under delusions that her being an anthropologist also makes her an expert on AI and the tech industry, ergo ipso facto her metaphor isn't incredibly dumb and ill-suited?

I never questioned if they knew the "original meaning". Yes, we've all read the meaning countless, countless times, in a million trope-filled blog entries. And indeed, the whole basis of her tosser "article" is some random blog entry that, as millions before have, decided to make everything about Cargo Cults.

Protip: If you are busy writing a blog entry and you decide to describe some island tribe (it does not actually matter where said tribe is) that had bamboo headsets, delete the entire thing and go do something actually useful.

It is a profoundly boring story at this point. And in this case, like with many, the metaphor is incredibly stupid and ill-suited. If these businesses were building "data centres" out of mud and drawings of GPUs it would be pertinent, but instead it's describing a gold rush where a lot of players are doing precisely the right thing to try to land grab o an obviously massive and important space (and in a very useful to them sense, see enormous capitalization gains in doing so), then trying to ham-fist some cliche "story" in.

Like, if that tribe built functional runways with ATC towers, and then a fleet of cargo planes -- being well funded in the process by outsiders who see how lucrative the cargo business is -- but then it turns out that the cargo business is a bit saturated so it's going to be tough for them to make it profitable on their EBIDTA statements, boy, fire up the typewriter you got a winner!

>The 'cult' behaviour described in the article is that of building big data centres without knowing how they will make money

Utterly nonsensical.

[-]

decimalenough 8 hours ago

Melanesian, not Micronesian (or Polynesian as you originally said). I know all those Pacific islands look the same, but it's not the same thing at all.

[-]

llm_nerd 8 hours ago

Oh, damn, Melanesian? Well this changes everything! I do remember when Melanesia built those computation centres and it turns out that neighbouring Polynesia went with the newer generation of fabric and upended their business. Truly a great metaphor for so many things!

Firing up notepad and going to author the next paper that does numbers among the Shakes Fists At Clouds crowd that spend their day tilting at windmills.

saltcured 10 hours ago

Yeah, if cargo cult were applied aptly, it would be more for the folks who are all-in on using LLMs yet not really getting any net productivity boost. Those basically just LARPing a dream world, but with no tangible benefit compared to the Old Ways.

hansonkd 10 hours ago

Yeah, Not seeing the connection to cargo cult unless AGI already appeared, offered us incredible bounty of benefits and then left, so we all created a religion in order to summon AGI back.

waprin 11 hours ago

edit: made a goal to avoid pointless internet flame wars that I briefly lapsed from

[-]

stego-tech 11 hours ago

From the perspective of AI critics like myself, HN is awash in posts showing what folks have done with AI or boosting AI PR pieces, while critics often get flagged and our submissions shunted away from the front page. AI Boosters claim that all this CAPEX will create a Utopia where nobody has to work anymore, economies grow exponentially forever, and societal ills magically disappear through the power of AGI. On the other side, a lot of AI Doomers point out the perils of circular financing, CAPEX investments divorced from reality, underlying societal and civilizational issues that will hinder any potential positive revolution from/by AI, and corporate valuations with no basis other than hype.

Where commenters like yourself trip themselves up is a staunch refusal to be objective in your observations. Nobody is doubting the excitement of new technologies and their potential, including LLMs; we doubt the validity of the claims of their proponents that these magic boxes will somehow cure all diseases and accelerate human civilization into the galactic sphere through automated R&D and production. When Op-Eds, bloggers, and commenters raise these issues, they’re brow-beaten, insulted, flagged, and shunted away from the front page as fast as humanly possible lest others start asking similar questions. While FT’s Op-Eds aren’t exactly stellar to begin with, and this one is similarly milquetoast at first glance, the questions and concerns raised remain both valid and unaddressed by AI Boosters like yourselves. Specifics are constantly nitpicked in an effort to discredit entire arguments, rather than address the crux of the grievance in a respectable manner; boosters frequently come off like a sleazy Ambulance-Chasing Lawyer on TV discrediting witnesses through bad-faith tactics.

Rather than bloviate about the glory of machine gods or whine about haters, actually try listening to the points of your opponents and addressing them in a respectful and honest manner instead of trying to find the proverbial weak point in the block tower. You - and many others - continue to willfully miss the forest for the specific tree you dislike within it, and that’s why this particular era in tech continues to devolve into toxicity.

At the end of the day, there is no possible way short of actual lived outcome for either side to prove their point as objectively correct. Though when one side spends their time hiding and smearing critique from their opponents instead of discussing it in good faith, that does not bode well for their position.

JohnMakin 11 hours ago

Ah, this is a good example!

ellg 11 hours ago

"shocking little amount of discussion"

are we reading the same website...

empiko 11 hours ago

Fully agreed. The author also can't decide whether AI is a Ponzi scheme, a bubble, or a cargo cult; so let’s just use them all! It's just buzzwords without any real analysis beyond what is generally known about the field.

[-]

o11c 8 hours ago

I mean, there's nothing wrong with all 3 potentially applying.

It can be a Ponzi scheme for specific investors, a bubble in the stock market in general, and a cargo cult for the companies using it.

One of the comments downthread did give an argument against "cargo cult" applying though - you need actual past successes (of the particular degree, not extrapolated) to assume that your actions will repeat them.

ctoth 11 hours ago

The only cargo cult behavior I see here is Tett's own journalism! She casually drops that same debunked "95% of companies see no AI revenue gains" figure[0] without tracing it to source, performing the ritual of citation while missing the actual mechanism that makes evidence valuable.

[0] https://aiascendant.com/p/why-95-of-ai-commentary-fails

halayli 10 hours ago

The paper claims that 95% of companies see no AI revenue gains, which seems like an outrageous blanket statement. The truth is likely somewhere in the middle.

The real issue here is a fundamental statistical and categorical error: the paper lumps all industries, company sizes, and maturity levels under the single umbrella of "companies" and applies one 95% figure across the board. This is misleading and potentially produces false conclusions.

How can anyone take this paper seriously when it makes such a basic mistake? Different industries have vastly different AI adoption curves, infrastructure requirements, and implementation timelines.

It's equally concerning that journalists are reporting on this without recognizing or questioning this methodological flaw.

[-]

Yaina 9 hours ago

I think what we're seeing, and what the article describes, are company leaders across industries reacting to the AI hype by saying "we need AI too!" not because they've identified a specific problem it can solve, but because they want to appear innovative or cut labor costs.

Right now, the market values saying you're doing AI more than actually delivering meaningful results.

Most leaders don't seem to view AI as a practical tool to improve a process, but as a marketing asset. And let’s be honest: we're not talking about the broad field of machine learning here, but mostly about integrating LLMs in some form.

So coming back to the revenue claims: Greenhouse (the job application platform) for example now has a button to improve your interview summary. Is it useful? Maybe. Will it drastically increase revenue? Probably not. Does it raise costs? Yes; because behind the scenes they’re likely paying OpenAI processing fees for each request.

This is emblematic of most AI integrations I've seen: minor customer benefits paired with higher operational costs.

[-]

bdbdkdksk 9 hours ago

The Greenhouse example is so crazy - with no additional context about the interview, what possible value could an AI add to a summary of a real event that happened?

[-]

Yaina 8 hours ago

It's just an additional button in their WYSIWYG editor. I'm sure its not much more than a simple prompt telling ChatGPT or whatever to clean up the text for clarity.

bongodongobob 8 hours ago

That's exactly it. We are using AI to reformat all of our documentation... And they we've been told to review the output. No one asked for this, there are no benefits, and it's adding completely unneeded extra work.

[-]

fragmede 8 hours ago

Would you feel the same way if they'd hired a human technical writer to generate documentation and you had to review the output?

[-]

bongodongobob 6 hours ago

Yes, because our documentation is fine. No one thinks any changes need to be made. Leadership is realizing all their AI hype has no benefit to anything we do so they're scrambling to find projects to use it for.

fishmicrowaver 10 hours ago

It's not clear to me how much companies are even attempting to quantify the value of 'AI'. Having 'AI' is the value. It's similar to the Data Science / Machine Learning craze where managers decided that we must have ML, instead of considering it one among many capabilities, that may or may not be useful for a particular problem.

molyss 10 hours ago

I think it's a bit disingenuous to reduce the article to a single sentence that's in parenthesis and links to a widely shared publication about an a MIT report. Especially when said article continues with "Don’t get me wrong: I am not denying the extraordinary potential of AI to change aspects of our world, nor that savvy entrepreneurs, companies and investors will win very big. It will — and they will."

One doesn't have to agree with the original report, but one can't in good faith deny that the whole thing smells of a financial scheme with circular contracts, massive investments for an industry that's currently losing money by the billion and unclear financial upside for most other companies out there.

I'm not saying AI is useless or that it will never be useful, I'm just saying that there are some legitimate reasons to worry about the amounts of money that are being poured into it and its potential impact on the economy at large. I believe the article is simply taking a similart stance