The sigmoids won't save you

(astralcodexten.com)

53 points | by Tomte 7 hours ago ago

72 comments

  • btilly an hour ago

    Lindy’s Law is an absolute gem, that I'm keeping.

    If we don't understand the fundamental limits to any particular kind of trend, our default assumption should be that it will continue for about as long as it has gone on already.

    We can, in fact, easily put a confidence interval on this. With 90% odds we're not in the first 5% of the trend, or the last 5% of the trend. Therefore it will probably go on between 1/19th longer, and 19 times longer. With a median of as long as it has gone on so far.

    This is deeply counterintuitive. When we expect something to last a finite time, every year it goes on, brings us a year closer to when it stops. But every year that it goes on properly brings the expectation that it will go on for a year longer still.

    We're looking at a trend. We believe that it will be finite. Our intuition for that is that every year spent, is a year closer to the end. But our expectation becomes that every year spent, means that it will last yet another year more!

    How can we apply that? A simple way is stocks. How long should we expect a rapidly growing company, to continue growing rapidly?

    • jerf an hour ago

      It's an interesting idea, and it may be something that could be mathematically justified, but I do think this is an abuse of Lindy's Law in the absence of such a justification. Per Wikipedia [1]:

      "The Lindy effect applies to non-perishable items, like books, those that do not have an "unavoidable expiration date"."

      And later in the article you can see the mathematical formulation which says the law holds for things with a Pareto distribution [2]. I'd want to see some sort of good analysis that "the life span of exponential growth curves" is drawn from some Pareto distribution. I don't think it's completely out of the question. But I'm also nowhere near confident enough that it is a true statement to casually apply Lindy's Law to it.

      [1]: https://en.wikipedia.org/wiki/Lindy_effect

      [2]: https://en.wikipedia.org/wiki/Pareto_distribution

    • skybrian 32 minutes ago

      You can do that but you're laundering ignorance into precise-seeming mathematics. Better to just say "we're probably somewhere in the middle, not at the beginning or end" and leave it at that. Calling a peak is hard.

    • LPisGood 39 minutes ago

      This is the exact same heuristic used in CPU scheduling.

      We expect fresh processes to terminate quickly and long running processes to last for a while longer.

  • LarsDu88 an hour ago

    I think an interesting thing about recent AI developments is that its all happening right as we hit the diminishing returns side of another "exponential that's actually a sigmoid" which is Moore's law.

    The naive expectation is that AI will slow down b/c Moore's law is coming to an end, but if you really think about the models and how they are currently implemented in silicon, they are still inefficient as hell.

    At some point someone will build a tensor processing chip that replaces all the digital matmuls with analogue logamp matmuls, or some breakthrough in memristors will start breaking down the barrier between memory and compute.

    With the right level of research funding in hardware, the ceiling for AI can be very high.

    • throwaway27448 an hour ago

      Even at orders of magnitude greater speed, we've still hit diminishing returns for quality of output. We simply haven't found anything like superhuman reasoning ability, just superhuman (potentially) reasoning speed.

      • energy123 24 minutes ago

        It's not that easy to assess diminishing returns with saturated benchmarks where asymptoting to 100% is mathematically baked in. I could point to the number of Erdos proofs being solved by AI going from 0 to many very recently as evidence for acceleration.

      • horsawlarway 26 minutes ago

        Possibly - but we've also seen that spending more tokens on a task can improve the quality of the output (reasoning, CoT, etc).

        So it's not impossible to have things that seem orthogonal, like generation speed or context length, have an impact on quality of result.

    • cyanydeez an hour ago

      they already did put a model into the silicon and it's crazy fast. https://chatjimmy.ai/

      I'm pretty sure there's a 3 year design goal starting this year that'll do that to any of the qwen, deepseek, etc models. There's a lot you could do with sped up models of these quality.

      It might even be bad enough that the real bubble is how much we don't need giant data centers when 80-90% of use cases could just be a silicon chip with a model rather than as you say, bloated SOTA

      • clickety_clack an hour ago

        It would be pretty cool to have interchangeable usb keys with models on them.

  • gm678 2 hours ago

    I don't know what the Y-axis is supposed to be on that Wharton AI capabilities graph, but I am not really convinced that Opus 4.6 has more than double the intelligence/capability/whatever of GPT 5.1 Max.

    • NitpickLawyer 2 hours ago

      IIRC that graph tracks capabilities as time_to_solve a task for humans (i.e. the model can now handle tasks that usually take a human ~8h). Which, depending on what tasks you look at, could be a reasonable finding. I could see Opus 4.6 handling tasks that take ~8h for humans, and that 5.1 couldn't previously handle (with 5.1 being "limited" at 4h tasks let's say). It is a bit arbitrary, but I think this is what they're tracking.

      • jrumbut an hour ago

        Without knowing more about their methodology, it seems like a lot of the recent improvements have involved the AI itself taking time to complete the task.

        At first the models turned a 5 minute task into a 5 second task (by 5 seconds I mean a very short amount of time, not precisely 5 seconds). Then they turned a 15 minute task into a 5 second task.

        Opus 4.6 completes 8 hour tasks all the time but (at least in my experience) it isn't spitting the answer out in 5 seconds anymore. It's using chain of thought and tools and the time to completion is measured in minutes or maybe hours.

        In my experiments with local LLMs, a substantial part of the gap between frontier and local (for everyday use) is in tooling and infrastructure.

        That is why I am sympathetic to the idea we are leveling off. But to bring in the air speed example from the article, I don't think we've reached the equivalent of the ramjet yet. I suspect in the coming years there will be new architectures, new hardware, and new ways to get even more capable models.

      • lukan an hour ago

        "It is a bit arbitrary, but I think this is what they're tracking."

        I don't know if they can get their numbers right this way, but this seems a way more useful metric, than theoretic capabilities.

        • cyanydeez an hour ago

          ok, but arn't you just measuring efficiency and not the big I in AGI improvements.

          • jsnell 19 minutes ago

            No? I think you're misunderstanding what is being measured.

            It is purely a test of capabilities (can it do a thing that takes a human $X hours), not efficiency (how fast will it do it).

          • lukan an hour ago

            Yes, but this study was not about that and "just efficiency" is actually what most people are after.

            At least I want AI to solve my problems, not score high on a academic leaderboard.

      • MadxX79 41 minutes ago

        I don't know why people are so impressed by 8h.

        I trained an LLM to write the whole Harry Potter series, and that took JK Rowling like 17 years.

        For my next point on the graph, I'll train the LLM to write the Bible, something that took humans >1500 years.

    • strken an hour ago

      Check out Re-Bench and HCAST.

      The tasks are obviously all of the form "Go do this, and if you get the following output you passed". Setting up a web server apparently takes 15 minutes for a human, which is news to me since I'm able to search for https://gist.github.com/willurd/5720255, find the python one-liner, and copy it within about ten seconds.

      Anyway, this is cool but it does not mean Claude can perform any human tasks that take less than 8 hours and are within its physical capabilities.

    • adw an hour ago

      https://podcasts.apple.com/us/podcast/machine-learning-stree... is a pretty good primer on METR, what it measures, and its limitations.

    • throwaway27448 an hour ago

      > more than double the intelligence/capability/whatever

      I'm curious what people really mean when they say this. Intelligence is famously hard to define, let alone measure; it certainly doesn't scale linearly; it only loosely correlates to real-world qualities that are easy to measure; etc. Are you referring to coding ability or...?

    • myhf an hour ago

      According to this article: whenever someone games a benchmark to make an upward chart on some y-axis, it's YOUR responsibility to prove how and why that trend can't continue indefinitely.

      emoji face with eyes rolling upward

      • skybrian an hour ago

        Seems to me that the default is "I don't know what's going to happen" and if you're making a confident prediction, bring evidence.

        Scott makes a Lindy effect argument which is plausible, but don't let that fool you, we still don't know what's going to happen.

      • AnimalMuppet an hour ago

        I'm pretty sure that gaming benchmarks can continue indefinitely.

    • BoredPositron 2 hours ago

      https://metr.org/time-horizons/ on linear scale. Clickbait garbage article as most of his in the last year.

      • afthonos 2 hours ago

        …yeah, that’s where you see the exponential?

  • jsmcgd 24 minutes ago

    > It’s true that birth rates must eventually flatten out and become sigmoid

    All positive growth eventually flattens out and becomes sigmoid, but a lot of phenomena experience negative growth and nose dive. No gentle curve, but a hard kink and perfect flat line at zero. Forever. I think it would be a stretch to categorize that pattern as sigmoid. Predicting a sigmoid pattern for negative growth implies some sort of a soft landing (depending on your definition of soft).

    We can think of many populations that are no longer with us. So just a caution about over applying this reasoning in the negative case.

  • dsign 42 minutes ago

    We did hit the sigmoid's plateau on airplane speed, but the applications of airplane speed are still coming (how fast can a Chinese company airship the PCB you ordered three minutes ago?). I expect the the same will happen with LLMs, though I also happen to believe things are just getting started on end capabilities.

  • janalsncm an hour ago

    > What if you don’t fully understand the process? AI forecasters know some things (like how data centers work and how much it costs to build them). But they’re unsure about other things (researchers keep inventing new paradigms of data generation that get over data walls, but for how long?), and other things are entirely opaque (What is intelligence really? Why do scaling laws work? Might they just stop working at some point?) Is there anything you can do here?

    This is the crux of the article. To a large extent continued progress depends on a stable increase in compute, an increase in training data, and an increase in good ideas to squeeze more out of both of them.

    One calculation you could do is a survival function: for each of the above, how long before it is disrupted? For example, China could crack down on AI or invade Taiwan. Or data centers become politically unpopular in the US. Or, we could run out of great ideas. Very hard to predict.

  • Brendinooo an hour ago

    > then what is their model?

    My mental model has been 3D computer graphics: doubling the polygon count had huge returns early on but delivered diminishing returns over time.

    Ultimately, you can't make something look more realistic than real.

    I don't know what the future holds, but the answer to the question "can LLMs be more realistic than real" will determine much about whether or not you think the curve will level off soon.

  • OscarCunningham an hour ago

    John D Cook gives more technical details here: "Trying to fit a logistic curve" https://www.johndcook.com/blog/2025/12/20/fit-logistic-curve...

  • philipallstar 2 hours ago

    But they do explain the improvement of AI driving 2017-2021 vs 2022-2026.

  • zkmon an hour ago

    The curve is a smoothed step curve (y=1 if x>1 otherwise 0). Nature doesn't allow any change to happen instantly at any degree of rate of change. The curveis just a manifestation a change with exponential smoothening of the sharp corners.

    For example, When a car starts, it's speed and acceleration become more than zero. But what about rate of change in higher degrees? It suddenly doesn't change from zero acceleration to non-zero. That means the car has a non-zero derivative at all degrees. In other words, the movement is exponential. The same thing happens in reverse when the car reaches a constant speed.

  • kubb an hour ago

    If the scary AI is so inevitable, why do you feel such an overwhelming need to convince people about that? Surely you can just wait a bit, and they'll see for themselves.

    • mitthrowaway2 an hour ago

      By that reasoning, why even warn people about anything? Why do road construction crews put up signs saying "ROAD CLOSED AHEAD" when you can just drive on and see for yourself?

      • kubb an hour ago

        Indeed, why warn people about real things that exist in the world? That is EXACTLY the same as inciting fear about something imaginary (not even projected).

        • mitthrowaway2 5 minutes ago

          I don't know how else to put this; there seems to be some theory of mind failure at work here.

          In your mind, dangers from AI are imaginary and not even projected, therefore, you don't see any reason to warn about them, because you don't think the dangers are real. You don't believe the road is actually closed up ahead, so you don't think it's necessary to post the sign.

          In Scott's mind, dangers from AI are not a known fact, but are somewhere between highly probable and a near-certainty. In his mind, there are well-grounded justifications for believing this, and he strives to convince skeptics of his reasoning. Because he thinks AI poses dangers to the public, he also believes he should inform people about this, so that we might steer clear.

          It's easy to understand why someone who believes what you believe about AI would of course not warn people about AI. It's also easy to understand why someone who believes what Scott believes about AI would want to warn people about AI. Your contention is with his confidence for being worried about AI, not his reason for wanting to warn people.

    • adleyjulian an hour ago

      1. It's not inevitable. 2. Those that see AI as an existential risk don't generally think it's a guarantee, but if it's say a 5% chance then that's worth addressing/mitigating. 3. That's not what this article was even about.

      • kubb an hour ago

        Sounds like the burden is on you to explain either

          1. If you're not treating my claim as a black box, explain explicitly what is your model of what the article was about? Are you aware, for example of the last paragraph of the article? I think that WAS what the article was about. Do you have specific opinions on e.g. how I went wrong and where my model differs?
          2. If you are treating it as a black box, what's your default expectation based on the law of Nothing Ever Happens?
        
        Just kidding, you don't need to explain anything. A"I" fearmongers should though.
        • adleyjulian 9 minutes ago

          The point of the article is that people are historically bad at predicting when exponential curves plateau, even if they're correct that there will be a plateau.

          This does *not* imply the inevitability of AGI. It does not imply AGI is necessarily bad.

          It does mean that "the capabilities of AI will eventually plateau" offers no meaningful predictive power or relevance to the overall AI discussion.

  • andai 2 hours ago

    Well, curve shape aside, the high watermark might be lower than where it tapers off.

    https://news.ycombinator.com/item?id=46199723

  • patrickmay an hour ago

    Stein's Law: "If something cannot go on forever, it will stop."

    • skybrian 43 minutes ago

      Yes, but figuring out when is the hard part.

  • krupan an hour ago

    News flash: predicting the future is hard

    • energy123 an hour ago

      The individual who is the best at predicting the future is predicting ASI and full labor automation by 2040:

      https://xcancel.com/peterwildeford/status/202963666232244661...

      • dsign 28 minutes ago

        My own bet is end of that decade: somewhere between 2045 and 2050.

        Ofc "full labor automation" has a certain spread of meaning. A sliver of population will always find ways to hold to a job or run one or many businesses. But there will be "enough" labor automation for it to be a social ticking bomb. That, in fact, does not depend on better models nor better AI than we have today. By 2045 there will be a couple of generations that has been outsourcing their thinking to AI for most of their adult lives. Some of them may still work as legal flesh of sorts, but many won't get to be middle man and will find no job.

        Also, if you could replace your senator today by an untainted version of a frontier model (of today), would you do it? Would it be a better ruler? What are the odds of you not wanting to push that button in the next twenty years, after a few more batches of incompetent and self-serving politicians?

      • Aurornis an hour ago

        > The individual who is the best at predicting the future

        Going to need a big citation for that claim

      • gerikson an hour ago

        Past results is no guarantee of future performance.

      • layer8 an hour ago

        Predicting who will predict the future best is hard.

      • margalabargala an hour ago

        > The individual who is the best at predicting the future

        Lol

  • inglor_cz 2 hours ago

    Hmmm, this is quite an interesting take by Scott.

    Lindy's Law is not actually a law and many exact minds will be provoked by the very name; it also fails spectacularly in certain contexts (e.g. lifetime of a single organism, though not necessarily existence of entire species).

    But at the same time, I am willing to take its invocation in the context of AI somewhat seriously. There is an international arms race with China, which has less compute, but more engineers and scientists. This sort of intellectual arms race does not exhaust itself easily.

    A similar space race in the 1950s and 1960s progressed from first unmanned spaceflight to a moonwalk in mere 12 years, which is probably less than what it takes to approve a bicycle lane in Chicago now.

    • mitthrowaway2 an hour ago

      It's not a law per se, but there are rules for reasoning under uncertainty to get the most out of what limited knowledge you have, and Lindy's law arises from that. To do better than Lindy's law requires having additional information about the problem beyond just the one data point.

    • krupan an hour ago

      "There is an international arms race with China"

      I keep seeing this. Where did it come from? Has China said that they intend to attack other countries using AI? Have other countries declared that they intend to attack China with AI?

      Also, why does anyone believe that AI could actually be that dangerous, given it's inherent unpredictable and unreliable performance? I would be terrified to rely on AI in a life or death situation.

      • aspenmartin an hour ago

        AI in war is like Palintirs whole business model. You have a system that can effectively deal with ambiguity and has superhuman performance on reasoning plus superhuman physical abilities via embodiment…

        Inherent unpredictable and unreliable performance is also quite the feature of human beings as well.

      • dmbche an hour ago
      • inglor_cz an hour ago

        It was a metaphor. I meant, and later clarified, an intellectual arms race.

        BTW your handle is an actual Czech word, minus a diacritic sign ("křupan"), and a bit amusing one. It basically means hillbilly. Not that it matters, just FYI.

        Anyway: AI will be used in military context, and it probably already is. Both for target acquisition and maybe even driving the weapon itself. As of now, the Ukrainians are almost certainly operating some AI-enabled killer drones.

  • itkovian_ an hour ago

    The other thing people don’t understand is exponential curves are self similar. The start of an exponential looks like an exponential. People always look at and think ‘well that’s it it’s exponential now, have missed it, can’t sustain’. Nope.

    Good example of this is number of submissions to neurips/icml/iclr. In 2017 that curve was exponential.

  • devmor 2 hours ago

    "Exponentials all tend to become sigmoids but you can't predict exactly when" is a true statement, but I'm not sure it needed an article.

    This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.

    I really don't get the point of what I just read.

    • aspenmartin an hour ago

      The point is the tiring arguments from AI skeptics saying “things are flattening, they have to” which while technically correct says nothing because no one knows when that will happen and we see no mechanism for this yet. Lindy’s law as a reasonable prediction under total uncertainty is interesting and insightful and a lot of people don’t know about it or why it holds. I did enjoy the reference to this!

  • nathan_compton 2 hours ago

    A lot of words to say "The initial part of a sigmoidal curve is not very informative about the parameters of the sigmoid function in question."

    • inglor_cz 2 hours ago

      That is true, but I generally enjoy reading a lot of words from Scott, who has a talent for writing.

      The entire plot of the Lord of the Rings could probably be compressed into less than 10 kB of text too.

      Edit: this seems to be a controversial comment, but IMHO a blog of Scott Alexander's type is an art form, not just a communication channel.

      • jeffreyrogers an hour ago

        I find him more interesting when he talks about non-AI topics. Lots of other interesting people are like this too. I'd rather get my knowledge on AI from people who have unique insights into it. Scott has a lot of unique perspectives of his own, but his views on AI are bog-standard for his social group.

        • inglor_cz an hour ago

          Frankly, me too, but he is still smart enough to introduce some grains of original thought even into those bog-standard views.

  • addaon 2 hours ago
  • BoredPositron 2 hours ago

    If you use the log scale you'll see that the time horizon of opus 4.6 was as expected...

    • afthonos 2 hours ago

      As expected by the exponential. The Wharton study was predicting when the exponential would turn into a sigmoid.

    • ReptileMan an hour ago

      Everything is linear on a log log scale with a fat marker.

  • bedobi an hour ago

    Why is this author tolerated on Hacker News? He's not actually knowledgable about 99% of subjects he posts about.

    • ngriffiths 34 minutes ago

      I think there are many ways someone with his lack of expertise can still be valuable, including:

      - Making connections to other subjects that an expert would miss. The hall of fame of sigmoid predictions is just excellent, I already know I'm going to be reminded of it some time in the future. Very entertaining way to get the point across.

      - Writing about tricky concepts in a very accessible and elegant way, which experts are notoriously bad at doing themselves - they are often optimizing for other specialists.

      - Being able to write with an air of speculation and experimentation with ideas that experts and institutions often can't afford. Experts have to maintain their track record; Scott Alexander can say "lol just double the timeline"

      • bedobi 12 minutes ago

        you do you, I don't come here for superficially informed-looking articles written by people who are in fact not experts, informed or educated, I come here for the real deal

        it doesn't help that sCotT aLexAndEr is also as close as you can come to the modern dressed up version of a eugenicist (again, not based on any actual expertise)

        but I rest my case

    • simianparrot 42 minutes ago

      Because HN is YCombinator which has invested in probably hundreds of «AI» firms by now. Including OpenAI.

      Allowing slop articles like this literally prints them evaluation money.