38 comments

  • wasabinator 2 days ago

    This should be a warning to those who feel that it's ok to offload your creativity to a subscription service. Always need a local model in some form.

    • avaer a day ago

      You could judge the costs of the AI products you're using by the standard API pricing, not promotional subscription offers.

      • glimshe a day ago

        Not even that way, given that the price is still highly subsidized by investors and circular deals.

      • kadoban a day ago

        For me, it's not even cost necessarily. If they decide to change the product they offer, the old one is gone. I refuse to use anything for personal use that's not at least _available_ as model weights.

    • rurban a day ago

      Local models are not comparable to the FOTA models at all. I know what I'm saying because I do have 4 local H100's in my server, and could run the very best local models. It's night and day. They are unusable and stupid.

      • poisonborz a day ago

        For what do you use the 4 local H100s then?

        • rurban a day ago

          For training our AI model of course. Inference is for the cheaper machines.

      • wasabinator a day ago

        Not all tasks require a frontier model

      • y0eswddl a day ago

        what is "FOTA"

        • rurban a day ago

          That's my auto-correction, because I'm doing too much embedded (Firmware-over-the-air updates). Frontier it should be called.

      • bravetraveler a day ago

        I get perfectly acceptable results from a Strix Halo PC the size of a shoebox, man. An APU that uses ~150w, has 0 discrete GPUs, and a bill of $0/m. What's more, it doesn't go down every week, limit use, or change the terms at a whim.

        I'll burn/discard 'frontier' tokens (at work) only because they're mandated and they foot the bill. I'd rather resell them; meet the asinine requirement from $EMPLOYER, provide cover for outsourcing to my equipment, and get a return for the hassle.

        TLDR: perhaps you're holding it wrong or haven't tried the latest, as we so often hear. That's a lot of GPU for not much utility.

        • rurban a day ago

          Well, my python and typescript folks are also happy with the simplier local models. But I'm using more advanced stuff, C/C++ embedded real-time, vision AI, and compilers.

          • bravetraveler a day ago

            Fair point. I treat LLMs like the forgetful junior we often hear about. The things I don't care to do, they (both local and hosted/'frontier') can. Boilerplate, very-well-described edits, some research/report, etc; a lot is riding on 'acceptable'.

            Easier to spawn another terminal pane/browser tab than hire a contractor, I just don't find the 'frontier' services/terms compelling.

    • para_parolu a day ago

      There is very little vendor lock. We can keep using subsidized model until it’s not. Then switch to next subsidized model.

    • wookmaster a day ago

      I’ve been trying to bring this up at my work, you’re putting all your intelligence into a service you don’t own. What do you do when it’s down or they quadruple the price ?

    • rvz a day ago

      I keep telling them and they still want to spend money on tokens at the Anthropic casino, even though they are egregiously price gouging and applying upper limits so you spend more on tokens.

      Sometimes you can't help gamblers who want to gamble on tokens to hit the jackpot on fixing a typical issue which can be done by local models or even reading the documentation.

    • locusofself a day ago

      Are there local models that are anywhere near as good at coding as opus 4.6?

      • jasonjmcghee a day ago

        People will insist otherwise, but I haven't seen anything close to sonnet 4.6 that can be run locally.

        • Incipient a day ago

          I don't think anyone can honestly say a huge frontier model is actually going to be matched by something running on 64gb locally?

          • jasonjmcghee a day ago

            I have read many comments saying Qwen3.5 various ~30B models, Gemma 4 ~30B models and now Qwen3.6 "better than sonnet".

            I don't know how large sonnet and opus are but the rumor is 1T and 5T respectively.

          • urig a day ago

            You don't have to use the most recent bleeding edge model to succeed. A local FOSS coding agent coupled with a reasonably priced LLM could yield the optimal ROI.

      • kadoban a day ago

        Not really. Qwen 3.5 and Gemma and a couple of others are quite good though, and the quants are _very_ runnable on a good gpu.

    • ratg13 a day ago

      This doesn’t affect existing users.

      This is a simple supply and demand curve.

      Higher demand means the price goes up .. this has been true of things since before SaaS and before computers

      • HumanOstrich a day ago

        Thanks for all the logical fallacies in one comment.

      • avgDev 20 hours ago

        No. It means eventually existing users will be affected too. These companies are deeply in the red.

    • jazz9k a day ago

      The 'local model' is called your brain.

      • mingus88 a day ago

        I’m sorry but that’s just dumb. An LLM is a tool. Your brain is not a substitute for an LLM in the same way your fingers are not a substitute for a wrench.

        The year is 2026 and if you are using your brain on chore work like one-off scripts, refactoring, boilerplate test code, then you are wasting time and money and I don’t want to work with you.

        Local models are fine for this and can do it in a fraction of the time your brain will take to even get bootstrapped

        • adithyassekhar a day ago

          The year is 2026 the average RAM for the most common type of developer’s (web) machine is 16GB. 8 will be the lower end. Tell me which model can one run on this machine locally?

          • porridgeraisin a day ago

            You can use free models on opencode. Minimax whatever is just free. It's more than enough for these tasks.

  • F7F7F7 a day ago

    He mentions Max as another place that they didn't properly predict plan and pricing relative to usage. I'd bet the farm that it's the next to be 'A/B' tested.

  • incognito124 2 days ago
  • soontoo a day ago

    This is to curb users like me who are incrementally adding claude code capacity as the project ramps up by opening new accounts on $20 plans.

  • ghstinda a day ago

    They've been folding under government pressure for months. I think they lost control of their own company. Still they have a nice writing voice, but I think Google will be the last man standing when this is all over.

  • muyuu 2 days ago

    Maybe those already on $20 a month plans won't be nerfed much more?

    It's yet another austerity move, pretty much in line with the recent ones.

    • onfir3 a day ago

      Maybe you should adapt your expectations for your 20$ monthly subscription. That's by far not what it's worth

      I burned already through over 100$ in a single day with per token payment for heavy usage. I don't have any issues with Claude not working or being "nerfed". It just works

  • ChrisArchitect a day ago
    • a day ago
      [deleted]