GPT-5.1 for Developers

(openai.com)

45 points | by tedsanders 4 hours ago ago

11 comments

  • dweekly an hour ago

    A few hours of playing around and I'm suitably impressed.

    Claude 4.5 Sonnet definitely struggles with Swift 6.2 Concurrency semantics and has several times gotten itself stuck rather badly. Additionally Claude Code has developed a number of bugs, including rapidly re-scrolling the terminal buffer, pegging local CPU to 100%, and consuming vast amounts of RAM. Codex CLI was woefully behind a few months ago and, despite overly conservative out-of-the-box sandbox settings, has quite caught up to Claude Code. (Gemini CLI is an altogether embarrassing experience, but Google did just put a solid PM behind it and 3.0 Pro should be out this month if we're lucky.)

    Codex with 5.1 high managed to thoughtfully paw through the documentation and source code and - with a little help pulling down parts of the Swift Book - managed to correctly resolve the issue.

    I remember getting the thread manager right being one of the harder parts of my operating systems course doing an undergrad in computer science; testing threaded programs has always been a challenge. It's a strange circle-of-life moment to realize that what was hard for undergrads also serves as a benchmark for coding agents!

  • __jl__ an hour ago

    The prompt caching change is awesome for any agent. Claude is far behind with increased costs for caching and manual caching checkpoints. Certainly depends on your application but prompt caching is also ignored in a lot of cost comparisons.

    • pants2 41 minutes ago

      Though to be fair, thinking tokens are also ignored in a lot of cost comparisons and in my experience Claude generally uses fewer thinking tokens for the same intelligence

  • miohtama an hour ago

    > On coding, we’ve worked closely with startups like Cursor, Cognition, Augment Code, Factory, and Warp to improve GPT‑5.1’s coding personality, steerability, and code quality.

    Why no GitHub?

    • conception 8 minutes ago

      Microsoft isn’t a startup and I suspect open AI is working closely with Microsoft already.

  • felixbraun an hour ago

    Already live in Cursor btw

  • kevinkatzke an hour ago

    This got only a single comment and 34 points in 3 hours. Crazy how the dynamics have changed around model releases in just a single year.

    • throwup238 an hour ago

      There was already an announcement post for 5.1 yesterday: https://news.ycombinator.com/item?id=45904551

    • observationist 16 minutes ago

      This is the first low-key, silent feature rollout, treated like "just another software update", with no hype or buzz beforehand. Prior to this point, every other feature release was pumped for weeks or even months with "leaks" from insiders and deliberately getting people amped. I don't know if OpenAI changed marketing tactics, or if they're in a new chapter in some book, but this is a radical shift from what they were doing before.

    • amelius an hour ago

      More of the same, I suppose.

      You have to be called Apple to get raving reviews for that.