Advancing finance with Claude Opus 4.6

(claude.com)

131 points | by da_grift_shift 7 hours ago ago

19 comments

  • typpo 6 hours ago

    Lately my company has been doing a lot of complex accounting and reporting in spreadsheets. Overall was surprised by how well both GPT and Claude handled some of these extremely tedious tasks. Not uncommon to have an hours-long task compressed to minutes.

    My anecdotal experience is GPT 5.2 Pro is decently ahead of Claude Opus 4.5 in this category when it gets to the tricky stuff, both in presentation and accuracy. The long reasoning seems to help a lot. But, apparently the benchmarks do not agree.

    Edit - noticed OpenAI specifically focuses on finance use cases in their gpt-5.3-codex blog as well https://openai.com/index/introducing-gpt-5-3-codex/

    • someuser54541 3 hours ago

      I feel like I'd be really skeptical of results from a non-deterministic model for something as precise as accounting....

      • nl an hour ago

        The deterministic part (calculations) is done by Excel.

        The non-deterministic part is turning human instructions ("calculate the NPV over 10 years for X given Y") into Excel.

        This is already a non-deterministic process (humans are non-deterministic!). The question is if an AI model can be more reliable than humans, and I can't see any reason why it wouldn't be.

        The correct path is pretty clear, so the logits for following that path are going to be a long way from off-path.

        For something like this the real problem is training the model to use Excel (which will show up by it being confused which sheet it is on or trying to use the wrong window or things like that), not the non-determinism.

      • aaaalone 2 hours ago

        With tool use you do reduce the risks.

        It's not like these models calculate.

      • purplerabbit 2 hours ago

        I know plenty of nondeterministic accountants

      • dr_dshiv 3 hours ago

        Yeah, seriously, I use AI all day every day but that terrifies me.

    • belter 6 hours ago

      Dont use Excel for accounting....

  • bovermyer 7 hours ago

    Based on the article... is this basically just making Claude better at formatting and data presentation, or does it also get better at analysis? I get the impression it's the former.

  • Havoc 3 hours ago

    And then you hand it to your boss who takes a 20 second look at it and asks why you made a projection that assume massive revenue growth and 3 years of perfectly flat utilities, insurance, G&A - no inflation etc.

    It does look really promising as a skeleton starting point though. Like generate it, delete numbers and populate by hand.

    Not unlike the boilerplate start we saw in AI coding a couple years back

  • warabe 7 hours ago

    It's time to sell hedge fund stocks!! Jokes aside, I took the CFA exam last week and now I'm starting to worry about my career...

    • sdf2erf 6 hours ago

      I wouldnt worry. Unless you are just memorising stuff and dont actually understand anything - then you should.

    • Ntrails 4 hours ago

      You'll use a ton of AI but it won't wipe the humans out. In the end you'll have a compositional change, likely nothing catastrophic imo. In part because there is a buck to stop and Claude ain't got no hands...

  • eggsby 6 hours ago

    Article did not load on my tablet :sweat_smile:

  • TacticalCoder an hour ago

    > The side-by-side outputs below show how output quality has improved from Claude Opus 4.5 to Opus 4.6.

    Disclaimer: I use AI to code (and I code for finance) and I love Anthropic.

    But: for f-ck's sake, I cannot click on the picture and have it show up in full. It stays at its tiny size, impossible to read the numbers. I had to right-click and "open in a new tab".

    AI is, somehow, definitely still not fully there yet.

  • henning 4 hours ago

    Their chart only goes up to 70.

    • layer8 4 hours ago

      At least it starts at zero.

  • behnamoh 6 hours ago

    Anthropic does anything to keep the Claude hype going; from fearmongering ("AI bad, need government regulations") to wishful thinking ("90% of code will be written by AI by the end of 2025" —Dario) to using Claude in applications it has no business being in (Cowork, accessing all your files, what could go wrong?) to releasing "research" papers every now and then to show how their AI "almost got out" and they stopped it (again, to show their models are "just that good") to prescribing what the society should do to adapt to the new reality to doing worthless surveys on "how AI is reshaping economy, but mostly our AI not others".