AI Can't Read an Investor Deck

(mercor.com)

1 points | by gmays 12 hours ago ago

2 comments

  • wxw 11 hours ago

    > We tested three frontier models: GPT-5.4, Gemini 3.1 Pro, and Claude Opus 4.6

    In what kind of harness/any at all? A model API call versus an agent would perform quite differently. People aren't thinking about regular-old ChatGPT taking jobs, they're thinking about Claude Code/Cowork.

  • Jet_Xu 12 hours ago

    Try DocMason it is born for agentic analyze messy office files.

    https://github.com/JetXu-LLM/DocMason

    Demo video: https://youtu.be/Sq3a5qxsLwM