OpenAI's In-House Data Agent

(openai.com)

23 points | by meetpateltech 2 hours ago ago

5 comments

  • maxchehab 8 minutes ago

    Trust is the hardest part to scale here.

    We're building something similar and found that no matter how good the agent loop is, you still need "canonical metrics" that are human-curated. Otherwise non-technical users (marketing, product managers) are playing a guessing game with high-stakes decisions, and they can't verify the SQL themselves.

    Our approach: 1. We control the data pipeline and work with a discrete set of data sources where schemas are consistent across customers 2. We benchmark extensively so the agent uses a verified metric when one exists, falls back to raw SQL when it doesn't, and captures those gaps as "opportunities" for human review

    Over time, most queries hit canonical metrics. The agent becomes less of a SQL generator and more of a smart router from user intent -> verified metric.

    The "Moving fast without breaking trust" section resonates, their eval system with golden SQL is essentially the same insight: you need ground truth to catch drift.

    Wrote about the tradeoffs here: https://www.graphed.com/blog/update-2

  • sjsishah 36 minutes ago

    Given my personal experience with various BI systems I think an AI agent like this is the perfect use case. These systems are operating on multiple layers of being wrong as is - layer 1 being your query is likely wrong, layer 2 being how you interpret the data is likely wrong.

    Mix them together and you’re already deep in make believe land, so letting AI take over step 1 seems like a perfect fit.

    I was hoping to read this article and be surprised by how OpenAI was able to solve the reliability problem, but alas.

  • 0xferruccio an hour ago

    At Amplitude we built Moda which is super similar to this.

    Our chief engineer Wade gave an awesome demo to Claire Vo some months back here: https://www.youtube.com/watch?v=9Q9Yrj2RTkg

    I use this basically every day asking all sorts of questions

  • spiderfarmer 3 minutes ago

    I'm more interested in Kimi's In-House Data Agent

  • htrp 10 minutes ago

    data problems are not tech problems but rather org problems