Launch HN: Captain (YC W26) – Automated RAG for Files

(runcaptain.com)

24 points | by CMLewis 2 hours ago ago

6 comments

vg_head an hour ago

Good looking! I didn't get to watch the video or look at docs in depth, but do the results trace back to the location of the answers in a document? Let's say it finds an answer in a PDF, and I'd like to know where in that PDF the citation is. Is that possible or intended?

[-]

CMLewis an hour ago

Great question, we have deterministic page # citations for PDF results and exact bounding box citations coming very soon.

If you want to check out the Query API response example, here's a link: https://docs.runcaptain.com/api-reference/query/collection-v...

mchusma 40 minutes ago

Having tried this a bit I do really like the single api call for all of it.

I also appreciate transparent pricing but I am not 100% sure the sense of scale of costs. It could be helpful to give some ballparks on things for each of the plans. I'm not sure exactly what i could get out of a plan. My guess, trying hard to figure it out, was if i had about 1,000 pages of new/updated content per month, I would pay $295/month for unlimited queries on top of it. Is that roughly correct?

jamiequint an hour ago

This is cool, like qmd as a service with real-time integrations where it matters?

How do you handle more structured data like csv/xlsx/json? Would be cool if it were possible to auto-process links to markdown (e.g. youtube, podcast, arbitrary websites, etc) a la https://github.com/steipete/summarize (which can pull full text in addition to summarizing).

[-]

CMLewis an hour ago

Thanks, we're just starting to optimize more for the semi-structured data. So far, we've been parsing tables into Markdown and running them through the contextualized embedding model with no overlap, taking advantage of how it strings together chunks. This isn't great for big files so we're exploring agentic exploration (slow but good for more structured numerical data) and automated graph creation (promising for more relational data).

Love the auto-process markdown idea, we'll add it to our roadmap :D

jzig an hour ago

> spotty RAG