8 comments

  • schnau_software 27 minutes ago

    I want to buy this book! But the price is too high. Can you offer some kind of HN discount?

    • apwheele 25 minutes ago

      You can use `LLMDEVS` for 50% off of epub (that was the coupon I sent to folks on my newsletter).

  • clemailacct1 an hour ago

    I’m always curious why local models aren’t being pushed more for certain types of data the person is handling. Data leakage to a 3rd party LLM is top on my list of concerns.

    • apwheele 18 minutes ago

      I am not as concerned with that with API usage as I am with the GUI tools.

      Most of the day gig is structured extraction and agents, which the foundation LLMs are much better than any of the small models. (And I would not be able to provision necessary compute for large models given our throughput.)

      I do have on the ToDo list though evaluating Textract vs the smaller OCR models (in the book I show using docling, their are others though, like the newer GLM-OCR). Our spend for that on AWS is large enough and they are small enough for me to be able to spin up resources sufficient to meet our demand.

      Part of the reason the book goes through examples with AWS/Google (in additiona to OpenAI/Anthropic) is that I suspect many individuals will be stuck with the cloud provider that their org uses out of the box. So I wanted to have as wide of coverage as possible for those folks.

    • iririririr 16 minutes ago

      but they claim your data is private and they will totally not share any of it with their advertising partners!

  • cranberryturkey an hour ago

    Biggest gap I see in most "LLM for practitioners" guides is they skip the evaluation piece. Getting a prompt working on 5 examples is easy — knowing if it actually generalizes across your domain is the hard part. Especially for analysts who are used to statistical rigor, the vibes-based evaluation most LLM tutorials teach feels deeply unsatisfying.

    Does this guide cover systematic eval at all?

    • apwheele an hour ago

      Totally agree it is critical. Each of chapters 4/5/6 have specific sections demonstrating testing. For structured outputs it goes through an example ground truth and calculating accuracy, demoing an example comparing Haiku 3 vs 4.5.

      For Chapter 5 on RAG, it goes through precision/recall (with emphasis typically on recall for RAG systems).

      For Chapter 6, I show a demo of LLM as a judge (using structured outputs to have specific errors it looks for) to evaluate a more fuzzy objective (writing a report based on table output).

  • Schlagbohrer 2 hours ago

    thought it said Large Lagrange Models