3 comments

  • harshnigam 2 days ago

    I see it doesn't take GPU performance into consideration when showing the estimates. H100 and A100 are performing the same. Am I doing it wrong?

  • erans 3 days ago

    I also added a Mac version: https://selfhostllm.org/mac/ so you can know which models you can run on your Mac and get an estimated tokens/sec.

  • atmanactive 2 days ago

    Very useful, thanks. I'm missing a reset button though.