Show HN: PreCog AI – Automatic AI Model Selection for Any Task

(precog.ubik.studio)

61 points | by ieuanking 9 months ago ago

18 comments

Do you find that there are a lot of variations and intricacies in deciding which LLM to use? I find it pretty simple to just assume sonnet is best at most coding jobs, o1 best at complicated non-coding tasks, 4o for simple questions that don't require planning. and well, tbh that's it, no other LLMs are that interesting.

[-]

ieuanking 9 months ago

We found that there is more nuance when it comes to different languages, we also update our leaderboard as new models drop so the best models are always available through PreCog. It's also nice to not have to tab switch constantly between chatbots and instead just use them in one central place. We added a feature so you can just pick the model you'd like to chat with if you dont want to be automatically matched with the leaderboard rankings (all matched responses provide an explanation in the reasoning behind the match so you can see why the model was chosen).

cma 9 months ago

I think o1 is best at algorithm heavy coding tasks, opus at api/language-translation knowledge-breath heavy tasks (and also has better recency via later knowledge cutoff). But with latest update opus is pretty close at algorithm heavy coding.

[-]

ieuanking 9 months ago

We have plans to put o1 on the leaderboard - but Opus is there RN! We should have a new ELO ranking soon, but the second o1 ranking are done it will be on our leaderboard.

swyx 9 months ago

you're doing model routing - any thoughts on https://github.com/lm-sys/RouteLLM and Martian?

hope you dont raise funding before you figure out what they haven't

[-]

ieuanking 9 months ago

we have looked at route llm and played with it but we felt like they were focusing on cost minimization, we think theres more work to be done in focusing on routing for the best possible output without the cost constraint. Also just trying to provide easy ways for people to use tools like route llm that dont require coding knowledge. That being said we def wanna release a benchmark of our routing in comparison to some of the pre-trained defaults in route llm

[-]

swyx 8 months ago

cool. its a very small distinction in my mind to flip from one to the other. all the best.

dvfjsdhgfv 9 months ago

I entered the query but was redirected to a login form. If you are honestly looking for feedback and no leads, unblock the app temporarily for HN. If you are for leads, for sure you will get some if this submission receives enough upvotes, but I wouldn't count on many. These days people are not so keen to leaving their data on random websites anymore.

[-]

ieuanking 9 months ago

Just took down the sign up for anyone who wants try it out! Thanks again for pointing that out - new to posting on here, super helpful hope you get some use out of our project.

ieuanking 9 months ago

Thanks so much for the feedback (new here for sure) - we are working on that rn, updating as fast possible gotta rebuild some stuff lmao

okintheory 9 months ago

Ah yes, Minority Report, that story we're so eager to repeat.

[-]

ieuanking 9 months ago

fortunately we aren't policing anybody :* (we do love PKD tho) we thought precog perfectly described the function

[-]

JasonSage 9 months ago

I think it’s a great name, it really does describe the function perfectly.

I got a huge smile when I saw it.

[-]

ieuanking 9 months ago

tysm - I got a huge smile reading this comment.

frmersdog 8 months ago

That doesn't really change the fact that it's exhausting (and worse, "commercially offputting") to be reminded that we're careening towards the worst futures literally imagined. I stayed away from Soylent and I'll probably stay away from this, but thanks for the head's up. rimshot

[-]

ieuanking 8 months ago

As big PKD fans, that definitely flew over our heads a bit. Can def understand that view and understand why its is commercially exhausting especially because we agree that we are heading toward some of the worst futures possible, so did PKD. We definitely build with this in mind!

KingFelix 8 months ago

PKD is rad, my username is also a pkd reference. Love the Ubik studio as well!

[-]

ieuanking 8 months ago

Truly the GOAT. Love the username, love cyphers ;) So glad you liked the Ubik Studio!