Can someone explain what one might use this model for? As a developer with a casual interest in biology it would be fun to play with but honestly not sure what I would do
It’s a self supervised learning architecture, and it’s pretty much universal. The loss function runs on embeddings, and some other smart architectural choices allover. Worth diving into for a few hours, Yann LeCun gives some interesting talks about it
What makes these Domain specific models work when we don’t have good domain models for health care, chemistry, economics and so on
>we don’t have good domain models for health care, chemistry, economics and so on
Who says we don't?
Examples please?
No, it's really simple to search for domain specific models being used "in production" all over the place
I didn’t find a single one that outperforms a general model.
Ok, alphafold.
It’s not a large language model
Can someone explain what one might use this model for? As a developer with a casual interest in biology it would be fun to play with but honestly not sure what I would do
You can get your feet wet with genetic engineering for surprisingly little money.
This guy shows a lot of how it's done: https://www.youtube.com/@thethoughtemporium
Basically you can design/edit/inject custom genes into things and see real results spending on the scale of $100-$1000.
Is there something like this in text/readable format?
> In Progress: CodonJEPA
JEPA is going to break the whole industry :D
Can you explain this? I haven't heard of JEPA, and from a quick search it seems to be vision/robotics based?
It’s a self supervised learning architecture, and it’s pretty much universal. The loss function runs on embeddings, and some other smart architectural choices allover. Worth diving into for a few hours, Yann LeCun gives some interesting talks about it
https://openreview.net/pdf?id=BZ5a1r-kVsf
full article: https://huggingface.co/blog/OpenMed/training-mrna-models-25-...
What makes this dataset or problem worth solving compared to other health datasets? Would the results on this task be broadly useful to health?
What other "datasets" are you talking about? How do you "solve a dataset" ?
Distributing the load on this will probably be infinitely more useful than “folding at home”
gray goo of the future