SmolLM2

(simonwillison.net)

84 points | by edward 3 hours ago ago

19 comments

  • echoangle an hour ago

    Semi-related but is there a standard way to run this (or other models from huggingface) in a docker container and interact with them through a web API? ChatGPT tells me to write my own FastAPI wrapper which should work, but is there no pre-made solution for this?

  • jdthedisciple an hour ago

    Very interesting. According to their X posts, this meme model "SmolLm" beats Meta's new 1B and 3B models across almost all metrics.

    I wonder how this is possible given that Meta has been in this game for much longer and probably has much more data at their disposal as well.

    • stavros 35 minutes ago

      Usually, that's because they use a groundbreaking ML method called TTDS, or "training on the test dataset".

  • cpa an hour ago

    What’s the context size? I couldn’t find it on the model summary page. Tangential: if it’s not on the model page, does it mean that it’s not that relevant here? If so, why?

    • cloudbonsai an hour ago

      > What’s the context size?

      SmolLM2 uses up to 8192 tokens.

  • kgeist an hour ago

    Does it support anything other than English? Sadly, most open-weights models have no support for languages other than English, which makes them useless for 75% world's population who don't speak English at all.

    Does anyone know of a good lightweight open-weights LLM which supports at least a few major languages (let's say, the official UN languages at least)?

  • oulipo an hour ago

    Nice! Do you think they could be fine-tuned to implement a cool thing like https://withaqua.com/ ? eg to teach it to do "inline edits" of what you say?

  • ksri an hour ago

    Is there a way to run this in the browser as yet? Transformers js doesn't seem to support this. Is there another way to run this in the browser?

  • Its_Padar an hour ago

    I wonder how one would finetune this

  • forrestthewoods 42 minutes ago

    Is there a good, small model that can take input images? Or are those all still larger?