4 comments

  • nl an hour ago

    If you are going to go to the bother of fine tuning for trivial problems like subject classification then I think you'll find Scikit Learn with a SGDClassifier on 2-grams will do probably just as well and be under 1MB for the trained classifier.

    You can train it in under a minute, and it will work perfectly well on embedded devices.

    Small LLMs are good choices for text classification in two cases:

    - If you next to provide in-context examples and classifier based on them.

    - Your classification goes beyond simple subject-type classifiers. For example, multiple choice question answering is classification where small LLM will work but traditional ML methods won't/

    • djsjajah 6 minutes ago

      Not with 800 examples. If you are going to consider an ngram model, I think you are better off getting a frontier llm to write you an absurd regex.

  • mickael-kerjean 9 minutes ago

    If you are interested in small language model to fine tune, gemma3:270m is quite interesting for its size

  • jszymborski an hour ago

    I think the Qwen 0.6B is so cool. It is super fast and as illustrated here it has a clear niche, esp. when fine-tuned.

    I'm also interested in it as a student for distillation.