Reducing TTFT by CPUMaxxing Tokenization

(crusoe.ai)

4 points | by AlonKejzman 6 hours ago ago

4 comments

  • 6 hours ago
    [deleted]
  • AlonKejzman 6 hours ago

    I am one of the researchers who worked on this, would love to hear your opinions

  • h011yM011y 6 hours ago

    Does it work on Qwen3.5?

    • AlonKejzman 6 hours ago

      Of course! It actually works out of the box, due to its generic design