4 comments

  • imrozim 19 hours ago

    interesting approach building this in rust the latency argument makes sense for something sitting inline with every LLM request. curious how it handles multi-turn context where injection might be spread across messages rather than a single prompt.

  • zippode 3 days ago

    one I forgot, please visit the benchmark of Isartor and see the deflection rate to reduce LLM tokens: https://github.com/isartor-ai/Isartor/tree/main/benchmarks

  • zippode 4 days ago

    [dead]

  • NathanielLucas 4 days ago

    [dead]