9 points | by ninjahawk1 9 hours ago ago
2 comments
I remember the idea of "swear at the LLM to get better results" and I even think it somewhat worked, at least for a while.
This is probably how we'll end up with a HAL9000 burning the world to the ground.
Pretty please ask Claude to start with benchmarks that measure your approach against other approaches. I did read all and only found:
Selected numbers from live system runs:
Scenario , Naive shell approach , Hollow API , Savings
Code search , 21636 tokens , 987 tokens , 95%
Agent drift (cons. rate) , 35% (cold start) , 70% (with handoff) , 2x
That is a lot less than enough to justify a git clone.
I remember the idea of "swear at the LLM to get better results" and I even think it somewhat worked, at least for a while.
This is probably how we'll end up with a HAL9000 burning the world to the ground.
Pretty please ask Claude to start with benchmarks that measure your approach against other approaches. I did read all and only found:
Selected numbers from live system runs:
Scenario , Naive shell approach , Hollow API , Savings
Code search , 21636 tokens , 987 tokens , 95%
Agent drift (cons. rate) , 35% (cold start) , 70% (with handoff) , 2x
That is a lot less than enough to justify a git clone.