Zork-bench: An LLM reasoning eval based on text adventure games

(lowimpactfruit.com)

2 points | by nicholasjbs 4 hours ago ago

No comments yet.