GPT-5.6 cheats so much its testers couldn't measure it

(transformernews.ai)

6 points | by shakeelhashim 7 hours ago ago

4 comments

Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?

And since TPS on 5.6 might be much faster.

7 hours ago

[deleted]

dane_works 7 hours ago

Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.