Discussion about this post

User's avatar
PolisObserver Forecasting's avatar

Well I was all the time wondering What is tier 1 ,2 ,3, 4.

It seems the post is more worried with explicit pure data description, instead of care of the reader will understand and What it means all if these numbers.

JP's avatar

The bit about finding a 2011 preprint to shortcut a Tier 4 problem is remarkable. That's not brute force reasoning; that's something closer to research. I covered the broader benchmark picture and the FrontierMath results were one of the more surprising data points. The computer use numbers got the headlines but this deserves more attention. Full breakdown: https://reading.sh/gpt-5-4-just-dropped-heres-your-explainer-8fcc0126d84d?sk=ad5982c9f3b9382ff8fea9c32491a811

1 more comment...

No posts

Ready for more?