4 Comments
User's avatar
Ben Schulz's avatar

How do I submit my solution?

Anatol Wegner, PhD's avatar

"Congratulations to Kevin Barreto and Liam Price, who first elicited a solution from GPT-5.4 Pro" so the breakthrough here is indeed supposed to be that a group of two engineers/mathematicians together with the mathematician who formulated the problem in their combined effort miraculously managed to find the right prompt/response sequence that makes GPT-5.4Pro produce a solution to the problem - we are deep into Clever Hans territory if you ask me.

Oscar Delaney's avatar

If several existing models were capable of solving this problem, why did we only learn this now? Is the scaffolding and promoting very important? I wonder if there are other problems in this dataset that current models can also solve them, have you checked?

Greg Burnham's avatar

The other models weren't able to one-shot it, which is all we had tried previously. We learned this as soon as we set up our scaffold. It's not so involved, and we'll open-source it once we've finished testing it on all problems. You can see a bit more here for now:

https://x.com/GregHBurnham/status/2036242043412848800?s=20