Mar 23

AI has solved one of the problems in FrontierMath: Open Problems, our benchmark of real research problems that mathematicians have tried and failed to solve

4 Comments

Ben Schulz

Mar 25

How do I submit my solution?

Anatol Wegner, PhD

Mar 25

"Congratulations to Kevin Barreto and Liam Price, who first elicited a solution from GPT-5.4 Pro" so the breakthrough here is indeed supposed to be that a group of two engineers/mathematicians together with the mathematician who formulated the problem in their combined effort miraculously managed to find the right prompt/response sequence that makes GPT-5.4Pro produce a solution to the problem - we are deep into Clever Hans territory if you ask me.

Oscar Delaney

Mar 24

If several existing models were capable of solving this problem, why did we only learn this now? Is the scaffolding and promoting very important? I wonder if there are other problems in this dataset that current models can also solve them, have you checked?

Reply (1)

Greg Burnham

The other models weren't able to one-shot it, which is all we had tried previously. We learned this as soon as we set up our scaffold. It's not so involved, and we'll open-source it once we've finished testing it on all problems. You can see a bit more here for now:

https://x.com/GregHBurnham/status/2036242043412848800?s=20

Epoch AI

First AI solution on FrontierMath: Open…