Discussion about this post

User's avatar
Parker Whitfill's avatar

FYI there is a typo where "It’s entirely possible that we see no improvement. It’s perfectly plausible that AlphaProof gets 2-4 problems and LLMs get 1-2 problems. This would be consistent with no progress over prior capabilities, or could just be due to the problems being unusually hard for AI systems" is repeated twice.

Expand full comment
Steeven's avatar

I guess what’s the point of the IMO if you’re not exactly looking at the IMO problem set but only if the LLM can solve the problem in a particular way? I think it’s also possible that mere grinding might be enough to solve open problems and even if the LLM isn’t creative in a way that would be particularly impressive, it might not be a dead end for the LLM to simply get faster at grinding

Expand full comment
5 more comments...

No posts