Hello!
In this edition of the Epoch AI Brief:
We published our 2024 Impact Report, detailing our accomplishments over the last year and our plans for the coming year.
We updated our Benchmarking Hub with more detailed data and a more frequent pace of updates.
We expanded our coverage of Biology AI models, with data on hundreds of AI models trained on biological data.
We wrote new issues in Gradient Updates, our weekly newsletter, about the economic impacts of AI and AGI, DeepSeek’s innovations, ChatGPT’s energy use, and more.
We are hosting a competition in March in Cambridge, Massachusetts to determine how well human mathematicians do on FrontierMath.
We published a page compiling all of our data insights as well as new data insights covering AI chips, large-scale AI models, Chinese LLMs, open models, and the factors driving growth in training compute.
We are hiring for a Communications Lead!
If you weren’t expecting this email, you can customize your preferences in the notifications section of your account settings.
Latest Publications
2024 Impact Report
We published our 2024 Impact Report! This report highlights Epoch AI’s major accomplishments over the past year. Our outputs include:
FrontierMath, our highly advanced math benchmark for AI
Can AI Scaling Continue Through 2030? This detailed report investigates four key bottlenecks in scaling up training compute.
Epoch AI’s Data Hub, a collection of databases on AI models, benchmark evaluations, and AI hardware, along with many analyses and visualizations of key trends. Last year, we added data on over 1200 AI models, 130 benchmark evaluations, and 75 AI accelerators!
We were also cited by media outlets such as Nature, Science, and the New York Times, and by government agencies such as UK’s DSIT and the US Department of Commerce.
In the coming year, we will continue and expand our efforts to collect data and publish key data insights about AI, and to benchmark AI capabilities, including by developing new benchmarks of our own. We will also develop and publish an economic model of the impact of AI automation, combining growth theory and AI scaling laws.
To learn more, read the full report here! We are fundraising $10M over the next two years—if you’d like to support our work, you can contribute here.
Data Insights
We’ve published a number of new data insights over the past month, with analyses and visualizations of key trends in AI. These insights cover the total stock of AI computing power, models trained on over 10^25 FLOP, progress in Chinese LLMs, training compute of open-weight models, and the factors driving growth in training compute.
You can find these insights and more on our new data insights page!
A more systematic and transparent AI Benchmarking Hub
We’ve updated our AI Benchmarking Hub, which houses a database of our independent evaluations of AI models on key capability benchmarks. We’ve overhauled our infrastructure to enable us to run evaluations on new models, and add the results to the Hub, much more quickly. We also added richer data on each model and evaluation, such as logs of the model’s responses to each problem. We also published an open-source Python library that you can use to access our benchmarking data. Stay tuned as we continue updating the Hub by adding new benchmarks and models!
You can find the update announcement post and a link to the Hub here.
Announcing our Expanded Biology AI Coverage
We’ve expanded our Biology AI dataset, which covers 360+ AI and machine learning models that were trained on biological data such as protein or genomic data. We found that training compute for the largest biology models has grown at over 3x per year, and the largest training datasets have doubled in size every year. In addition, we collected data on whether biology models have safeguards to guard against dual-use risks.
You can find the announcement post and full dataset here!
Gradient Updates
We’ve published several new issues of Gradient Updates, our weekly newsletter containing shorter-form research and commentary on important issues on AI. Our latest issues cover:
The economic impacts of AI that could fully automate remote work
The training techniques that went into DeepSeek-R1, DeepSeek’s reasoning model
Why algorithmic progress will increase rather than decrease demand for compute
You can find a full list of issues, and subscribe to future updates, at this link!
Note that 2024 Epoch AI Brief subscribers have been added to Gradient Updates by default given Substack’s subscriber-management limitations. You can customize your preferences in the notifications section of your account settings.
Other Updates
FrontierMath Competition
We are hosting a competition in Cambridge, Massachusetts on March 22, 2025 to measure the performance of human mathematicians on our FrontierMath benchmark! Participants will spend several hours solving novel, difficult mathematics problems alongside leading mathematicians, and we will have a total prize pool of $30,000 for top performers.
You can find more information and register your interest here!
Clarifying the Creation and Use of the FrontierMath Benchmark
In this post, we clarify the relationship between FrontierMath and its sponsor, OpenAI. To create FrontierMath, we partnered with OpenAI, who funded the work and provided valuable technical feedback to improve the benchmark. However, we initially did not communicate clearly that OpenAI owns and has access to the benchmark questions. For any future benchmark development, we will ensure that benchmark contributors and the general public have clear information about any industry partnerships and data access agreements from the outset.
Careers
We’re hiring for a Communications Lead! In this role, you would lead and expand Epoch AI’s communication efforts to grow our public presence. Ideally, we’re looking for someone with strong communications and project management skills who is also deeply interested in AI progress and our research areas. This position would be fully remote, with a salary range of $80,000 to $110,000 per year.
You can find more details and apply here! Applications close on March 2.