
Benchmark Gap #4: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, how many months will it be before an AI is listed as a (co) first author on a published math paper?
9
410á¹€5992050
37
esperado
1H
6H
1D
1W
1M
ALL
This question is meant to measure the gap between solving the main math-based benchmarks at the time of market creation, and contributing to real world mathematics.
The co first author requirement is loose: I will also accept an AI being credited with significant contributions to both deciding what to prove and the actual proof (merely contributing to the proof is not enough - I am trying to get at "the AI does the work of a mathematician" not "the AI does the work of a proof assistant"). I would also accept, for instance, the human author of the paper expressing that they would have named the AI as a coauthor if it was human, or saying that the result could not have been obtained without the assistance of the AI.
Esta pergunta é gerenciada e resolvida pela Predita.
Get
1,000 to start trading!
Pessoas também estão operando
Perguntas relacionadas
Will AI be better every human at proving Math theorems by the end of 2030?
37% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
66% chance
Benchmark Gap #5: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, will it be less than two years before AI models are used as entry-level data science / data analysis / statistics workers?
67% chance
Benchmark Gap #8: Once a single AI gets >= 80% on FrontierMath Tier 4, how long until an AI publishes a math paper?
17
Will an AI co-author a mathematics research paper published in a reputable journal before the end of 2026?
34% chance
Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?
43% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?
35% chance
Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?
78% chance
Which MATH-AI 23 works will have >50 Google Scholar citations by end of 2026?
Will AI contribute as much as a co-author would today to a real research mathematics paper before Jan 1 2028?
83% chance