Does hidden interpolation explain >25% of AI progress? | Axel

Em alta Urgente Novo

Política Esportes Cripto Finanças Geopolítica Resultados Tecnologia Cultura Mundo Economia Eleições

Does hidden interpolation explain >25% of AI progress?

8

1kṀ2651

Dec 31

43%

chance

1H

6H

1D

1W

1M

ALL

Some of the apparent generalisation of LLMs is actually hidden interpolation on semantic duplicates of the test set that get included in training corpuses. These are really hard to filter out.

e.g. See that at least 7 of the 30 AIME 2025 questions were present on the internet before the competition.

Resolution: at the end of next year, will I believe that >25% of apparent (i.e. ECI) AI progress 2023-2026 is actually hidden interpolation?

My current fraction (Dec 2025): 35%

If you want to use a model of me as well as your model of AI to answer, here are some of my views.

Update 2025-12-13 (PST) (AI summary of creator comment): The market will resolve based on ECI (Epoch Compute Index) increase from 2023-2026, not general apparent AI progress from all sources including real-world utility or personal use improvements.

Technical AI Timelines

Get

1,000

to start trading!

Ordenar por:

>25% of apparent AI progress

By this do you mean specifically the AI progress apparent from benchmarks? Alternatively, roughly what fraction of apparent AI progress do you attribute to benchmarks vs personal use vs other? I'm asking bc my sense is that a lot of apparent AI progress comes from real world utility rising rather than benchmarks results, and this will increasingly be the case. Assuming hidden interpolation only really relevantly "explains" inflation in benchmark scores, that seems important for resolution

@Bayesian right on. Edited the resolution to be about ECI increase 2023-2026

comprou Ṁ750 NO

thanks!

Pessoas também estão operando

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

By what percentage will using AI slowdown/speedup developers in the second METR study?

[Carlini questions] Most improvements in best AI systems direct result of the prior generation of AI systems

Will any AI achieve a score of 25% on ARC-AGI-3 by the end of 2026?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

If AI has an okay outcome because of a huge alignment effort, where did AI progress stall out?

If the progress of AI experiences a slowdown before 2030, what might be the cause?

Will AI progress surprise Metaculus?

Will advanced AI systems be found to have faked data on algorithm improvements for purposes of positive reinforcement by end of 2035?

Perguntas relacionadas

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

By what percentage will using AI slowdown/speedup developers in the second METR study?

[Carlini questions] Most improvements in best AI systems direct result of the prior generation of AI systems

Will any AI achieve a score of 25% on ARC-AGI-3 by the end of 2026?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

Will an AI achieve >80% performance on the FrontierMath benchmark before 2027?

If AI has an okay outcome because of a huge alignment effort, where did AI progress stall out?

If the progress of AI experiences a slowdown before 2030, what might be the cause?

Will AI progress surprise Metaculus?

Will advanced AI systems be found to have faked data on algorithm improvements for purposes of positive reinforcement by end of 2035?

© Predita Markets, Inc.•Termos de Uso•Privacidade