Interesting idea. Although I wouldn't consider `but restrict the data set to publications from <= year 1600` "easy".
If you did have access to a high-quality pretraining dataset and you could explore training up to 1600, then up to 1610, 1620, ... 1700 and look at how the presence of calculus was learned over that period. Running some tests with the intermediate models to capture the effect
Interesting idea. Although I wouldn't consider `but restrict the data set to publications from <= year 1600` "easy".
If you did have access to a high-quality pretraining dataset and you could explore training up to 1600, then up to 1610, 1620, ... 1700 and look at how the presence of calculus was learned over that period. Running some tests with the intermediate models to capture the effect
> It seems like an easy test: Train an LLM but restrict the data set to publications from <= year 1600
So, train it on less than 0.01% of the material other LLMs are trained on? It won't prove much if it fails.
Betteridge's law of headlines states: "Any headline that ends in a question mark can be answered by the word no."
For example: "Has Science Found the Fountain of Youth?" -> No
So, answering your question, it's a resounding no