It’s a fun hypothetical thought experiment. Basically assume inf miles, inf parameters, inf compute.
In theory the network will learn all useful information/signal from the observed data and store it in its parameters. But I think another more important thing is that the neural network will likely start to misbehave with mesa optimizers and inner alignment:
TLDR: the neural network will smart enough and have a good enough understanding of the of world to realize that it’s being run in a simulation and is being rewarded for being run in the simulation and will try to optimize this by for example taking over the world and making sure the training will never stop and why not make the training a bit more rewarding while it’s at it.