The main advantage of Tesla having a lot more compute is that their engineers can iterate faster and that they can increase the dataset. Meta who has tons of compute has released their newer GPT version and performance has improved a lot compared to last year:
View attachment 1040253
Just look at the score of MATH, compare llama2 with 70B parameters to llama3 with 8B parameters. Model size down 10x, performance up 3x !! In 9months!
This is what FSD V12.1 to V12.3.5 looks like. More compute has been unlocked, more targeted data has been added, the system has been tuned slightly. Expect Llama4 to further improve the metrics for all size of their neural network. Expect V12+ to improve the metrics. We know the formula, keep adding tons of high quality data, add extra targeted data where the system is weak, lots of offline compute and even with a moderate online compute you will see massive gains in inference performance.