Some interesting AI news relevant for Tesla FSD.
New way of doing object detection in images:
Google release new state of the art imagerecognition neural network efficientnetv2
Toward Fast and Accurate Neural Networks for Image Recognition
Posted by Mingxing Tan and Zihang Dai, Research Scientists, Google Research As neural network models and training data size grow, training efficien...ai.googleblog.com
Note that Tesla are using RegNet according to AI day:
View attachment 713815
So what this mean? The first one is one potential architecture change Tesla can do in the future. They could leave the BEV image output space and just output a list of objects in text format. That would probably be a major rewrite so not likely to see that for a while.
The second one they can probably implement in few days and a few more weeks of validation to improve performance overall. The company they aquired deepscale.ai are experts at this and have done the same thing many times before.
RegNet is only for the first layer before integration to vectorspace.
If you compare regnet to efficientnet they barely differ given same number of params and flops - but I guess it is not as fast on the HW3 chip.
They took regnet BECAUSE it is so simple & fast. As first thing.
That is like the first filter in your eyes, before the information is sent into the visual cortex. Speed is key here for sorting out the useless stuff. Accuracy comes later.
The first thing is also in image space and thus not relevant to Tesla. This thing sounds like "we threw a 2d transformer at the task and it worked"..
Tesla uses a 1d transformer in time in vector space (not image space) with added RNN as neural memory in later layers.
More interesting is another paper kaparthy liked some days ago. It should be the latest one talked about by Yannick kilcher on YouTube. Still on my playlist so I cannot say much about it.