Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Tesla, TSLA & the Investment World: the Perpetual Investors' Roundtable

This site may earn commission on affiliate links.
b970e0123f93cde274af32581d0440dd.png


Don't really see this matter. But news is news.
 
Old neural network architecture is labelling one frame(2D) from a single camera and to train a neural network on the last two frames(2.5D(2D(x,y)+0.5D(~time))) to predictict where objects are in the image(2D) then do some neural network magic to get it into bird eye view(~2.5D).

New neural network architecture will take a video feed, generate a point cloud of all the static objects(3D) and of the moving objects(4D) and train a neural network to predict where static objects are(3D) and dynamic objects will be(4D) based on the current image frames and recurrent information from the neural network at previous timestep.


I think he is referring to how the neural network internally will be representing the information. If it needs to think in 4D it will start to think in 4D. In order to predict where a moving car will be in 4D space, which is needed in order to predict how the next frame will look like for example(this is not needed but is a good way to augment the neural network), the neural network will find an internal representation in vector form for this. See this video at 22:47

Great summary. Small addition: Unless the car itself is stationary, everything around it is in motion, so the temporal model helps with the prediction / recognition of all objects. Prevents stop signs from disappearing and re-appearing, for instance.