Very impressive lidar. Can read signs.
This is at 3:13:20 mark in the video.
@Bladerskb I grabbed this screenshot from the Waymo presentation at CVPR.
It is impressive that Waymo has a method that is 36-40% better than the current Uflow. Shows that Waymo has state of the art ML.
But I am trying to understand more about this subject. Would you be able to explain this slide in layman's terms please?
I looked up "optical flow" and I found this definition:
"Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene. Optical flow can also be defined as the distribution of apparent velocities of movement of brightness pattern in an image."
So my rudimentary understanding is that Waymo is doing unsupervised machine learning with camera vision where the NN looks at several frames and can auto label the object based on the motion of the object. Do I have that right?
Optical flow describes a dense pixel-wise correspondence between two images, specifying for each pixel in the first image, where that pixel is in the second image. The resulting vector field of relative pixel locations represents apparent motion or “flow” between the two images. Estimating this flow field is a fundamental problem in computer vision and any advances in flow estimation benefit many downstream tasks such as visual odometry, multiview depth estimation, and video object tracking.
After having covered the foundation that our method builds on, we will now explain our three major improvements: 1) enabling the RAFT architecture  to work with unsupervised learning, 2) performing full-image warping while training on image crops, and 3) introducing a new method for multi-frame self-supervision.
RAFT works by first generating convolutional features for the two input images and then compiling a 4D cost volume C ∈ R H×W×H×W that contains feature-similarities for all pixel pairs between both images. This cost volume is then repeatedly queried and fed into a recurrent network that iteratively builds and refines a flow field prediction. The only architectural modification we make to RAFT is to replace batch normalization with instance normalization  to enable training with very small batch sizes. Reducing the batch size was necessary to fit the model and the more involved unsupervised training steps into memory. But more importantly, we found that leveraging RAFT’s potential for unsupervised learning requires key modifications to the unsupervised learning method, which we will discuss next
= Simulation City will allow Waymo to solve many more edge cases and scenarios faster than they could with real world driving.
= Once Simulation City is proven to be as accurate as real world driving, Waymo won't need as much real world driving to achieve the same results.