Yeah. Moved to Waymo thread.
Thanks, and a couple of follow-up questions. Perhaps this should move to the Waymo thread, but oh well:
- Is there any indication, from conferences or interview answers etc., that Waymo has rethought any major part of the AV technical strategy or architecture over the last several years? In the way that e.g. Tesla has evolved their neural network architecture, examined the fusion problems of disparate sensors and the phased conversion to temporal persistence, and also the intent to move more of the stack to NN over time? Or like MobilEye with their evolved dual-perception / late fusion architecture? The thrust being that, over years of development far past the original expectation, one would expect that some pretty major rethinking would happen along the way.
Short answer is definitely, I think Waymo has evolved a lot in terms of their sensors, hardware and software stack over time. They've undergone some big changes.
In terms of hardware, they went from a crude lidar on the roof of the first Google car to a much more sophisticated array of long range and short range cameras, lidar and radar on the 5th Gen I-Pace, including perimeter sensors for seeing around corners. In fact, just from the 4thGen in Chandler to the 5th Gen in SF, we see some important changes. For example, the 5th Gen has cameras, lidar and radar in the front bumpers to see around corners where the 4th Gen only has the lidar and radar. I suspect Waymo encountered cases on the narrow streets in SF like what we see in FSD beta videos where the car has to creep forward and realized they needed more/better perimeter sensors than what they were using on the wide streets in Chandler. The sensors have also improved in quality a lot.
In terms of the software stack, they started with mostly lidar machine learning back when computer vision was still rudimentary. They've added a lot more advanced computer vision now. They've also added more ML for prediction and planning. They also developed much more advanced sensor fusion as they went from basically 1 lidar on the first Google car to almost 30 sensors now on the I-Pace. Waymo has also evolved on how to represent their perception and how to interface between perception and prediction. They started with just an image representation but then switched to VectorNet which is a NN that converts map and perception into vectors and polylines. Anguelov says that VectorNet significantly improved their prediction and planning. He also says that Waymo has not completely solved yet the question of how to best represent perception in a way that helps prediction and planning. He also describes how they are working on joint prediction/planning, meaning a single NN that handles both prediction and planning. So we could see more changes there in their stack.
I could give a more detailed answer but yeah, I think Waymo has evolved a lot in their FSD since they started.
- Your explanation implies that that the last CEO, Krafcik, was too much of a marketer and not enough bound to the engineering reality (probably similar to your and many others' take on Elon). However, at the time of this change a few months back, I seem to recall some people saying that Waymo actually wasn't moving fast enough toward real scaled-up deployment, and was stuck in a money-draining science-fair condition with no end in sight. These two explanations seem to be rather contradictory. I'm not asking you to reverse your position, just trying to understand if it's your own take or the generally accepted story. Perhaps I'm relying too much on one or two things I heard that weren't representative of mainstream Waymo-watcher consensus.
Well, I just gave you my personal take.
I think the general consensus seemed to be "Waymo has these robotaxis in Chandler. Their FSD looks great. So why haven't they expanded yet?" I don't think Krafcik had a good answer to that. He seemed to kind of say "we have plans and when the FSD is safe enough, we will expand". Behind the scenes, we know Krafcik wanted to expand but presumably couldn't because they were encountering edge cases and did not feel confident enough in the FSD.