Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Autonomous Car Progress

This site may earn commission on affiliate links.
LOL. Nobody does end to end NN and you know that. You must mean something else ….

You are misinterpreting my words. I never said that Waymo does end-to-end. I said that Waymo does not do hand coded and they use ML for all their stack. Using all ML is not the same as end-to-end. It is the difference between using multiple NNs in your stack versus just one NN for everything. Waymo uses multiple NNs for different parts of their stack so it is not end-to-end and Waymo does not do hand coded, they use ML.
 
Last edited:
NN is probabilistic- not deterministic like you say.

Yes, NN is probabilistic by nature. But a good behavior prediction stack will try to narrow those probabilities down as much as possible. The planner stack does not just go "well, the prediction stack is telling me the other car has a 33.3% chance of going right, 33.3% of going left and 33.3% of going straight, I guess I will turn left and hope I guessed right". Also, Waymo has merged their prediction and planner NN. So it takes into account other vehicle's behavior before the Waymo makes a decision but also how the Waymo's actions may affect other behaviors.
 
Last edited:
LOL. Nobody does end to end NN and you know that. You must mean something else ….
This is e2e.
Wayve CEO shared this clip from their simulation to demonstrate their AI. The simulated Wayve car brakes to avoid hitting the green car that suddenly turned in front of it. He says it was safe. But I don't agree. It was a very close call. Also, you cannot assume the simulation accurately reflects what would have happened in the real world. A real human driver in the green car might not have behaved like in the sim. So I don't think we can trust this sim as proof that Wayve would handle it safely.

Having ML in the entire stacks of your architecture does not mean e2e. End to end means a single NN with input and a control output. Sensor data goes in, and the output is control for the vehicle.

tAVZMQC.png


Waymo, cruise, tesla use traditional disparate subsystems that use DL and or traditional programing in each subsystem. Wayve has a single NN with sensor input and a control output.
 
I said that Waymo does not do hand coded and they use ML for all their stack.
Everyone uses hand code to stitch together and invoke different NNs, validate and cross-check NN outputs, override NN outputs in some cases, calculate physical properties including velocity, interface with vehicle control systems and users plus dozens of other things.

It is the difference between using multiple NNs in your stack versus just one NN for everything.
I'm not sure there's an actual difference. A single NN has many layers. If you hook up the input of one NN to the output of another do you still have two separate NNs? Or have you simply created one large NN with more layers? I'm sure CS professors have abstract definitions that draw fine lines, but in terms of implementation I don't see much difference, especially on the inference side.
 
Everyone uses hand code to stitch together and invoke different NNs, validate and cross-check NN outputs, override NN outputs in some cases, calculate physical properties including velocity, interface with vehicle control systems and users plus dozens of other things.

Sure but I am talking about inside the perception, prediction and planning stacks. Back in Sept 2022, Waymo switched to a new next gen ML planner. So now, Waymo uses NN inside all 3 stacks, perception, prediction and planning.


I'm not sure there's an actual difference. A single NN has many layers. If you hook up the input of one NN to the output of another do you still have two separate NNs? Or have you simply created one large NN with more layers? I'm sure CS professors have abstract definitions that draw fine lines, but in terms of implementation I don't see much difference, especially on the inference side.

I think if you were to ask a ML engineer, they could point to significant architectural differences between the different approaches.

And I imagine that training the NN would be very different as well. In the first case, you are training an NN to do specific tasks like perception or prediction depending on the NN. In the E2E approach, you are not training NN to do perception, prediction or planning, rather you are training the NN to output the desired control directly from visual input.
 
Last edited:
  • Informative
Reactions: nativewolf
I'm not sure there's an actual difference. A single NN has many layers. If you hook up the input of one NN to the output of another do you still have two separate NNs? Or have you simply created one large NN with more layers? I'm sure CS professors have abstract definitions that draw fine lines, but in terms of implementation I don't see much difference, especially on the inference side.
The difference is.

An e2e neural network is a single network trained to perform a specific task directly from the input data which are the sensor input (Camera, Radar, Lidar, IMU) to the output data which is the control without any intermediate steps. The advantage being a simple single NN that just provides you the control for the vehicle, but the disadvantage is it is hard to diagnose any issue.

The traditional way is disparate NNs trained to perform a single or multiple parts in the entire stack, perception, planning, control, etc. Advantage is it's easier to diagnose issues when one-part breaks and you can mix and match different NN architectures suitable for the different subsystems. Disadvantage is more complexity.
 
  • Like
Reactions: diplomat33
It must cost an absolute fortune for these companies to move and operate in new areas. Renting or purchasing Buildings/Maintenance facilities. Setting up equipment. Along with hiring personnel or bringing in personnel from other areas and housing them. I think that Waymo and Cruise won't operate far from their maintenance depots. Either these companies will just operate in the large metro areas of large Cities or they will have to setup maintenance depots far outside of major cities.
 
  • Like
Reactions: DanCar
Yes, NN is probabilistic by nature. But a good behavior prediction stack will try to narrow those probabilities down as much as possible. The planner stack does not just go "well, the prediction stack is telling me the other car has a 33.3% chance of going right, 33.3% of going left and 33.3% of going straight, I guess I will turn left and hope I guessed right". Also, Waymo has merged their prediction and planner NN. So it takes into account other vehicle's behavior before the Waymo makes a decision but also how the Waymo's actions may affect other behaviors.
Duh.
This is what I was replying to.

Waymo has good behavior prediction. It knows what other cars will do.
Waymo doesn't "know". It tries to predict ... C'mmon, I know you are a Waymo "fan" ... but even you must "know" the difference between "knowing" and "predicting".
 
Waymo doesn't "know". It tries to predict ... C'mmon, I know you are a Waymo "fan" ... but even you must "know" the difference between "knowing" and "predicting".

Of course, I know the difference. No need to be snarky. Being a fan of Waymo has nothing to do it. Yes, it predicts but the predictions are very good. That is what I am saying.
 
What does that mean ? There isn't a single line of code that is not NN ? Link ?

No, it does not mean that there is no line of code anywhere. I mean that perception is done by NN, prediction is done by NN and planning is done by NN. For example, we know all the perception tasks like object detection/classification, lidar segmentation, vision/radar fusion, etc is done by NN. And Waymo has presented on StopNet and other NN that they use for behavior prediction. We also know that use NN for the planning. So, perception, prediction and planning are done by NN.
 
You don't know much about how Waymo does anything because they don't show you what they do. Their talks are very general and speak of various techniques but don't go into detail about what Waymo does.

Since this is the case, I suspect Waymo's approach uses a lot of hand coding / feature engineering, especially in the construction of the HD maps. Can anyone here show us what a Waymo HD map looks like and what goes into creating it?
 
You don't know much about how Waymo does anything because they don't show you what they do. Their talks are very general and speak of various techniques but don't go into detail about what Waymo does.

This is wrong. We know quite a lot. Waymo gives regular updates on what they are doing. And unlike Tesla, Waymo has actually put their latest research out there for anyone to read. You can search by topics and read what Waymo has done in perception, prediction, planning and general ML:


And Anguelov's talks are full of technical details:




Since this is the case, I suspect Waymo's approach uses a lot of hand coding / feature engineering, especially in the construction of the HD maps.

This is completely false. Waymo does not do this. Waymo has even explained that their HD mapping is automated, not hand coded. You just keep pushing these lies about Waymo. I wish you would stop.
 
  • Like
Reactions: nativewolf