Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

FSD Predictions for 2020

This site may earn commission on affiliate links.
I'm very impressed with the implications of the new visualizations... I wonder if Tesla will also max out the capabilities of HW3 soon. It seems Tesla may need to recognize and render more objects in vector space than they initially anticipated?

These new visualizations and their speed/accuracy wrt light/signs have me thinking Tesla will release a beta city NoA around June 2020.
 
I'm very impressed with the implications of the new visualizations... I wonder if Tesla will also max out the capabilities of HW3 soon. It seems Tesla may need to recognize and render more objects in vector space than they initially anticipated?

These new visualizations and their speed/accuracy wrt light/signs have me thinking Tesla will release a beta city NoA around June 2020.

I don't know if Tesla will max out AP3 soon since I have very little info on how much of AP3, Tesla is currently using. But I am sure "city NOA" will probably take up a significant portion of AP3. As you point out, there is a lot for the camera vision to process and render in vector space. So AP3 will definitely be busy. And yes, I do anticipate we will get beta "city NOA" in the first half of 2020 and then the second half will be improving it further.
 
  • Like
Reactions: APotatoGod
I was enjoying the new visualizations today, and when leaving the grocery was turning left onto a two lane road with a large divider in the middle, and limited vision of the traffic coming from the right because of vegetation in the divider. Which got me to wondering, will the car be able to safely turn left onto main roads with only a stop for you, and traffic coming from both directions? Will it recognize turn signals of cars turning into the parking lot (or side road) from the main road, from both directions? Is the wide angle view from the front cameras good enough for this?

Will it know how to safely turn right on a red light?

Seems to me if they are going to have FSD on city streets, this would be pretty important to be able to do
 
  • Informative
Reactions: APotatoGod
I was enjoying the new visualizations today...

I'm curious how consistent the visualizations are. Does it ever or often miss stop signs or stop lights? That could give us a clue to how well things are going.

The recent Fridman interview had Musk mentioning the two stages of first getting an accurate "vector space" and then driving policy. I'm going to predict we'll get really good vector space, but won't complete the driving policy aspect. So it'll stop and go at lights, but may not do turns at intersections, and have specific difficulty with unprotected left turns. It may do them, but very slowly / too cautiously that it'll annoy me.
 
  • Like
Reactions: APotatoGod
The recent Fridman interview had Musk mentioning the two stages of first getting an accurate "vector space" and then driving policy. I'm going to predict we'll get really good vector space, but won't complete the driving policy aspect. So it'll stop and go at lights, but may not do turns at intersections, and have specific difficulty with unprotected left turns. It may do them, but very slowly / too cautiously that it'll annoy me.

Yeah, I definitely think beta "city NOA" will struggle at first with being too cautious in turns and such. If you use a full sensor suite including LIDAR, building the vector space is actually easy now. It's taking Tesla longer to get this part done only because they have to do it with just cameras. But driving policy is actually the hard part because you have to figure what to do with the vector space. So when you need to do an unprotected turn, when do you go? Do you wait for the car to pass before making the unprotected turn or do you go right away? Is that other car going to go straight or is it going to make turn off the road? There is a lot of prediction that goes into driving that determines your driving policy. That is why it is hard.
 
I'm curious how consistent the visualizations are. Does it ever or often miss stop signs or stop lights? That could give us a clue to how well things are going.

The recent Fridman interview had Musk mentioning the two stages of first getting an accurate "vector space" and then driving policy. I'm going to predict we'll get really good vector space, but won't complete the driving policy aspect. So it'll stop and go at lights, but may not do turns at intersections, and have specific difficulty with unprotected left turns. It may do them, but very slowly / too cautiously that it'll annoy me.
I'm very curious how the stopping at lights will work in terms of driver interaction, given that it will still be a driver assistance system (rather than fully autonomous). Both false positives and false negatives can have fatal consequences. Assume you're driving towards a stop light at 40 or 50 mph. Is the driver now supposed to keep watching the screen to verify that the car is recognizing the correct traffic light and it's status, so they can make a decision whether they need to intervene or not?
 
  • Informative
Reactions: APotatoGod
I'm very curious how the stopping at lights will work in terms of driver interaction, given that it will still be a driver assistance system (rather than fully autonomous). Both false positives and false negatives can have fatal consequences. Assume you're driving towards a stop light at 40 or 50 mph. Is the driver now supposed to keep watching the screen to verify that the car is recognizing the correct traffic light and it's status, so they can make a decision whether they need to intervene or not?

I see 2 possibilities:

1) The AP nags will be pretty frequent to make sure the driver is paying attention and the driver will be expected to react before the car each reached the stop light. In other words, if the car does not start braking when it should, the driver should intervene.

2) Tesla will wait until traffic light response is so accurate that it is an autonomous feature. Tesla will still expect the driver to pay attention with nags because Autopilot as a whole is not autonomous but traffic light response itself will be reliable enough.
 
  • Informative
Reactions: APotatoGod
But driving policy is actually the hard part because you have to figure what to do with the vector space. So when you need to do an unprotected turn, when do you go? Do you wait for the car to pass before making the unprotected turn or do you go right away? Is that other car going to go straight or is it going to make turn off the road? There is a lot of prediction that goes into driving that determines your driving policy. That is why it is hard.

Something else to consider. Karpathy was talking about having a "black box" that takes in sensor input and spits out driving policy. That completely skips vector space visible by humans, and may require HW4. The advantage that has is you can train the NN with both sensor and driver control input, which Tesla has loads of. The current "conventional" method of having a NN generate vector space requires human coded driving policy, which can be quite the bottleneck. To make matters worse, different regions have different driving cultures. Will that mean different driving policy will be required for each area?
 
  • Informative
Reactions: APotatoGod
Is the driver now supposed to keep watching the screen to verify that the car is recognizing the correct traffic light and it's status, so they can make a decision whether they need to intervene or not?

It might be nice to have feedback to the driver to let them know the car plans to stop. Hopefully seeing the red light on the visualization is enough, but maybe a message like "Preparing to stop" would be nice.
 
I'm very curious how the stopping at lights will work in terms of driver interaction, given that it will still be a driver assistance system (rather than fully autonomous). Both false positives and false negatives can have fatal consequences. Assume you're driving towards a stop light at 40 or 50 mph. Is the driver now supposed to keep watching the screen to verify that the car is recognizing the correct traffic light and it's status, so they can make a decision whether they need to intervene or not?
That's going to be a headache and very likely harder than simply driving it yourself. The problem is that you need to recognize the need to intervene rather late in the process, which could mean you're intervening too late. You're not driving while anticipating, which determines your next action. You're correcting at the last second.
 
  • Informative
Reactions: APotatoGod
I don't know if Tesla will max out AP3 soon since I have very little info on how much of AP3, Tesla is currently using. But I am sure "city NOA" will probably take up a significant portion of AP3. As you point out, there is a lot for the camera vision to process and render in vector space. So AP3 will definitely be busy.

I'm not an expert on this topic so I could be wrong, but here's my current understanding. I think neural network can, in principle, be scaled up and down in size almost arbitrarily. So, Tesla may have a version of Hydranet (or whatever the proper name is) that is far more computationally intensive than what HW3 can run. Then they might “squeeze” that bigger network down until it's exactly HW3-sized. (That might be why DeepScale was acquired.) Or they might simply work on a HW3-sized network from the start.

As I understand it, there would be no point using only half of HW3 because you could just double the size of the network and get better accuracy on your NN's predictions.

IIRC, one limitation on scaling up neural network size is that you need to scale up your training datasets along with your network or else your network will overfit your specific datasets rather than generalize. But particularly with self-supervised learning — which will be accelerated by Dojo — it doesn't seem that will much of an issue.

Karpathy was talking about having a "black box" that takes in sensor input and spits out driving policy. That completely skips vector space visible by humans, and may require HW4. The advantage that has is you can train the NN with both sensor and driver control input, which Tesla has loads of.

When did Karpathy talk about this? What you're describing is end-to-end imitation learning. Nvidia did a demo of this concept a few years back:

“We trained our network to steer the car by having it study human drivers. The network recorded what the driver saw using a camera on the car, and then paired the images with data about the driver’s steering decisions. We logged a lot of driving hours in different environments: on roads with and without lane markings; on country roads and highways; during different times of day with different lighting conditions; in a variety of weather conditions.​

The trained network taught itself to drive BB8 without ever receiving a single hand-coded instruction. It learned by observing. And now that we’ve trained the network, it can provide real-time steering commands when it sees new environments. See it in action in the video below.”​

Elon briefly mentioned on Autonomy Day (I think in response to a question from Tasha Keeney at ARK Invest) that he expected the system would “eventually” move to pixels in, steering and acceleration out. But I think he meant in the long-term future, not anytime soon.

Karpathy and Elon have also both talked about self-supervised learning, which is a different concept from end-to-end learning but to confuse with it. I originally thought Elon's comments on Dojo were about end-to-end learning, but now (thanks to @jimmy_d) I'm pretty sure he was talking about self-supervised learning for computer vision (i.e. creating the vector space representations).

The current "conventional" method of having a NN generate vector space requires human coded driving policy, which can be quite the bottleneck.

Fortunately, this isn't true. You can do “mid-to-mid” imitation learning, in which the imitation network takes the vector space representations as its input rather than pixels. Then its output is a plan/path/trajectory/action. That gets sent to the control software (which is hand-coded) and turned into low-level steering and acceleration commands. In training, the human's plan/path/trajectory/action is paired with the vector space representations and that's the state-action pair (“state” as in world state or environment state), which is the input-output pair for deep supervised learning.

Waymo did mid-to-mid imitation learning with their ChauffeurNet research project:

Learning to Drive: Beyond Pure Imitation

I believe Tesla's approach to planning/driving policy is a combination of hand-coded elements and mid-to-mid imitation learned elements.
 
  • Informative
Reactions: APotatoGod
That's going to be a headache and very likely harder than simply driving it yourself. The problem is that you need to recognize the need to intervene rather late in the process, which could mean you're intervening too late. You're not driving while anticipating, which determines your next action. You're correcting at the last second.
No different than TACC. When I approach stopped cars at a traffic light - I expect the car to slow down and if it doesn't I intervene. In the beginning I used to intervene too fast because I didn't know exactly how AP would slow down. Now I just watch. In some 9 months of using AP on city / highway roads this way - I've had to intervene once (curving road).

With city NOA - I expect the same thing. Around traffic lights & stop signs, I expect to see the car slow down and if t doesn't I've intervene.

So, no - you don't intervene at the last second - you will have plenty of time.