Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Neural Networks

This site may earn commission on affiliate links.
My theory is the blind spot visualization UI seems to "snap" all the vehicle orientations to the predicted path of each respective lane. So all vehicles are always rendered parallel to the lane path, not oriented on a per-vehicle basis. This works well for forward objects but for rearward vehicles the lane markers can't always be clearly seen from the repeater or rear camera, especially in heavy traffic, so the output of the path prediction is low confidence and produces spastic output. So you end up with vehicles being rendered with ridiculous orientations.
 
My theory is the blind spot visualization UI seems to "snap" all the vehicle orientations to the predicted path of each respective lane. So all vehicles are always rendered parallel to the lane path, not oriented on a per-vehicle basis. This works well for forward objects but for rearward vehicles the lane markers can't always be clearly seen from the repeater or rear camera, especially in heavy traffic, so the output of the path prediction is low confidence and produces spastic output. So you end up with vehicles being rendered with ridiculous orientations.


You're probably atleast partially correct but it's not the UI that's doing this. It's more likely it's the vision NN it's self that has learned vehicle orientation in relation to lane lines. I also seem to recall reading here (but don't have a link handy) that Tesla's vision NN takes 2 frames that are sequential as input into each iteration so it's likely that it relies on movement in order to achieve the best accuracy.
 
I also seem to recall reading here (but don't have a link handy) that Tesla's vision NN takes 2 frames that are sequential as input into each iteration so it's likely that it relies on movement in order to achieve the best accuracy.
From Karpathy's presentation, they definitely take multiple frames (he didn't specify how many) that is used for some tasks like dynamic objects.
 
You're probably atleast partially correct but it's not the UI that's doing this. It's more likely it's the vision NN it's self that has learned vehicle orientation in relation to lane lines. I also seem to recall reading here (but don't have a link handy) that Tesla's vision NN takes 2 frames that are sequential as input into each iteration so it's likely that it relies on movement in order to achieve the best accuracy.

There hasn't been evidence that this kind of NN is in production yet, only that it was present in the code at some point for unknown purposes.

I think the observation that car/object visualizations are snapped based on the direction of travel of the surrounding lanes is pretty spot on. You can see some fairly accurate orientations in this relatively older video from verygreen, but the cars rendered on screen in a parking lot or on the street are pretty much never this accurate:

 
There hasn't been evidence that this kind of NN is in production yet, only that it was present in the code at some point for unknown purposes.

I think the observation that car/object visualizations are snapped based on the direction of travel of the surrounding lanes is pretty spot on. You can see some fairly accurate orientations in this relatively older video from verygreen, but the cars rendered on screen in a parking lot or on the street are pretty much never this accurate:

Holy Christballs! that video makes me realize the robots are coming for us all
 
It would need to be a low number like 2 to avoid introducing a lot of extra latency. If they sample at 30 fps then 1 frame = 33.3ms of latency. At 3 frames that would introduce 100ms of extra latency (assuming 30 fps).

They can use the current frame for object detection and one from X samples ago for motion detection. Then the only latency is in velocity change detection.
 
Incremental progress in basic image recognition is fairly well understood at this point...

It would be very interesting to gain some insight into Tesla's development approach for _how_ the car drives. Driving policy - not just for other road users, but in general... this is an area that Tesla haven't spoken about (all of Karpathy's talks focus on image recognition and data collection). Teaching AP how to read, react and understand the world around it (rather than just recognise it) is the really interesting area we need to see a lot of advancement in before it becomes a viable approach.

Would be super interested in any thoughts from @verygreen on any news in that area...
 
Incremental progress in basic image recognition is fairly well understood at this point...

It would be very interesting to gain some insight into Tesla's development approach for _how_ the car drives. Driving policy - not just for other road users, but in general... this is an area that Tesla haven't spoken about (all of Karpathy's talks focus on image recognition and data collection). Teaching AP how to read, react and understand the world around it (rather than just recognise it) is the really interesting area we need to see a lot of advancement in before it becomes a viable approach.

Would be super interested in any thoughts from @verygreen on any news in that area...

Yes. The latest builds seem to do lane keeping like a neural network that has been trained on driving. Ie. It doesn’t appear to be using programmatic rules. It does appear to use rules for deciding actions on things like stop lights, stop signs, etc. So, a hybrid approach. Some neural net, some programmatic rules. I suspect this is the best way forward for Tesla.

I think human brains also use a similar hybrid approach. When you are a beginner driver, your cortex is consciously giving inputs into the car and furiously trying to keep the car centered in the lane. Over time, the autonomous parts of the brain start to take over. An experienced driver can be thinking quite complex thoughts (daydreaming) while driving because they aren’t consciously doing lane keeping anymore, that is being done unconsciously.

In a way, Tesla’s off line learning and neural net builds are similar to what the human brain does in this case.

But this approach breaks down for things that have hard and fast rules like stop sign intersections (4 way, Not 4 way, left turn, right turn, etc.). Here, you see that even human brains struggle. How many times have you been at a 4 way (they are common where I live), and four cars have approach their stops, but they don’t enter the intersection in the correct order? What you’ll often see is a person entering after they have given a single driver the right of way, instead of giving possibly three other drivers the right of way. These drivers are letting their brains drive autonomously without thinking about the rules, and human brains will often shortcut and treat a 4 way like a two way stop sign intersection, especially when another part of their brain is stressed about getting somewhere fast.

Given that human neural nets can’t properly handle 4 way intersections, I doubt Tesla will figure it out (but I could be wrong). So it’ll probably be a combo of neural net and some interesting programmatic coding.
 
No which is why I said "seems". But you might get clues for this by looking at Karpathy's recent videos. He talks about various different things the various neural nets do, and I believe (but am not certain) that some of those tasks were for driving, and not just perception.

They're talking about things planned far in the future. I'm extremely skeptical that they are, or will in the near future, allow what is effectively a black box make choices about lane following. They need a way to decipher why a NN has given a specific output before they'd be allowed to unleash it on the public.
 
They're talking about things planned far in the future. I'm extremely skeptical that they are, or will in the near future, allow what is effectively a black box make choices about lane following. They need a way to decipher why a NN has given a specific output before they'd be allowed to unleash it on the public.
And that is where test cases, millions of miles of validation, visualizations, and heat maps come into play.

As long as it does the right thing, the internal processing doesn't matter, as long as there is a high probability of continuing to do the right thing in the future.
 
So, please explain the internal computing process of any human driver. And/ or how current driver's training is any different.

Difference is when Tesla is back in court defending itself over the latest autopilot deaths the response "we don't know how it works or what rules it follows, it's more like a human driver we trained" is just going to make it worse for them.
 
Difference is when Tesla is back in court defending itself over the latest autopilot deaths the response "we don't know how it works or what rules it follows, it's more like a human driver we trained" is just going to make it worse for them.
Actually, if taken to court, Tesla would have a billion+ miles of data plus all their training test cases to show the vehicle behaves properly.
"Your honor and members of the jury, here are all the similar cases in which FSD acted in a reasonable manner, including this one. This event did (or did not) show an area the system can improve in (if did) and we have already updated all of our vehicles with that improvement."

Versus, 'yep we hard coded it to do exactly what it did when it did what we got sued for'.

You can't code against life. Nor can you create a system that can fully predict effects of actions.
 
So, please explain the internal computing process of any human driver. And/ or how current driver's training is any different.

This is a nonsense argument. I can explain my decision making. You have no possible way of querying a neural net's output. There's an entire field of study to create neural nets to probe neural nets with inputs to attempt to determine how they respond.