Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

My car is learning a lot but.... what? and how?

This site may earn commission on affiliate links.
I'm confused; do different cars learn differently? or are you implying the entire fleet learns as one?
All the cars learn collectively (although there may be limited cases where there is some individually). This is not 100% certain, but from what I can tell basically all cars provide data to train a neural network. the updated neural network is then uploaded to all cars during updates.

In this way, all cars with the same software version would respond the same way to the same situation (you won't have cars responding differently based on what it individually learned).
 
OK, so...

Inference
You can think of 'inference' as the execution of the 'program' you created by training your neural network. In effect, it is almost the same process as training, without the feedback part. In inference, you show the neural net some data, just like in training, but you don't feed the result back to the network (indeed, you usually can't, since you don't have a label for the data). In effect, you are trusting the neural networks conclusion. This is different from how our brains work, in that we learn dynamically, from our mistakes and successes, but the neural net is static - it only learns via training (there is a lot of research here, and some day we'll have dynamic learning, but it'll require a lot more processing than our AP2 computers are capable of).

Inference is a lot less compute-intensive than training, so it can easily run on our cars.

I suspect our cars are doing a bit 'inference' when we engage autopilot - there *may* be a trained neural net identifying objects such as lane markings and other cars. What we can't know is if a neural net is controlling the car based on that, or if some more traditional system is doing so. I suspect the following, however:

  • TACC is not based on a neural network. The acceleration/braking behavior feels much smoother. I suspect TACC is currently based on traditional programming.
  • Sign recognition is also not based on a neural network. They're probably running something trivial like OpenCV object detection (traffic signs only come in a small variety, and always tend to face the driver).
  • The FSD videos Tesla published bear all the earmarks of neural network object detection, but I can't really draw any conclusions about what is actually doing the car control.
  • The autosteer speed limitation is probably based on some framerate limitation - trained neural networks operate at a maximum speed. From what I know about the AP2 computer, it isn't a hardware limitation, so we'll see improvements as the Tesla folks optimize their software.
source: I stayed at a Holiday Inn Express last night.
 
Last edited:
One last point: the engineers doing all the labeling of our collective video and data have to be very selective: deep learning operates on data sizes of a few terabytes at most, and 8 cameras multiplied by tens of thousand of cars can produce petabytes. This means they are probably looking for exceptions and anomalies, things that produce novel information for the neural net to train on.

This means that they're most interest in situations where AP is enabled, but the driver took over, either at the prompting of AP (interesting) or without prompting (superinteresting).
 
So, a question I asked Mr. Musk on Twitter, but didn't hear back on: What do you think is being done to address situations where traffic and direction signals are being given by hand? Think construction areas, street crossings, parking lots at theme parks. There is so much variability with the way people give hand signals that sometimes even us human drivers can become confused. Is the NN going to be able to handle this with programmed behaviors or are people going to have to learn a standard set of hand signals to be used across the board (good luck with that)?

Also, I'm fairly confident that the current AP 2 hardware will be able to reach level 5. Perhaps not to the level of perfection, but to a level that far exceeds the performance of the average human driver. This is based on my observations of drivers who have very limited visual acuity and/or very poor driving skill and judgement still successfully navigating scenarios with all types of weather and conditions on a daily basis. Honestly, this system can probably already see better than a large number of the humans driving on the road today. Plus it has radar to supplement. Just teach it to slow down when it's unsure, something humans don't like to do, and it should be fine. Your automated car won't be in a hurry to kill itself like so many of the human drivers out there who obviously have extremely important places to be, thus their speeding and failure to follow traffic laws.
 
Should we be concerned about EM's DS decision not to include LIDAR in the current AP2 vehicles, while they're included on the vehicles of many other car manufacturers that are trying to create their own self driving tech?
I'll admit that the LIDAR hw looks horrendous, but the fact that EM point blank states it's not required for level 6 autonomy worrying any of you?

I'm new to Tesla as I just have had my MS60D for just over a month, but this was a concern that came to me once I started educating myself more and more.

Hope to hear some input from those of you more experienced. Thanks!
 
...LIDAR hw looks horrendous...

You are still thinking about the first generation of LIDAR but they don't have to look that way nowadays. Instead of one rotating LIDAR, you can have 3 less obvious ones instead:

Hyundai-autonomous-W.jpg
 
Should we be concerned about EM's DS decision not to include LIDAR in the current AP2 vehicles...

I still think LIDAR is a gold standard with a very high price although it is promised to be real cheap real soon.

I guess soon might be in 2020.

In the mean time, we'll have to take something cheaper than LIDAR until the promise of cheapness arrives.
 
Sorry, I'm just not drinking the kool-aid. Yes, Nvidia showed a car that avoided cones, and Tesla now uses Nvidia hardware. I don't make the leap of faith that teslas now avoid traffic cones. How many posts have we seen of autopark not avoiding concrete pillars. Hasn't it learnt by now?

Machine learning is AI. It's a rudimentary form of AI, but it is indeed AI.

Unless you think that Tesla has had programmers manually code billions to trillions of lines of code to account for the many billions of miles of road around the earth, this is indeed AI. It's no different than how Google Deepmind's Go bot works, or how Watson can suggest cancer treatments after throwing thousands of medical journals at it.

Tesla's code simply tells the autopilot system how to monitor and learn from human drivers, both in shadow mode and outside of shadow mode. It also tells autopilot how to handle hundreds of millions of miles of useful AP information. Put two and two together, and you have a rudimentary AI.
 
OK, so...

Inference
You can think of 'inference' as the execution of the 'program' you created by training your neural network. In effect, it is almost the same process as training, without the feedback part. In inference, you show the neural net some data, just like in training, but you don't feed the result back to the network (indeed, you usually can't, since you don't have a label for the data). In effect, you are trusting the neural networks conclusion. This is different from how our brains work, in that we learn dynamically, from our mistakes and successes, but the neural net is static - it only learns via training (there is a lot of research here, and some day we'll have dynamic learning, but it'll require a lot more processing than our AP2 computers are capable of).

Inference is a lot less compute-intensive than training, so it can easily run on our cars.

I suspect our cars are doing a bit 'inference' when we engage autopilot - there *may* be a trained neural net identifying objects such as lane markings and other cars. What we can't know is if a neural net is controlling the car based on that, or if some more traditional system is doing so. I suspect the following, however:

  • TACC is not based on a neural network. The acceleration/braking behavior feels much smoother. I suspect TACC is currently based on traditional programming.
  • Sign recognition is also not based on a neural network. They're probably running something trivial like OpenCV object detection (traffic signs only come in a small variety, and always tend to face the driver).
  • The FSD videos Tesla published bear all the earmarks of neural network object detection, but I can't really draw any conclusions about what is actually doing the car control.
  • The autosteer speed limitation is probably based on some framerate limitation - trained neural networks operate at a maximum speed. From what I know about the AP2 computer, it isn't a hardware limitation, so we'll see improvements as the Tesla folks optimize their software.
source: I stayed at a Holiday Inn Express last night.

Those were two excellent entries, but I just want to clarify a couple of things that people get hung up as it relates to Tesla.

For an AP2 car the Tesla Vision software is running on NVidia Hardware, but it's completely separate software from what NVidia demonstrates. Now it might share some fundamentals or various software libraries, but what's valid for one can't be assumed for the other. Tesla might use parts of it or they might not at all. Tesla is also trying to stay away from being so dependent on a specific type of hardware because they eventually plan on making their own chips for it. Similar to what Google did for their TensorFlow chip.

There are also different kinds of Neural Network types to use depending on what you're trying to accomplish. What you described is typical for a deep convolution network where you use forward and backward propagation during the training phase of the network, and then when you deploy the network it only uses forwards propagation. An AP car is doing inference to detect lanes, pedestrians, and to determine whether that's a truck in front of you or a car.

It can't learn from any mistakes by itself, but this is where the fleet learning comes into play. Let's say your driving down the road with AP on, and then it screws up so you take over. This action gets recorded, and if enough people correct it in the same place then the fleet learning can take place. I would call this supervised learning as it likely relies on a human to deal with how to fix what portion of the autopilot that needs to get fixed to not screw up.

Most of what fleet learning isn't about AI, or about deep neural networks. It's really about getting really highly detailed maps. Where you can get it to a point where the car can deal with areas where the lines are covered with snow. In order to do that it needs maps that were learned from the fleet.

The problem with using Fleet learning to correct the Vision software is the lack of feedback. The car doesn't know if it read the speed limit incorrectly, and the driver doesn't have time to tell Tesla if it did.

Self learning neural networks work really well for things like pinball where they know "Hey, I lost". Where they can play the game over and over to the point where they're better than a human.

What I find really awesome is most of this stuff is possible with todays graphics cards. So if you have the right hardware, and a lot of time you can play with this stuff on your own. You don't really need a supercomputer to play around with it. I play around with it using a Ubuntu workstation with an NVidia TitanX for the training, and then I use an NVidia TX1 to do inference. The TX1 can be thought of as the car computer, but it's a lot less powerful than what's in HW2 Tesla. But, I'm just trying to make a rover robot and it's perfect for that.

For training I use NVidia Digits. Lots of people like Caffe, but I like Digits which is essentially a front end to Caffe. It comes with a lot of tutorials and solid support base so I like it.

NVIDIA DIGITS

For inference deployment I've borrowed heavily from this project.

GitHub - dusty-nv/jetson-inference: Guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and Jetson TX1.

For Reinforcement learning I'm trying to wrap my head around this to reuse it for stuff I want my robot to do. Basically this is a self learning type network.

GitHub - dusty-nv/jetson-reinforcement: Deep reinforcement learning libraries for Jetson and online training
 
  • Like
Reactions: pjoseph
You are still thinking about the first generation of LIDAR but they don't have to look that way nowadays. Instead of one rotating LIDAR, you can have 3 less obvious ones instead:

Hyundai-autonomous-W.jpg
The problem is the first type of LIDAR (rotating one on top of the car) is the only type that provides something superior because it allows a 360 degree view of the environment.

Forward facing LIDAR of the type in your picture does not provide the same thing and a well designed radar array (plus forward cameras with multiple fields of view) can accomplish the same thing (and would be even better by working in conditions of poor visibility).

That's why Elon says LIDAR is unnecessary.
 
Ford places 2 LIDAR pucks above side mirrors.

Ford-Fusion-Hybrid-Autonomous-Development-Vehicle.jpg
The pucks are what I was talking about as superior. But it has the same problems as mentioned: they are ugly and results in poor aerodynamics. Placing them on the mirrors also adds blind spots (mainly the rear can't be covered completely even with both).

Leddar Technology place LIDARs on plain sights at various locations of your car as a part of a familiar assembly/housing to create 360 degrees.

Side mirrors:
6a9dsLZ.png


Headlight Assemblies:

2iFNH0X.png


Tail Lamp Assemblies:

2ZMwA0k.png




161121_Image_Texte_v01-1024x576.jpg


It also claims its LIDAR works in inclement weather:

0uqEzWe.png
The 2D lidar facing the rear and the MEMS lidar facing the front offer the same type of functionality as radar does.

The 3D lidar on the two mirrors is what functions similar to the puck (which can do point clouds), but then they don't overlap and they have no visibility to the rear and much of the side. I'm not convinced there will be a significant advantage to this vs the 5 camera setup Tesla has to cover the same area.
 
...I'm not convinced there will be a huge advantage to this vs the 5 camera setup Tesla has to cover the same area.

Ahhh! That's the thing!

People automatically think that LIDAR companies are anti-RADAR, anti-Cameras, anti-Sonars.

They advocate to add LIDAR as additional sensors and they don't encourage subtractions of what sensors you already have.
 
Ahhh! That's the thing!

People automatically think that LIDAR companies are anti-RADAR, anti-Cameras, anti-Sonars.

They advocate to add LIDAR as additional sensors and they don't encourage subtractions of what sensors you already have.
But the question is are they necessary (as people seem to think from the promotion of it). If a camera and radar suite can do the same thing, then they might not be, which is Elon's point. His point was never that it would not be nice to have in addition, just that it was not necessary to reach level 5 autonomy.
 
...are they necessary...

With Autopilot system, human operator is responsible for scanning the road, so there is no need to waste money on LIDAR to do the scanning.

That question was asked again with the fatal Florida autopilot collision.

Tesla admitted that in this particular scenario, the camera couldn't differentiate between the bright background and the white color tractor-trailer so no brake command was issued.

Tesla also said that the RADAR did detect the tractor-trailer but the brake command was not issued because it didn't know whether it's just a road sign or not.

That's when the LIDAR companies say with a full suite of fusion of sensors including LIDAR, the fatal Florida autopilot collision could have been avoided.

Sure, it's easy for LIDAR companies to say that without reconstructing the very same scenario.

However, we can review Google car that has long history of using LIDAR and got very minimal incidences. They used to report incidences monthly and the DMV does have their annual reports of "Disengagement" (each time human intervenes the automation such as overriding the steering wheel, brake/accelerator pedals...)

We might have to wait to see when LIDAR is more common and may be a third party can reconstruct the Florida scenario and test that out.
 
Google wants to know everything. Including as much as possible about the world of roads and traffic. Theyre not into selling production cars, and never has been.

Thats why they equipped ridiculously looking, relatively slow moving toy cars with highly expensive HD mapping HW: Lidar. And drove around for a while, like the Street View cars, to collect data.

Lidar is the best at creating excellent HD maps. They do so by firing laser light 360 and measure the time it takes for the reflection to reach the car.

The (BIG) downside is the wavelenghts these things operate at: Enter snow, dust, rain or fog, and Houston we have a problem.

Tesla is trying to sell cars. At an "affordable" pricetag. At the same time, theyre trying to create a top notch autonomous system.

Their choice of skipping Lidar, and sticking with GPS, radar, sonars and cameras (the latter being FAR superior to the others in certain tasks, such as lane reading, sign recognition and traffic light reading), is very understandable.

Understand that all of the sensor types have their pros and cons. No single sensor type is good enough for all task, but the question is: Do we need all og them or not?

Thats a question of what youre trying to accomplish. And that seems to be Level 5 autonomy. (Someone posted something about "Level 6"... I dont know what hes refering to.)

The SAE "level" system is just a classification of to what degree the system expects human to take over. SAE 5 is absolutely no human intervention. L1 or 0 is no automation what so ever. The rating is not about sensors and how many of them you have, its all about human intervention.
 
Last edited:
I was driving through the recent California storms, dodging tumbleweeds at one point and reacting to a serous accident a little later while trying to avoid getting in one myself, and I realized the full L5 is probably going to take a very long time.

At one point the road was closed and an officer was diverting traffic, but doing a really bad job at it (I thought he was telling me to stop). I actually consulted my wife. The jump from object recognition and bounding boxes and path planning to "all contingencies and high level reasoning) is just too vast.

L4 is coming for sure, though.
 
...ridiculously looking, relatively slow moving toy cars with highly expensive HD mapping HW: Lidar...

I know what you mean about Google Koala car

landscape-1446478423-google.jpg


Google cars before Koala car were regular cars like Prius, Lexus..

Those do drive in slow city speed and fast freeway speed as well.

As the disengagement rate goes down and it can drive more miles before human needed to intervene:


Screen_Shot_2016-01-13_at_2.09.17_PM.0.png


they felt comfortable enough from those regular cars and they decided to take it to the next phase by removing all human interventions with a Koala car with no human controls (steering wheel, brakes, accellerator...)

The law does not allow driverless without a driver, so they let Koala drives on its own on private roads only.

And because this would be the first time that Google would let a car drives itself without a human to take over control in emergency, it limits the speed to 25 mph to reduce fatalities just in case if there's a failure.

Ideally, when it can show that Koala has minimal failure rate, it would then increase to freeway speed for a car without human controls of course.

The problem is: that vision is still quite a few years away but there is a pressure to make profit now.

Thus, Waymo is now created and this time, there's no need to wait for perfection and there is no need for a car without steering wheel, brakes and accelerator anymore.

It is now partnering with conventional cars and If the automation makes a mistake now, there's always a human to catch the mistake by taking over steering wheel, brakes and accelerator.
 
Last edited:
  • Like
Reactions: lunitiks
I've been wondering about fleet learning too. I understand neural net training (at Tesla) versus inferencing (in the car), and what I'm hearing is that if a driver intervenes, that it somehow feeds back into the training. How? More specifically, what gets captured and sent to Tesla's training system, and how does it get applied to the next learning pass? Since it's a camera-based system, it would seem like it would need to upload the last few seconds of video from the cameras, as well as the radar and ultrasonics. Then Tesla would need to decide if AP2 made a mistake that the driver corrected, or not. And when the system is operating in shadow mode it's comparing AP2 path to driver path and you're going to get a lot more deviations (due to personal style) than when AP2 is operating and the driver takes over.