Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Elon: "Feature complete for full self driving this year"

This site may earn commission on affiliate links.
We are not talking about a typical car. We are talking about an early access tester.


Funny enough- Green is an early access tester so he knows exactly what's happening there too.

In this thread (scroll up a bit for the start) he mentions the release notes on this EAP release are a bit misleading as to what it "really" is stopping for- and also a bit about what goes into these new AP screenshot captures

green on Twitter
 
Karpathy presented at another conference at the end of February, but the video of his presentation was released yesterday:


Some key points I picked out:
  • Some clips of Pedestrian AEB in the wild
  • He does actually discuss how relying on vision alone has slowed down their development compared to the LIDAR approach
  • He goes through the "recruitment clip" and discusses a few things the network is labeling
  • Stop sign classification examples, and modifiers. He says they have the world's biggest training set for stop signs with "except right turns" modifiers (tens of thousands of images).
  • Curated unit tests for march of nines. Example test he gives shows 99.51% pass rate right now for occluded stop signs.
  • 48 neural networks, that have 1,000 output tensors, that take 70,000 GPU hours to train
  • Working on new detection task for police lights
  • "Occupancy tracker" is the term they're using for their current multi-camera map over space and over time (keeping temporal memory of previous prediction).
  • Autopilot rewrite is throwing out the "occupancy tracker", where the multiple views over time are combined by the neural network itself instead of procedural code. The AP rewrite is much better and predicting intersection edges.
  • At around 22 minutes, Karpathy shows off video of pseudo-LIDAR in action. All done by self-supervised techniques
  • Ends with recruitment push
  • In the Q&A, discusses low-definition maps. (Not "There's a stop-sign precisely here" but "There was a stop-sign somewhere around here...")

Clip of the pseudo-LIDAR:

Screenshot from 2020-04-21 14-51-45.png
 
Karpathy presented at another conference at the end of February, but the video of his presentation was released yesterday:


Some key points I picked out:
  • Some clips of Pedestrian AEB in the wild
  • He does actually discuss how relying on vision alone has slowed down their development compared to the LIDAR approach
  • He goes through the "recruitment clip" and discusses a few things the network is labeling
  • Stop sign classification examples, and modifiers. He says they have the world's biggest training set for stop signs with "except right turns" modifiers (tens of thousands of images).
  • Curated unit tests for march of nines. Example test he gives shows 99.51% pass rate right now for occluded stop signs.
  • 48 neural networks, that have 1,000 output tensors, that take 70,000 GPU hours to train
  • Working on new detection task for police lights
  • "Occupancy tracker" is the term they're using for their current multi-camera map over space and over time (keeping temporal memory of previous prediction).
  • Autopilot rewrite is throwing out the "occupancy tracker", where the multiple views over time are combined by the neural network itself instead of procedural code. The AP rewrite is much better and predicting intersection edges.
  • At around 22 minutes, Karpathy shows off video of pseudo-LIDAR in action. All done by self-supervised techniques
  • Ends with recruitment push
  • In the Q&A, discusses low-definition maps. (Not "There's a stop-sign precisely here" but "There was a stop-sign somewhere around here...")

Clip of the pseudo-LIDAR:

View attachment 534563

Yes, it is a very informative video. I always like Karpathy's talks on the subject. I wish he were the official spokesperson for all of Tesla's FSD progress.

Just to be clear, when you say "pseudo-lidar", Tesla is using camera vision to create depth maps similar to the depth maps that you get with lidar. Tesla is not actually using lidar.

Also, Karpathy makes a mistake talking about HD maps at the end. It is nonsense to say that HD maps are for getting the exact distance of leaves on trees. Why would an autonomous car need to know that? HD maps are used for the same reason that Karpathy mentions: to get the position of traffic lights, stop signs etc...

I am also still a bit puzzled by Tesla's philosophy of HD maps. So they are using low resolution maps to get the approximate location of a stop sign but reject HD Maps? if you are going to use maps to get the location of a stop sign, why not get the exact position of the stop sign?

FYI, we already have a thread to discuss this video: Andrej Karpathy - AI for Full-Self Driving (2020)
 
  • Like
Reactions: willow_hiller
Just to be clear, when you say "pseudo-lidar", Tesla is using camera vision to create depth maps similar to the depth maps that you get with lidar. Tesla is not actually using lidar.

I should always check the Autonomy forum first! Following that other thread now.

I think "pseudo-LIDAR" is actually a term that's been used in academia for a while: Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D...

With the HD map comment, I think at this point Tesla is just trying to save face. Karpathy did cut the questioner off pretty quickly when he tried to follow up. They're not using HD maps, they're using Tesla™ Maps ;)
 
  • Like
Reactions: diplomat33
I think "pseudo-LIDAR" is actually a term that's been used in academia for a while: Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D...

I know. I just wanted to post the clarification to make sure there was no misunderstanding. I did not want someone else to see your post and get confused and mistakenly think that Tesla was using some type of lidar now.

With the HD map comment, I think at this point Tesla is just trying to save face. Karpathy did cut the questioner off pretty quickly when he tried to follow up. They're not using HD maps, they're using Tesla™ Maps ;)

Yes. I think that Tesla will end up using HD maps. But they can't admit it since Elon made a big deal about HD maps being useless. I do wonder if Tesla does end up using lidar at some point to help with their depth perception accuracy, what will Karpathy and Elon say? LOL.
 
  • Like
Reactions: willow_hiller
Yes. I think that Tesla will end up using HD maps. But they can't admit it since Elon made a big deal about HD maps being useless. I do wonder if Tesla does end up using lidar at some point to help with their depth perception accuracy, what will Karpathy and Elon say? LOL.
I don't want to beat the dead horse of HD vs "non-HD maps" again, but I think there is definitely a difference, though like porn, you'll know it when you see it.

Telling your software "there should be a stop sign somewhere at this intersection" is a hint for the cameras to look really hard and find the damned thing, devoting more resources to the stop sign detecting NN if necessary, and if the cameras DON'T see the stop sign, well we'd better stop anyway for safety. What if the stop sign happens to be occluded today because spring has sprung and the leaves are mostly covering it up? What if there is a FedEx truck parked in front of it?

Telling your software "there *is* a stop sign that is approaching at 12.935m, 12* to the right, and 2.123m above the ground" isn't any more helpful in this instance unless you're going to blindly act upon it, which I think we can all agree is a bad idea. It simultaneously requires a MUCH more detailed "HD" map of the environment, which necessitates additional resources devoted to the initial data gathering and constantly updating this highly detailed data.

In the "blocked by a FedEx truck" example, having an HD map that tells you with pinpoint accuracy where the stop sign is located doesn't help your visual software "see it" any better and could have the opposite effect of laser (lolz) focusing its attention on "where it should be" versus "generally speaking, this intersection has a stop sign, and if you don't see it, you should probably stop anyway."

To me, this is akin to "Go down the road a few miles, through 3 traffic lights. At the 4th light with the Wendy's on the corner, make a right go a little ways down and then look for the entrance on the right when it narrows down to 1 lane" versus "Proceed at the following tens of thousands of exact speeds, braking, and steering angles noting the series of 600,000 points in space that were totally there the last time I drove by 3 days ago, making adjustments to all of your accelerate/brake/steering angles as your slight deviations continually cause you to adjust every slight change of viewpoint to the points in space to arrive at your destination."
 
I know. I just wanted to post the clarification to make sure there was no misunderstanding. I did not want someone else to see your post and get confused and mistakenly think that Tesla was using some type of lidar now.

I don't really see how there could be any confusion around that.

upload_2020-4-21_16-9-32.png


Edit: let the disagrees flow in purely for missing the chance to actually post this originally :)

upload_2020-4-21_16-27-20.png
 
Last edited:
Autopilot rewrite is throwing out the "occupancy tracker", where the multiple views over time are combined by the neural network itself instead of procedural code. The AP rewrite is much better and predicting intersection edges.
Yeah, the Autopilot rewrite portions of the talk were the most interesting and exciting parts:
birds eye view network.png

This wasn't explicitly presented but notice how "Moving Objects" and "Road Lines" are making use of the "BirdsEyeView Network" now. From using current Autopilot, it's pretty clear that only the main camera is used for road lines, and that's why sharp curves are hard for Autopilot because it just doesn't see the road lines visible from the fisheye wide and side pillar cameras. Similarly, the "software 1.0" code manually stitching together objects from multiple cameras is what causes a large truck to appear multiple times in the display because the front camera sees the front of a truck, side camera sees the body of "another" truck, and the repeater camera sees the trailer of "yet another" truck, whereas with the rewrite will see a single truck and make better predictions.

Another exciting aspect is the understanding of double yellow lines and barriers separating different directions of traffic flow:
traffic flow direction.png
Current Autopilot definitely doesn't use this information as it will happily drive into oncoming traffic and only panics and alerts the driver when it detects a fast approaching vehicle realizing something is wrong. E.g., in this screenshot, if the car was going directly straight from the thicker purple portion, the light green portion does extend a little bit into where "straight" would be, so current Autopilot might incorrectly detect the light green portion as the continuation of the current lane, but with the rewrite, Autopilot would know it should drive towards the "purple" area.
 
Yeah, the Autopilot rewrite portions of the talk were the most interesting and exciting parts:
View attachment 534633
This wasn't explicitly presented but notice how "Moving Objects" and "Road Lines" are making use of the "BirdsEyeView Network" now. From using current Autopilot, it's pretty clear that only the main camera is used for road lines, and that's why sharp curves are hard for Autopilot because it just doesn't see the road lines visible from the fisheye wide and side pillar cameras. Similarly, the "software 1.0" code manually stitching together objects from multiple cameras is what causes a large truck to appear multiple times in the display because the front camera sees the front of a truck, side camera sees the body of "another" truck, and the repeater camera sees the trailer of "yet another" truck, whereas with the rewrite will see a single truck and make better predictions.

Another exciting aspect is the understanding of double yellow lines and barriers separating different directions of traffic flow:
View attachment 534635 Current Autopilot definitely doesn't use this information as it will happily drive into oncoming traffic and only panics and alerts the driver when it detects a fast approaching vehicle realizing something is wrong. E.g., in this screenshot, if the car was going directly straight from the thicker purple portion, the light green portion does extend a little bit into where "straight" would be, so current Autopilot might incorrectly detect the light green portion as the continuation of the current lane, but with the rewrite, Autopilot would know it should drive towards the "purple" area.

Thanks for sharing. Yes, the rewrite should be a big improvement.
 
Also you linked a green tweet from over a year ago, which is an eternity ago in Tesla FSD software.
An interesting tidbit from Third Row Tesla Podcast - Tesla's Future 17:56

> You might not be seeing some AutoPilot updates come to your car, but they're always coming to your car. It's always getting better. "Are you saying the car will actually push new neural nets or other kind of shadow mode stuff behind the scene without requiring a full update?" Correct.

Andrej Karpathy confirmed this behavior at ScaledML AI for Full-Self Driving 11:52

> We can train a small detector that detects an occluded stop sign by trees, and then what we do with that detector is we can beam it down to the fleet, and we can ask the fleet -- please apply this detector on top of everything else you're doing, and if this detector scores high then please send us an image. And then the fleet responds with somewhat noisy set, but they boost the amount of examples we have of stop signs that are occluded, and maybe 10% of them are actually occluded stop signs that we get from that stream. This requires no firmware upgrade. This is completely dynamic and can just be done by the team extremely quickly. It's the bread and butter of how we actually get any of these tasks to work. Just accumulating these large data sets in the full tail of that distribution.

So even if a single car isn't uploading much data, Tesla's nearly 1 million data collection fleet in aggregate finds the interesting examples that Tesla is actively working on closer to real time, which is important for engineering productivity. This is much faster than the normal software update feedback cycle that requires full testing before deployment as the basic detectors don't change deployed user-visible functionality.
 
Funny enough- Green is an early access tester so he knows exactly what's happening there too.

In this thread (scroll up a bit for the start) he mentions the release notes on this EAP release are a bit misleading as to what it "really" is stopping for- and also a bit about what goes into these new AP screenshot captures

green on Twitter

Is he an early access tester?? I thought he just was able to flip some feature flags on his car that gave him access to features that are disabled for non early access
 
> We can train a small detector that detects an occluded stop sign by trees, and then what we do with that detector is we can beam it down to the fleet, and we can ask the fleet -- please apply this detector on top of everything else you're doing, and if this detector scores high then please send us an image. And then the fleet responds with somewhat noisy set, but they boost the amount of examples we have of stop signs that are occluded, and maybe 10% of them are actually occluded stop signs that we get from that stream. This requires no firmware upgrade. This is completely dynamic and can just be done by the team extremely quickly. It's the bread and butter of how we actually get any of these tasks to work. Just accumulating these large data sets in the full tail of that distribution..



That's pretty much exactly how green described the "campaigns" Tesla does in those tweets from a year ago :)

So nothing appears to have changed.

The car isn't constantly "shadowing" the driver reporting disagreements- it only collects and uploads
specific "fixed" or "campaign" events... with fixed being always on things like "upload data if airbags deploy" and campaigns being short-term things he gives a number of examples of like "collect if backup camera matches specific labeled object"

Which would be similar to one they'd be doing now for the front camera collecting stop-sign data for example.

And not every car gets every campaign either.


Is he an early access tester?? I thought he just was able to flip some feature flags on his car that gave him access to features that are disabled for non early access


He claims he admitted himself to it- but does not detail how

green on Twitter
 
So nothing appears to have changed.
Clearly you did not listen to what Karpathy said...

The video linked further up from a few days ago by Karpathy.
  • they can ask for any subset at any given point without firmware update to test a NN.
  • they do this regularly and that is how they build out their real-life data sets.
I enjoy the reports from green as much as the next guy, but I am going to go with what Karpathy says anytime it contradicts what outsiders might say.
 
This is really disturbing. The fact they can change my car without my consent or knowledge is beyond anything believable. What if they change causes my car to act so differently that I get into an accident or if the update itself causes an accident? All fingers would point at the driver and no one would suspect the car itself. It's one thing to force updates on folks and this requires trust beyond common sense. I love my OTA updates and my car but not sure why this isn't causing a huge uproar.
 
The fact they can change my car without my consent or knowledge is beyond anything believable.
Again, it would be good if you actually listened/watched the video.
This is not to make your car drive differently, this is to gather only observed data based on a NN that they asked to run.
This has NOTHING to do with how the car drives, that (at this moment) still requires a firmware update.
 
Clearly you did not listen to what Karpathy said...

Clearly you didn't listen to what I or green said.


Cle

The video linked further up from a few days ago by Karpathy.
  • they can ask for any subset at any given point without firmware update to test a NN.
  • they do this regularly and that is how they build out their real-life data sets.
Which is exactly the same thing green said

At no point did he say a targeted campaign required a "firmware update"

At no point did he say they don't do this "regularly"


In fact he says exactly the OPPOSITE of that- he says THE SAME THINGS KARPATHY DOES.


Again I'd suggest you actually read Greens thread on how this all works

green on Twitter

If you can quote something in there that ACTUALLY disagrees with your video- please post it.




So you keep insisting Karpathy said something different from Green- while continuing to quote him not saying a single thing that contradicts anything Green reported.


Cl I am going to go with what Karpathy says anytime it contradicts what outsiders might say.

Which not a single thing you quoted him saying does.


The point was Shadow Mode as originally suggested (and as many continue to believe) is NOT constantly reporting all divergence between human behavior and what AP "thinks it should do even when it's not on"

It ONLY reports specific things Tesla asks it do-and only from the specific cars Tesla asks to report that info.


Hence all the "I like to drive on AP in weird situations all the time to train the car via shadow mode!" posts are simply nonsense because the car does not work like that.


If your specific car happens to currently have a campaign to look for a SPECIFIC thing, and you happen to encounter that specific thing, you will be helping train the car (well, train the master stuff back at HQ, your own specific car won't change driving behavior without a firmware update).
 
Last edited: