Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Andrej Karpathy - AI for Full-Self Driving (2020)

This site may earn commission on affiliate links.

Bladerskb

Senior Software Engineer
Oct 24, 2016
3,211
5,553
Michigan

Good talk, the only thing i disliked is him equating HD map to knowing the distance of the leaves on a tree. Basically that the Map has to be geometric for it to be HD. That's not true and he knows that. The map that they have which is increasing in fidelity IS a HD Map. Having stop signs, traffic lights, stop lines, pot holes, parking lots, lane lines and road edges semantics mapped is a HD Map and has always been referred to as such by all relevant companies.

Heck there are dozens of HD map companies offering just that. Like Lvl5, Carmera, Nvidia, Mobileye's REM, etc.
But to save face and to please Elon, he refers to it as "Low def". Google maps are low definition maps.
 

Good talk, the only thing i disliked is him equating HD map to knowing the distance of the leaves on a tree. Basically that the Map has to be geometric for it to be HD. That's not true and he knows that. The map that they have which is increasing in fidelity IS a HD Map. Having stop signs, traffic lights, stop lines, pot holes, parking lots, lane lines and road edges semantics mapped is a HD Map and has always been referred to as such by all relevant companies.

Heck there are dozens of HD map companies offering just that. Like Lvl5, Carmera, Nvidia, Mobileye's REM, etc.
But to save face and to please Elon, he refers to it as "Low def". Google maps are low definition maps.

First, thanks for posting.

I definitely agree with you about HD maps. I do have a follow-up question: can't HD maps also be used for contextual information like labeling when a stop sign is only if you are turning right? For example, Karpathy mentions needing to use camera vision to understand when a stop sign only applies to certain situations but that seems like something that could be labeled on a HD map.

Based on this new talk, how would you assess Tesla's current state of FSD? What I get from the presentation is that Tesla is using existing machine learning techniques and slowly working through all the complexities and nuances of perception, planning and driving policy, trying to get it all done with vision and machine learning alone. It's taking them longer than they thought because of how complex the problem of autonomous driving is, especially with just camera vision.
 
Good talk, the only thing i disliked is him equating HD map to knowing the distance of the leaves on a tree. Basically that the Map has to be geometric for it to be HD. That's not true and he knows that.

I don't think that was his meaning (geometric). Only that the precision and accuracy is much higher for the HD map.

Let's see where he might get that idea from...
What about Waymo? Bringing 3D perimeter lidar to partners
Multiple returns per pulse: When the Honeycomb sends out a pulse of light, it doesn’t just see the first object the laser beam touches. Instead, it can see up to four different objects in that laser beams’ line of sight (e.g., it can see both the foliage in front of a tree branch and the tree branch itself). This gives a rich and more detailed view of the environment, and uncovers objects that might otherwise be missed.
Though that only says then can do it, not that they do do it or map it.

Maybe he read BMW's page: Autonomous driving: digital measurement of the world
These latest HD maps show a wealth of information, not just roads and routes. Accurate to the centimetre, the environment is contained completely in the many billions of pixels: from the trees at the roadside to the height of the kerb. All of this data is captured and displayed in three-dimensional images.
Maps with centimeter leaf accuracy. They may round down the precision, but that is a per-implementation decision.

Is there an official standard for map definition level and whee 'HD' begins?
Lower def
There are two streets that cross
There are two streets that cross with a 2 way stop on the N-S
There are two streets that cross with a 2 way stop on the N-S and the E-W is 4 lane
There are two streets that cross with a 2 way stop on the N-S and the E-W is 4 lane and the stop signs are at location X1,Y1,Z1 and X2,Y2,Z2
There are two streets that cross with a 2 way stop on the N-S and the E-W is 4 lane and the stop signs are at location X1,Y1,Z1 and X2,Y2,Z2 and sign 2 is occluded by a tree
There are two streets that cross with a 2 way stop on the N-S and the E-W is 4 lane and the stop signs are at location X1,Y1,Z1 and X2,Y2,Z2 and sign 2 is occluded by a tree and the curbs follow this spline {...}
Higher def
 
First, thanks for posting.

I definitely agree with you about HD maps. I do have a follow-up question: can't HD maps also be used for contextual information like labeling when a stop sign is only if you are turning right? For example, Karpathy mentions needing to use camera vision to understand when a stop sign only applies to certain situations but that seems like something that could be labeled on a HD map.

Yup. That's exactly what others do in their HD map. They have a semantic layer that says, this stop sign is for this lane and this traffic light is for this lane and this stop sign is only active during this condition. Ofcourse the real time perfection matches with the HD Map.

Based on this new talk, how would you assess Tesla's current state of FSD? What I get from the presentation is that Tesla is using existing machine learning techniques and slowly working through all the complexities and nuances of perception, planning and driving policy, trying to get it all done with vision and machine learning alone. It's taking them longer than they thought because of how complex the problem of autonomous driving is, especially with just camera vision.

That all the hoopla from Elon and his fans is still pure ludicrous as always. Like you said, they are just using existing ML techniques and not doing any new original research. That in itself proves that fact that they will be incapable of getting to L5 with the hardware sensors that they have because we know what's capable in the SOTA of computer vision which is what they are just using. Then when it comes to planning. Notice how he says they haven't even started on an AI planner. So there it is.

Also its not that its taking them longer than they thought. I'm sure its taking them the exact same time that they thought it will take them. I don't believe that a single AP engineer believed they would have L5 in 2017 or in 2018 or in 2019 or even in 2020. Remember every single thing you read about AP and FSD is being filtered by Elon Musk. When you actually watch videos from the AP team, from former director Sterling Anderson, presentations from Chris to now AJ. You can't glean that from any of their presentation. The first AP engineer we heard from in 2018 from The Information article said Elon Musk was full of it when he said they would have Level 5 in 2019. That they are not even close.

 
I don't think that was his meaning (geometric). Only that the precision and accuracy is much higher for the HD map.

No that was his meaning. you can get similar precision and accuracy with or with geometric. He's literally saying we don't do HD map because we don't map the entire curb, trees, buildings. When a HD map doesn't need that.

Is there an official standard for map definition level and whee 'HD' begins?

HD map is any map that has more details than a conventional map. Its always been the case. What Tesla currently provides and plan to provide in their map IS an HD Map. For it not to be an HD Map, All these HD map companies need to go out of business and stop calling what they are doing HD map because Elon Musk has come and redefined it for them. But if you want to know how HD Map companies define HD map from conventional map?

How are HD maps different from regular maps?

Regular maps (like Waze and Apple Maps) are routing maps intended to help human drivers figure out how to get from A to B. An HD Map is a machine-readable, 3D representation of the static features in the driving environment. For autonomous cars, these "static features" include things like road signs, traffic lights, curbs, and lane-lines.​
 
Like you said, they are just using existing ML techniques and not doing any new original research.

I'm not really sure how you can make this assertion. I think it's pretty clear that the infrastructure Tesla has set up around applying ML theory is novel; including the methods they've developed to team-work on training different aspects of neural networks. If they weren't doing anything new, why is Karpathy invited to present their work at conferences?
 
That all the hoopla from Elon and his fans is still pure ludicrous as always. Like you said, they are just using existing ML techniques and not doing any new original research. That in itself proves that fact that they will be incapable of getting to L5 with the hardware sensors that they have because we know what's capable in the SOTA of computer vision which is what they are just using. Then when it comes to planning. Notice how he says they haven't even started on an AI planner. So there it is.

Thanks. I definitely get the sense that Tesla is trying to solve problems that have mostly already been solved, but they are struggling because they are trying to do it all with just cameras and not using lidar and HD maps that are useful to solve these problems more reliably. So Tesla is still working on common tasks like turning at intersections, responding to traffic lights and stop signs that Waymo could do a decade ago because they have to figure out a way how to solve the problems reliably with just camera vision. And like you said, Tesla has not even tackled the other stuff like planning or driving policy yet.

Also its not that its taking them longer than they thought. I'm sure its taking them the exact same time that they thought it will take them. I don't believe that a single AP engineer believed they would have L5 in 2017 or in 2018 or in 2019 or even in 2020. Remember every single thing you read about AP and FSD is being filtered by Elon Musk. When you actually watch videos from the AP team, from former director Sterling Anderson, presentations from Chris to now AJ. You can't glean that from any of their presentation. The first AP engineer we heard from in 2018 from The Information article said Elon Musk was full of it when he said they would have Level 5 in 2019. That they are not even close.

I guess I should rephrase. It is taking them longer than Elon claimed. But you are right that I am sure the engineers on the team, had a good idea of what problems would need to be solved and that it would take this long to get to where they are.
 
  • Disagree
Reactions: mikes_fsd
I'm not really sure how you can make this assertion. I think it's pretty clear that the infrastructure Tesla has set up around applying ML theory is novel; including the methods they've developed to team-work on training different aspects of neural networks. If they weren't doing anything new, why is Karpathy invited to present their work at conferences?

None of that is new. They are all standard ML workflow. Most of these conferences are setup by the framework company for promotion. Its basically tell us how you are using our tool and we will help you with recruiting. Its not some novel contest. Its a popularity contest. The bigger the big names they can get to speak, the more they can get more people to convert and use their tool, framework, software.

For example the pytorch conference
"Hear from Andrej Karpathy on how Tesla is using PyTorch to develop full self-driving capabilities for its vehicles, including AutoPilot and Smart Summon."

or DataBrick Sparks AI conference (Sparks is an analytics software)

on and on...

If you want to know about new novel research. Look up research papers and watch videos from peer-reviewed ML conferences or videos about new PR papers. For example CVPR (Conference on Computer Vision and Pattern Recognition), etc.

For example the video I posted earlier about an actual Recent SOTA (State of the Art) paper, which even Andrej referred to. This is from Adrien Gaidon at Toyota Research Institute. These are the things you watch if you want to see novel research. Not product based conferences. Its very easy to copy this and go to a product based conference and claim "look what we have done." especially when they write a paper telling you how they did it and giving you the code.

 
These are the things you watch if you want to see novel research. Not product based conferences. Its very easy to copy this and go to a product based conference and claim "look what we have done." especially when they write a paper telling you how they did it and giving you the code.

This is almost like claiming that Tesla has done nothing to advance electric vehicles because John Goodenough did all the research to invent the lithium ion battery. Just because Karpathy isn't currently publishing papers doesn't mean they're not applying research in novel ways that will move the industry forward.
 
Just because Karpathy isn't currently publishing papers doesn't mean they're not applying research in novel ways that will move the industry forward.

So tell me? What is it that they are doing? We have known really in depth what they have been doing since early 2017. Infact the recent patent Andrej wrote about data collection we have known since 2017 right down to the exact detail. The new patent they applied for again is standard ML procedure. So then tell me. Because we have a very good insight into what they are doing and can prove that they have just been playing catch up. But tell me, what are they doing? Whats the novel research? What are they doing that no one else is doing?

Even Andrej himself isn't claiming anything novel. He infact says this isn't anything new or special, that this is standard to any CV student. Yet the fans can't help themselves.
 
Good talk, the only thing i disliked is him equating HD map to knowing the distance of the leaves on a tree. Basically that the Map has to be geometric for it to be HD. That's not true and he knows that. The map that they have which is increasing in fidelity IS a HD Map. Having stop signs, traffic lights, stop lines, pot holes, parking lots, lane lines and road edges semantics mapped is a HD Map and has always been referred to as such by all relevant companies.

No, he specifically was saying that he referred to HD maps as to having centimeter accuracy where you can position yourself in that map to the centimeter level. Tesla doesn't do maps like that. He even said that their map only shows that a stop sign "should be somewhere around this area". That actually sounds like what you would refer to as "routing maps". Maybe you would prefer they call them MD, or mid-definition, maps?

I definitely agree with you about HD maps. I do have a follow-up question: can't HD maps also be used for contextual information like labeling when a stop sign is only if you are turning right? For example, Karpathy mentions needing to use camera vision to understand when a stop sign only applies to certain situations but that seems like something that could be labeled on a HD map.

You still don't get it, you can't ever trust HD maps for making driving decisions. Conditions/rules/signs change every single day. What happens when one day the city comes along and decides that the "except for right turn" no longer applies to this stop sign and takes the exception sign down. Your car depending on HD maps will now run that stop sign when turning right since its HD map says it is allowed to while the Tesla using vision will see that the exception is gone and correctly stop at the sign.
 
So tell me? What is it that they are doing? We have known really in depth what they have been doing since early 2017. Infact the recent patent Andrej wrote about data collection we have known since 2017 right down to the exact detail. The new patent they applied for again is standard ML procedure. So then tell me. Because we have a very good insight into what they are doing and can prove that they have just been playing catch up. But tell me, what are they doing? Whats the novel research? What are they doing that no one else is doing?

Even Andrej himself isn't claiming anything novel. He infact says this isn't anything new or special, that this is standard to any CV student. Yet the fans can't help themselves.

I don't believe we've seen every aspect of Tesla's internal processes. The fact that former Tesla employees seem to be a hot commodity at autonomy competitors like Zoox leads me to believe they must be doing something the competition sees as worthwhile.

It's fair to say that Tesla makes hyperbolic promises and often takes years to keep them, but I can't take an argument that Tesla is faking their way through autonomy by copying others in good faith.
 
You still don't get it, you can't ever trust HD maps for making driving decisions. Conditions/rules/signs change every single day. What happens when one day the city comes along and decides that the "except for right turn" no longer applies to this stop sign and takes the exception sign down. Your car depending on HD maps will now run that stop sign when turning right since its HD map says it is allowed to while the Tesla using vision will see that the exception is gone and correctly stop at the sign.

No, you don't get it. Nobody is saying you only use HD maps to do 100% of your driving. That would be ridiculous! Everybody uses HD Maps with other sensors (cameras, lidar, radar) working together to do autonomous driving.

Ask yourself a simple question: if HD maps are so useless why do all the leaders in autonomous driving use them and have better FSD than Tesla? And why is Karpathy admitting in this video that just one feature like stop signs is so super hard without HD maps because there are so many cases and different contexts that change what stop signs mean?
 
Last edited:
I don't believe we've seen every aspect of Tesla's internal processes.

Tesla and Andrej has confirmed everything verygreen has told us since 2017 and he has almost never been wrong. This again is the wishful thinking. Instead of accepting and discussing facts, its easy to cling to some unseen, unknown hope.

The fact that former Tesla employees seem to be a hot commodity at autonomy competitors like Zoox leads me to believe they must be doing something the competition sees as worthwhile.

You're joking right? Source? I think you are misconstruing the lawsuit with Zoox (former Tesla employees) about logistics that has nothing to do with autonomous driving.

It's fair to say that Tesla makes hyperbolic promises and often takes years to keep them, but I can't take an argument that Tesla is faking their way through autonomy by copying others in good faith.

No one is saying they are faking. We are all pointing to the obvious fact that they are doing exactly what others are doing.
For example i did this comparison last year.
https://i.redd.it/578gwi8zu5y21.png

IF we take a close look at road edge network versus what Mobileye had in 2017.

Picture of Tesla's HW3 FSD Road Edge Neural Network

Picture of Mobileye's Road Edge Neural Network from Production Q4 2017

Or lets take a look at what they are doing with pseudo lidar versus mobileye's 2020 EyeQ5.

Amon shushua: "We use the vidar (pseudo lidar) data and feed it into the Lidar processing stream in order to detect objects. We take the same algorithms that we have for the cars with lidar and radar and we feed it with pseudo lidar data instead of lidar data"

Andrej: "As many other do, we take pseudo lidar input and take the technique that has been developed for lidar processing and achieve object detection
 
Ask yourself a simple question: if HD maps are so useless why do all the leaders in autonomous driving use them HD maps and have better FSD than Tesla? And why is Karpathy admitting in this video that just one feature like stop signs is so super hard with vision alone because there are so many cases and different contexts that change what stop signs mean?

Because HD maps are a crutch to quickly get out a demo/proof of concept FSD system. And how do you know their FSD system is better than Tesla's? (And no I am not talking about what Tesla has released to people purchasing the FSD option, I mean what they actually have internally.)

And why is Karpathy admitting in this video that just one feature like stop signs is so super hard with vision alone because there are so many cases and different contexts that change what stop signs mean?

No, you don't get it. Nobody is saying you only use HD maps to do 100% of your driving. That would be ridiculous! Everybody uses HD Maps with other sensors (cameras, lidar, radar) working together to do autonomous driving.

If you can't rely on the HD map, what is the value of it? So you know that you have to have the vision system look extra hard in this location to see what the current signage says? :rolleyes: What is the value of an HD map that tells you there is a stop sign at the corner when the city has decided to remove that stop sign? What do you do when your vision system tells you something different than your HD map? Which do you trust? If you say you have to trust the vision system, then you have just said that the HD map is worthless and not necessary. (Or is only necessary as a crutch until you can get the vision system working properly.)
 
No, he specifically was saying that he referred to HD maps as to having centimeter accuracy where you can position yourself in that map to the centimeter level. Tesla doesn't do maps like that. He even said that their map only shows that a stop sign "should be somewhere around this area".
Right, I would assume the detailed-ness of a map that Tesla is using is of the quality from OpenStreetMap providing approximate 2d positioning of points, lines, and areas. Here's an example of a stop sign for exiting Tesla HQ parking lot:
tesla stop sign.png
This map data for the stop sign is a single point on a line that a human approximated from a satellite photo. Clearly the stop sign is not actually placed in the middle of a lane that has no width. Whereas a HD map would know how large of a stop sign is off the ground positioned in 3d relative to the 2 crossing roads down to centimeter accuracy.
 
Tesla and Andrej has confirmed everything verygreen has told us since 2017 and he has almost never been wrong. This again is the wishful thinking. Instead of accepting and discussing facts, its easy to cling to some unseen, unknown hope.

verygreen isn't an employee of Tesla. All he has access to is the hardware in his car and the software Tesla pushes to his vehicle. This gives him insight into their output, but he doesn't have access to any of the internal processes Tesla hasn't disclosed. No-one outside of Tesla has access to that information.

You're joking right? Source? I think you are misconstruing the lawsuit with Zoox (former Tesla employees) about logistics that has nothing to do with autonomous driving.

Tesla was able to prove that four of their former employees brought logistics information with them to Zoox, but that doesn't mean it's all they took. Zoox is actively performing an audit to ensure none of their former-Tesla hires are using other confidential information. I find it hard to believe that Zoox hired former Tesla employees purely for warehouse logistics when they're both competing in the autonomous vehicle field.

"Zoox says it will pay Tesla an undisclosed amount of money and will perform an audit to 'ensure that no Zoox employees have retained or are using Tesla confidential information.' " from Former Tesla employees brought stolen documents to self-driving startup Zoox

And there seems to be a history of former-Tesla employees being sought by autonomy competitors:

"In April the company scooped up former Tesla Vice President David Nistér, who was a key member of Tesla’s autopilot team, which developed semi-autonomous car aspects." With New Hire, Nvidia Targets Mobileye, Report Says - CTech
"The company is suing Sterling Anderson, formerly the Director of Autopilot Programs at Tesla, for allegedly downloading confidential information about the company’s Autopilot program, destroying evidence, and trying to poach former co-workers." Tesla sues former Autopilot director for taking proprietary information and poaching employees
"Apple appears to have stepped up its poaching activities involving Tesla employees over the past few months, luring away manufacturing, security and software engineers, and supply chain experts to work on the "Project Titan" self-driving car initiative and other products, according to a new report." Apple poached 'scores' of Tesla employees in recent months, but not all go to 'Project Titan'

No one is saying they are faking. We are all pointing to the obvious fact that they are doing exactly what others are doing.
For example i did this comparison last year.
https://i.redd.it/578gwi8zu5y21.png

That's reasonable to say Tesla is doing what others are doing, but the fact that they're doing it with an active fleet available for data collection and attempting to do it without LIDAR are unique.
 
  • Like
Reactions: PlaidCPA
Right, I would assume the detailed-ness of a map that Tesla is using is of the quality from OpenStreetMap providing approximate 2d positioning of points, lines, and areas. Here's an example of a stop sign for exiting Tesla HQ parking lot:
View attachment 534607 This map data for the stop sign is a single point on a line that a human approximated from a satellite photo. Clearly the stop sign is not actually placed in the middle of a lane that has no width. Whereas a HD map would know how large of a stop sign is off the ground positioned in 3d relative to the 2 crossing roads down to centimeter accuracy.

OpenStreetMap isn't Tesla map.
 
  • Like
  • Disagree
Reactions: jepicken and Joel
Because HD maps are a crutch to quickly get out a demo/proof of concept FSD system.

That's an Elon talking point to justify why he rejects HD maps. It's BS.

And how do you know their FSD system is better than Tesla's? (And no I am not talking about what Tesla has released to people purchasing the FSD option, I mean what they actually have internally.)

I think Autonomy Day and these talks by Karpathy give us a good idea of what Tesla has internally. And I don't see anything in them that is better than Waymo. And we know that Waymo has L4 robotaxis that can handle city driving with no safety driver for 13,000 miles with no disengagement. Can Tesla's internal software do that? I doubt it. If they did, Tesla would have released it by now.

If you can't rely on the HD map, what is the value of it?

HD maps are reliable. I said you don't drive just with HD maps but they are reliable.

So you know that you have to have the vision system look extra hard in this location to see what the current signage says? :rolleyes: What is the value of an HD map that tells you there is a stop sign at the corner when the city has decided to remove that stop sign? What do you do when your vision system tells you something different than your HD map? Which do you trust? If you say you have to trust the vision system, then you have just said that the HD map is worthless and not necessary. (Or is only necessary as a crutch until you can get the vision system working properly.)

Ah, the classic "if you need to use vision when the map is wrong, why not just rely on vision and dump maps?" argument.

Because sometimes camera vision is not enough. What happens when you don't have HD maps, just camera vision, and your camera vision fails or makes a mistake? Or do you think camera vision is always perfect all the time?

But this should explain it better than I could:

5qg8CrF.png