Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Andrej Karpathy - AI for Full-Self Driving (2020)

This site may earn commission on affiliate links.
For example i did this comparison last year.
https://i.redd.it/578gwi8zu5y21.png

By the way, Karpathy mentions 48 NN. But I am thinking that is probably across all 8 cameras. So 48/8 would be 6 distinct NNs.

But based on what we've seen, it seems that Tesla is working on "free space", "road markings", "path prediction", "road signs", traffic lights" and "road edges" from your list.
 
  • Disagree
Reactions: mikes_fsd and mongo
By the way, Karpathy mentions 48 NN. But I am thinking that is probably across all 8 cameras. So 48/8 would be 6 distinct NNs.

But based on what we've seen, it seems that Tesla is working on "free space", "road markings", "path prediction", "road signs", traffic lights" and "road edges" from your list.

Karpathy said they were working on a classifier for emergency service vehicles in the video; does that count as a sub-category of free space or a classifier unto itself?
 
Karpathy said they were working on a classifier for emergency service vehicles in the video; does that count as a sub-category of free space or a classifier unto itself?

No. I don't think it would be part of "free space". I would imagine that it would just be a modifier to the existing object detection for vehicles. In other words, if the NN detects the emergency lights, it would tag the vehicle as an "emergency vehicle" which would in turn modify the driving policy (ie "pull over and let the emergency vehicle pass").

By the way, I am very happy to hear that Tesla is working on emergency vehicles. This is something that Waymo can do and I was wondering when Tesla would get to it, since it is an important part of autonomous driving.
 
  • Like
Reactions: willow_hiller
Ah, the classic "if you need to use vision when the map is wrong, why not just rely on vision and dump maps?" argument.

Because sometimes camera vision is not enough. What happens when you don't have HD maps, just camera vision, and your camera vision fails or makes a mistake? Or do you think camera vision is always perfect all the time?

But how do you know if the vision system has made a mistake or your HD map is wrong? If you can answer that, you probably don't need the vision system in the first place. Again, what do you propose a FSD car should do when it is approaching an intersection and the vision system says there is a stop sign and the HD map says there isn't? What about when it is in the opposite situation, the vision system says there is no stop sign and the HD map says there is? How do you break the tie? Do you always trust vision? Do you always trust the HD map? Do you stop, put your flashers on, and phone home for instructions?

Even if you are using HD maps what do you do if your vision system fails? Do you just continue on using nothing but HD maps and LiDAR?

If you have to rely on an HD map you can only operate in a geo-fenced area that has well maintained HD mapping. (ala Cadillac SuperCruise.) Which means by definition you haven't, and can't, achieve Level 5. The furthest you can get is level 4. (Unless you think someone is really going to create, and maintain, HD maps of everywhere.)
 
But how do you know if the vision system has made a mistake or your HD map is wrong? If you can answer that, you probably don't need the vision system in the first place. Again, what do you propose a FSD car should do when it is approaching an intersection and the vision system says there is a stop sign and the HD map says there isn't? What about when it is in the opposite situation, the vision system says there is no stop sign and the HD map says there is? How do you break the tie? Do you always trust vision? Do you always trust the HD map? Do you stop, put your flashers on, and phone home for instructions?

It would depend on the specific case. But in general, you would program the driving policy to drive cautiously if the HD maps and the vision did not agree. Also, are other cars stopping where you think a stop sign is? If so, that would confirm that there is a stop sign there. But you would also make sure your HD map is updated so you could trust it.

Even if you are using HD maps what do you do if your vision system fails? Do you just continue on using nothing but HD maps and LiDAR?

Yes, you would trust your HD maps and lidar since they are correct until your vision comes back. That's the whole point of HD maps, is to give you a fall-back when vision temporarily fails. Obviously, if your vision failed permanently, the car would pull over.

Here's the problem with not using HD maps. What if your vision thinks there is a stop sign but there really isn't? If you are just using vision, you have no way of knowing if the vision is correct or not. What should the car do? Should it assume the vision is correct or should it assume it is wrong? With HD maps, you have a fall-back that can confirm if the vision is right or wrong.

If you have to rely on an HD map you can only operate in a geo-fenced area that has well maintained HD mapping. (ala Cadillac SuperCruise.) Which means by definition you haven't, and can't, achieve Level 5. The furthest you can get is level 4. (Unless you think someone is really going to create, and maintain, HD maps of everywhere.)

I don't think that is completely true. According to the SAE document, every road in the US would be good enough for L5 (if you are selling your car in the US). A HD map of every road in the US is doable. And you could update it with your fleet or with satellites in near real-time. So no, I don't think HD maps would prevent you from doing L5.

But I would add that even L4 would still provide significant value. Your profile says you live in Oregon. If you had a true robotaxi (no steering wheel or pedals, true driverless) that worked on every road in Oregon, it would technically only be L4 but wouldn't that still be useful to you? Do you really need an autonomous car that works on every single road in the entire US? Probably not.

It sounds good to say that HD maps only work for small geofenced areas but camera vision can do generalized L5. But if you just have vision, then you only have one system for driving. There is no redundancy or fall-back. If your vision is not 100% perfect all the time, your autonomous driving will fail. And like I said earlier, with only one system, you won't know if your vision is wrong or not.
 
Last edited:
Yes, you would trust your HD maps and lidar since they are correct until your vision comes back.

How do you know they are correct, and that an area didn't go into construction and have the position and signage changed?

But if you just have vision, then you only have one system for driving. There is no redundancy or fall-back. If your vision is not 100% perfect all the time, your autonomous driving will fail. And like I said earlier, with only one system, you won't know if your vision is wrong or not.

But if you only trust vision when it matches your HD map, as you say we should, then you don't really have a vision system and you are operating with a single system for driving with nothing to fall-back on. If your HD maps are not 100% perfect all the time, which they can't be, your autonomous driving will fail. And like I said earlier if you already know if your vision is wrong or not then you don't need the vision system to begin with.
 
  • Like
Reactions: mikes_fsd and mongo
This lecture convinces me that Tesla's approach is the ONLY possible solution to general FSD, if it's ever achieved. Tesla will be the first to achieve general FSD, unless another company is able to create a similar fleet of sensors and trainers.
I agree with this but Tesla must upgrade their sensors, which precludes current vehicles from achieving legit FSD. It will be a very long time before another company can come close to approaching Tesla's fleet advantage.
 
  • Disagree
Reactions: mikes_fsd
Hilariously, we've all been buying "Full Self Driving" from Tesla since late 2016. I bought it in 2017 - haha!

I am kind of amazed that Tesla have been able to get away with this for so long. I mean, I remember the days when they were getting slapped about with lawsuits because they called the system "autopilot", which implied autonomous capabilities. Thankfully, they quickly cleared up any confusion by calling it FULL SELF DRIVING instead.

Which, I guess, is technically correct: you have to fully drive it yourself.

On a more serious note: anything you hear about in corporate public-facing talks is either total speculation "what-if?" - or stuff that's already been done and shipped. I don't think that even Tesla would have the hubris to do a public presentation on some of their internal research that hasn't yet been granted patent status or at least have some kind of wrap around it, for legal and IP reasons.

But, it's all moot anyway. They're going about trying to achieve autonomy in such a weird way; but at least Andrej is having fun with his data labelling engines and Software 2.0 projects! There's some serious IP in that side of things alone...

Right, I'm just popping out to fully self-drive myself over a few roundabouts. See you later!
 
Last edited:
How do you know they are correct, and that an area didn't go into construction and have the position and signage changed?

Tesla uses map data too. How do they know that the map data is correct? If an area went into construction, it would be obvious, no? One of your cars would then upload updated data that updates the HD map for the entire fleet.

But if you only trust vision when it matches your HD map, as you say we should, then you don't really have a vision system and you are operating with a single system for driving with nothing to fall-back on. If your HD maps are not 100% perfect all the time, which they can't be, your autonomous driving will fail. And like I said earlier if you already know if your vision is wrong or not then you don't need the vision system to begin with.

Maybe I am not explaining myself correctly. You use BOTH systems together. Why do you think Tesla is using map data for traffic lights and stop signs? Tesla claims it's not HD maps, but it's the same principle. The map data tells the car where traffic lights and stop signs are so it will tell the vision where to look. The map data also helps improve the reliability of your camera vision. But you also use camera vision to see the stop sign or see the traffic light. The map data can also provide contextual information that you can't get from vision like this stop sign is only for the left lane or this traffic light is only in effect between 1pm and 5pm.

Waymo and everybody uses HD maps and it works. Waymo cars don't blow through stop signs because the HD maps were wrong. And like I said, even Tesla is using their own version of HD maps.
 
Coast to coast or bust :D

Remember the arguments that used to happen in that old thread?

Yeah. It's funny. I used to think a coast to coast demo would be super cool and proof that FSD had arrived. On the surface, a coast to coast demo would appear to be a good test of FSD. It would be a nice sample of US roads, involving different cities and different highways. But now, I realize that a coast to coast demo would be pointless.

For one, 3000 miles without a disengagement is not even close to the safety rate that you would need for reliable robotaxis. Waymo already has 4 times better. Cruise also has 4 times better and they say that their autonomous cars can easily do 3000 miles of city driving without a single incident. So Cruise and Waymo easily pass the coast to coast demo challenge yet their robotaxis are not quite ready for full deployment yet.

Second, a coast to coast demo would be mostly highways and mostly long stretches of boring highways too. You could probably route the car to stay on highways and avoid most city streets even. The fact that your robotaxi can drive straight on a highway lane for 3000 miles would not be very impressive. So really, it would not test the most difficult parts of FSD which involve city driving. That is why all companies working on FSD, like Cruise and Waymo, focus all their efforts on solving autonomous driving on city streets. That's the challenge. If you can master city driving, highway driving is a non issue.

Lastly, it would be more of an endurance test than a FSD test since cars can't drive 3000 miles without recharging or refueling. So really it would be a test of how easily you can refuel/recharge your car.
 
verygreen isn't an employee of Tesla. All he has access to is the hardware in his car and the software Tesla pushes to his vehicle. This gives him insight into their output, but he doesn't have access to any of the internal processes Tesla hasn't disclosed. No-one outside of Tesla has access to that information.

And yet everything he has said has been right and he has barely missed anything. He said word for word everything that was discussed in the autonomy day presentation over 2 years ago. That's something for someone whose not an employee. The point is, we know what Tesla uses so they cant hide it from us. Therefore we know what they have done the last 4 years. And they haven't done any of the fairy tale stuff that their fans has been proclaiming they do.

Tesla was able to prove that four of their former employees brought logistics information with them to Zoox, but that doesn't mean it's all they took. Zoox is actively performing an audit to ensure none of their former-Tesla hires are using other confidential information. I find it hard to believe that Zoox hired former Tesla employees purely for warehouse logistics when they're both competing in the autonomous vehicle field.

"Zoox says it will pay Tesla an undisclosed amount of money and will perform an audit to 'ensure that no Zoox employees have retained or are using Tesla confidential information.' " from Former Tesla employees brought stolen documents to self-driving startup Zoox

And there seems to be a history of former-Tesla employees being sought by autonomy competitors:

"In April the company scooped up former Tesla Vice President David Nistér, who was a key member of Tesla’s autopilot team, which developed semi-autonomous car aspects." With New Hire, Nvidia Targets Mobileye, Report Says - CTech
"The company is suing Sterling Anderson, formerly the Director of Autopilot Programs at Tesla, for allegedly downloading confidential information about the company’s Autopilot program, destroying evidence, and trying to poach former co-workers." Tesla sues former Autopilot director for taking proprietary information and poaching employees
"Apple appears to have stepped up its poaching activities involving Tesla employees over the past few months, luring away manufacturing, security and software engineers, and supply chain experts to work on the "Project Titan" self-driving car initiative and other products, according to a new report." Apple poached 'scores' of Tesla employees in recent months, but not all go to 'Project Titan'

This is complete rubbish.

That's reasonable to say Tesla is doing what others are doing, but the fact that they're doing it with an active fleet available for data collection and attempting to do it without LIDAR are unique.

First of all they are not unique. Second of all the problem is people running around claiming and preaching that Tesla is 5+ years ahead while doing what others are doing but doing it up to 2 years later.

This lecture convinces me that Tesla's approach is the ONLY possible solution to general FSD, if it's ever achieved. Tesla will be the first to achieve general FSD, unless another company is able to create a similar fleet of sensors and trainers.

By copying what others are doing...huh? I'm confused.
 
Last edited:
Tesla uses map data too. How do they know that the map data is correct? If an area went into construction, it would be obvious, no? One of your cars would then upload updated data that updates the HD map for the entire fleet.

So now you are trusting the vision system to update the HD map for the fleet? If that is the case then again you don't need HD maps because the vision system can create them in real time. What if a malicious person moves signage around, you're just going to update your fleet map to match it?

For one, 3000 miles without a disengagement is not even close to the safety rate that you would need for reliable robotaxis. Waymo already has 4 times better. Cruise also has 4 times better and they say that their autonomous cars can easily do 3000 miles of city driving without a single incident. So Cruise and Waymo easily pass the coast to coast demo challenge yet their robotaxis are not quite ready for full deployment yet.

I don't know about Cruise, but that certainly isn't true for Waymo. Waymo requires a route to be pre-HD mapped, and they haven't HD mapped a coast to coast route yet, so no Waymo can not currently "easily pass the coast to coast" demo challenge. That would be like saying a FSD car can drive 100,000 laps around a closed 1 mile track without a disengagement so that means it can drive the 100k miles in a different setting with the same results.
 
  • Like
Reactions: mikes_fsd and mongo
So now you are trusting the vision system to update the HD map for the fleet? If that is the case then again you don't need HD maps because the vision system can create them in real time. What if a malicious person moves signage around, you're just going to update your fleet map to match it?

What don't you understand? HD maps are no different than what Tesla is doing. Tesla uses map data to tell the vision where to look for a traffic light or stop sign, and to give the vision higher confidence when it sees a traffic light or stop sign. But vision is also used to see the color of a light and to read a stop sign. But Tesla says in the manual that traffic lights and stop signs response may not work without map data.

I don't know about Cruise, but that certainly isn't true for Waymo. Waymo requires a route to be pre-HD mapped, and they haven't HD mapped a coast to coast route yet, so no Waymo can not currently "easily pass the coast to coast" demo challenge. That would be like saying a FSD car can drive 100,000 laps around a closed 1 mile track without a disengagement so that means it can drive the 100k miles in a different setting with the same results.

I am pretty sure Waymo could do a coast to coast demo without pre-mapping. Or maybe they would pre-map first to increase reliability. In any case, Waymo has the FSD to do a coast to coast demo easily, whether pre-mapped or not pre-mapped.
 
HD maps are reliable. I said you don't drive just with HD maps but they are reliable.
Unless the scene changes from when they were made.

Yes, you would trust your HD maps and lidar since they are correct until your vision comes back. That's the whole point of HD maps, is to give you a fall-back when vision temporarily fails.
So you are trusting maps over vision.

How do they know that the map data is correct? If an area went into construction, it would be obvious, no? One of your cars would then upload updated data that updates the HD map for the entire fleet.
How did that car traverse the scene if the map was wrong? It can only do that if the map was not needed to begin with.

Here's an example:
Say you have an autonomous car, but it's stop sign recognition is only 99.9 percent accurate at positive detection. Within that 0.1% failure rate is instances of 50% or more occlusion due to foliage. To compensate, you use mapping data to indicate where stop signs are.
Now, a new stop sign is added that has 50% occlusion due to foliage in some town in the countryside. How does your system handle that?

Most of the discussion you guys are waging is over my head. What worries me is that Karpathy is working on "STOP-signs" claiming it is very hard. I mean, that is a real basic thing in driving.... a STOP-sign
If they are still working on it now and he says it is really hard, that doesn't give me very much confidence.

Tesla is able to collect data on mostly all variations of stop signs, with and without occlusions. When Karpathy says they are hard, I think that says more about how aware Tesla is of the problems L5 cars face than it indicates that they are bad at solving it. You can't test against edge cases (simulated or otherwise) unless you know about them. Handling a general stop sign is simple, handling every stop sign is hard.

HD maps are no different than what Tesla is doing. Tesla uses map data to tell the vision where to look for a traffic light or stop sign, and to give the vision higher confidence when it sees a traffic light or stop sign. But vision is also used to see the color of a light and to read a stop sign. But Tesla says in the manual that traffic lights and stop signs response may not work without map data.
Those statements are in conflict, either detection is based on maps, or it isn't.

HD map is any map that has more details than a conventional map.
I don't agree with that definition, but it explains your previous comments, gratci.
 
So you are trusting maps over vision.

NO. I trust BOTH HD Maps AND Vision!

How did that car traverse the scene if the map was wrong? It can only do that if the map was not needed to begin with.

Look, I get where you are coming from. If the car uses vision to drive when the map is wrong, why does it need maps at all? Maps can't be trusted, so why not just use vision first without maps?

Yes, the car uses vision but there are plenty of instances where it needs a map to help vision:

Case #1: the car's vision thinks it sees a stop sign but it's really a red sign from a restaurant. So without a map, it would phantom brake for an imaginary stop sign. Could it still traverse that scene? Yes, but phantom braking in the middle of the road for a non existent stop sign is not desirable. A HD map would tell the car that there is no sign there, so no phantom braking. The car now traverses that same scene much more smoothly and safely.

Case #2: The car's vision does not see a sign that is blocked by a large tree. The car fails to stop at the stop sign. Again, that is not desirable behavior. A HD map would tell the car that there is a stop sign there and it could stop accurately at the stop line. Much better behavior.

Case #3: The car sees a stop sign but does not know that the stop sign is only for the left lane. So, the car phantom brakes in the middle of the road. Again, not good behavior. A HD map would tell the car that it should ignore the stop sign since it is not in the left lane. Much better behavior.

Here's an example:
Say you have an autonomous car, but it's stop sign recognition is only 99.9 percent accurate at positive detection. Within that 0.1% failure rate is instances of 50% or more occlusion due to foliage. To compensate, you use mapping data to indicate where stop signs are.
Now, a new stop sign is added that has 50% occlusion due to foliage in some town in the countryside. How does your system handle that?

You have to add the new stop sign to your HD map. Since the new sign has 50% occlusion due to foliage, you already established that vision won't be reliable anyway in that scenario.

Those statements are in conflict, either detection is based on maps, or it isn't.

Well, Green says that for Tesla, detection is map based.

eING4c1.png