Karpathy talk today at CVPR 2021

RobDickinson · Jun 21, 2021

To be honest computer vision has come on a hella lot since 2016.

powertoold · Jun 21, 2021

Knightshade said:
He went on at length about how important it was in many interviews at the time, pointing out the advantages of seeing cars ahead, the ability to see through bad weather, etc.... and between that his "pushing" the team as quoted they decided it could be used as a primary control sensor WITHOUT imaging/vision back then.

Nah, I'm very confident you're misinterpreting this.

Ever since V8, Tesla has done sensor fusion with radar and cameras. Karpathy talks about the underpass / bridge problem in his most recent talk. That means that until 2021, Tesla has used radar as a primary sensor. Obviously, in the vision-only cars, radar is no longer a primary sensor.

What Elon / Tesla is talking about in regards to radar being a "primary control sensor" is that radar dictates the obstructions / objects ahead of it. However, it doesn't know what the objects are. The camera is there to identify the objects and where to drive (lane lines).

Knightshade · Jun 21, 2021

powertoold said:
Nah, I'm very confident you're misinterpreting this.

More confident than Elon was when he said radar would be a primary control sensor for Tesla?

powertoold said:
Ever since V8, Tesla has done sensor fusion with radar and cameras

Then how do you explain the existence of the word "without" in Teslas claim here:

After careful consideration, we now believe it can be used as a primary control sensor without requiring the camera to confirm visual image recognition

powertoold said:
. Karpathy talks about the underpass / bridge problem in his most recent talk. That means that until 2021, Tesla has used radar as a primary sensor.

....well, from 2016 to 2021 at least.... though the intent changed multiple times.

Which is... what I originally pointed out.

So it's weird you keep saying I'm wrong.

My original point was Tesla has changed their strategy 3 times since 2016. And they have.

2016: Radar moves from secondary to primary sensor- and cameras when used are individual frames of individual cameras. HW2 is PLENTY.

2019(ish): Radar remains a primary sensor FOR NOW, fused with a 4D Birds-eye-view stitching of all 8 cameras in real time thanks to HW3.
(note Elon DID say in this period the system would eventually only need vision as primary, but radar would be there as a plus)

2021: Radar is garbage, we're ripping it out, vision only baby!

That's 3 entirely different strategies.

Arguably there's the HW2->2.5 transition in there too, but that was more of a "we need more of the same thing" instead of a genuine direction change like the others..... we also know there'll be a HW4 (that is probably going to be done soon, if it hasn't been delayed) but that's also likely a "we need more of the same" type thing rather than a fundamental change.

powertoold · Jun 21, 2021

Knightshade said:
My original point was Tesla has changed their strategy 3 times since 2016. And they have.

2016: Radar moves from secondary to primary sensor- and cameras when used are individual frames of individual cameras. HW2 is PLENTY.

My main contention was when you replied to following with those articles:

I'd like to see proof tesla said radar would be the primary source of data for self driving.

Here's a simplified version of Tesla's prior approaches:

Before V8: Cameras identified all road objects and radar provided the position / velocity. When camera and radar disagree (camera doesn't identify an object but radar thinks there's an obstruction ahead), then follow the camera.

V8 to 2021: Similar to above, but when camera and radar disagree, follow the radar.

It's as simple as that. Tesla never wanted the V8 to 2021 approach! But since they *had* to do it because of the AP accident, they tried to make the best of radar, until now.

Knightshade · Jun 21, 2021

powertoold said:
My main contention was when you replied to following with those articles:

I'd like to see proof tesla said radar would be the primary source of data for self driving.

Did you miss the very first sentence in the reply where I pointed out that was NOT my actual claim, and the guy I was replying to had changed my words?

If you don't want to take MY word Tesla has had 3 entirely different strategies, will you take Elons?

2016: Working toward radar as primary sensor that can be used for car control without vision confirmation

https://twitter.com/x/status/753843823546028040

2018: Working toward fusion that is vision-focused with Radar as a value add

https://twitter.com/x/status/1057949223445020672

2021: FUSION BAD! VISION GOOD!

https://twitter.com/x/status/1380796939151704071

powertoold · Jun 21, 2021

Knightshade said:
If you don't want to take MY word Tesla has had 3 entirely different strategies, will you take Elons?

I agree with you that Tesla has had 3 different strategies, not entirely different, see my simplified version above.

Again, they never wanted radar to define obstructions. They just did it out of necessity. I don't think this was a flip flop.

RobDickinson · Jun 21, 2021

powertoold said:
My main contention was when you replied to following with those articles:

I'd like to see proof tesla said radar would be the primary source of data for self driving.

Dude that was me not knightshade

stopcrazypp · Jun 21, 2021

MP3Mike said:
They have removed the blog entry but this article references it: Tesla Autopilot Upgrade Will Make Radar A Primary Control Sensor

After careful consideration, we now believe it can be used as a primary control sensor without requiring the camera to confirm visual image recognition. This is a non-trivial and counter-intuitive problem, because of how strange the world looks in radar.

Click to expand...

Yeah, I remember this was the case and the definition is above that it doesn't need camera to confirm (but not that it's the "main" sensor that you depend on, as obviously the cameras still serves that role).

hpiv · Jun 23, 2021

diplomat33 said:
Everybody else is using high quality sensors and doing great with sensor fusion.

Everybody else is limited to carefully pre-mapped and geofenced areas.

How do we know if they are doing great with sensor fusion or not anyway?

hpiv · Jun 23, 2021

BitJam said:
Humans are able to move their heads and they have access to other senses too.

Humans have to move their heads because they only have a very narrow field of vision, unlike cameras. Humans constantly have to move their attention around to different places while cameras allow for constant attention everywhere cameras have coverage. Humans are severely limited compared to cameras, and of course the computers handing the input from these cameras which can react much faster than a human to anything happening out there.

diplomat33 · Jun 23, 2021

hpiv said:
Everybody else is limited to carefully pre-mapped and geofenced areas.

How do we know if they are doing great with sensor fusion or not anyway?

Because their perception is so good. And the reason their perception is so good is in large part because of their sensor fusion. They have been able to fuse data from the maps with the data from camera, lidar and radar to give the car accurate perception. That's a big reason they are able to do reliable autonomous driving with no human in the driver seat.

Knightshade · Jun 23, 2021

diplomat33 said:
Because their perception is so good. And the reason their perception is so good is in large part because of their sensor fusion. They have been able to fuse data from the maps with the data from camera, lidar and radar to give the car accurate perception. That's a big reason they are able to do reliable autonomous driving with no human in the driver seat.

MUCH PERCEPTION! SUCH WOW!

Seriously though- it's a good point that you appear to be entirely ignoring.

"WORKS GREAT IN ONE TINY SPECIFIC HIGHLY HD MAPPED AREA-- AS LONG AS THERE'S NO MAP CHANGES AND A HUMAN REMOTE BACKUP IS AVAILABLE TO HELP IT" is not evidence of awesome fusion or perception.

Works -all over the place- would be.

They have not demonstrated that at all. Indeed the fact they (like tesla) are many years past their original deadlines to do so suggests it's not as good as you seem to think.

diplomat33 · Jun 23, 2021

Knightshade said:
MUCH PERCEPTION! SUCH WOW!

Seriously though- it's a good point that you appear to be entirely ignoring.

"WORKS GREAT IN ONE TINY SPECIFIC HIGHLY HD MAPPED AREA-- AS LONG AS THERE'S NO MAP CHANGES AND A HUMAN REMOTE BACKUP IS AVAILABLE TO HELP IT" is not evidence of awesome fusion or perception.

Works -all over the place- would be.

They have not demonstrated that at all. Indeed the fact they (like tesla) are many years past their original deadlines to do so suggests it's not as good as you seem to think.

The problem in that video was not a problem with perception or sensor fusion! The Waymo perceived the cones just fine. It was a planning issue. The Waymo was not sure how to proceed which was made worse by the fact that it got bad help from the remote operator. It was explained in the statement that Waymo put out after the incident.

But I am not saying that Waymo's perception is 100% perfect all the time. I am disputing Karpathy absurd claim that sensor fusion is unworkable because of all the phantom braking and false positives and therefore vision-only is the only solution. If that were true then Waymo, Cruise, Zoox should be phantom braking every 5 seconds with all the sensors that they have. Clearly that is not the case. Waymo, Cruise, Zoox and others are proof that sensor fusion is viable. Heck, if sensor fusion is so unworkable as Karpathy claims, then how come Waymo is still using sensor fusion after 20M autonomous miles?

It's kinda funny how Karpathy throws in the towel and says sensor fusion doesn't work. Yet, every AV company is still using sensor fusion and deploying robotaxis with sensor fusion.

powertoold · Jun 23, 2021

diplomat33 said:
It's kinda funny how Karpathy throws in the towel and says sensor fusion doesn't work. Yet, every AV company is still using sensor fusion and deploying robotaxis with sensor fusion.

LIDAR has different failure modes compared to cameras as well (reflective trucks, wet reflective roads, reflective walls / tunnels, glass on trucks, etc.). They'll eventually run into the similar local maximums as Tesla.

How do you expect LIDAR to solve the reflective false positive / negative problems? We've actually seen Waymo cars have trouble (non-consequential steering hesitations) when driving next to shiny metallic trucks.

Knightshade · Jun 23, 2021

diplomat33 said:
But I am not saying that Waymo's perception is 100% perfect all the time. I am disputing Karpathy absurd claim that sensor fusion is unworkable because of all the phantom braking and false positives and therefore vision-only is the only solution

That's not at all how I read his remarks.

He was saying that the engineering effort it would require to have great fusion is better spent on vision- since you ultimately need to solve vision to ever have L5.

diplomat33 said:
. If that were true then Waymo, Cruise, Zoox should be phantom braking every 5 seconds with all the sensors that they have.

When you are only operating in a single, small, area with HD maps it's a lot easier to just have the major triggers whitelisted to avoid that.

That's not good sensor fusion, that's having a tiny ODD.

diplomat33 said:
Clearly that is not the case. Waymo, Cruise, Zoox and others are proof that sensor fusion is viable. Heck, if sensor fusion is so unworkable as Karpathy claims, then how come Waymo is still using sensor fusion after 20M autonomous miles?

Again a lot of it is in the same tiny, highly mapped, areas. And 20M miles is a tiny tiny fraction of Teslas miles driven too.

If sensor fusion is so WORKABLE why do they still not offer L4 service outside one tiny area of a perfect weather, simple road, city?

SageBrush · Jun 23, 2021

diplomat33 said:
At this point, Tesla does not really have a choice IMO. They committed early to putting cameras on every car and claiming the hardware was capable of FSD. And they've sold hundreds of thousands of cars now with the hardware. They can't afford to upgrade ...

^^ Wild conjecture, and poorly reasoned. All those 100s of thousands of cars *have* LiDAR. Tesla is deprecating the hardware it already has installed.

I don't find it hard to believe that the computing cost of integrating LiDAR is better spent on Vision. I don't mean $, I mean computer cycles. I think about this way: Say a decision has to made in 0.5 second that works out to C computer cycles available. Tesla can either spend them analyzing only Vision data, or analyzing and integrating two different data streams.

Tesla is choosing to conserve computing cycles for Vision analysis.

Addendum: Knightshade beat me to it.

diplomat33 · Jun 23, 2021

Knightshade said:
That's not at all how I read his remarks.

He was saying that the engineering effort it would require to have great fusion is better spent on vision- since you ultimately need to solve vision to ever have L5.

Karpathy only says that because his approach is vision-only. If all you have is vision, then yes, you need to solve vision to do L5. But there are lot of AV companies who are spending a lot of engineering effort on sensor fusion because they believe sensor fusion is needed to reliably solve L5.

Knightshade said:
If sensor fusion is so WORKABLE why do they still not offer L4 service outside one tiny area of a perfect weather, simple road, city?

Because autonomous driving is about a lot more than just perception. FSD also requires reliable prediction, planning, and safe driving policy. Waymo is working on the whole package. There are prediction and planning problems Waymo needs to solve first before they can expand. They will expand to more areas when they are confident their FSD can do drive safely in those areas with no human interventions.

diplomat33 · Jun 23, 2021

powertoold said:
LIDAR has different failure modes compared to cameras as well (reflective trucks, wet reflective roads, reflective walls / tunnels, glass on trucks, etc.). They'll eventually run into the similar local maximums as Tesla.

How do you expect LIDAR to solve the reflective false positive / negative problems? We've actually seen Waymo cars have trouble (non-consequential steering hesitations) when driving next to shiny metallic trucks.

Did you see Waymo's new lidar? This is the point cloud from the lidar. It is on par with camera vision IMO. It's basically like having night vision.

SageBrush · Jun 23, 2021

linux-works said:
dumbing down your sensor array because making sense of multi inputs is 'hard' - that's just another way of saying 'we give up'.

Sure, but that is not what they said or what they are doing. Integrating multiple inputs is a trade-off Tesla has decided not to accept.

techlogik · Jun 23, 2021

Well, let's not focus so much on TSLA, rather at the 30min mark when the talk starts with the CEO of Wayve, and watch that car using visual, non-HD maps to navigate and drive around London, which is absolutely nuts the examples they show. I think this is exactly what TSLA is moving towards and trying to do.

That example, if TSLA is anything getting near this ability, is on the right track. Vision and computer learning AI using 6 cameras to perform those driving examples. London is nuts, everybody should agree to this. Then other cities they want to move to is really where the systems can learn on the fly and build how to go about dealing with things. TSLA could really excel with this method of vision only if done right.

Karpathy talk today at CVPR 2021

Active Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Active Member

Well-Known Member

Member

Member

Average guy who loves autonomous vehicles

Well-Known Member

Average guy who loves autonomous vehicles

Active Member

Well-Known Member

REJECT Fascism

Average guy who loves autonomous vehicles

Average guy who loves autonomous vehicles

REJECT Fascism

Closed

Similar threads