Tesla has solved perception way better than driving

PhaseWhite · Dec 28, 2019

Elon Musk has often said that the hardest problem is perception, and once you solve that the rest is easy. I think most of us would agree that deciding what to do is actually the hard part and it seems like we've hit that point now even for Autopilot and NOA.

Here's a recent example (which I've also experienced). Notice the display shows the line is there, yet the car swerves over it due to bouncing off the right lane marking as the merge lane closes. Go to time 13:45

Navigate On Autopilot also comes to mind, I think a lot of the short comings come down to really dumb driving algorithms.

1. The code for making lane changes leading up to an exit appears to be as simple as for every mile away the exit is, move one lane closer to the exit lane. Each lane change is initiated based on this simple timer approach, completely ignoring information about cars next to you, how busy the traffic is etc. Decisions like slowing down or speeding up to pick a gap seem to be beyond it.

2. NOA relies on map data to a fault, if the map says there are 3 lanes on the road but the vision system only sees 2 (because of construction), the car still does dumb things because it thinks there are 3 lanes.

3. In my mind, the reliance on static map data downloaded once a month for navigation will never work very well, they need to be able to stream the map data from the cloud and constantly update it with in car vision and telemetry.

Why is Tesla's driving lagging so far behind their vision system now and what can they do to fix it?

Tesla recently implemented a deep NN called deeprain for making the wipers smarter, perhaps we need a deepLaneChange NN to teach the car how to minimize bad lane change decisions.

Green Pete · Dec 28, 2019

I may be wrong. But I don't think the newest perception has any impact on driving yet. I think they rolled out the new perception visualization but left the car driving on the old perception and rules

strangecosmos2 · Dec 28, 2019

Green Pete said:
I may be wrong. But I don't think the newest perception has any impact on driving yet. I think they rolled out the new perception visualization but left the car driving on the old perception and rules

Yes, that's my understanding too. I'm really curious to see what happens when the new perception NNs are actually used in Autopilot. Will the dumb behaviours of the sort that @PhaseWhite describes disappear?

The thing that makes me feel more confident that decision-making is solvable than vision is that training vision is constrained to a large degree by human labour in labelling videos. By contrast, if you've solved vision, then if you have a fleet of ~1M cars, you can automatically train on an essentially unlimited amount of data without any labour constraint whatsoever. For billions of dollars, you can train on a quantity of data for decision-making that would cost trillions of dollars to hand label for training vision.

This approach has worked extraordinarily well in the proof of concept stage:

diplomat33 · Dec 29, 2019

PhaseWhite said:
2. NOA relies on map data to a fault, if the map says there are 3 lanes on the road but the vision system only sees 2 (because of construction), the car still does dumb things because it thinks there are 3 lanes.

Yes, I've definitely noticed this. There is an exit that I take every time I am driving across town to go to the movies. It is a very standard exit lane. A 3rd lane appears. The 3rd lane goes parallel to the highway for a few hundred feet with a concrete divider between it and the highway. Then the 3rd lane widens into two lanes with the left lane becoming the off ramp to merge ont US40 North and the right lane merging onto US40 South.

But NOA appears to have bad map data or something. My nav said "exit coming up in 1 mile" when my screen said the exit was 1.2 miles away. On 2016.36, NOA would swerve into the exit lane but then indicate a change lane back unto the highway. But now on 2019.40.50, It misses the exit completely. When I manually took the exit and reengaged NOA, it must have thought I was in the wrong lane because it wanted to do an auto lane change back unto the highway (and into a concrete divider I might add).

I am guessing that when the exit lane widens into two lanes, that NOA thinks I am in the right lane and that's why it is prompting me to lane change to the left since I need to stay in the left exit lane to take US40 North.

But a human does not use map data after all. A human sees the sign that says exit in 1 mile so we know to get into the right lane and look for the exit. We see the 3rd lane appear with the exit sign and the arrow pointing down to that lane so we lane change into the exit lane. We see the sign with 2 arrows indicating the left lane is US40 North and the right lane is US40 South so we stay left or stay right in the exit lane to take the ramp to North or the ramp to South. So I would hope that Tesla can implement sign recognition and lane detection to do the same.

strangecosmos2 · Dec 29, 2019

Relevant:

How Tesla could potentially solve “feature complete” FSD decision-making with imitation learning

PhaseWhite · Dec 30, 2019

Trent Eady said:
Yes, that's my understanding too. I'm really curious to see what happens when the new perception NNs are actually used in Autopilot. Will the dumb behaviours of the sort that @PhaseWhite describes disappear?

The thing that makes me feel more confident that decision-making is solvable than vision is that training vision is constrained to a large degree by human labour in labelling videos. By contrast, if you've solved vision, then if you have a fleet of ~1M cars, you can automatically train on an essentially unlimited amount of data without any labour constraint whatsoever. For billions of dollars, you can train on a quantity of data for decision-making that would cost trillions of dollars to hand label for training vision.

This approach has worked extraordinarily well in the proof of concept stage:

The solution you're describing sounds like an end to end NN for driving policy, images in, driving control out. I know Nvidia has demoed a POC but is anyone actually using this approach now?

My main point is that it seems like Tesla assumes driving control is the easy part so they seem to be spending very little time on that problem so far (atleast from what we see in NOA and Autopilot) and I think that's a big mistake.

strangecosmos2 · Dec 30, 2019

PhaseWhite said:
The solution you're describing sounds like an end to end NN for driving policy, images in, driving control out.

What I have in mind is actually mid-to-mid imitation learning, not end-to-end. Computer vision NN representations in, plan out.

diplomat33 · Dec 30, 2019

PhaseWhite said:
]My main point is that it seems like Tesla assumes driving control is the easy part so they seem to be spending very little time on that problem so far (atleast from what we see in NOA and Autopilot) and I think that's a big mistake.

Elon said that driving control is easier than perception, not necessarily easy. That does not mean that the Tesla engineers feel the same way of course. But I think the real reason why Tesla is so focused on perception is because they have not solved it yet. And perception is a prerequisite for everything else. If your car can't even see correctly, working on driving controls is a moot point. So I don't think Tesla is neglecting driving controls because they think it is easy. Rather, they are focused on perception because they need to get it right before they can driving controls.

Search

Tesla has solved perception way better than driving

PhaseWhite

Member

Green Pete

Active Member

strangecosmos2

Koopa Troopa

diplomat33

Average guy who loves autonomous vehicles

strangecosmos2

Koopa Troopa

PhaseWhite

Member

strangecosmos2

Koopa Troopa

diplomat33

Average guy who loves autonomous vehicles

Similar threads