Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Hotz: 3 Problems of Autonomous Driving

diplomat33

Average guy who loves autonomous vehicles
Aug 3, 2017
12,048
17,364
USA
I found this to be one of the best explanations of the problems of autonomous driving:


Hotz describes the 3 problems of autonomous driving:
1) Static Driving
This is the basic level where the car just navigates a static map with no other moving objects. At this stage, you are developing path finding and developing the controls for steering and braking. Hotz says this stage is now "easy" if you have sensor fusion that includes LIDAR and HD maps.

2) Dynamic Driving
This is stage 2 where you add moving objects (cars, pedestrians). Now, your car needs to navigate a path like in static driving but also avoid hitting anything. This stage is more difficult. You need to track and predict where objects will be in real-time and make sure you are not on a collision course. When you finish this stage, your autonomous vehicle can drive on public roads but it will be very robotic and may not react to other driver's actions correctly.

3) Counterfactual driving
This is the final stage where you factor in the sometimes unpredictable nature of human driving. it is the most difficult stage because it requires intelligent thinking. You have to anticipate what other drivers might do. Will that car try to cut me off? You can't solve it with hard coded rules. A rule that works in one driving scenario may not work in another. But, if you can solve this stage of autonomous driving, then you would have a truly L5 autonomous vehicle that could pass for a human driver.

Hope this is helpful.
 
I'm not sure robotic driving would be the worst thing...especially if all cars are autonomous. Imagine if I could look at a speed limit sign and KNOW all cars are going that speed (or within 5%)? Or when a turn signal comes on, I KNOW it will blink 3 times before attempting a lane change...the biggest problem would be humans taking advantage of this knowledge (speeding up when turn signal activates because you know the car won't attempt the change for another 2 seconds or so).
 
  • Like
Reactions: diplomat33
I'm not sure robotic driving would be the worst thing...especially if all cars are autonomous. Imagine if I could look at a speed limit sign and KNOW all cars are going that speed (or within 5%)? Or when a turn signal comes on, I KNOW it will blink 3 times before attempting a lane change...the biggest problem would be humans taking advantage of this knowledge (speeding up when turn signal activates because you know the car won't attempt the change for another 2 seconds or so).

Agree. If all cars were autonomous, it would not matter. The issue stems from trying to mix autonomous cars with human driven cars because we know that even if we did deploy autonomous cars now, it would still be awhile before all cars were autonomous. There will be that necessary transition phase where both autonomous cars and non autonomous cars have to coexist on the same roads.
 
Remind me again why anyone should listen to Geohot?

He started his short life as a hacker, worked very briefly and ineffectively at two large companies, then for some reason decided he was going to make a self-driving car.

His views on ML/AI are fringe, and reductive to the point of outright falsehood. He says so many ridiculous things in this video that the whole thing is comedy gold -- it's so cringe-worthy. And then all those crazy coked-out deep stares at the camera.....

Many auto OEMs make built-in ADAS systems that are comparable or better than his. Why didn't Lex Fridman host the head of Nissan or GM's ADAS program instead?

The world is going mad, folks.
 
Remind me again why anyone should listen to Geohot?

He started his short life as a hacker, worked very briefly and ineffectively at two large companies, then for some reason decided he was going to make a self-driving car.
Well, he is a very good hacker. Have you ever used OpenPilot? I haven't but it sounds pretty impressive considering it's built using a smart phone platform.
His views on ML/AI are fringe, and reductive to the point of outright falsehood. He says so many ridiculous things in this video that the whole thing is comedy gold -- it's so cringe-worthy. And then all those crazy coked-out deep stares at the camera.....
Haha. He is quite a character. I'm very skeptical that his approach will work but it seems like it's worth a try.
Many auto OEMs make built-in ADAS systems that are comparable or better than his. Why didn't Lex Fridman host the head of Nissan or GM's ADAS program instead?
He has interviewed the CTO of Cruise.
The world is going mad, folks.
You should listen to his interview with Elon Musk, your head might explode. :p
 
  • Like
Reactions: diplomat33
Haha. He is quite a character. I'm very skeptical that his approach will work but it seems like it's worth a try.

He doesn't have a coherent strategy.

He views the problem like a child trying to make fireworks out **** they found in a recycling bin.

I'm just gonna take some-a-dat imitation learning, plug in some reinforcement learning, maybe decompose some eigenvectors, get some big data, do some k-means clustering, set up some beefy servers and DAMN it's gonna take off like a rocket, you'll see!

His real understanding of the subject is comically limited.
 
He doesn't have a coherent strategy.

He views the problem like a child trying to make fireworks out **** they found in a recycling bin.

I'm just gonna take some-a-dat imitation learning, plug in some reinforcement learning, maybe decompose some eigenvectors, get some big data, do some k-means clustering, set up some beefy servers and DAMN it's gonna take off like a rocket, you'll see!

His real understanding of the subject is comically limited.

What a helpful and insightful comment. We've all learned something and are now better off because you posted.

Whether he is as godly as you or not, what he has made is and has been used and enjoyed by many. He admits himself he is not on the cutting edge of self driving and doesn't intend to be, that's not what he is doing. Seems like a pretty smart guy, he adds some interesting perspective in every interview I've seen with him. What he's able to do with very little resources and starting without a lot of expertise in the area is not easy and produces something of value.
 
He doesn't have a coherent strategy.

He views the problem like a child trying to make fireworks out **** they found in a recycling bin.

I'm just gonna take some-a-dat imitation learning, plug in some reinforcement learning, maybe decompose some eigenvectors, get some big data, do some k-means clustering, set up some beefy servers and DAMN it's gonna take off like a rocket, you'll see!

His real understanding of the subject is comically limited.

Given you’ve hinted at your repute in the field, I’m curious what your thoughts are on the path to a robotaxi future? What are the problems that need to be solved? Outside of Waymo, I find it hard to believe that anyone is remotely close to operating a robotaxi service.

Also, I see that you discounted imitation learning in the planner by citing a paper from Waymo that trained on 60 hours of driving data. I’m curious if you really believe that is enough training data to make such a conclusion, as that seems like a relatively minute amount of data from a back of the napkin estimation. (60^3 examples if each path takes 1s to execute and you want to avoid overlapping examples, of which most of the data is relatively uninteresting)
 
Also, I see that you discounted imitation learning in the planner by citing a paper from Waymo that trained on 60 hours of driving data. I’m curious if you really believe that is enough training data to make such a conclusion

Imitation learning is hugely data-intensive, and very data-inefficient. The smart money is pursuing much more data-efficient strategies.

It isn’t known — even theoretically — if imitation learning would ever truly solve the problem. You can burn up an entire 1 MW data center looking at the inputs from a single car, and you still can’t solve it. That means it’s at least 3-4 orders of magnitude removed from anything practical on a car. It might be 6-7 orders. Or 10. No one knows. Trying to make imitation learning work today is a waste of time and resources.

Also, practical hardware limits today are < 100M parameters per net, so their capacity is insufficient to gain value from 1B+ examples. Again, it might work one day, but that day isn’t today, and probably won’t happen for another decade. Machine learning has worked this way for a while: many core ideas were figured out 30 years ago, but it took orders of magnitude more data and compute power to make it actually work.

IMO, imitation learning is nothing at all like how humans learn. It’s kinda nuts, really. A billion examples to drive in a parking lot? By the time we have enough compute for imitation learning to actually work, we will have already found much more “human-like” techniques that will obviate imitation learning.

There are many more promising approaches that already work better and cost less. In 2019, imitation learning is mostly a philosophical question. Ask me again in 2029.
 
Last edited:
Imitation learning is hugely data-intensive, and very data-inefficient. The smart money is pursuing much more data-efficient strategies.

It isn’t known — even theoretically — if imitation learning would ever truly solve the problem. You can burn up an entire 1 MW data center looking at the inputs from a single car, and you still can’t solve it. That means it’s at least 3-4 orders of magnitude removed from anything practical on a car. It might be 6-7 orders. Or 10. No one knows. Trying to make imitation learning work today is a waste of time and resources.

Also, practical hardware limits today are < 100M parameters per net, so their capacity is insufficient to gain value from 1B+ examples. Again, it might work one day, but that day isn’t today, and probably won’t happen for another decade. Machine learning has worked this way for a while: many core ideas were figured out 30 years ago, but it took orders of magnitude more data and compute power to make it actually work.

There are many more promising approaches that already work better and cost less. In 2019, imitation learning is mostly a philosophical question. Ask me again in 2029.

What strategies do you consider more data efficient? If we assume a priori that the planner cannot be designed in a programmatic way, which I consider extremely likely, the only current alternative I’ve seen proposed is reinforcement learning in a simulation. RL is not exactly known for being data efficient.

I agree with your observation about the hardware required, but systems like Cerebras Wafer Scale Engine are starting to come to market, the required hardware may not be too far away.
 
Imitation learning is hugely data-intensive, and very data-inefficient. The smart money is pursuing much more data-efficient strategies.

It isn’t known — even theoretically — if imitation learning would ever truly solve the problem. You can burn up an entire 1 MW data center looking at the inputs from a single car, and you still can’t solve it. That means it’s at least 3-4 orders of magnitude removed from anything practical on a car. It might be 6-7 orders. Or 10. No one knows. Trying to make imitation learning work today is a waste of time and resources.

Also, practical hardware limits today are < 100M parameters per net, so their capacity is insufficient to gain value from 1B+ examples. Again, it might work one day, but that day isn’t today, and probably won’t happen for another decade. Machine learning has worked this way for a while: many core ideas were figured out 30 years ago, but it took orders of magnitude more data and compute power to make it actually work.

IMO, imitation learning is nothing at all like how humans learn. It’s kinda nuts, really. A billion examples to drive in a parking lot? By the time we have enough compute for imitation learning to actually work, we will have already found much more “human-like” techniques that will obviate imitation learning.

There are many more promising approaches that already work better and cost less. In 2019, imitation learning is mostly a philosophical question. Ask me again in 2029.

much better, thanks.

You are absolutely correct that this is not how humans learn. Neural nets, both in software and hardware, are really dumbed down versions of human Neural processes. There are much better, or I should say more accurate, models but they are so computationally heavy as to be only useful in a Neural Science lab with no real idea on how to make that something useful for real world tasks.

The power consumption raises the obvious issue that we are missing some important pieces before this goes to the next level. A single human brain is powered off of a banana and some crackers and still outperforms all but the most highly trained (and narrow focused) nets, especially MW to MW.

So, given that human brains are often capable of 1 trial learning (though that is only possible when a brain has sufficient context in its environment etc, so it's not truly out of the box 1 trial learning), and that is pretty much the best we can hope for and your typical multi-layer neural nets require something on the order of, what 10k trial trial learning (this is a dog...10k images later...ahhhh a dog!)? What does the path look like to go from where we are, to where we will be? I'm assuming that the pathway includes some type of Neural Net, with ever increasing hardware speed and efficiency, but where is the back bone of the software going? What is the breakthrough that will get us a couple of orders of magnitude more efficient at learning?

I have a hunch that current leaders are going to invest money in better use of available data. If I understand project Dojo, it sounds like Tesla's attempt to use the same data they already have to reinforce more quickly and over a broader range of context but that's not really the solution, it's just a way to help make the current 'system' more effective.
 
I would also add, that we shouldn't box ourselves in to 'human type' learning. It's entirely possible that all this time spent trying to copy (and simplify) organic processes for computer learning is interesting, has some value, but ultimately is not the most direct path to where we want to be. Metaphorically that is like spending time developing walking cars rather than making an electric motor that just spins to make a car move. To my knowledge there are no organic equivalents to a rotating motor but I digress.

It might be some off the wall physicist who looks at the problem in a completely new and profound way and completely pivots the industry...I think that, though not a new technology, computer learning is still open to some completely game changing ideas. I say this confidently because we have examples between our ears of something that works better, so we know we are no where near the limits of what is feasible. Maybe its like lithium ion batteries, where we keep improving what we have incrementally, enough to mean competing ideas never mature, but I suspect there will be enough money and brains on this that we'll see some big changes over the next 5-10 years.

Having said that, what we have now is just kinda barely good enough to make self driving cars a reality and only needs incremental improvement. Waymo sounds like they are going 'live' any day now and will slowly march out from there. SO, it ain't efficient, but it's the stick we have so it will be used until there is a better stick.

I would also say that the current hardware limits for mainstream computer learning and NN processing will be growing WAAAAY faster than they have in the past. IN the past, the big players largely ignored it and most work was done on regular old data centres and PCs. The near future will be a shift to dedicated hardware, not unlike the direction Tesla is going in with HW3. There is money in it now and we're still early enough in the technology we can make huge improvements over a short period. I know google has been refining its own learning hardware and nvidia is obviously jumping in pretty deeply as well, it's year over year improvements at the same price point have been pretty significant and they haven't even really started to leverage any of intels' next gen FPGA stuff. So if we get a 2-5x year over year improvement in hardware and a 15% improvement in software...your 100M limit and 6-10 orders of magnitude won't take a decade for instance. Our ability to collect, store, classify and use data (in particular for self driving but in other areas as well) is also growing quickly. So where as right now, using 1 billion trials is no benefit and we don't have a billion trials to learn from anyway, in 5 years we'll have a trillion trials and be able to use most of it...maybe.
 
I would also add, that we shouldn't box ourselves in to 'human type' learning.

computer learning is still open to some completely game changing ideas

Yeah, let’s not discount sensors that are “not naturalistic” either. The jet engine looks nothing like a bird, but is much better.

IN the past, the big players largely ignored it and most work was done on regular old data centres and PCs.

No one has done deep learning on CPUs in a decade.

the direction Tesla is going in with HW3.

Tesla is easily 8 years late to the party.

they haven't even really started to leverage any of intels' next gen FPGA stuff

FPGAs are 4-10x less power efficient than ASICs.

So if we get a 2-5x year over year improvement in hardware and a 15% improvement in software...your 100M limit and 6-10 orders of magnitude won't take a decade for instance

Check your math. 5^9 ~= 10^6. And 5x improvement per year is delusional.
 
  • Like
Reactions: ramonneke
I think for me the ”common sense” question that it is hard to wrap my head around in imitation or reinforcement learning is the end to end scenario, which kind of best illustrates the problem.

We have of course seen end to end solutions that are capable of keeping the lane on a simple road, but simply the diversity of events, diversity of written signs and whatnot on a parking lot or in an urban environment is so massive, how could you possibly teach all that to a computer brain? It would have to approach general AI to be able to handle all that in an end to end manner where only visual and driving inputs are used to teach the computer.

Think about it: sudden appearance of a police officer etc directing traffic by hand, complex written signs dictating parking or lane use, handling of situations with multiple simultaneous players often body or hand-signaling each other to solve a gridlock and so forth, sometimes using the horn or blinking lights and so forth. How could an end to end imitation learned AI handle all these... and how would you teach it?

Which then, inevitably, returns to the question of what kind of combination of different technologies do you need to make it all work realistically.
 
No one has done deep learning on CPUs in a decade.
Lol I happen to know that there are plenty of people STILL doing this. Even (or maybe mostly lol) at a research level, brain modelling for instance is quite commonly done on CPUs because they are easier for many people to work with and are more readily available. I guess not really relevant to what the big guys are doing but CPUs are still used quite a bit even though it makes no sense.


Tesla is easily 8 years late to the party.
Not really. Tesla isn't a research company. How many other companies have released custom hardware on a large scale specific to their learning approach that is in consumer hands? Yes lots of companies have their own internal facing hardware, but in user devices? Even the cell phone industry has just in the last year or two started to include 'AI' chips in ernest...and this hardware has gotten better at insane rates because it started as an afterthought and now its being included right on chip dies (probably close to my 5x per year 'delusional' number over the past ~3 years). Similarly Tesla is talking about HW4 already that is another multiple times faster maybe next year. This isn't because clock speeds or AI tech is getting better, it's because the customer facing hardware is in its infancy and companies are just learning how to cost effectively scale/integrate the pieces. Low hanging fruit if you will.


FPGAs are 4-10x less power efficient than ASICs.
But also have unique qualities that can be leveraged to improve overal system efficiency and throughput. An FPGA elemenent that is adaptable to fit a specific build of software, if actually leveraged (which they are not typically, which is why they aren't used as often as they could be), opens some interesting opportunities. I know some people working on intels next gen FPGA stuff and they would beg to differ about power efficiency when considered at a system level. FPGAs have historically struggled to move from the fringe, but they have undeniable advantages. If you could check a little box in the compiler that said 'optimize for FPGA' maybe things would be different.

NOTHING these days takes a decade...except fusion and fuel cells lol.
 
How many other companies have released custom hardware on a large scale specific to their learning approach that is in consumer hands?

Uhhhh.... Apple, Google, Movidius, MobilEye, Intel, and NVIDIA all have shipping products, in customers' hands, with specialized hardware for deep learning. Maybe you've heard of some of them?

Like I said, Tesla is 8 years late.

But [FPGAs] also have unique qualities that can be leveraged to improve overal system efficiency and throughput.

FPGAs are never faster or more power-efficient than the same logical function in custom silicon. Do you know anything at all about FPGAs?

I know some people working on intels next gen FPGA stuff and they would beg to differ about power efficiency when considered at a system level.

Okay, show me some documentation.

FPGAs have historically struggled to move from the fringe, but they have undeniable advantages.

FPGAs are not fringe. They are used for V&V of digital designs before fab, and they are used as glue all over the place where custom silicon would be cost-prohibitive in terms of unit economics.

If you could check a little box in the compiler that said 'optimize for FPGA' maybe things would be different.

Are you not aware that FPGAs literally use compilers to translate HDL into the bitstream written into the FPGA's SRAM?

Why the **** am I talking to you, again?
 
Last edited: