Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

AI experts: true full self-driving cars could be decades away because AI is not good enough yet

This site may earn commission on affiliate links.
I think it will take too long to solve all edge cases with just real world driving alone. There are way too many. You said it yourself, there could be as many as 10^37 cases. I think you need BOTH real world driving AND simulations.



Yes, you still need to solve edge cases but with pre-mapping, the car is starting with some pre-knowledge of what the road looks like. So you don't need to start from scratch, teaching the car what the intersection looks like. You don't need to train the car to identify every single intersection since it will already have that information before it drives. You can focus on edge cases with moving objects.

Simulations don't make up for lack of edge cases. That's just a known concept in data science / statistical modeling.

Say you are collecting data in where only 2 features matter. Then you can plot all your data points on a 2d plot. Say Waymo collects the green points. That is their known reality. They create simulations that basically fill in all the area (pink). This is useful so that their model doesn't overfit the green points and creates a more stable solution (within the bounds of the green points).

Tesla, by virtue of collecting a lot more data, finds more edge points that are actually weirder and further from the center (norm) than anyone expected. Tesla then adds simulations (blue + pink) to fill in their gaps.

Waymo doesn't even know the brown points existed. The area between the brown and green are not considered by Waymo. They have no simulation data on them. Ergo, when presented these conditions in real life, their model will fail.

Because these models aren't smart, they are just enormous interpolation machines.

untitled (1).png


In reality, the numeric space between the green and brown points might be small, but happen on so many different dimensions, that the actually "volume" (as in 100 dimension volume) that is missed could be enormous.

Further, the power of simulations is amplified by having more unique raw data. The blue points provide more power than simply adding more dense pink points in a confined region.

Again, Waymo will never talk about this even though it is a fundamental concept in statistical modeling.
 
  • Like
Reactions: rxlawdude
Simulations don't make up for lack of edge cases. That's just a known concept in data science / statistical modeling.

Say you are collecting data in where only 2 features matter. Then you can plot all your data points on a 2d plot. Say Waymo collects the green points. That is their known reality. They create simulations that basically fill in all the area (pink). This is useful so that their model doesn't overfit the green points and creates a more stable solution (within the bounds of the green points).

Tesla, by virtue of collecting a lot more data, finds more edge points that are actually weirder and further from the center (norm) than anyone expected. Tesla then adds simulations (blue + pink) to fill in their gaps.

Waymo doesn't even know the brown points existed. The area between the brown and green are not considered by Waymo. They have no simulation data on them. Ergo, when presented these conditions in real life, their model will fail.

Because these models aren't smart, they are just enormous interpolation machines.

View attachment 673678

In reality, the numeric space between the green and brown points might be small, but happen on so many different dimensions, that the actually "volume" (as in 100 dimension volume) that is missed could be enormous.

Further, the power of simulations is amplified by having more unique raw data. The blue points provide more power than simply adding more dense pink points in a confined region.

Again, Waymo will never talk about this even though it is a fundamental concept in statistical modeling.
Humans drive without all this "edge case" data. Doesn't this approach violate the "first principles" methodology?
It seems like a huge leap to conclude that edge case data will be a substitute for human reasoning.
 
  • Like
Reactions: diplomat33
Humans drive without all this "edge case" data. Doesn't this approach violate the "first principles" methodology?
It seems like a huge leap to conclude that edge case data will be a substitute for human reasoning.

I think we both agree, humans are smarter than these algorithms.

Like I said, all these fancy algorithms are just giant interpolation machines. That's what they're really good at.

Maybe it won't be enough to substitute 100% of human reasoning. But I wouldn't want to operate on a even more constrained dataset like Waymo or Cruise...
 
Simulations don't make up for lack of edge cases. That's just a known concept in data science / statistical modeling.

Say you are collecting data in where only 2 features matter. Then you can plot all your data points on a 2d plot. Say Waymo collects the green points. That is their known reality. They create simulations that basically fill in all the area (pink). This is useful so that their model doesn't overfit the green points and creates a more stable solution (within the bounds of the green points).

Tesla, by virtue of collecting a lot more data, finds more edge points that are actually weirder and further from the center (norm) than anyone expected. Tesla then adds simulations (blue + pink) to fill in their gaps.

Waymo doesn't even know the brown points existed. The area between the brown and green are not considered by Waymo. They have no simulation data on them. Ergo, when presented these conditions in real life, their model will fail.

Because these models aren't smart, they are just enormous interpolation machines.

View attachment 673678

In reality, the numeric space between the green and brown points might be small, but happen on so many different dimensions, that the actually "volume" (as in 100 dimension volume) that is missed could be enormous.

Further, the power of simulations is amplified by having more unique raw data. The blue points provide more power than simply adding more dense pink points in a confined region.

Again, Waymo will never talk about this even though it is a fundamental concept in statistical modeling.

But I wouldn't want to operate on a even more constrained dataset like Waymo or Cruise...

And yet, with "only" 20M real world miles and 15B simulated miles, Waymo has L4 FSD. So their "constrained dataset", as you put it, was good enough to achieve driverless L4 FSD.

Also, note that the simulated miles are 750x greater than the real miles. So Waymo achieved FSD with mostly simulated miles. Again, I am not suggesting you don't need real miles. You do. But I think some Tesla fans exaggerate the importance of real word miles because the fact is that Waymo did not need billions of real world miles to achieve good FSD. Waymo has achieved L4 FSD that works really well anywhere that is mapped with just a few million real world miles and the rest augmented by simulation.
 
And yet, with "only" 20M real world miles and 15B simulated miles, Waymo has L4 FSD. So their "constrained dataset", as you put it, was good enough to achieve driverless L4 FSD.

Also, note that the simulated miles are 750x greater than the real miles. So Waymo achieved FSD with mostly simulated miles. Again, I am not suggesting you don't need real miles. You do. But I think some Tesla fans exaggerate the importance of real word miles because the fact is that Waymo did not need billions of real world miles to achieve good FSD. Waymo has achieved L4 FSD that works really well anywhere that is mapped with just a few million real world miles and the rest augmented by simulation.


I don't care how many simulated miles they did. That's great marketing. I, and basically all other data scientists I've ever seen comment on this issue agree that simulation doesn't replace real edge data. I just showed you a diagram.

Do you understand it?

You could certainly argue that Waymo has enough real world data. It's funny because Waymo reality actually supports my case. They have certainly been successful, but their deployment of L4 operating in 1 suburb is the definition of "constrained". If their approach was so robust, it would work anywhere in the U.S with sufficient accuracy.
 
I don't care how many simulated miles they did. That's great marketing. I, and basically all other data scientists I've ever seen comment on this issue agree that simulation doesn't replace real edge data. I just showed you a diagram.

Do you understand it?

You could certainly argue that Waymo has enough real world data. It's funny because Waymo reality actually supports my case. They have certainly been successful, but their deployment of L4 operating in 1 suburb is the definition of "constrained". If their approach was so robust, it would work anywhere in the U.S with sufficient accuracy.
Could you define edge data? Give an example? And say how Tesla uses it? I guess I don't really know what it is.

Tesla probably has more images of curbs than any company on earth and yet beta FSD (and Smart Summon) still hit curbs. I remember Elon commenting about how difficult it is to detect curbs. Other AV companies have solved this problem by putting sensors that can directly detect curbs instead of relying solely on neural nets.

Waymo doesn't even really know if their vehicles work within the constrained area. If next week one of them mows down 6 pedestrians they would instantly drop to being 100x less safe than human drivers! It seems a bit early to increase the scope of the problem to everywhere in the U.S.
 
I don't care how many simulated miles they did. That's great marketing. I, and basically all other data scientists I've ever seen comment on this issue agree that simulation doesn't replace real edge data. I just showed you a diagram.

Do you understand it?

Yes, I understand your diagram. I am not questioning your diagram. I am questioning your conclusion. You say a lot of real world data is king and Tesla has more real world data than Waymo. So if you are right, Tesla should have solved way more FSD than Waymo? Yet, we see the opposite. We see that Waymo has solved more FSD than Tesla.

You could certainly argue that Waymo has enough real world data.

Yes, Waymo has enough real world data. That's my point. You don't need billions of real miles to do FSD.

They have certainly been successful, but their deployment of L4 operating in 1 suburb is the definition of "constrained".

Waymo is not constrained to just Chandler. That's just where they launched a public driverless ride-hailing service. Waymo has FSD that works in many cities across the US. Proof is that they reported millions of autonomous miles in CA. They also have autonomous test cars in Orlando, Detroit, and other cities. And Waymo has autonomous trucks driving the I45 in TX. So, Waymo Driver is far from constrained to just 1 suburb.

If their approach was so robust, it would work anywhere in the U.S with sufficient accuracy.

Certainly, the accuracy might not be good enough everywhere for driverless everywhere yet, but the Waymo driver does work anywhere in the US. Proof is that Waymo has test cars driving autonomously in cities across the US.

And Tesla's FSD Beta does not work reliably everywhere in the US with sufficient accuracy either or Tesla would be able to remove driver supervision already. So, it is not like Tesla's FSD beta is that robust yet either. The fact is that everybody is working on solving FSD. Nobody is quite there yet.
 
Last edited:
Could you define edge data? Give an example? And say how Tesla uses it? I guess I don't really know what it is.

Tesla probably has more images of curbs than any company on earth and yet beta FSD (and Smart Summon) still hit curbs. I remember Elon commenting about how difficult it is to detect curbs. Other AV companies have solved this problem by putting sensors that can directly detect curbs instead of relying solely on neural nets.

Waymo doesn't even really know if their vehicles work within the constrained area. If next week one of them mows down 6 pedestrians they would instantly drop to being 100x less safe than human drivers! It seems a bit early to increase the scope of the problem to everywhere in the U.S.

Weird cases previously unseen or even thought of (e.g., not in simulations).

My argument is not that Tesla is currently taking advantage of all this extra edge case data. They are stuck somewhere else... They pick out useful data to train any one of their perception modules. Like stop signs. But I don't think they are overall at an accuracy to where really distinct edge case path planning / driving policy are needed to improve their control algorithms.

I mean their driving policy is hard coded right now, so I don't expect diverse edge case data to improve it until that goes more into a deep net...

My point has nothing to do with Tesla being better (they certainly currently are not), but it is simple pushback to the idea that simulations can make up for lack of edge case data. It does not. It is a different mechanism. Data scientists know this.
 
Yes, I understand your diagram. I am not questioning your diagram. I am questioning your conclusion. You say a lot of real world data is king and Tesla has more real world data than Waymo. So if you are right, Tesla should have solved way more FSD than Waymo? Yet, we see the opposite. We see that Waymo has solved more FSD than Tesla.



Yes, Waymo has enough real world data. That's my point. You don't need billions of real miles to do FSD.



Waymo is not constrained to just Chandler. That's just where they launched a public driverless ride-hailing service. Waymo has FSD that works in many cities across the US. Proof is that they reported millions of autonomous miles in CA. They also have autonomous test cars in Orlando, Detroit, and other cities. And Waymo has autonomous trucks driving the I45 in TX. So, Waymo Driver is far from constrained to just 1 suburb.



Certainly, the accuracy might not be good enough everywhere for driverless everywhere yet, but the Waymo driver does work anywhere in the US. Proof is that Waymo has test cars driving autonomously in cities across the US.

And Tesla's FSD Beta does not work reliably everywhere in the US with sufficient accuracy either or Tesla would be able to remove driver supervision already. So, it is not like Tesla's FSD beta is that robust yet either. The fact is that everybody is working on solving FSD. Nobody is quite there yet.

You are making a strawman argument that I am saying Tesla is going to have superior FSD tech and I am not saying that. I am simply pushing back against your incorrect narrative that simulations make up for lack of unknown edge case data. They do not.

Your constant support of simulations as the key has nothing to do with your own experience with any of the technologies, but simply what you've read from Waymo and others. Simulations are not creating weird cases that may only happen in Fargo, South Dakota because Waymo engineers haven't come across those weird cases before. Ergo if Waymo deployed their car in Fargo right now, it might have some failures that they can't predict. That's why they start out deploying only in specific areas (that they have collected a lot of data in).
 
  • Like
Reactions: mark95476
but it is simple pushback to the idea that simulations can make up for lack of edge case data. It does not. It is a different mechanism. Data scientists know this.

I am simply pushing back against your incorrect narrative that simulations make up for lack of unknown edge case data. They do not.

Your constant support of simulations as the key has nothing to do with your own experience with any of the technologies, but simply what you've read from Waymo and others. Simulations are not creating weird cases that may only happen in Fargo, South Dakota because Waymo engineers haven't come across those weird cases before. Ergo if Waymo deployed their car in Fargo right now, it might have some failures that they can't predict. That's why they start out deploying only in specific areas (that they have collected a lot of data in).

Why so hostile towards simulations? I am not suggesting that simulation can replace all edge case data. And I am not suggesting that simulations alone can solve everything. Of course, you need real world data. I am merely suggesting that simulation can help with some edge cases. For example, I can add a simulation of a broken stop sign, a stop sign missing the "S", a stop sign with a glare, etc... Those are some edge cases. No, it won't cover all edge cases but it will help my data by adding some common edge cases. And simulations can be used for driving policy by creating common driving scenarios like an intersection with a car coming from the right and a pedestrian crossing on the left, to see how the AV would react before putting in the real world. So, I've merely pointed out that simulations are very useful. That's all. I think the fact that Waymo has achieved the FSD that they have with 20M real miles augmented by 15B simulation miles proves that real world data and simulation data are both important.

And yes, Waymo would encounter edge cases that they've never encountered before. Nobody is denying that. Waymo has not solved all edge cases. But I am sure Tesla would also encounter some edge cases that they've not solved yet either.
 
Last edited:
Weird cases previously unseen or even thought of (e.g., not in simulations).

My argument is not that Tesla is currently taking advantage of all this extra edge case data. They are stuck somewhere else... They pick out useful data to train any one of their perception modules. Like stop signs. But I don't think they are overall at an accuracy to where really distinct edge case path planning / driving policy are needed to improve their control algorithms.

I mean their driving policy is hard coded right now, so I don't expect diverse edge case data to improve it until that goes more into a deep net...

My point has nothing to do with Tesla being better (they certainly currently are not), but it is simple pushback to the idea that simulations can make up for lack of edge case data. It does not. It is a different mechanism. Data scientists know this.
Yeah this is my impression as well. I just don't think doing driving policy with a neural net (as they exist) will work. Sure, you can try to make modules that predict other road users behavior (both Tesla and Waymo have said they do this).
I will say that, if they can safely use their customers as testers, Tesla has far more testing capability than anyone else. However when I watch beta FSD videos it doesn't seem like they need more testers, they are still finding plenty of problems to fix. Waymo, with only about ten million miles of testing, is certainly missing edge cases and they'll need to test a lot more to prove safety. I suspect that their system will be safe enough at handling these edge cases by requesting remote assistance but we'll have to wait and see what happens.
 
  • Like
Reactions: ZeApelido
You are making a strawman argument that I am saying Tesla is going to have superior FSD tech and I am not saying that. I am simply pushing back against your incorrect narrative that simulations make up for lack of unknown edge case data. They do not.

Your constant support of simulations as the key has nothing to do with your own experience with any of the technologies, but simply what you've read from Waymo and others. Simulations are not creating weird cases that may only happen in Fargo, South Dakota because Waymo engineers haven't come across those weird cases before. Ergo if Waymo deployed their car in Fargo right now, it might have some failures that they can't predict. That's why they start out deploying only in specific areas (that they have collected a lot of data in).
He's the Ambassador from Waymo. Expect nothing but glowing words and you'll no longer be peeved at what is better posted on a Waymo fan site. :rolleyes:
 
  • Like
Reactions: mark95476
Silicon Valley is immensely boring and you can't make up the stuff you see in the real world.

Here's a guy on a bicycle, no footed, with a white piece of cardboard in front (!!??), coming the opposite way. Don't know if this should be classified as a just another cyclist. :oops:🤨
Bike_rider.jpg




Weird cases previously unseen or even thought of (e.g., not in simulations).

My argument is not that Tesla is currently taking advantage of all this extra edge case data. They are stuck somewhere else... They pick out useful data to train any one of their perception modules. Like stop signs. But I don't think they are overall at an accuracy to where really distinct edge case path planning / driving policy are needed to improve their control algorithms.

I mean their driving policy is hard coded right now, so I don't expect diverse edge case data to improve it until that goes more into a deep net...

My point has nothing to do with Tesla being better (they certainly currently are not), but it is simple pushback to the idea that simulations can make up for lack of edge case data. It does not. It is a different mechanism. Data scientists know this.

🤨
 
Why so hostile towards simulations?

I'm not. I'm hostile to the idea that thought increasing simulations reduces the need for unique data. So when some talks about how many simulated miles Waymo does...well I don't really care how many simulated miles. Anyone can do however many simulated miles they want to. They just increase the number of trials they want generated and hit go. It's a dumb metric. People will increase simulated miles to the point that their models stop improving.

We can trust that basically everyone will do this appropriately because all these companies have a lot of machine learning engineers / data scientists.

I am triggered because I am a data scientist and a signal processing engineer and algorithm developer, so I am very familiar with all these tools and processes. Simpler algorithms that can be hand coded are also easier to make simulations for (because the range of possible outcomes is simpler) but also then less dependent on simulation data).

Data (notice it's not simulation) scientists almost always use simulations as part of their training. But no one brags about their simulations when they know they are missing a lot of real data and they know their simulations don't cover everything. That's what's annoying.
 
Yeah this is my impression as well. I just don't think doing driving policy with a neural net (as they exist) will work. Sure, you can try to make modules that predict other road users behavior (both Tesla and Waymo have said they do this).
I will say that, if they can safely use their customers as testers, Tesla has far more testing capability than anyone else. However when I watch beta FSD videos it doesn't seem like they need more testers, they are still finding plenty of problems to fix. Waymo, with only about ten million miles of testing, is certainly missing edge cases and they'll need to test a lot more to prove safety. I suspect that their system will be safe enough at handling these edge cases by requesting remote assistance but we'll have to wait and see what happens.

I don't know if it will work or not. If they can't figure out the right architectures that could even potentially learn robust enough control policies, then the value of that edge data reduces a lot.
 
I'm not. I'm hostile to the idea that thought increasing simulations reduces the need for unique data. So when some talks about how many simulated miles Waymo does...well I don't really care how many simulated miles. Anyone can do however many simulated miles they want to. They just increase the number of trials they want generated and hit go. It's a dumb metric. People will increase simulated miles to the point that their models stop improving.

We can trust that basically everyone will do this appropriately because all these companies have a lot of machine learning engineers / data scientists.

I am triggered because I am a data scientist and a signal processing engineer and algorithm developer, so I am very familiar with all these tools and processes. Simpler algorithms that can be hand coded are also easier to make simulations for (because the range of possible outcomes is simpler) but also then less dependent on simulation data).

Data (notice it's not simulation) scientists almost always use simulations as part of their training. But no one brags about their simulations when they know they are missing a lot of real data and they know their simulations don't cover everything. That's what's annoying.

I only mentioned Waymo's simulated miles because the simulated miles are one important part of what Waymo uses to build their FSD software. If you only mention Waymo's real world miles, you are not giving the whole picture of how Waymo is building their FSD software. But I am advocating for real world data + simulation. I am certainly not saying that you don't need real world data.
 
He's the Ambassador from Waymo. Expect nothing but glowing words and you'll no longer be peeved at what is better posted on a Waymo fan site. :rolleyes:

I try to share facts, as best I know them, and interesting news about all AV companies because I am interested in all FSD, not just Tesla's FSD. Maybe I am naive but I would hope that people would be interested in learning about FSD from different sources, not just Tesla.

Yes, I post good things about Waymo but that is only because Waymo is the leader in FSD. And I do try to correct misinformation about Waymo because misinformation annoys me. And if we are really interested in truth, then we should be against misinformation about Waymo, just as much as we are against misinformation about Tesla.
 
Wall Street Journal has an interesting article today about self-driving cars:


Basically, some AI experts are arguing that AI is not good enough yet for true full self-driving cars. They point out that our best self-driving cars still need some help, like with HD maps and remote operators. So they think we will see limited self-driving cars, like we are seeing now with Waymo and others, but true full self-driving cars, that can drive anywhere with no human assistance, are still decades away. They explain that current AI is good at seeing patterns but is not good at extrapolation:



Other experts, including at Waymo and Aurora, argue that you don't need to "solve AI" in order to have true full self-driving:



-----------

My take: I think the article is right about the current challenges with AI and that we probably won't see true L5 in the short term. However, I think the article is probably wrong about it taking decades to "solve FSD". We tend to underestimate the speed of technological progress. Just look at the how quickly computers have evolved. I think it is very possible that we could see some big AI breakthroughs in say 5 years that help us achieve better FSD sooner than "decades". We might also find clever engineering ways to "solve FSD" without solving AI, as Rajkumar suggests. After all, we've solved a lot of tough engineering problems already without "solving AI". So I think self-driving tech will get better and better and we will see more self-driving cars on the roads in the years to come. I am optimistic that it won't take decades to "solve FSD".

I would also argue that limited L4 self-driving may be good enough for now, at least for the short term. Sure, true generalized L5 self-driving cars, with human-like intelligence, would be the holy grail, but I don't think it is necessary. After all, the goal is to achieve self-driving cars that are safe and reliable and serve a useful application like ride-hailing. Does it really matter how we achieve that goal, as long as we achieve it? If it takes some geofencing, HD maps etc to achieve the goal, so what?
I'm a machine learning guy running a company with 75 people. They are right about machine learning when it comes to knowledge representation. Today's algorithms even outperform humans when it comes to perception, i.e. detecting visually what elements are in the world and to some degree can extrapolate their movements.
Knowledge representation would mean that you have an idea of all the things a human, a car, a cyclist and a dog can do in any kind of traffic situation. This is still hard even given the amount visual information that is collected across all Teslas driving around. However, this doesn't have to be solved on machine learning (deep learning) level. We may use symbolic or graph-based methods for achieving this.
IMO, HD maps don't help here, the idea of precise maps increasing knowledge is just wrong, similar to increasing the mega pixels of the on-board cameras. To the contrary, the car has to have an idea of a building and other elements in the world to make predictions (for example understanding the processes at a road block / building site).
Another way of getting insights about interactions may come from wireless signals which is what the phone companies (like Vodafone and Google) do. These informations are not real time but they may help to predict things that are not immediately perceivable like the end of a traffic jam on the autobahn around the corner.