You can install our site as a web app on your iOS device by utilizing the Add to Home Screen feature in Safari. Please see this thread for more details on this.
Note: This feature may not be available in some browsers.
So you solve vision to say 95. LIDAR is still back there at 85 and it’s not going to contribute towards the finish line because we already baked its capabilities into the 85%-capable system. We’ve passed LIDAR’s limits. So we’re at 95 on the scale with vision, and LIDAR is done contributing in a meaningful manner, and vision is doing all the heavy lifting in nearly every scenario. If we don’t pass 95 then no L5 for anyone and LIDAR isn’t going to help. It’s already accounted for.
I guess you have never been in a downtown area.
Yes, but that isn't relevant to the conversation. As a tech were you allowed to go straight? Because your crippled Robotaxi couldn't. (Without possibly breaking the law.)
Adding LIDAR to this setup, today, immediately bumps that to 85 on this scale. You get to leapfrog vision overnight and rely on this incredible dot cloud to navigate the world. But you can’t get past 85 with (year 2020’s) vision capabilities + LIDAR.
LIDAR’s additive abilities END where they are today, more or less, because it can’t see signs or markings or lights or all the things we need to navigate the world safely, and it never will. LIDAR is fundamentally incapable of these tasks and always will be. LIDAR let’s you cut in line but not to the front, only to 85. Not good enough for the end goal but some immediate and impressive progress.
Now what? LIDAR plus 2020-vision got you to the 85 mark and you’re stuck. You cannot opt out of lane markings and signs and speed limits and light colors and all this stuff. That will never be optional. It must be extremely reliably solved and LIDAR can’t help. You’ll never get to 100 with this setup.
So then what? Without solving vision to the degree that it can do all this on its own you never get to 100.
So you solve vision to say 95. LIDAR is still back there at 85 and it’s not going to contribute towards the finish line because we already baked its capabilities into the 85%-capable system. We’ve passed LIDAR’s limits. So we’re at 95 on the scale with vision, and LIDAR is done contributing in a meaningful manner, and vision is doing all the heavy lifting in nearly every scenario. If we don’t pass 95 then no L5 for anyone and LIDAR isn’t going to help. It’s already accounted for.
Now what? How do you start creeping past 99? It ain’t LIDAR. It’s vision. And to do this vision must have already surpassed LIDAR’s early leapfrog on the timeline. We don’t need LIDAR anymore because we fundamentally can’t need it to solve this problem. We need something else.
While I remain skeptical that this problem will ever be solved I am solidly convinced that LIDAR will not be contributing past its leapfrog point. I’m convinced that only enormous data ingestion feeding neural net development (plus radar and ultrasonics) will ultimately get us to the 100 mark on this scale. And I’m convinced that Tesla is the only company with the fleet and tools deployed to pull this off, if it’s possible at all.
Here's a bit of a fun video for all of the LIDAR fans:
I will concede, some of these quotes are taken out of context, especially the Mobileye folks who are still using LIDAR on their primarily-vision based system, but it's still an entertaining watch
Except that you don't need as much CV computing power if you are using lidar. For example, you don't need CV to do "pseudo-lidar" if you are using lidar. So we don't know what the actual numbers will be for each variable in your equation.
So if you had a camera system at 99.99% rate of failure
And then you add a lidar/radar system with 99.99% rate of failure.
This would result in effectively a system that is well over 99.99999% rate of failure.
You're taking a very "this is a BIG black box" approach to this.The next evolution (who knows how long it will take) for every sensored AV (camera w or w/o lidar, radar, etc...) will be advancements in training of huge, deep neural nets based on video segment.
This will be state of the art engineering accomplishment. No one knows how long it will take, how much compute it will take, what advancements in architectures will be needed, how much inference compute will be needed.
You're taking a very "this is a BIG black box" approach to this.
That black box has been worked on for over a decade and has been broken down to smaller black boxes as some of the unknowns become knowns.
For instance, there will NOT be an FSD solution without radar of some sort.
This is because radar does way better then humans in inclement weather. And as radar gets better, the benefit will only grow.
During Autonomy Day and since then, Karpathy and even Elon have both outlined the framework and approach that they at Tesla are taking.
While, it is still accurate "we do not know how long it will take", I wouldn't classify it all as this big black box.
My position is that vision is needed but that it is unclear how accurate and reliable it can. "Solving vision" to the needed 9's may take a long time if it is even possible. So I am arguing the "Waymo approach": you solve vision but only to 99.99% which is much easier, and you combine this vision with radar and lidar. Radar will help with cases where vision is poor like inclement weather. In good weather, Lidar will provide a "second opinion" on distance calculation, object detection, object classification, lane detection etc...
Sorry, but this argument does not work. You say that you can solve vision to 99.99% and solve lidar to 99.99% and then you can multiply the error rates and get 99.999999% reliability. But this assumes independent failure modes for the different sensors. Two simple cases where this will not work are traffic lights and stop signs. Suppose you only had two sensors in front of the car: 1 camera and 1 lidar. If the camera fails, lidar may be able to detect a traffic light, but it would not know its state, so your car would need to stop at the light. Note that an HD map would not tell you the current state of the traffic light. Lidar may be able to detect a sign but it would not be able to interpret it. Again, if you do not know what a sign says, how can you be sure you can keep driving? The car must stop, unless you trust an HD Map, provided the HD map is available, and was updated recently.
Other cases where cameras and lidars fail in different ways:
- Detection of objects with low reflectivity, such as black cars or certain clothing: cameras can do this just fine in daylight; lidars will have trouble
- Classification of smaller objects and phenomena: debris like tires, plastic bags; car exhaust: cameras have enough resolution and color recognition to recognize this as benign/not benign; a lidar will give you some points from it and you need to decide whether it's a real obstacle
- Detection/classification of nearby objects: If you mount a lidar on top of a car, you'll have huge blind spots in the immediate vicinity of the car, where you would need to fall back to cameras, radar, and ultrasonics. Or, if you would like to, again, add thousands of dollars to the cost of a car, you would add multiple extra lidars to deal with this.
- Classification of objects far away (> 100 m): The much lower resolution of lidar (~3-5% of a typical ADAS camera's resolution vertically) starts becoming a bigger problem at farther distances. Choose 100 m on this page and see (or rather, don't see) what details that one of the best lidars on the market, that needs to stick out at the top of the car, captures.
In blinding light, a car with lidar and cameras will also fail. It won't see the traffic light colors.
The camera-only FSD car would completely fail and would not be able to handle the intersection safely.
I don't think this line of argumentation works for or against LIDAR. As long as either vehicle can retain some information about their surroundings, a blinded autonomous vehicle can remember where the side of the road is and safely pull over.
Human drivers are temporarily blinded all the time, and we don't instantly crash because of our ability to recall our immediate surroundings.
Not any more .... https://twitter.com/ray4tesla/status/1277843175294373888The Byton K-byte is scheduled to be released next year.
Once you perfect vision, there's no need for lidar.
The people advocating for lidar + camera aren't considering the complexity of "when" and "how" to hand over to vision vs lidar when either one is disagreeing with the other. It's a problem that's rife with hyperparameters and micromanagement that becomes more problematic over time.