Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Experts ONLY: comment on whether vision+forward radar is sufficient for Level 5?

This site may earn commission on affiliate links.
We have a number of threads polluted with the opinions/rants/speculations of laymen (myself included). How about a thread inviting comment for *only* experts? If you are/have:
  • A grad student or researcher in AI/computer vision/computer science
  • Have a BS in computer science
  • A degree in computational or theoretical physics or math
  • A software engineer actually employed working with AI/neural networks/artificial vision/autonomous driving
  • An electrical engineer
  • Someone with domain expertise working with radar
Please comment and give us your thoughts.

If you are *NOT* a software engineer/researcher/ please be quiet. For example, if you are a philosophy major like me working in finance who once took Pascal as a 12 year old and has watched some Youtube videos with the CEO of Mobileye - SHUT THE HELL UP ON THIS THREAD. If you are a plastic surgeon who double majored in neuroscience - SHUT THE HELL UP ON THIS THREAD. If you took a 10 week coding bootcamp to learn to use Ruby to get a job building web apps 'cause your creative writing AA wasn't working hard enough for your financial future - SHUT THE HELL UP ON THIS THREAD.

Okay, experts please comment away. :)
 
A software engineer actually employed working with AI/neural networks/artificial vision/autonomous driving
That is the only category from your list that I believe is qualified to respond to the question you post in your thread title. The other four categories you list are not qualified in my non-expert opinion.

Therefore, it is highly unlikely that you will see any "expert" replies to your post. This is the internet, after all...
 
Yes, Complete stereoscopic vision alone would be enough if the resolution was high enough and the processing powerful enough. Adding in radar would just be a bonus, but can be used to offset some less than perfect processing or vision.

Also, I think this thread is rather rude to discount the opinion of folks who you don't consider an expert. In addition most of the qualifications you have for being an 'expert' really have nothing to do with the problem statement at all. In addition the 'Experts' do not agree as Tesla clearly believes it is possible with their current hardware suite, and others have said it is not.
 
  • Like
Reactions: ZAKEEUS and David99
Yes, Complete stereoscopic vision alone would be enough if the resolution was high enough and the processing powerful enough. Adding in radar would just be a bonus, but can be used to offset some less than perfect processing or vision.

Wait - what field do you work in? :calisnow closes eyes and hopes for a computer scientist or engineer working in AI:

Also, I think this thread is rather rude to discount the opinion of folks who you don't consider an expert.

Agreed.


In addition most of the qualifications you have for being an 'expert' really have nothing to do with the problem statement at all.

This is why I need an expert, clearly.

In addition the 'Experts' do not agree as Tesla clearly believes it is possible with their current hardware suite, and others have said it is not.

Do you believe others really think it is not or they are simply covering their asses PR wise, because laymen clearly are pacified by the presence of more sensors.
 
  • Funny
Reactions: Sully's8
For starters, SAE autonomous driving grades need to be stated and understood up front:
SAE J3016 Automated Driving Standards
Level: 5
Name: Full Automation
Narrative Definition: the full-time performance by an automated driving system of all aspects of the dynamic driving task under all roadway and environmental conditions that can be managed by a human driver.
Execution of Steering and Acceleration/ Deceleration: System
Monitoring of Driving Environment : System
Fallback Performance of Dynamic Driving Task : System
System Capability (Driving Modes) : System

Definition of driving mode: Driving mode is a type of driving scenario with characteristic dynamic driving task requirements (e.g., expressway merging, high speed cruising, low speed traffic jam, closed-campus operations, etc.).

The above are all SAE definitions, verbatim.

In short, L5 requires full 360 degree situational awareness without requiring human intervention. For example, lane changes cannot be executed without keeping track of cars, bikes, bicycles or pedestrians in the rear flanks. Front visibility is not sufficient.

In my opinion, upto L3 is quite straightforward to implement. AP2 should be able to accomplish it. The jump from L3 to L4 is very tricky, not just technologically, but in terms of regulatory oversight. L4 to L5 is even harder.
 
  • Like
Reactions: juice15e
For starters, SAE autonomous driving grades need to be stated and understood up front:
SAE J3016 Automated Driving Standards
Level: 5
Name: Full Automation
Narrative Definition: the full-time performance by an automated driving system of all aspects of the dynamic driving task under all roadway and environmental conditions that can be managed by a human driver.
Execution of Steering and Acceleration/ Deceleration: System
Monitoring of Driving Environment : System
Fallback Performance of Dynamic Driving Task : System
System Capability (Driving Modes) : System

In short, L5 requires full 360 degree situational awareness without requiring human intervention. For example, lane changes cannot be executed without keeping track of cars, bikes, bicycles or pedestrians in the rear flanks. Front visibility is not sufficient.
Please, god please no. Please not another derailment where people begin talking/bickering about autonomy levels. god no please. We have so much of that. What we do not have is much informed discussion by domain experts on whether vision is sufficient to drive as safely as a human being and why/why not.
 
I have a BS in Comp Sci from University of Missouri, so based on your criteria, I qualify as an expert. :p

I would say, if vision alone is good enough for a human, then it should be good enough for software, with enough resolution and processing power. Of course, it may be easier with more input sources.

Of course, this leaves open the questions of how much code it will take and how long it will take to develop and exactly how much processing power is needed to run that code and is it reasonable for a car without using too much energy. All of this remains unanswered. If your question meant to imply considering these factors, I don't yet think anyone can say. If you question was a more simple question of whether or not it's technically possible regardless of those factors, then I would say yes.
 
Hello,

Computer BS + MS here. Not my major profession but I also took Computer Vision and AI courses and had done projects in the university.

Long story short, no I don't think it's possible to achieve Level 5 using current hardware.

Elon says we will be able to sleep in the car in 2 years. If that happens, I'd eat my hat.
 
Fine. You want a domain expert - me - to not post, I won't. Good luck defining L5 to suit yourself.

Please understand what he truly expects: a rational discussion of something he doesn't understand in a loose domain such that should anyone respond that doesn't suit his agenda allows for a full tantrum reminiscent of DJ Trump. Queue Mr. A. Jackson please...
 
A question for the experts. How can the Tesla evaluate cross traffic? It can't turn it head as a human would. It can't assess auditory cues. Which camera or sensor assesses the person/car who runs the red at 50mph and stops the Tesla from proceeding into the intersection dispite the light being green for the Tesla.
 
I'm on the internet, so that makes me an expert on most things. Whatever sensor system is employed there will still be a need for a predictive AI engine, to guess people's behaviors, for example, when they will sit in your blindspot or as in socal, accelerate into it from a distance, as soon as you indicate a lane change...
 
  • Like
Reactions: Gwgan
Nothing gives me the right but I'm going to speak up for the dozens or however many experts at Tesla and Elon Musk who do obviously think this is possible. If they don't pull it off, they'll have quite the PR mess on their hands.
 
AI expert here, but more in the natural language processing area, but I can extrapolate autopilot.

Short answer - Not possible.

Long answer -
You need 4 ingredients to make the L5 autopilot pizza,
- a) 360 degree situational awareness
- b) Fleet learning
- c) Data crunching both in real time (like your head does), and learned (like your head does)
- d) A computer and a powersource to support all this.

360 degree situational awareness
Compare it with your head, you have stereoscopic vision, but it's mounted on an axis (your neck). AP camera is not, radar does not have enough resolution.
Cameras are vastly inferior to eyes, except we can make cameras good in a single dimension.
Military drone cameras cost millions, and they get around the whole problem with brute force (very clear lenses, huge aperture, massive CCDs) ~ but then they produce so much data. A car with a 90kwh battery can't run a computer powerful enough to process all that in real time.

Fleet Learning
I don't think they are taking advantage of their 'high resolution maps' just given how the system behaves so far, and that is the 'learning' bit that your head is so good at, and Tesla just isn't. And I don't think they will be able to take advantage of the high-rez-maps either - not to the extent they'd like you to believe, simply because the car neither has enough data storage, nor enough computational power, nor a power source to support that kind of computation. Basically that "fleet learning" thing - that's way oversold than what Tesla can realistically ever deliver. However, no way to measure the success of that ;-) so they can get away with that bluff. Story of corporate America, the judge (you) is dumber than the criminals (Tesla). Anyway, so what they CAN do is to improve their algorithms based on data. I don't think they are doing that greatly yet either. i.e. they are not considering every car's data. That may actually not be necessary even for what they are trying to achieve for now.

What they cannot do you drive 15 mins before me, and swerve a pothole, and my Tesla magically learns from your experience without a programmer in freakmont writing a line of code - BS! Not happening.

Data crunching both in real time (like your head does), and learned (like your head does)
Data crunching in real time ~ oh crap that bickhead is just swerved into my lane cuz he was texting and driving.
Learned - it's friday night, better be extra careful of honda civics with coke can exhausts and underbody lights.

With AI, we could extrapolate missing bits of info, but this won't happen for 3 reasons,
-- Human beings do not possess the ability to write complex software, not to that level on that hardware on a mobile platform.
-- Neither do we have the necessary hardware to process all that in a small enough power efficient enough package to mount on a car.
-- Your head is a computer, with stereoscopic forward vision, but it has a ridiculous amount of computational power and far superior algorithms to what Tesla is writing (okay that was below the belt, snicker).
-- Your head cannot do L5 under all situations either. Can you drive in fog? In a downpour? snowstorm?

Now with AP2 hardware, can we do it? Sorry nope!
We have the necessary 'cameras' and 'sensors' but not the necessary software, or the hardware to run that software.
However, we can get pretty damn close, for 90% of the time.
Tesla is taking the logical path here, to use Linux/GCC/graphics card acceleration repurposed for AP computation. Basically they are brute forcing computation at the problem and reacting as fast as they can to give you the illusion of FSD .. which is good enough for majority of the situations.

A computer and a powersource to support all this.
So the 'learned' portion - is some bonehead in freakmont learning for the computer and writing out code. But the real time learning replacing that bonehead - well, its possible, one day. But we need vastly superior hardware than we have today. If we did, countries wouldn't be racing each other in making the best super computer possible, while not really explaining what the hell they use them for.

Summary,
With AP2 and the current hardware Tesla will be able to give you the illusion of FSD that will actually work in 80-90% situations, which is plenty good and fairly impressive. However, it's not as sci-fi as Elon is trying to sell you. And yes, it's a shame that no other auto-manufacturer can figure this out. PS: This is the internet, so I guess I pulled off being an expert pretty well, no?
 
I would include people who worked on the Tomahawk Missile program - they have over 2000 successful strikes in a 3D environment.

Key elements of guidance for the THM:

  • IGS - Inertial Guidance System
  • Tercom - Terrain Contour Matching
  • GPS - Global Positioning System
  • DSMAC - Digital Scene Matching Area Correlation

The IGS is a standard acceleration-based system that can roughly keep track of where the missile is located based on the accelerations it detects in the missile's motion (click here for a good introduction). Tercom matches an on-board 3-D database of the terrain the missile will be flying over. The Tercom system "sees" the terrain it is flying over using its radar system and matches this to the 3-D map stored in memory. The Tercom system is responsible for a cruise missile's ability to "hug the ground" during flight. The GPS system uses the military's network of GPS satellites and an onboard GPS receiver to detect its position with very high accuracy.

Once it is close to the target, the missile switches to a "terminal guidance system" to choose the point of impact. The point of impact could be pre-programmed by the GPS or Tercom system. The DSMAC system uses a camera and an image correlator to find the target, and is especially useful if the target is moving. A cruise missile can also be equipped with thermal imaging or illumination sensors (as used in smart bombs)
 
  • Informative
Reactions: TaoJones