Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Tesla autopilot HW3

This site may earn commission on affiliate links.
Good to know. I do wonder if it is worth buying FSD now since the only new features will be the city driving with traffic lights. Of course, buying FSD now will also get us the AP3 chip which may improve NOA, Advanced Summon and Auto Park. Plus, Getting FSD will get us the future updates that improve city driving.

Well for those functions to work well, you need HW3...which you only get if you have FSD. It'll be interesting to see how Tesla plays this one out.
 
  • Like
Reactions: OPRCE
No. Lidar is much more than that. It vastly simplifies object recognition and tracking. Instead of needing an AI to do image recognition, you can use well tested and long established algorithms with lidar. It's far more than simply a reduction in CPU load.

What kind of algorithms can process LIDAR point cloud data but can't process a point cloud generated from stereo images?
 
Lidar has a few advantages for object recognition.

- Longer range. At distances where a camera image will be just a bunch of indistinct pixels, lidar will be able to resolve a usable point cloud.

Only if you configure it to scan a particular spot at higher resolution, which then results in less frequent repaints of the overall scene. You can do something similar with a camera, too, provided you have a zoom lens and the ability to pan and tilt it. But in practice, if something is far enough away that it is just a bunch of indistinct pixels, it also probably doesn't matter yet. :)


- Temporal super-sampling. It's easier to combine point clouds from multiple samples to enhance object recognition. It's very difficult to do with cameras in a useful way.

Actually, I've seen a decent number of papers about multi-image superresolution. And one approach for getting depth information from a single camera is to use two consecutive shots while moving towards or away from the object in question, then using parallax differences to estimate its position. So there's a lot you can do with information from cameras across multiple shots.


- No need for multiple front facing FOV cameras, the limitations of static optics do not apply.

Nobody, and I mean absolutely nobody is trying to do self-driving without multiple front-facing cameras. For one thing, it is not possible to determine the color of a traffic light with LIDAR, nor recognize turn signals to know when to slow down to let people in, nor read speed limit signs to know that the road is under construction and has a lower limit, etc. And if you only have one front-facing camera, then you have no redundancy, which means when it fails, your car becomes a death trap.

So no, LIDAR does not remove the need for multiple front-facing cameras.


- Removes the need for AI to do depth perception because it samples depth directly.

There are a number of approaches for computing depth from stereo images taken a fixed distance apart, and AFAIK, most of them involve no AI at all.


- Point clouds allow for estimation of which direction the object is facing in more easily than AI trying to do it. AI tends to work on flat image recognition, e.g. the back of a vehicle, but with the ability to handle transforms such as it being at an angle. To estimate direction it would need significantly more effort.

If I understand the comment correctly, you can do that on a single still image without stereo, with no AI at all, by using basic edge detection algorithms to estimate the average slope of the bottom edges of the bumper and the visible side of the car, then performing some basic math on that data. The hard part is actually recognizing that you're looking at a car to begin with, and thus need to know which way it is facing.

And, of course, if you care about whether it's the front or the back, that's still just basic image recognition. Are there headlights? If so, it is facing you. :)

But the more interesting data is knowing which way it is moving. Cameras work just fine for that, too, obviously, because once you know the object's distance from parallax, you can compare how quickly the object is getting bigger compared with the ground at a similar distance and determine its relative velocity. It is just a lot more computationally complex to do it that way.


- Lidar works just as well in poor light, including low sun. As well as not requiring and indeed filtering ambient light, it is able to see things like a person wearing black against a black background because it can see they are separate from the background via depth measurement.

LIDAR may offer some advantages in low light. Then again, cars have headlights, and if you're driving too fast to make out a person in your headlights, then unless you're on the freeway where no pedestrians are allowed, you're driving too fast.
 
Nobody, and I mean absolutely nobody is trying to do self-driving without multiple front-facing cameras. For one thing, it is not possible to determine the color of a traffic light with LIDAR, nor recognize turn signals to know when to slow down to let people in, nor read speed limit signs to know that the road is under construction and has a lower limit, etc. And if you only have one front-facing camera, then you have no redundancy, which means when it fails, your car becomes a death trap.

So no, LIDAR does not remove the need for multiple front-facing cameras.

While I completely agree multiple forward cameras will be used in any urban self-driving car, Audi actually does Level 3 (read a book) on highway with just one forward-facing camera (being used for autonomous, parking/night vision cameras are not). The redundancy is in the radars and Lidar.
 
  • Like
Reactions: OPRCE
While I completely agree multiple forward cameras will be used in any urban self-driving car, Audi actually does Level 3 (read a book) on highway with just one forward-facing camera (being used for autonomous, parking/night vision cameras are not). The redundancy is in the radars and Lidar.

To be fair the Audi system isn't really level 3 and is actually pretty janky even in the situations it is supposed to work.
 
  • Like
Reactions: OPRCE
To be fair the Audi system isn't really level 3 and is actually pretty janky even in the situations it is supposed to work.

Audi’s system is Level 3 but it is true as @HolyGrail says it has not been released yet and it is a limited scenario product. Just noting that redundancy at least in limited settings can come from other places than multiple cameras. Audi has three forward-facing radars, Lidar and one camera in this case doing the thing.

Of course any urban self-driving car will have multiple forward cameras...
 
Level 3 is also ill defined. Its usefulness very much depends on how much advance-warning time for a takeover the manufacturer guarantees. If it is 3 seconds, it is as useful as autopilot. I highly doubt they would guarantee more, maybe someone has numbers?
 
Level 3 is also ill defined. Its usefulness very much depends on how much advance-warning time for a takeover the manufacturer guarantees. If it is 3 seconds, it is as useful as autopilot. I highly doubt they would guarantee more, maybe someone has numbers?

Audi’s Level 3 Traffic-jam Pilot gives 10 seconds to take control, if the pre-release information turns out correct. Even after that there is a progressive process to alert the driver and eventually turn on hazards and stop the car, while the car remains in control. After it has stopped — in-lane — it will turn on all lights, open locks and call for help.
 
  • Informative
Reactions: OPRCE
As a typical usable forecasting horizon is ~ 1-2 seconds, I highly doubt that. Their media center "the driver has about 10 seconds to respond, depending on the situation" communication leaves good room for interpretation and for lawyers :)
 
As a typical usable forecasting horizon is ~ 1-2 seconds, I highly doubt that. Their media center "the driver has about 10 seconds to respond, depending on the situation" communication leaves good room for interpretation and for lawyers :)

As long as it is an unreleased product doubt is certainly okay but there is no ambiguity about the 10 seconds in the pre-release testing by journalists for example. There have been many journalists testing it so it really was 10 seconds in the pre-release product and it specifically allows watching TV for example (it turns video on when in that mode). In this regard it is nothing like Autopilot.
 
  • Informative
Reactions: OPRCE
Haha journalist demos?
On one handpicked piece of route they have driven a thousand times and optimised for?
Do I need to say more?

All I’m saying is that there is no ambiguity the number has been 10 seconds that driver has to react before the system starts an orderly extra alert and eventually stopping process. It is perfectly okay to take that with a grain of salt as long as it is unreleased but the 3 seconds you came up with is not based on anything seen of Audi’s Traffic-jam Pilot, the number has been 10 seconds ever since the Audi A8 launch.
 
Only if you configure it to scan a particular spot at higher resolution, which then results in less frequent repaints of the overall scene. You can do something similar with a camera, too, provided you have a zoom lens and the ability to pan and tilt it. But in practice, if something is far enough away that it is just a bunch of indistinct pixels, it also probably doesn't matter yet. :)
Indeed, cameras have the advantage that anything within the pixel box will be sensed (although sub pixel sized features will be averaged with the rest). Lidar only senses an area the size of the beam (a point).
 
Well for those functions to work well, you need HW3...which you only get if you have FSD. It'll be interesting to see how Tesla plays this one out.
Wild guess but I'd bet HW3 is ready to be included in new builds, hence the new pricing. Retrofits should start happening once there's something new that requires HW3 in older 2.5/2.0 cars.
 
  • Love
Reactions: JeffnReno
Wild guess but I'd bet HW3 is ready to be included in new builds, hence the new pricing. Retrofits should start happening once there's something new that requires HW3 in older 2.5/2.0 cars.

Based on the new website descriptions of...coming later this year....we won't see those HW3 upgrades anytime soon...

TeslaTime for "coming later this year"??? Well....3 months maybe 6 months definitely are now years old...what does "later this year" mean in reality.
 
Based on the new website descriptions of...coming later this year....we won't see those HW3 upgrades anytime soon...

TeslaTime for "coming later this year"??? Well....3 months maybe 6 months definitely are now years old...what does "later this year" mean in reality.

They can run the current AP builds on HW3. So if they're able to make them by end of Q1 (around now), like they said, I can't see a reason why they would put in 2.5 which would need to be upgraded later for people who get FSD.
 
  • Like
Reactions: JeffnReno and OPRCE
Kind of frustrating that some of the cause of delays have been from trying so hard to push this heavy software onto incapable component systems. If we had HW3 in the era of HW2, who knows; software may have been a little less troublesome with the HW3 features/ capability.

Yes yes, you have other factors in all equations, but I assume this would be a huge one.
 
  • Like
Reactions: OPRCE
Kind of frustrating that some of the cause of delays have been from trying so hard to push this heavy software onto incapable component systems.

Actually, I sort of don't think that's the case. The SoC in the new hardware is three years old, which means they've been working on HW3 for a good three years now. I doubt they've been working too hard to cram extra NN functionality into the existing hardware.
  • For the first several months in which I owned my Model X, AP 2.5 felt like it was in maintenance mode.
  • In version 8, they released a port of their neural nets to a new software platform, which massively improved performance on the old hardware. Though at a glance, that might sound like trying to push heavy SW onto incapable hardware, it's really way more than that, because:
    • The new software platform will also improve HW3 performance.
    • The improved performance on HW2/2.5 makes it possible to log data from the non-front cameras, which is a requirement for training the bigger models for HW3.
  • In version 9, they didn't change any of the NN bits much, I don't think, but bolted on the NoAP parts on the side. Those parts can almost certainly sit on top of either of the two neural net setups.
So basically, it feels like very little effort was actually spent on throwaway pieces, like training neural nets for the existing hardware, and most of the effort was spent on getting ready for the new NN on HW3.

But that's just my gut feeling based on what I see from the outside. I won't be sure until we see how well the new NN on HW3 performs. :)
 
  • Informative
Reactions: OPRCE and WillK
Audi's Level 3 "ready" system still has no regulatory approval in the EU and works only up to 60 km/h if driving directions are hard separated.

One of the lessons here is that a Level 3 system isn’t necessarily more advanced than a Level 2 system. For example, imagine if Tesla is really able to release a Level 2 system that can handle all driving, including urban driving, but requires human supervision. That’s more advanced than Audi’s Level 3 system.

Imagine that a corporate headquarters used an autonomous shuttle to take people from Building A to Building B, at 10 miles per hour along a dedicated access road paced solely for this purpose. That would be a Level 4 system. But it wouldn’t have to deal with other vehicles, pedestrians, cyclists, animals, signs, traffic lights, lane lines, or construction. So it would be a lot less impressive than a Level 2 system that could deal with those things.

Similarly, a car could hypothetically be Level 4 on select freeways, but nowhere else.

A few times when Jim Chanos appeared on TV to talk about his Tesla short thesis, he argued that since Audi is at Level 3 and Tesla is at Level 2, Tesla is behind Audi. I think this is a facile comparison.

Lex Fridman at MIT argues that we should throw out the whole Level 1-5 schema and just make it a binary classification: a human is involved, or no human is involved. The car is either partially autonomous, or fully autonomous. I agree this is more useful than the SAE levels.
 
Last edited: