Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Full Self Driving Removed - Not Even “Off Menu”

This site may earn commission on affiliate links.
The biggest challenge for Tesla getting AP2 working for AutoSteer was replacing Mobileye's object recognition with Tesla's own software.

This now works pretty well tracking lane lines. The software still seems to struggle in tracking the relative locations of surrounding vehicles, which is demonstrated by the "dancing" cars shown on the dashboard display.

The faster HW3 processor should help considerably by allowing the software to more accurately project where the surrounding vehicles are located and their relative speed & direction, along with handling the increasing number of rules the AI engine will be processing to mimic safe human driving.

Getting back to the OP's original post. If Tesla is close to deploying HW3, Tesla can save some $$$ (which is important right now as they push to maintain profitability) by not giving away the HW3 upgrade to any more vehicles - and when they make the upgrade available it will cost more for AP 2.5 vehicles to upgrade.
 
Having an order of magnitude more performance will make a big difference in the car’s ability to handle tuns at useful speeds by reducing latency.

It not a latency problem, latency would only exist if it was multiple pipelined processors when the chain of calculations lasts more than one frame time. Even a 10 Hz rate is only 8.8 feet per frame at 60MPH, and they are running 20-30 frames per second, I think.

There is one main core doing all the calculations before the next frame (which does limit max frame rate, however, the NN gets pairs of frames, so the time shift is a good thing for motion estimation). The new HW allows it to do more calculations in that time, providing more and high quality image recognition data to the driving NN which will also be improved.
 
It not a latency problem, latency would only exist if it was multiple pipelined processors when the chain of calculations lasts more than one frame time. Even a 10 Hz rate is only 8.8 feet per frame at 60MPH, and they are running 20-30 frames per second, I think.

The current neural net processes 200 frames per second, and AFAIK, that's divided by 8 cameras, for a total of 25 FPS per camera. But that's not the whole story.

Think about the pipeline this way. In the worst-case scenario, the camera takes 1/30th of a second to take a shot. Assuming the computer takes 1/25th of a second to process the shot, that means from the time it starts capturing a frame to the time it finishes processing that frame of video, the oldest parts of the image were captured about .07333 seconds ago (the exposure time plus the processing time), or a little over 1/14th of a second.

That comes out to 6.42 feet worth of latency, which is huge. If the computer turns slightly too tightly or not tightly enough, that's far enough to be horizontally mispositioned by several inches pretty easily, if not more. And not only must the computer correct that error, but it must overcorrect because it has already been wrong for long enough to be dangerous, and it must guess when to stop overcorrecting. This is why the original neural net, reportedly at ~10 FPS, IIRC, was unusable at corners and ping-ponged constantly, while the current one, at 25 FPS, is only just meh under those circumstances.

IMO, an ideal processing loop would run at 120 FPS end-to-end. Tesla's cameras, unfortunately, only support 60 FPS, which IMO is really kind of bare minimum performance. The new HW3 will be fast enough to process 250 FPS per camera, though, which when combined with the camera taking a 60th of a second, would translate to essentially 1/48.5th of a second latency between the start of capturing the frame and the end of processing — less than a third the end-to-end latency of the current setup.

Bear in mind that one of the big advantages of LIDAR over cameras is latency. With LIDAR, the data flow is continuous, though it still takes maybe 1/20th of a second to paint the entire scene. But during that time, you're getting updates to points on a continuous basis, so they can provide at least some indication of whether the vehicle is doing what the car is expecting. With 30 FPS cameras, you start out with a 1/30th of a second penalty right off the bat. This is, IMO, likely to be a problem. I'm fully expecting them to crank up the frame rate to the full 60 FPS with HW3, simply because it will likely make a big difference in terms of the car's ability to precisely execute maneuvers like turns.

AP2 has 40 times the performance of the AP1, yet it is not much better.

AP1 has only a single camera instead of 8. So in terms of frame rate per camera, AP2 has only about 5x the performance of AP1. Additionally, AP1 has a much simpler neural net that does a lot less. So that's not really equivalent to running a similar neural net setup on much faster hardware. :)

HW3 will be fast enough to crank the cameras up to 60 FPS, which will, as I said, make a huge difference in end-to-end latency. *And* the processing latency will be less.
 
Last edited:
The current neural net processes 200 frames per second, and AFAIK, that's divided by 8 cameras, for a total of 25 FPS per camera. But that's not the whole story.

Think about the pipeline this way. In the worst-case scenario, the camera takes 1/30th of a second to take a shot. Assuming the computer takes 1/25th of a second to process the shot, that means from the time it starts capturing a frame to the time it finishes processing that frame of video, the oldest parts of the image were captured about .07333 seconds ago (the exposure time plus the processing time), or a little over 1/14th of a second.

That comes out to 6.42 feet worth of latency, which is huge. If the computer turns slightly too tightly or not tightly enough, that's far enough to be horizontally mispositioned by several inches pretty easily, if not more. And not only must the computer correct that error, but it must overcorrect because it has already been wrong for long enough to be dangerous, and it must guess when to stop overcorrecting. This is why the original neural net, reportedly at ~10 FPS, IIRC, was unusable at corners and ping-ponged constantly, while the current one, at 25 FPS, is only just meh under those circumstances.

IMO, an ideal processing loop would run at 120 FPS end-to-end. Tesla's cameras, unfortunately, only support 60 FPS, which IMO is really kind of bare minimum performance. The new HW3 will be fast enough to process 250 FPS per camera, though, which when combined with the camera taking a 60th of a second, would translate to essentially 1/48.5th of a second latency between the start of capturing the frame and the end of processing — less than a third the end-to-end latency of the current setup.

Bear in mind that one of the big advantages of LIDAR over cameras is latency. With LIDAR, the data flow is continuous, though it still takes maybe 1/20th of a second to paint the entire scene. But during that time, you're getting updates to points on a continuous basis, so they can provide at least some indication of whether the vehicle is doing what the car is expecting. With 30 FPS cameras, you start out with a 1/30th of a second penalty right off the bat. This is, IMO, likely to be a problem. I'm fully expecting them to crank up the frame rate to the full 60 FPS with HW3, simply because it will likely make a big difference in terms of the car's ability to precisely execute maneuvers like turns.



AP1 has only a single camera instead of 8. So in terms of frame rate per camera, AP2 has only about 5x the performance of AP1. Additionally, AP1 has a much simpler neural net that does a lot less. So that's not really equivalent to running a similar neural net setup on much faster hardware. :)

HW3 will be fast enough to crank the cameras up to 60 FPS, which will, as I said, make a huge difference in end-to-end latency. *And* the processing latency will be less.

Even if the camera required a full frame time to push the data out (which isn't what HW3 addresses, if the cameras can do 60 fps, then their latency is better than 33mS), road paths are a fairly continuous function, if you know the processing delay, you can adjust the control loop response for it. Like leading the target. However, if it runs more if a straight PID steering loop, yah latency will get you.
The car does also have the high speed accelerometer data from the SRS that could help augment path tracking...
 
Even if the camera required a full frame time to push the data out (which isn't what HW3 addresses, if the cameras can do 60 fps, then their latency is better than 33mS), road paths are a fairly continuous function, if you know the processing delay, you can adjust the control loop response for it. Like leading the target. However, if it runs more if a straight PID steering loop, yah latency will get you.
The car does also have the high speed accelerometer data from the SRS that could help augment path tracking...

True. The latency shouldn't be introducing the misbehavior I've seen, but at least my experience early on (pre-9) was that it usually did the right thing, just so much too late that it nearly caused accidents. And the 9.x behavior feels like it is doing the same things, just not as much too late.
 
  • Helpful
Reactions: mongo
Additionally, AP1 has a much simpler neural net that does a lot less. So that's not really equivalent to running a similar neural net setup on much faster hardware.

While it is true that MobilEye EyeQ3 that powers AP1 is highly efficient... it is not true that it does a lot less in itself.

It recognizes far more objects than Tesla Vision does even today (as shipping in production cars anyway). Those speed signs being one example. For eight cameras one would of course need several such chips but for a camera or two it does more than Tesla’s own currently.

In other words Tesla’s current NN implementation is much less efficient than MobilEye’s computer vision is.