Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Neural Networks

This site may earn commission on affiliate links.
Surprised you didn't mention the issue from about 51s where the car doesn't see the traffic queue on the overpass, and he has to brake hard to avoid an accident.

Makes me wonder if this is an example where the HW2.0 suite is less capable than the HW2.5 suite.

Saw that. But it's not worth mentioning seeing it was par for course for AP and looks like it happens on 2.5 too.

spooncartel9 (u/spooncartel9) - Reddit
 
On the topic of vehicles and objects being misidentified and jumping around, I think it might also be helpful for folks to recognize that even an imperceptible difference to us could result in wild differences for the neural net:
Attacking Machine Learning with Adversarial Examples

(I’m not saying this is a case of bad “adversarial” data, but just an illustration of how difficult a real world environment is to identify digitally using frame-by-frame pixel analysis from a video feed.)

also this in the real world.
How To Fool A Neural Network
 
I just created a new forum intended for deep, detailed technical discussions like some of the discussions in this thread. It’s also a safe haven away from trolling, toxicity, incivility, rudeness, and acrimony.

It will be smaller than TMC, but I think it could also be a better place for these sorts of discussions.

https://gradientdescent.co

How can it not be against forum rules to include a link to your own forum in your signature?
 
What is that thick line in the center supposed to indicate? I’m some videos I see it clash with detected line markings, indicated in red therefore it cannot be planned path.

I do not think that word ("cannot") means what you think it means. :p Anybody with much direct experience with Autopilot has observed behavior which conflicts with reality and common sense on multiple occasions. :D

But more seriously, I think that is the "predicted path", perhaps an early attempt to replace some of the "software 1.0" (traditional hand-coded logic) which currently does motion planning and control with Karpasky's "software 2.0" (deep learning). You can see it has some way to go still. Hand-coded software logic is still in control of path planning and execution, if I'm not mistaken, and deep learning is still used only for perception and not for planning/control.
 
While Tesla and others work on refining their vision neural nets, I think they need to work on a few others to get self driving working. In particular, they don't seem to be doing other vehicle prediction. ie. based on the behavior of a vehicle (and ideally, surrounding vehicles), you should have a "mental" model of what these vehicles are likely to do in the near future. This would be really useful for when people merge into your lane, but also useful in lots of urban driving scenarios.
 
While Tesla and others work on refining their vision neural nets, I think they need to work on a few others to get self driving working. In particular, they don't seem to be doing other vehicle prediction. ie. based on the behavior of a vehicle (and ideally, surrounding vehicles), you should have a "mental" model of what these vehicles are likely to do in the near future. This would be really useful for when people merge into your lane, but also useful in lots of urban driving scenarios.

Prediction is probably the biggest outstanding problem in L3+ systems, but there's no evidence that Tesla has even started on this. The big boys have been working on prediction for many years using a variety of methods. Tesla is still struggling with perception, and they seem to be using all of HW2's compute capacity just for perception. They've already said that HW3 is required for FSD -- presumably they are hoping to have enough capacity to do both perception and prediction on the HW3 chip. But as with all of Tesla's AP promises, their statements about future functionality and schedules should probably be ignored. My guess is they will need all of HW3 for perception alone, which means prediction will remain an unsolved problem for Tesla.
 
Why would you need to be finished with perception to get started on driving policy/agent behavior prediction?
You can get this started in simulation.

Sure, you will eventually need to combine the two to put it into a working product, so that's still far out for sure.

But what difference does it make for a computer to recognize a patrol car stopping another car if that situation happens in an Unreal engine scripted environment or the result of a live vehicle sensor result?

As this would, of course, only ever end up in a fleetwide firmware when there are results and the control handling is extensive enough, we don't see anything of it in current firmwares.
It doesn't mean there is zero development going on. We just have no information on whether there is or not.
 
Some interesting numbers:

Traffic sign recognition: neural networks vs. humans

Question for @jimmy_d: if AKnet_V9 has 5x as many parameters as the previous version of AKnet, then how many parameters did the old AKnet have relative to GoogLeNet? GoogLeNet has about 7 million parameters. If the old AKnet is the same, then AKnet_V9 has 35 million parameters.

In a previous post, you said Tesla increased the parameters “more than 2x”. 7 million * 2.5 = 17.5 million. * 5 = 87.5 million.

That doesn’t seem that big compared to these other neural networks that have 100 million, or 150 million, or 860 million parameters. And if I’m not mistaken, they all take less than 20 GFLOPs to run, so you could run one for each camera with less than 160 GFLOPs. Drive PX2 AutoCruise has 4 TFLOPs or 4,000 GFLOPs, so why is HW3 necessary if AKnet_v9 has under 100 million parameters?
 
I can offer a potential answer to my own question about compute: runtime. It might take 20 GFLOPs to recognize one object at a time in one image at a time in an unhurried fashion. But if you need to recognize dozens of objects per image in 240 images per second (8 cameras x 30 fps), then you need more gigaflops.

Still curious about approximately how many parameters AKnet_v9 might have.
 
What does Karpathy mean in this video at 11:45 when he says “Build a single model”?


At 12:28, he re-states it: “Train 1 model to solve all tasks”.

3LBR6NR.png