Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
I'm worried the E2E approach will get us very far very fast, but will ultimately be a dead end. Machine learning has a degree of randomness, which works fine for text and image generation, but not for making driving decisions. And as you said, diagnosing errors is difficult to impossible with machine learning. I don't think this foundation is solid enough to be trusted for humanless driving.
I don't think it's a dead-end, but I do think it's going to progress a lot slower than Elon thinks. Also, I think randomness is actually a good thing in this context, because it allows the model to "consider" more unusual possibilities than it might otherwise, and occasionally a very unusual hypothesis will turn out be the correct one. Since it's recalculating many times per second, it's unlikely that a random "mistake" will propagate for very long, but it's important for the network to be able to randomly "think outside the box" to find non-obvious solutions.
I think Tesla's previous path was the correct albeit more difficult one. Driving should be solvable with heuristics, supplemented with AI for the parts where there is a very large solution space (e.g. identifying objects from images and path finding). Driving is complex; they just needed more time to pin down all of the rules.
The probably with human-coded heuristics is that they are unavoidably "brittle", and simplify the real world too much. The real world is edge case upon edge case upon edge case, and our brains are uncannily good at making things seem drastically simpler to our conscious minds than they actually are. Properly solving the driving task with heuristics would involve literally trillions of rules and rule combinations; it is fundamentally unsolvable that way, even when combined with neural networks for sub-tasks. See: The Bitter Lesson.
Ironically the problem of autonomous driving would be a lot simpler if all cars had the hardware. Then the cars could communicate their position and intentions precisely with each other. There would still be the issue of outside events to deal with however, such as animals or debris entering the road. Maybe at some point they will start building the hardware into new cars in preparation of an autonomous-only transition date.
Lucid is taking this approach. Their cars have 32 external sensors (14 cameras, 5 radars, lidar, 12 ultrasonics), which sets them up extremely well for autonomy once the software and compute hardware is ready. I wish Tesla were more committed to computer upgradeability within their fleet; I would love to know that my 2022 Model Y (with HW3) might be upgradeable to the HW5 computer when it becomes available in a couple years. (Elon said late 2025.) Likewise, I would be far more likely to upgrade to a HW5 Model S in a couple years if I had assurances that it would be forward-compatible with, say, HW7. (Just the computer; not necessarily the entire sensor suite.) Cars don't wear out in 2-3 years like cell phones, and it would add tremendous value to be able to keep them current (compute-wise) for substantially longer. It does potentially add a lot more configurations for Tesla to support, but even this is bounded and manageable, if Tesla were to restrict the upgrades to say 2 generations. This would still be enough to mostly cover the 8-10 year life of the typical car.
 
Last edited:
There's an important point here, which is that real-world driving is quite a lot about having the general intelligence to know when and how to bend the rules, and equally important, when not to. Mechanistic adherence to the rules will always feel un-humanly robotic. There's also a social aspect where you're going to probably give a human Uber driver the benefit of the doubt for doing something slightly out-of-bounds (like doing a 3-point turn using a private driveway), while with a non-human Robotaxi such fudging is more apt to be reported and penalized. Until the car has the ability to literally explain its actions (and I do believe KITT-style conversational FSD will one day be a thing), there's going to be a double standard here.
For now, until laws catch up with technology, robotaxis are okay:

An NBC Bay Area investigation reveals autonomous vehicles in California cannot be cited for moving traffic violations since transportation laws require tickets to be issued to actual humans.​


If your L3 or higher system bends/breaks the traffic laws (rolls a stop, cuts a yellow too close and runs a red, changes lanes over a solid white line, speeds, etc.) they cannot be cited. Our L2 FSD obviously can be cited, since the driver would get the ticket.
However, things could change soon, and then I think robotaxis will not be able to bend/break the rules:

The two bills cover different aspects of autonomous vehicle legislation. Senator Dave Cortese’s bill SB-915 empowers California cities to write their own individual regulations relating to autonomous vehicles, and Representative Phil Ting’s AB-1777 aims to make the driverless car companies liable for moving violations that their cars commit on California roads. The Autonomous Vehicle Industry Association which represents most major autonomous vehicle companies, unsurprisingly opposes both bills meant to hold its members accountable for the actions of their products.
 
For now, until laws catch up with technology, robotaxis are okay:

If your L3 or higher system bends/breaks the traffic laws (rolls a stop, cuts a yellow too close and runs a red, changes lanes over a solid white line, speeds, etc.) they cannot be cited. Our L2 FSD obviously can be cited, since the driver would get the ticket.
However, things could change soon, and then I think robotaxis will not be able to bend/break the rules:
Pedantically true that L3+ cars cannot currently be cited on an individual case-by-case basis for bending/breaking the rules. However, Tesla is not yet L3, and Tesla itself has effectively been "cited" by NHTSA for rolling through stop signs, causing it to have to retrain the network to not do that. (This will presumably continue to apply to any systematic rule-breaking NHTSA catches them doing, especially as they approach L3/L4.) So Tesla's FSD systems will end up having to follow the rulebook much more strictly and mechanistically than any human reasonably would. And until NHTSA acknowledges that "rules are made to be broken" within the bounds of reasonableness and safety, it will be very difficult to get autonomous cars to drive in a humanlike way.
 
Last edited:
  • Like
Reactions: primedive
Tried 12.3.6 from work to home. 29 miles, some street, mostly freeway. For the most part it did well. It did something weird on the freeway exit. It shot over to the far right lane causing it to over correct. This caused it to bounce back and forth in the lane. Terrible visual aid below.

1714722605559.png
 
Because it’s likely all BS.
This is not like putting out an update to Spotify guys.

When you make a product that can easily kill scores and scores of people, thorough safety testing is extremely important. Do you think updates to Boeing or Airbus’s flight control software is just pushed out when some developer thinks it’s ready?

You curate the training data, then train the model. Then you have to validate that from your unit tests and do lots of internal testing (real-world driving by a closed team of employee drivers). From that you determine the problem areas, curate more data, and retrain the model. Repeat until you have the problem areas addressed, then open to a wider team of employee drivers.

Assuming that all goes well, only then can you trickle out to the public.

The best way to do that is to be refining one version (12.4) while you’re still earlier in the process for the next version (12.5).
 
Lucid is taking this approach. Their cars have 32 external sensors (14 cameras, 5 radars, lidar, 12 ultrasonics), which sets them up extremely well for autonomy once the software and compute hardware is ready.
That’s one reason why Lucid is losing a quarter million dollars on every car they sell.

In addition to the hardware, you now need significantly more compute just to process all this data. Tesla has already shown that just cameras is enough.

Lucid will not survive long enough to see any meaningful self-developed autonomy systems in their car.
 
Machine learning has a degree of randomness, which works fine for text and image generation, but not for making driving decisions.
Text and image generation are the opposite of vehicle control systems. The former take a small amount of information and expand it into a large amount of information. The latter does the reverse. That "randomness" that you're talking about is essential when creating all that new information, but has no role in vehicle control. Something like FSD is taking the randomness of the world and distilling it down to a small set of vehicle control outputs. That why @Ben W speaks of the brittleness of heuristics - the input data is too subtle for heuristics to detect all the needed patterns. Training serves the purpose of emergently identifying the important patterns in the randomness of the world so that proper control outputs can be generated. In other words, it self-organizes into its own heuristics, but without explicitly laying out human-readable heuristics.

In short, neural networks are far more promising than heuristics ever were, and I'd say that it is heuristics that give the impression of quick progress while actually representing a dead end. V1 through V11 demonstrated that.

I think the challenge before Tesla now is curation of the training data. If they can figure out how to collect the exact training data that they need to produce the control behavior(s) that they want, then they should be able to take this system to the limits of the hardware, wherever that is.
 
I'm hopeful that future updates will come more quickly and more regularly in the new E2E model. My thought process is that Tesla will simply be targeting the slightly weak spots in the current version via adding targeted training videos to the overall net that "smooth out" the rough edges identified.

In theory, that would leave the majority of the things that the system already does pretty well alone, while targeting improvements in problem areas. And testing the new version wouldn't have to be a 100 re-test of every aspect, just a check to see if the "rough spots" have been improved via the targeted training videos added to the net.

I'm no deep learning expert, though, but this is my hope...
 
It’s not just the slow stop sign routine for me. I disengage a lot because it waits too late to stop and then has to use brakes. I prefer a super smooth and efficient stop with regen only. Seems like calculating a perfectly timed regen only stop is something a computer would excel at, but apparently not. Or, in the case of end to end AI, it’s probably just mimicking the cruddy stopping behavior of most humans… accelerate as far as possible and then slam on the brakes at the last moment.
Thats what I mean by fit & finish. In software you have to always first try to get it to work and then iterate on finer aspects. I really consider FSD still something that is being developed rather than being fine tuned. It is not yet "Feature complete".

But that can't be the reason people drop ou of FSD - because as you note - they actually drive like that "cruddy stopping behavior of most humans… accelerate as far as possible and then slam on the brakes at the last moment." ;)
 
Text and image generation are the opposite of vehicle control systems. The former take a small amount of information and expand it into a large amount of information. The latter does the reverse. That "randomness" that you're talking about is essential when creating all that new information, but has no role in vehicle control. Something like FSD is taking the randomness of the world and distilling it down to a small set of vehicle control outputs. That why @Ben W speaks of the brittleness of heuristics - the input data is too subtle for heuristics to detect all the needed patterns. Training serves the purpose of emergently identifying the important patterns in the randomness of the world so that proper control outputs can be generated. In other words, it self-organizes into its own heuristics, but without explicitly laying out human-readable heuristics.

In short, neural networks are far more promising than heuristics ever were, and I'd say that it is heuristics that give the impression of quick progress while actually representing a dead end. V1 through V11 demonstrated that.

I think the challenge before Tesla now is curation of the training data. If they can figure out how to collect the exact training data that they need to produce the control behavior(s) that they want, then they should be able to take this system to the limits of the hardware, wherever that is.
For me, the vehicle control nets are nowhere close to refined heuristics of other production vehicles.
 
  • Disagree
Reactions: STUtoday
I don’t know of any driver who stops like FSD does. And I have never/rarely seen any other driver on the road stop that way.

It’s some weird training error problem they should be able to simulate and then figure out why it is happening. Or one of their other training inputs or guardrails is overriding the correct control, probably for safety.

They either know what the problem is or they are not bothering to simulate it.

TBH, been thinking of this issue over the past few days and I wonder if it's not actually a training, data, simulation, etc issue at all.

I wonder if the AI model has such a limited amount of recall that it essentially can't blend between driving and braking smoothly that far out. So instead it's basically an on/off state of driving (proceeding) vs stopping which leads to it being delayed, and then "harsh" as I would put it to stop in time.
 
  • Helpful
Reactions: cyborgLIS
True. Making it even more desperate and self-defeating to release to the masses.
I'm not sure. As we say "shipping is a feature". You can wait endlessly for things to be "perfect" or release to get feedback.

As we are seeing, you can always release to the masses and let them self-select as to whether they want to use it or not. That way you get the widest participation and feedback.

ps : Funny thing as I was going through the airport yesterday. None of the "presence sensing" paper towel disposers in the restrooms worked. You had to really struggle to get the hot air dryers from starting. So much for release products to the public only when its perfect ;)
 
  • Like
Reactions: Ben W
Nobody (approx) drives this way.
Hmmm .... probably an exaggeration - but what OP wants is definitely not how the majority drive. OP wants a "limo" experience, which BTW is exactly what I want as well - atleast in the "chill" profile. But alas we are not there (yet?).

On that note I should say my drive yesterday from the airport was a head scratcher. FSD refused to keep up with the traffic driving 10 below the speed limit (50 on 60 freeway) and 15 below the set limit at times. Feels like V12 behavior rather than V11.

Ofcourse it also drove too fast for my liking right next to the barriers and extensive roadwork that was going on, esp. on curvy roads - so had to drive that stretch of 10 miles or so manually.
 
That’s one reason why Lucid is losing a quarter million dollars on every car they sell.

In addition to the hardware, you now need significantly more compute just to process all this data. Tesla has already shown that just cameras is enough.

Lucid will not survive long enough to see any meaningful self-developed autonomy systems in their car.
If Lucid used Tesla's sensor suite instead, they would still be losing $249k on every car they sell (although the naive calculation that yields the $250k loss figure is highly misleading for a company in Lucid's phase of growth, and does not accurately reflect marginal cost). Tesla "lost" a similar amount on their early Roadsters, and their early Model S's in 2012, even with no sensors at all. The cost of the off-the-shelf sensors is a drop in the bucket compared to other factors.

Tesla has already shown that just cameras is enough for mediocre-quality L2. (The extremely wide ODD is impressive, and the quality on highways is foreseeably approaching L3 level, but the quality within the trickier parts of the ODD is still mediocre, compared to a skilled human driver.) Tesla has emphatically not yet shown that pure vision will be enough for L4/Robotaxi.

Counterintuitively, adding lidar dramatically reduces the compute requirements; it doesn't increase it. That's because a tremendous amount of value (constructing ground-truth 3D maps) is obtained instantly and for free by lidar, whereas it requires a huge amount of computation (and lag) when done by pure vision. The same is true to a lesser degree for radar and ultrasonics.

Lucid has a daunting uphill climb ahead of them, but there are many aspects of their product that are extremely impressive, and I do hope they survive.
 
Last edited:
Ironically the problem of autonomous driving would be a lot simpler if all cars had the hardware. Then the cars could communicate their position and intentions precisely with each other. There would still be the issue of outside events to deal with however, such as animals or debris entering the road. Maybe at some point they will start building the hardware into new cars in preparation of an autonomous-only transition date.

Totally agree that this is the way to go. Our vision only cars can only see as far as the camera hardware allows which is still far less than what human eye can do. Example, we can see the red light coming and start slowing for it in a gentle way far earlier than FSD can do. Another one, on a fast rolling freeway you can see cars upto half a mile ahead starting to slow down, the tail lights building up and prepare for a hard stop. Vision only with today's cameras can't do that. However if the lead cars were communicating their sensor and status updates then followers could take proactive stance. This is the way to solve late response issues.
 
  • Like
Reactions: cyborgLIS