Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

FSD Beta 10.69

This site may earn commission on affiliate links.
No. Read again where you misunderstood the point.
No i read it. Its pretty straight forward that you have not looked elsewhere other than Tesla.
List out all the self driving company and find me one that believe perception is harder than prediction and planning and that driving policy isn't multiple orders of magnitude harder. Perception for SDC is largely solved, you hear that sentiment from alot of SDC companies, the open problem are in prediction and planning (even harder)

Here oliver (ceo of voyage, now works at cruise) explains it

Here's Huawei Head of AD Research who is putting it bluntly: “What a lot of people don’t realize… is that planning and control has a greater impact on MPI than perception.”

MPI = Miles per intervention.
You hear the same quote if you watched any tech talk from Waymo, Mobileye, etc

The architecural challenges of implementing driving policy are essentially nil.
This is simply not true. As all SDC company disagree with this. As all SDC company consider this an open problem. Infact most SDC companies largely do the same thing when it comes to perception. But you won't find a single SDC company doing the same thing when it comes to planning (driving policy). Its all vastly different. Also the song and dance of how much ML to use in Driving Policy and how much heuristics to use is constantly changing as things are constantly being redesigned. For you to call this nil, shows that you haven't peeked outside your Tesla bubble.
Once you know what you want the car to do for an edge case in driving policy, the implementation is trivial.
Thus there is no limitation other than time it takes to learn from real-world edge case to define the requirements and logic-out the heuristic decisions.
You need to venture outside your Tesla bubble.
 
No i read it. Its pretty straight forward that you have not looked elsewhere other than Tesla.
List out all the self driving company and find me one that believe perception is harder than prediction and planning and that driving policy isn't multiple orders of magnitude harder. Perception for SDC is largely solved, you hear that sentiment from alot of SDC companies, the open problem are in prediction and planning (even harder)
One specific example of a particularly non-trivial driving policy problem even with decent perception is detecting if a stopped vehicle in your lane is parked vs. waiting for vehicles or an obstruction ahead of it. In my daily commute, FSD averages roughly 3x per day where it tries to use the oncoming travel lane to pass a queue of cars at a stop sign, school bus, or traffic light. Every time it happens I find myself trying to work out a simple logical statement to describe how it should have known better, but the more I think about it the more I appreciate the difficulty of making that choice.
 
One specific example of a particularly non-trivial driving policy problem even with decent perception is detecting if a stopped vehicle in your lane is parked vs. waiting for vehicles or an obstruction ahead of it. In my daily commute, FSD averages roughly 3x per day where it tries to use the oncoming travel lane to pass a queue of cars at a stop sign, school bus, or traffic light. Every time it happens I find myself trying to work out a simple logical statement to describe how it should have known better, but the more I think about it the more I appreciate the difficulty of making that choice.
Just move that determination to the vector space and the problem becomes trivial. You just check the vector space "parked" attribute.:p
I'm curious whether getting FSD beta makes people think self-driving is harder than they thought or if there are some who think it's easier.
@MrTemple, do you have FSD beta?
 
I think it's worth distinguishing between the kind of driving policy that leads to technically safe behavior, and the kind of driving policy that leads to "natural" driving behavior that confuses other drivers less, feels safer, and by consequence leads to fewer accidents.

Once perception is robust, driving safely in isolation is relatively trivial. For e.g. some of Dirty Tesla's recent videos where the navigation tries to take it into a barrier in the middle of a roundabout. Technically, it has perceived the barrier and avoided hitting it. But it avoided hitting it by stopping nearly dead in the middle of the roundabout. So if all other drivers were equally perceptive, that would be a safe maneuver. But it feels unsafe and awkward because we know it's not how a human would respond, and if another driver isn't paying full attention, it could lead to being rear-ended.
 
  • Like
Reactions: Silicon Desert
Thanks, got it.

Assuming they have vector space worked out, that makes sense. Big assumption.

Simple to code does not mean the effort to implement is trivial. What to implement is always the question.

Once you have the answer, yeah it's easy to do it the next time, you already know what to do.

So still not sure what the point is. But yeah, ok, we agree on where the hard part is.

(sorry, I checked out for 2 days, thought I caught back up, must not have).
The original response was basically ‘the hardware can’t do it’.

The point is the vectorspace conversion is already working.

The NN training of edge cases takes a while, but this doesn’t require more compute or memory (this is the thing that makes Neural Networks unique and fascinating).

Actually if you watch Ashok’s talk from June about the Occupancy Networks, they are MASSIVELY optimizing both the compute required and it’s efficacy (some of the first of these new nets are what went into 10.69, hence it’s huge improvements already on VERY new nets).

I’ve said all along the definition of every little bit of smart and safe driver policy will take a while to define. But exact implementation is trivial. There’s no hardware/software architectural hurdle there.

Fortunately for the ‘what do we build’ question, we have an exact example of ‘what to do’ in replicating safe human driving.

So the question isn’t so much how to decide what is safe, but how to replicate what is safe.

Still not fast or easy to copy the existing safe driver policy, but MUCH easier than if we had no example to copy.
 
Just move that determination to the vector space and the problem becomes trivial. You just check the vector space "parked" attribute.:p
I'm curious whether getting FSD beta makes people think self-driving is harder than they thought or if there are some who think it's easier.
@MrTemple, do you have FSD beta?
One specific example of a particularly non-trivial driving policy problem even with decent perception is detecting if a stopped vehicle in your lane is parked vs. waiting for vehicles or an obstruction ahead of it. In my daily commute, FSD averages roughly 3x per day where it tries to use the oncoming travel lane to pass a queue of cars at a stop sign, school bus, or traffic light. Every time it happens I find myself trying to work out a simple logical statement to describe how it should have known better, but the more I think about it the more I appreciate the difficulty of making that choice.
You’re not demonstrating that you know the difference between decision of policy vs execution of policy via software implementation. (Maybe because you missed where my point was in response to the software/hardware potential capability, which is MOOOOORE than able to handle nuanced policy decisions 100x more complex than your example, once the state is known in the system’s vector-space).

You are describing a great example of why it’s hard for a human to DECIDE and define on paper what the car should do.

That takes time and nuanced study of how safe humans make their decision.

It will take a ton of iterative attempts simulated and also tested in real-world cases

But THE KEY DISTINCTION it is absolutely trivial to implement in software every single iterative attempt at a policy for that.

Do you see the difference?

Implementation of complex decisions like you describe is not limited by the software or hardware architecture. Not in the very slightest bit.

Creating a robust driver policy for that situation (assuming it is captured in vectorspace) is HARD, because it requires us to understand existing human safe drive policy and define it.

The translation of defined logic in requirement to code running the car for your example is absolutely trivial.

And of course there is still lots to do in capturing the world into vector-space, but this is not system constrained in software or hardware architecture (see Ashok’s June talk where they just made MASSIVE compute efficiency improvements to this with v1.0 of Occupancy Networks in 10.69).
 
Yeah, you came late to the party and completely missed the point where I said capturing the world into the vector-space is the hard part.
Capturing the world is not difficult, a map is a prior ground truth. Representing the sensor and prior into vector space is not difficult. A Lidar is a sensor that represents data directly in 3d space, vector space is just representing the data in 3d world coordinate. If you think that is difficult, imagine fusing data from 3 different sensor types into a single vector space (3d space), now that must be impossible right? You have cars with 30 to 50 sensors on the road driving autonomously.
The vector-space problem is heinously difficult to implement. Three years ago 95% of experts in the industry would have said it was impossible with the sensors Tesla is using.
3 years ago, Waymo debut VectorNet that fuses Map and Sensor data into vectorized (2D/3D Coordinates) representation for cheaper compute. Do you have any resources to support your assertion that 3 years ago 95% of experts found sensor fusion and representing the data in 3d space impossible?
But it is working better than anybody could have guessed at this point in its incredibly short development lifecycle (since the clean-slate NN architecture rewrite).
They are building on the shoulders of giants. They didn't start from knowing nothing, they started from the various research papers and tens of years of research and development in this field by various contributors.

If you look at every ADS company out there operating on the road, you will see that the problem they all have in common is largely driving Policy.
WlXxYtk.png
 
Capturing the world is not difficult, a map is a prior ground truth. Representing the sensor and prior into vector space is not difficult. A Lidar is a sensor that represents data directly in 3d space, vector space is just representing the data in 3d world coordinate. If you think that is difficult, imagine fusing data from 3 different sensor types into a single vector space (3d space), now that must be impossible right? You have cars with 30 to 50 sensors on the road driving autonomously.

3 years ago, Waymo debut VectorNet that fuses Map and Sensor data into vectorized (2D/3D Coordinates) representation for cheaper compute. Do you have any resources to support your assertion that 3 years ago 95% of experts found sensor fusion and representing the data in 3d space impossible?

They are building on the shoulders of giants. They didn't start from knowing nothing, they started from the various research papers and tens of years of research and development in this field by various contributors.

If you look at every ADS company out there operating on the road, you will see that the problem they all have in common is largely driving Policy.
WlXxYtk.png
Weird that in all of your quotes you are VERY careful to not include where I repeatedly described the vectorspace problem of doing it with the cameras/sensors in question used on Teslas. And then you go on to describe how it's not hard on platforms using entirely different sensors. 🤔

Really, really weird you did that. 🙄
 
  • Like
  • Funny
Reactions: EVNow and sleepydoc
The 5 stopped cars in the driving lane example (or whatever) waiting for (whatever, we don't know, human or FSD).

Can we ever solve that for FSD? Sometimes I half pull out, check for the obstruction, if it's a car parking, I pull back in, wait for traffic.

Once it was a row of valet parked cars and a guy at the front of the line waves me around.

Once it was a dog running around in the street confused, and drivers from both directions are now out of their cars running around trying to help.

Is robotaxi (once driverless) going to have remote intervention for absolute stuck situation like this?
 
The 5 stopped cars in the driving lane example (or whatever) waiting for (whatever, we don't know, human or FSD).

Can we ever solve that for FSD? Sometimes I half pull out, check for the obstruction, if it's a car parking, I pull back in, wait for traffic.

Once it was a row of valet parked cars and a guy at the front of the line waves me around.

Once it was a dog running around in the street confused, and drivers from both directions are now out of their cars running around trying to help.

Is robotaxi (once driverless) going to have remote intervention for absolute stuck situation like this?
If you know the problem and solution - it can be automated.
 
The 5 stopped cars in the driving lane example (or whatever) waiting for (whatever, we don't know, human or FSD). Can we ever solve that for FSD? Sometimes I half pull out, check for the obstruction, if it's a car parking, I pull back in, wait for traffic. Once it was a row of valet parked cars and a guy at the front of the line waves me around. Once it was a dog running around in the street confused, and drivers from both directions are now out of their cars running around trying to help.
Is robotaxi (once driverless) going to have remote intervention for absolute stuck situation like this?

If you know the problem and solution - it can be automated.

Sometimes stuck is stuck. Do you go around 5 cars stopped for geese (happens around here). Do you go around them if they stopped because of a crash, sometimes there's a very good reason why the cars in front of you are stopped.

You have to apply judgement if it's something you want to get around, or something you should wait for. I would guess that driving policy would say "don't break any laws", but it'll probably pass a stopped car on a double-yellow eventually, and if that's the wrong choice then it may have to face the consequences of that choice. At some point it may have to learn how to reverse or smush over to let an emergency vehicle pass too. Or just stop and wait until the way looks clear.

If the robotaxi company has installed remote assistance then that might help. Maybe they'd offer actual remote driving, but Waymo doesn't do that, probably their policy doesn't allow it. If a car is absolutely 'refusing' to move from some kind of neural net meltdown, it'll probably have to be deactivated and towed away.
 
Last edited:
my experience with the "stopped car" insanity was being stopped a a red traffic light, which was correctly visualized on the screen. The car then suddenly signaled left and tried to move into the oncoming traffic lane. Obviously I stopped it.
Personally I'd rather have a screen prompt like NoA has for lane changes.
Kind of like a Bing! Overtake parked car? With the default set to "No you stupid car, its a red traffic light" :rolleyes:
 
If you know the problem and solution - it can be automated.
A functional definition of intelligence is the ability to solve a problem that you haven't encountered (or encountered similar examples of) before. Real-world driving contains far too many such problems, to be able to automate all of their solutions through enumeration.

To give a concrete example: my wife was recently on a four-lane highway where the northbound lanes split into two single-lane paths due to construction. She arbitrarily took the left path; it was walled in with concrete barriers, so all the cars were forced to go single-file. After a minute or so of driving, all the cars slowed to a stop. Several cars ahead of her, there was an accident that had completely shut down her lane, but the other lane in the same direction was still wide open. After waiting an hour at a standstill, a bunch of drivers agreed to coordinate reversing direction and backing out of the lane to before the fork, then take the other lane, in order to get around the accident. Explain to me how FSD could be programmed to understand and deal with this situation all by itself?
 
  • Like
Reactions: sleepydoc
A functional definition of intelligence is the ability to solve a problem that you haven't encountered (or encountered similar examples of) before. Real-world driving contains far too many such problems, to be able to automate all of their solutions through enumeration.

To give a concrete example: my wife was recently on a four-lane highway where the northbound lanes split into two single-lane paths due to construction. She arbitrarily took the left path; it was walled in with concrete barriers, so all the cars were forced to go single-file. After a minute or so of driving, all the cars slowed to a stop. Several cars ahead of her, there was an accident that had completely shut down her lane, but the other lane in the same direction was still wide open. After waiting an hour at a standstill, a bunch of drivers agreed to coordinate reversing direction and backing out of the lane to before the fork, then take the other lane, in order to get around the accident. Explain to me how FSD could be programmed to understand and deal with this situation all by itself?

Don't assume completed FSD has to mean driverless. In fact I think most believe driverless except for specific robotaxi services is far away. Your example is when the approach outlined below solves the problem and there will be many other examples like yours.
And one item I missed in the original posting was the phased implementation could start with limited access highways where FSD once the single stack is done FSD would be fairly close to this capability.

I realize this is not a popular perspective but it's going to be a long time before FSD can handle 100% of driving without driver assistance.
I continue to believe the greater revenue driver for the average owner is not driverless cars but simply the capability for FSD to handle 99.5% of driving with handoff of edge cases to the driver in a non emergency manner. In other words instantaneous takeover is not required. For example you come up on a construction project with flagmen/police officers handling traffic. Or rerouting because of a traffic accident. So lets assume the driver has 30 seconds to respond. Tesla could then request regulators to allow drivers to text, watch videos or otherwise be distracted until and/or if they must take over. No sleeping. The first car maker who provides this will not be able to meet demand. Sure robotaxi is important but not for the average owner. I would much rather see Tesla approach FSD as a phased implementation. I'm not looking for the holy grail.