Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
Another bitch about Beta ability to read human intent. It seem that when coming to a Pedestrian crossing and if there are people near Beta will usually slow or stop even if they are not crossing. We humans can look at a person and "read" their eyes and body posture and make a good determination of intent. Beta can't do this and I just don't see how it will be able to for MANY years. Here today it clearly missed people INTENDING to cross, which is the opposite of what it usually does and the MOST dangerous. You can clearly see by watching that they FULLY intended to cross. Beta just can't see what people intend to do very well.

 
Another bitch about Beta ability to read human intent. It seem that when coming to a Pedestrian crossing and if there are people near Beta will usually slow or stop even if they are not crossing. We humans can look at a person and "read" their eyes and body posture and make a good determination of intent. Beta can't do this and I just don't see how it will be able to for MANY years. Here today it clearly missed people INTENDING to cross, which is the opposite of what it usually does and the MOST dangerous. You can clearly see by watching that they FULLY intended to cross. Beta just can't see what people intend to do very well.

Reading human intent is something humans get wrong half the time. Not sure how you expect FSD to get it.
 
You can clearly see by watching that they FULLY intended to cross. Beta just can't see what people intend to do very well.
Is that perhaps due to the resolution of the cameras? We humans point our eyes at the pedestrians, and apply foveal resolution to interpret what's about to happen. I often slow while I figure it out, and one of my pet peeves is pedestrians who stand close to the curb, but don't intend to cross. When I am a pedestrian, I try to signal with body language, but even that can be had when walking with a dog. Anyway, I do wonder if the camera resolution is sufficient. Same question with tail lights on the freeway: I start to slow well before AP does. Perhaps AP and FSD know too well how quickly the car can stop or turn, and delay action till the last moment. Or, perhaps, it just has latency in its perception. Or, perhaps, the camera resolution is just not great. Who knows....
 
Is that perhaps due to the resolution of the cameras? We humans point our eyes at the pedestrians, and apply foveal resolution to interpret what's about to happen..... Who knows....
I have always thought and wondered this. The cameras probobly need to be at least 4k and even that is low compared to human vision, especially at distances. Also the front does have 3 and one detected telephoto that should help. But this doesn't address things like B-pillar view that is more of a standard to wider angle lens and the even BIGGER problem IMO is not have them on movable gimbals like we do. Relocation can go a LONG way to improving perception. Anyone who has ever used LCD rear view mirrors (especially side) will have experienced this.
 
I have always thought and wondered this. The cameras probobly need to be at least 4k and even that is low compared to human vision, especially at distances. Also the front does have 3 and one detected telephoto that should help. But this doesn't address things like B-pillar view that is more of a standard to wider angle lens and the even BIGGER problem IMO is not have them on movable gimbals like we do. Relocation can go a LONG way to improving perception. Anyone who has ever used LCD rear view mirrors (especially side) will have experienced this.
It’s more complex than just camera resolution, and just bumping up the pixel count doesnt necessarily fix anything (and can cripple performance Of the NNs). Human eyes are actually pretty bad at raw pixel gathering, it;s the brain that is amazingly good at extracting information from the noisy data. In many ways the cameras are already better than the eye .. they have high resolution across the entire FOV, they see all around the car at the same time, and with different depths (with no delays to re-focus or move the head/eye). The big difference here is the vastly more sophisticated neural processing that the brain can do.

Remember that what we “see” isnt what our eyes show us, it’s what out brain interprets as what it thinks is there. See this, for example.
 
It’s more complex than just camera resolution, and just bumping up the pixel count doesnt necessarily fix anything (and can cripple performance Of the NNs). Human eyes are actually pretty bad at raw pixel gathering, it;s the brain that is amazingly good at extracting information from the noisy data. In many ways the cameras are already better than the eye .. they have high resolution across the entire FOV, they see all around the car at the same time, and with different depths (with no delays to re-focus or move the head/eye). The big difference here is the vastly more sophisticated neural processing that the brain can do.
And, interestingly, it’s not that the neural networks in our skulls are anywhere near as fast as silicon. It’s that those neural networks are massively parallel. And if that’s not whacky enough, they’re really slow, as in too slow, by far, to be able to catch a ball, say. Or to avoid a lion. So, a huge part of our neural networks is its built-in predictive capabilities that runs all the time. Those whose predictive capabilities weren’t up to snuff got eaten.
As one might imagine, this plays hob with our sense of time. There’s reasons why, when one first looks at a ticking clock, the first tick or two seems to run at double time, then settles down: brain algorithms at work. It’s a wonder we can keep our collective balances.
 
Is that perhaps due to the resolution of the cameras? We humans point our eyes at the pedestrians, and apply foveal resolution to interpret what's about to happen. I often slow while I figure it out, and one of my pet peeves is pedestrians who stand close to the curb, but don't intend to cross. When I am a pedestrian, I try to signal with body language, but even that can be had when walking with a dog. Anyway, I do wonder if the camera resolution is sufficient. Same question with tail lights on the freeway: I start to slow well before AP does. Perhaps AP and FSD know too well how quickly the car can stop or turn, and delay action till the last moment. Or, perhaps, it just has latency in its perception. Or, perhaps, the camera resolution is just not great. Who knows....

Deducing intention is a complex guessing game that higher resolution cameras can help but still won't make perfect. Yesterday I jogged to a median as a Zoox was approaching down the other half of the road and it stopped in the middle of the intersection to yield to me. I did my best to slow down in an obvious manner and angle myself towards the median to show intention of yielding but it still sat and waited for a few seconds before proceeding.

To my eyes, the FSD beta looked ok in that case. There didn't seem to be any cars behind OP and only 1-2 cars in the opposite direction coming through at the same time, so as a human driver I would've interpreted it as the people waiting for the last couple cars to go by before they crossed
 
Here's an interesting experience I had driving south on I-405 in Seal Beach area of California. There is major construction going on in the area, widening the freeway. As such, they've shifted the lanes to the right in certain areas.

I was on Autopilot, at 75MPH. As the lanes shifted, I trusted AP enough to take my eyes off the road and watch the screen. I was zoomed in enough on the map to see street level details. I noticed that my arrow was now off the freeway (as the lanes had shifted in this construction zone). There was an exit coming up, which was a small exit lane that left the freeway and paralleled it. Since the arrow was off the freeway due to the shift, it snapped onto that exit road.

So, mapping thought I was on the exit road, even though I was still in the #2 lane of the freeway. Realizing this, I put my foot over the accelerator, and had my finger on the right scroll wheel, ready for what I knew would happen. Sure enough, as I approached the end of the exit road on the map, the car suddenly switched from 75MPH to 40MPH, and the car started to brake pretty aggressively. But I immediately put my foot down and keep the speed up, while scrolling up the wheel back to 75MPH.

So - there must be mapping speed limit data, where the car thought it was on a different road than the freeway and adjusted speed to 40MPH based on that map data.

I would also imagine that some people who experience sudden speed limit changes on freeways/highways may be having a GPS issue where the car thinks it's not on the freeway anymore and is on a side road, then setting speed accordingly.

In this case, any car from any manufacturer would likely suffer the same issue if they were using mapping data for speed limits, and the GPS thinks the car is on a different road due to lane shifts (construction) or errors in GPS.
 
  • Like
Reactions: drtimhill
In this case, any car from any manufacturer would likely suffer the same issue

Yes but the Telsa with HW3 should be able to tell it is on a freeway and what an appropriate speed is, based on surrounding traffic, signs, etc If I have not been driving and I wake up and open my eyes I can immediately tell I am on a freeway. So the car can too. So it should not do this anymore, since this is going to be ready for wide release by the end of the year.

It’s an easy problem. The car just has to know it is on a freeway from the environment and behave accordingly.

I would also imagine that some people who experience sudden speed limit changes on freeways/highways may be having a GPS issue

Yes there are a variety of posts around here from a bug a while back where the GPS position was substantially shifted and it led to very poor AP behavior.
 
Yes but the Telsa with HW3 should be able to tell it is on a freeway and what an appropriate speed is, based on surrounding traffic, signs, etc If I have not been driving and I wake up and open my eyes I can immediately tell I am on a freeway. So the car can too. So it should not do this anymore, since this is going to be ready for wide release by the end of the year.

It’s an easy problem. The car just has to know it is on a freeway from the environment and behave accordingly.


FWIW I've seen actual speed limits as low as 35 mph in construction zones of freeways (interstates even).

So I'm not sure "just knowing it's a freeway" is enough to know thinking it's 40 mph is wrong.
 
  • Like
Reactions: tmoz
FWIW I've seen actual speed limits as low as 35 mph in construction zones of freeways (interstates even).

So I'm not sure "just knowing it's a freeway" is enough to know thinking it's 40 mph is wrong.
Yeah that’s why I said it needs to be able to look at signs! In construction zones there is NEVER a problem with knowing what the speed is - always a convenient timely sign. Just like a human (better in fact since it’ll always be looking for signs and humans sometimes forget what they saw or miss the sign).

Anyway this is an easy problem for something that is very confidently going to be wide release shortly.

Most likely it’ll just be another limitation tossed into the “it’s just L2” junk pile though.
 
Yeah that’s why I said it needs to be able to look at signs! In construction zones there is NEVER a problem with knowing what the speed is - always a convenient timely sign. Just like a human (better in fact since it’ll always be looking for signs and humans sometimes forget what they saw or miss the sign).

Anyway this is an easy problem for something that is very confidently going to be wide release shortly.

Most likely it’ll just be another limitation tossed into the “it’s just L2” junk pile though.
There is no question that a lot of progress can be made by training the NN to read and interpret a much wider variety of signs. Programs like Google Lens, and even some of the extensions to my smartphone camera app, have had this capability for some time. I'm frankly a little mystified why this aspect of FSD isn't farther along; it seems like low hanging fruit compared to some of the other less widely-useful improvements they've worked on.

Learn to read [many more] signs and reconcile the information with map data. Also learn to recognize (label) the type of road being driven and/or temporary barriers & detours, and likewise compare and reconcile that information with the map data. The correct outcome, for a system that is explicitly not dependent on constantly-updated HD maps, is that the confident Vision Information would be allowed to override the always-doubtful map, and thus continue smoothly along the self-adjusted routing.
 
Another area where GPS reliance causes issues is in areas where you have one road over another. Ultimately, if a maker is going to implement an automated system (like TACC, AP, etc) it needs to be able to accurately interpret the inputs and filter out noise and from what we've seen, Tesla fails to succeed in this area on a regular basis.

Relying on GPS data is fraught with potential errors. Not only does your GPS data need to be consistently accurate to within a few meters but your map data has to be up to date. If either one of those is off then the system fails.

Nominally, a visual system (like humans) would be better; the problem with things like speed limits is the fact that signs are often obscured - think of driving in the left lane with a truck in the right lane. What should a system do if the map data disagrees with the visual data? How does it know which is correct?
 
  • Like
Reactions: DrGriz
Another area where GPS reliance causes issues is in areas where you have one road over another. Ultimately, if a maker is going to implement an automated system (like TACC, AP, etc) it needs to be able to accurately interpret the inputs and filter out noise and from what we've seen, Tesla fails to succeed in this area on a regular basis.

Relying on GPS data is fraught with potential errors. Not only does your GPS data need to be consistently accurate to within a few meters but your map data has to be up to date. If either one of those is off then the system fails.

Nominally, a visual system (like humans) would be better; the problem with things like speed limits is the fact that signs are often obscured - think of driving in the left lane with a truck in the right lane. What should a system do if the map data disagrees with the visual data? How does it know which is correct?
The car should do what good humans do: use visual data, but then adjust to the local conditions.

EDIT: you can also use map data. If I'm driving using GPS, and I feel that my speed feels wrong for the road, I'll check the speed the GPS says. But I can't start with GPS since I _know_ that it's sometimes wrong.
 
Last edited:
...What should a system do if the map data disagrees with the visual data? How does it know which is correct?
This is what I was trying to address in my post above:
...compare and reconcile that [visual] information with the map data. The correct outcome, for a system that is explicitly not dependent on constantly-updated HD maps, is that the confident Vision information would be allowed to override the always-doubtful map...
Of course this is only the guiding principle and not an unambiguous solution to each scenario.

But basically, the map or navigation plan should not cause the car to do something that is unexpected or inconsistent with the current road environment, (eg accelerating the highway speed while on a road that "labels" as residential, non-limited-access or under-construction).

Probably most importantly, resolving any such conflict or uncertainty should not produce an abrupt maneuver unless, of course, the accident-avoidance behavior has been triggered. From FSDb user reports and my own experience with AP/FSD, the most dangerous and unsettling behavior comes from overly abrupt slowing, lane position "correction" or turn-path adjustment, particularly when it results in sudden changes to relative speed or buffer distance to other vehicles.

So the software's priority ordering when resolving uncertainty should be:
  • VRU accident avoidance
  • Non-VRU accident avoidance
  • Compliance to visually-perceived traffic control: barriers, signs, traffic directors, school buses, emergency vehicles
  • Apparent accidents or other unexpected road-user positioning/behavior
    • Ranked below temporary ttraffic-control directives because an accident, tie-up or public event response should favor organized directives over independent decisions of path and driving behavior.
  • Road markings and proximate signs
  • Speed and direction of proximate traffic, subject to a weighting/voting system that won't take cues from a minority of reckless or confused drivers
  • Map information for both routing and traffic control, as available
  • GPS-based direction hunting to destination, in the absence of map routing information
  • If no destination is programmed (a perfectly valid FSD mode IMO), generally follow the flow of the current road until directed to turn or exit. However the rules and discussion of this behavior are beyond this particular post.
In the list above, note that the map ranks fairly low. This doesn't imply that it isn't much used, but that all kinds of unusual or conflicting information take precedence. Most of the time, those conflicts aren't present and the map is followed.

Also, pretty much only the first two items justify highly abrupt correction behavior. Compliance to remaining priorities should involve more relaxed and human-like adaptation to changing conditions (doesn't mean I think I solved Phantom Braking just by saying this; there's obviously the tricky problem of falsely triggered accident-avoidance maneuvers. But, map information alone should not be justification to slam on the brakes, nor to accelerate to a speed inconsistent with the visually-perceived class of road).
 
  • Like
Reactions: Dewg
This is what I was trying to address in my post above:

Of course this is only the guiding principle and not an unambiguous solution to each scenario.

But basically, the map or navigation plan should not cause the car to do something that is unexpected or inconsistent with the current road environment, (eg accelerating the highway speed while on a road that "labels" as residential, non-limited-access or under-construction).

Probably most importantly, resolving any such conflict or uncertainty should not produce an abrupt maneuver unless, of course, the accident-avoidance behavior has been triggered. From FSDb user reports and my own experience with AP/FSD, the most dangerous and unsettling behavior comes from overly abrupt slowing, lane position "correction" or turn-path adjustment, particularly when it results in sudden changes to relative speed or buffer distance to other vehicles.

So the software's priority ordering when resolving uncertainty should be:
  • VRU accident avoidance
  • Non-VRU accident avoidance
  • Compliance to visually-perceived traffic control: barriers, signs, traffic directors, school buses, emergency vehicles
  • Apparent accidents or other unexpected road-user positioning/behavior
    • Ranked below temporary ttraffic-control directives because an accident, tie-up or public event response should favor organized directives over independent decisions of path and driving behavior.
  • Road markings and proximate signs
  • Speed and direction of proximate traffic, subject to a weighting/voting system that won't take cues from a minority of reckless or confused drivers
  • Map information for both routing and traffic control, as available
  • GPS-based direction hunting to destination, in the absence of map routing information
  • If no destination is programmed (a perfectly valid FSD mode IMO), generally follow the flow of the current road until directed to turn or exit. However the rules and discussion of this behavior are beyond this particular post.
In the list above, note that the map ranks fairly low. This doesn't imply that it isn't much used, but that all kinds of unusual or conflicting information take precedence. Most of the time, those conflicts aren't present and the map is followed.

Also, pretty much only the first two items justify highly abrupt correction behavior. Compliance to remaining priorities should involve more relaxed and human-like adaptation to changing conditions (doesn't mean I think I solved Phantom Braking just by saying this; there's obviously the tricky problem of falsely triggered accident-avoidance maneuvers. But, map information alone should not be justification to slam on the brakes, nor to accelerate to a speed inconsistent with the visually-perceived class of road).
I agree - mapping data should be used for navigation (the "Navigate on Autopilot" feature), but otherwise visual data should be used. There are a few problems that need to be worked out:

1) If GPS/mapping data shows that you are on a side street when you are on a freeway (either due to lane shifts in construction zones, or incorrect GPS data), and navigation says you need to turn right, how does the car handle that turn when it's on the freeway? Would it just ignore the navigation data and miss the turn, or would it try to move over inappropriately on the freeway? Now you have mapping sending data to the NN telling it where to go, and the NN is not seeing the path. I recall a YouTube video of someone on FSD Beta where the mapping data was totally incorrect. It looked like a GPS error, as the car was not showing where it should be on the map. And the navigation data said to turn left, and the car just started to turn left into essentially nothing - there was no left turn to be had - I think it was a driveway instead of a street it was supposed to turn on. An example I can think of for humans would be a passenger in your car giving you directions, and while you're driving, they suddenly yell "Turn left, NOW!!!" And you, as the driver, don't see where you're supposed to turn, and end up missing the turn, or driving awkwardly.

2) If there are no signs available, how does the car know what speed to drive at? There are huge, and I mean miles long, stretches of I-405 in Orange County where there are no speed limit signs. I think I counted over 15 miles with no speed limit signs. So if mapping told the car it's on a side road, and the car visually overrides that, knowing it's on the freeway, but then checks for speed - what speed does it use? Mapping is telling it 40MPH, and there are no freeway signs for miles, so does it just keep it's last speed that it detected? And vice-versa. If it's on a side road, and mapping suddenly says it's on a highway, but there are no speed limit signs, how does it know what speed to keep going? Each state has its own vehicle code telling us what the default speed limit is when signs are not present. In California the speed limit in unposted (or unknown by the driver) freeway/highway is 55MPH. 25MPH in business districts / school zones / etc. 15MPH in alleys, blind intersections, and blind railroad crossings. Perhaps there should be a setting in the cars for default speed limits on highways/freeways, city streets, and residentials. We could enter 55MPH, 45MPH, and 25MPH in those settings and the car, absent speed limit signs, would use those speeds. Or it would have to defer to mapping data, which leads us back to #1.
 
The correct outcome, for a system that is explicitly not dependent on constantly-updated HD maps, is that the confident Vision Information would be allowed to override the always-doubtful map, and thus continue smoothly along the self-adjusted routing.
The trouble is it’s not quite as clear-cut as that. Based on the few hints we’ve had so far it’s pretty clear that the Vision system uses map data as an input to the NN .. so if the Vision system isnt sure (to use a too-crude example) if the road ahead has 2 or 3 lanes, it can use map information to resolve the ambiguity. Of course its much more complex than that, but it means that when map data and vision data differ the car may reach a point where its confidence level is too low to continue. Sure, as the vision system improves Tesla can weight that more heavily than map data, but you are still going to face times when the two are so dissonant that the car cannot progress by itself (which seems pretty sensible for an L2 system).

There is also safety to be considered. If the vision system sees a 50 mph road sign with 60% confidence, but the map system has a 40 mph limit, which should the car choose? If the car saw a 25 mph sign with the same 60% confidence, which should it choose? So it’s not just a matter of correctness, its a matter of what is the safest course of action.
 
  • Like
Reactions: sleepydoc
The map system apparently does not include stop signs. Signs on small, wooded roads in moderately hilly areas can be hard to see and the vision system can miss them or see them late enough that they don't respond, just run through the stop signs (what happened today). This has to change.