Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
Really confused by these failure to yield for pedestrians at crosswalk or late yield:


12.2.1 pedestrians disengagement.jpg


Maybe the hat on the pedestrian in the road just slightly outside of the marked crosswalk or face mask? Distracted/confused by so many unusual decorations (11.x visualization later seems to show red stop sign for a red lantern)? Maybe Tesla should have a data collection trigger for when others glare / "WTF?"

Separately, interesting behavior when she drops the max set speed to 1mph to avoid bottoming out and the visualization still draws out the full 2 second path of the speed it otherwise would have taken at AUTO. Seems like another example of some post-processing heuristics that limits what end-to-end is allowed to do. There's also decision wobble of end-to-end seemingly confused about why it's only going at 1mph with decision wobble of long path and short path "I wanted to go, but some reason I'm slow, so I guess I need to slow down, but it's actually clear, so I should just go?"
 
Really confused by these failure to yield for pedestrians at crosswalk or late yield:


View attachment 1024198

Maybe the hat on the pedestrian in the road just slightly outside of the marked crosswalk or face mask? Distracted/confused by so many unusual decorations (11.x visualization later seems to show red stop sign for a red lantern)? Maybe Tesla should have a data collection trigger for when others glare / "WTF?"

Separately, interesting behavior when she drops the max set speed to 1mph to avoid bottoming out and the visualization still draws out the full 2 second path of the speed it otherwise would have taken at AUTO. Seems like another example of some post-processing heuristics that limits what end-to-end is allowed to do. There's also decision wobble of end-to-end seemingly confused about why it's only going at 1mph with decision wobble of long path and short path "I wanted to go, but some reason I'm slow, so I guess I need to slow down, but it's actually clear, so I should just go?"
V12 needs to be more careful. That's all. 😂
 
Many seem to be in a rush to “me to” get the new version only to complain about what it can’t do the next day. There are 10,000 plus pages of hurry to complain posts. I personally say let them monkey around with it as long as they need to give us a moderately stable foundation to start with. Then we can complain about the non chauffeur driving style etc from that point.
I'd be happy to help monkey around with V12.
 
A question for any knowledgeable AI folks here. Elon once described the older version as a whole bunch of NNs as well as 300,000 lines of C code just for lane choosing. (Apply grain of salt.)

If all that is now a single end-to-end neural net, might that take significantly longer to train? How does training time vary with NN size? Is it linear? Does the entire net need to be retrained from scratch for every iteration? Intuition suggests that vastly more iterations might be needed to train a larger and deeper net.

I heard a talk by a Cal Tech AI researcher who is working on medical applications, such as CAT scan reading. He bemoaned the vast cost of computer time to train the systems, costs which researchers find hard to cover. So far only huge tech companies have the budgets to invest, while life saving applications are lagging, even when we know they can do the job. FSD is an example of massive expenditure on an uncertain outcome.
"I heard a talk by a Cal Tech AI researcher who is working on medical applications, such as CAT scan reading. He bemoaned the vast cost of computer time to train the systems, costs which researchers find hard to cover. So far only huge tech companies have the budgets to invest, while life saving applications are lagging, even when we know they can do the job. FSD is an example of massive expenditure on an uncertain outcome."

Dojo computer used to train FSD does not have bandwidth to train?
 
  • Funny
Reactions: AlanSubie4Life
For those of you bummed to not have v12, you might feel better knowing that I would rather be on v11.4.9 right now. I'm honored to be selected, but

1. The speed stuttering is annoying.
2. I feel I need to be extra extra vigilant.
I'm putting 11.4.9 in a FedEx box to send to you as I type. I'll even pay the shipping from your end. Box it up and send it on. Make sure NOT to send your wipers🤣
 
  • Funny
Reactions: FSDtester#1
Dojo computer used to train FSD does not have bandwidth to train?
That is more or less my question. We hear about Dojo, then a year later heard that maybe soon they would start to use it. Are they now???

My understanding is that NN training requires multiple passes through the training data, just as we do when learning. Tesla has collected an unimaginable amount of video to use in training. We know how boring it is to watch even a few FSD UTubes. Even with bandwidth of warp-factor-Dojo, scanning a billion miles of video for the multiple iterations which AI training requires might take a while.

I'm just wondering if maybe, compared with editing a few lines and regression testing the C code, tweaking a large NN system may actually slow down the revision cycles.
 
  • Funny
Reactions: AlanSubie4Life
That is more or less my question. We hear about Dojo, then a year later heard that maybe soon they would start to use it. Are they now???

My understanding is that NN training requires multiple passes through the training data, just as we do when learning. Tesla has collected an unimaginable amount of video to use in training. We know how boring it is to watch even a few FSD UTubes. Even with bandwidth of warp-factor-Dojo, scanning a billion miles of video for the multiple iterations which AI training requires might take a while.

I'm just wondering if maybe, compared with editing a few lines and regression testing the C code, tweaking a large NN system may actually slow down the revision cycles.
Vaguely connected trivia: I saw a thing the other day that estimated that by 2030 Tesla will have the ability to capture 16,000 miles of driving data a second from their fleet.
 
I feel I need to be extra extra vigilant
Yeah, it seems like 12.x might not be as good as 11.x in avoiding hitting stuff in that it might incorrectly assume vehicles will continue moving (34:33):


12.2.1 stopping disengagement.jpg


Ultrasonics seem to show 20" after disengagement, so maybe end-to-end would have adjusted, but 11.x probably would have been too hesitant to get that close in the first place. More generally, 12.x might have learned to do more aggressive maneuvers without picking up on signals of when those actions would be inappropriate, and now we can't rely on 11.x's perception and C++ control heuristics that more consistently avoid collisions / ensure safety.
 
  • Like
Reactions: EVNow
Yeah, it seems like 12.x might not be as good as 11.x in avoiding hitting stuff in that it might incorrectly assume vehicles will continue moving (34:33):


View attachment 1024239

Ultrasonics seem to show 20" after disengagement, so maybe end-to-end would have adjusted, but 11.x probably would have been too hesitant to get that close in the first place. More generally, 12.x might have learned to do more aggressive maneuvers without picking up on signals of when those actions would be inappropriate, and now we can't rely on 11.x's perception and C++ control heuristics that more consistently avoid collisions / ensure safety.
The fact that not even all the usual YT influencers have got access tells me they are not confident. Increased likelihood of collisions seems to be the problem.

End-to-End is a tough nut to crack and is probably much more difficult to debug / fix.
 
Yeah, it seems like 12.x might not be as good as 11.x in avoiding hitting stuff in that it might incorrectly assume vehicles will continue moving (34:33):


View attachment 1024239

Ultrasonics seem to show 20" after disengagement, so maybe end-to-end would have adjusted, but 11.x probably would have been too hesitant to get that close in the first place. More generally, 12.x might have learned to do more aggressive maneuvers without picking up on signals of when those actions would be inappropriate, and now we can't rely on 11.x's perception and C++ control heuristics that more consistently avoid collisions / ensure safety.

I think HW3 is worse than HW4 for V12
 
  • Funny
Reactions: AlanSubie4Life
Really confused by these failure to yield for pedestrians at crosswalk or late yield:


View attachment 1024198

Maybe the hat on the pedestrian in the road just slightly outside of the marked crosswalk or face mask? Distracted/confused by so many unusual decorations (11.x visualization later seems to show red stop sign for a red lantern)? Maybe Tesla should have a data collection trigger for when others glare / "WTF?"

Separately, interesting behavior when she drops the max set speed to 1mph to avoid bottoming out and the visualization still draws out the full 2 second path of the speed it otherwise would have taken at AUTO. Seems like another example of some post-processing heuristics that limits what end-to-end is allowed to do. There's also decision wobble of end-to-end seemingly confused about why it's only going at 1mph with decision wobble of long path and short path "I wanted to go, but some reason I'm slow, so I guess I need to slow down, but it's actually clear, so I should just go?"

Yep, v12 isn't reliably detecting, acknowledging, and/or responding correctly. And many seem to be pretty run of the mill scenarios.
 
  • Like
Reactions: AlanSubie4Life
Yep, v12 isn't reliably detecting, acknowledging, and/or responding correctly. And many seem to be pretty run of the mill scenarios.
It kinda seems that V12 acts a lot like a typical AI model like ChatGTP. Most of the time it impressive and seems "human" like in its abilities and perceptions. Other times it seems so far off the "reality" track and obtuse in actions while maintaining an aurora of confidence.
 
It kinda seems that V12 acts a lot like a typical AI model like ChatGTP. Most of the time it impressive and seems "human" like in its abilities and perceptions. Other times it seems so far off the "reality" track and obtuse in actions while maintaining an aurora of confidence.

I posted prior in this thread that the video in controls out approach means that the car's driving performance is similar both in simple and complex situations, because the NN is processing the same video bitstream input

So we have videos and examples of V12 doing amazing stuff but also failing at simple stuff
 
  • Like
Reactions: FSDtester#1