Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Stopping at traffic lights and stop signs now in "shadow mode"

This site may earn commission on affiliate links.
You were stating that a system working on Las Vegas is geofenced but working on the whole US is not.

I understand that technically "no geofence" should mean the entire world but I am not using the term in that way. I am only looking at the US as "no geofence". I am basically not counting the rest of the world since it I am only looking at the area that affects where I drive.

For my daily driving, "no geofence" just means the entire US.

I understand you and it is easy to commit that mistake (specially living in US where Tesla has more pressence), but please, understand that it is really important to be very clear with the message. Either Tesla hardwires the US rules into his FSD or make it really work without being geofenced.

This goal (and not the technology) is what drives Tesla to a much more interesting future than Waymo's.

Yes, I understand that Tesla wants FSD to work in the entire world but for me, the US is the only part that relates to my driving needs.
 
I understand that technically "no geofence" should mean the entire world but I am not using the term in that way. I am only looking at the US as "no geofence". I am basically not counting the rest of the world since it I am only looking at the area that affects where I drive.

For my daily driving, "no geofence" just means the entire US.



Yes, I understand that Tesla wants FSD to work in the entire world but for me, the US is the only part that relates to my driving needs.
Once North America is complete, it won't take a lot to do the rest of the left-hand-drive world. The right-hand drive world will take a bit longer. I don't agree this is geofencing. You might as well say that because it won't work on the moon or mars that it's geofenced.
 
If Tesla defends that they are making the system not geofenced, it is land what matters, not people.

If Tesla makes something "useful" just to US owners, it is a geofenced system, like it or not. It would be the same exactly that others are doing, just increasing (a bit) the scale of the geofenced zone (from 1% to 6%).
Because land doesn't drive. People do. Just Vegas (or suburbs of Pheonix) is a very tiny % of trips. Just like land doesn't vote, people do.

BTW, why are you bringing a tangential topic to this thread. If you want to talk about V2I - start your own thread. Pls don't derail this thread.

ps : That's why we really need moderators in this forum. I nominate @diplomat33 ;)
 
Last edited:
According to the autonomy investor day presentation, Tesla uses the following procedure for developing and rolling out a new AP/FSD feature:
1) Learn from humans
Tesla starts with data from human driving to develop the software feature.
2) Shadow Mode
Tesla uploads the software feature to test how it would work in the real world. Tesla uses the feedback to fine tune the software.
3) Early Access
When Tesla feels good about the software, they release it to actual owners in active mode and continue to get feedback on how it works in the real world.
4) Wide roll out
Tesla releases the feature to all cars.

So right now, the feature is at stage 2. So yes, early access should be the next stage.

it is worth noting that it was about 1 month between shadow mode for NOA and wide release of NOA with confirmation. It was about 6 months between shadow mode NOA and wide release of NOA without confirmation. if the timeline holds, we could see "traffic light stop" in a few months.

It would be good for us to get an understanding of how the dev process really works on new features. Here is what I think. Lets take stop sign as an example.

A. Initial development
- Collect lots and lots of images with stop signs. They already have a lot of images without stop signs, so can use those as negative examples.
- Label the images. I've no idea how many images might be needed. Thousands ? 100s of thousands ? Millions ? This is the most laborious and time consuming part of the development.
- Create a new NN "task" which outputs presence/absence of stop sign and distance to the sign.
- Train the NN with labelled images.
- Write procedural code (heuristics / software 1.0) to use the NN task's output to stop the car at the right place.
- Iterate collecting images, labeling and training optimizing NN
- Once the initial quality bar is met, include in dev build

B. Test on internal fleet and iteratively fix procedural bugs and optimize NN

C. Include in shadow mode. Whenever the shadow mode is different from what the driver does, send the data.

D. Analyse the shadow mode data and include new scenarios where NN+software doesn't work properly and optimize, fix NN/software. This is also a very laborious, time consuming phase. If we think the shadow mode is operating even on 100k cars and they could send a Million data points every week. How do you analyze, pick the edge cases to include ? Having a lot of data is good, but its only the start. Lots of hard work, time & resources needed to use that data properly.

E. Include in early release and fix/optimize as in C/D.

F. Release widely to fleet.

As you can see, it is a laborious and time consuming process. That is why each new feature takes months. As I wrote elsewhere, Tesla tries to get a particular feature good to six 9s before releasing to the fleet (that doesn't mean it is really six 9s, because as they release widely they will find more bugs and edge cases).
 
Last edited:
It would be good for us to get an understanding of how the dev process really works on new features. Here is what I think. Lets take stop sign as an example.

A. Initial development
- Collect lot and lots of images with stop signs. They already have a lot of images without stop signs, so can use those as negative examples.
- Label the images. I've no idea how many images might be needed. Thousands ? 100s of thousands ? Millions ? This is the most laborious and time consuming part of the development.
- Create a new NN "task" which outputs presence/absence of stop sign and distance to the sign.
- Train the NN with labelled images.
- Write procedural code (heuristics / software 1.0) to use the NN task's output to stop the car at the right place.
- Iterate collecting images, labeling and training optimizing NN
- Once the initial quality bar is met, include in dev build

B. Test on internal fleet and iteratively fix procedural bugs and optimize NN

C. Include in shadow mode. Whenever the shadow mode is different from what the driver does, send the data.

D. Analyse the shadow mode data and include new scenarios where NN+software doesn't work properly and optimize, fix NN/software.

E. Include in early release and fix/optimize as in C/D.

F. Release widely to fleet.

As you can see, it is a laborious and time consuming process. That is why each new feature takes months. As I wrote elsewhere, Tesla tries to get a particular feature good to six 9s before releasing to the fleet (that doesn't mean it is really six 9s, because as they release widely they will find more bugs and edge cases).

Yep. I think that is a very accurate description of Tesla's dev process. And considering that Tesla has a relatively small dev team, they have accomplished a lot. And I do think this approach will get Tesla to true FSD. It will just take time.
 
Yep. I think that is a very accurate description of Tesla's dev process. And considering that Tesla has a relatively small dev team, they have accomplished a lot. And I do think this approach will get Tesla to true FSD. It will just take time.
A lot of time - as you can see. Having a lot of automation and tools to pick out the edge cases to use from millions of disengagement data that comes in weekly is the key.

1,000 people working 8 hours - and labeling an image in 10 minutes, will take 1 month to label 1 Million images. This will cost about $1M ($5/hour, in India - would 4x this in US) - so, its the time that is more critical here than money. But you can't arbitrarily scale people, 1,000 people is already a lot.

If something like stop sign needs 1 Million images to be labeled, it would take a month to label initially. I've no idea how much effort it would be to analyze the shadow mode data and make use of that data to optimize the feature. Could be several months.

They probably do multiple features in parallel - so they can get to a few important features every 6 months ...
 
It would be good for us to get an understanding of how the dev process really works on new features. Here is what I think. Lets take stop sign as an example.

A. Initial development
- Collect lot and lots of images with stop signs. They already have a lot of images without stop signs, so can use those as negative examples.
- Label the images. I've no idea how many images might be needed. Thousands ? 100s of thousands ? Millions ? This is the most laborious and time consuming part of the development.
- Create a new NN "task" which outputs presence/absence of stop sign and distance to the sign.
- Train the NN with labelled images.
- Write procedural code (heuristics / software 1.0) to use the NN task's output to stop the car at the right place.
- Iterate collecting images, labeling and training optimizing NN
- Once the initial quality bar is met, include in dev build

B. Test on internal fleet and iteratively fix procedural bugs and optimize NN

C. Include in shadow mode. Whenever the shadow mode is different from what the driver does, send the data.

D. Analyse the shadow mode data and include new scenarios where NN+software doesn't work properly and optimize, fix NN/software.

E. Include in early release and fix/optimize as in C/D.

F. Release widely to fleet.

As you can see, it is a laborious and time consuming process. That is why each new feature takes months. As I wrote elsewhere, Tesla tries to get a particular feature good to six 9s before releasing to the fleet (that doesn't mean it is really six 9s, because as they release widely they will find more bugs and edge cases).
The stop sign case seems like the perfect application of "shadow mode". It seems like it would be easy to detect where nearly every stop sign in the country is rather quickly by analyzing all the data from all the Teslas on the road. You just have to look at places where the many different cars have behaved as if at a stop sign. Now you can take image data from those places and train the NN.
My guess is that if they do release this feature to the fleet (I kind of doubt they will) it will automatically stop at stop signs but it won't go without driver confirmation. Starting at stop signs is a much trickier problem.
 
The stop sign case seems like the perfect application of "shadow mode". It seems like it would be easy to detect where nearly every stop sign in the country is rather quickly by analyzing all the data from all the Teslas on the road. You just have to look at places where the many different cars have behaved as if at a stop sign. Now you can take image data from those places and train the NN.
My guess is that if they do release this feature to the fleet (I kind of doubt they will) it will automatically stop at stop signs but it won't go without driver confirmation. Starting at stop signs is a much trickier problem.
That is easier said than done. Think about what is actually needed to "analyze all the data from all the Teslas on the road". We are talking millions of pieces of data every day. That is why I said, it depends on the tools they have to analyze data - figuring out what this actually means in practice "behaved as if at a stop sign" is not easy. People could stop for traffic lights, pedestrians, other cars - lots and lots of things.

Ofcourse, they probably label traffic lights and stop signs together. But I don't think it can be done rather quickly … anyway, what do you mean by it - hours, days, weeks, months ?
 
Last edited:
That is easier to write than actually do. Think about what is actually needed to "analyze all the data from all the Teslas on the road". We are talking millions of pieces of data every day. That is why I said, it depends on the tools they have to analyze data - figuring out what this actually means in practice "behaved as if at a stop sign" is not easy. People could stop for traffic lights, pedestrians, other cars - lots and lots of things.

Ofcourse, they probably label traffic lights and stop signs together. But I don't think it can be done rather quickly … anyway, what do you mean by it - hours, days, weeks, months ?
You can run code on every car to detect stop sign behavior (stop wait, a second, go). Transmit all the locations where this behavior occurs back to the mothership. Sure people stop for all sorts of reasons but if you have 99% of your fleet stopping at exactly the same spot (and of course you look at the distribution of stopped times to make sure it's not a stop light) then you've found a stop sign. The stop signs around here probably see 100's of Teslas a day so it wouldn't take long.
 
You can run code on every car to detect stop sign behavior (stop wait, a second, go).
This is the part I'm not sure about. Its not clear to me they can run arbitrary code to catch the scenarios. It would be interesting to think about how they got data for "cut-in" scenario they confirmed was done using shadow mode.

If they can run arbitrary code, it becomes slightly easier. They have to look for patterns of driving where you have stop - wait for x seconds - and drive. Then, the car has to look for an image (or short video) x seconds before the stop. May be here they can ignore rolling stops. They probably need to have slightly different criteria to get busy stop signs (because the waiting time will be more). Then, they would need a good geographical spread - including EU and Asia if they are training globally. Then, they need stop signs that are partly occluded, by other vehicles and by foliage - may be even different types of foliage (by season too!) . They have to look for different angles and heights.

Now, in shadow mode, it gets trickier, because they need to eliminate rolling stops as a divergant behavior.
 
Then, they need stop signs that are partly occluded, by other vehicles and by foliage - may be even different types of foliage (by season too!) . They have to look for different angles and heights.
You've got to look for other clues (road markings, back side of stop sign on other side, "4 way" text on stop sign in other direction, etc.). There are also intersections with no stop signs in any direction by design.
 
V2I is utopia. If one location is malfunctioned, it's all worthless. Then vision has to take over. Also you have to retrofit it in the entire world, or it's worthless.
Traffic lights are out of order all the time. They are either off here or blink yellow. Then the normal yield rules take over.

It's a "nice to have" to confirm the traffic light when it works
 
  • Like
Reactions: fmonera
You've got to look for other clues (road markings, back side of stop sign on other side, "4 way" text on stop sign in other direction, etc.). There are also intersections with no stop signs in any direction by design.
Yes.

Anyway back to my point, these things are not simple, nor quick.

We could see some of these features take a year to be mature enough to be released to the wide fleet.
 
I was being a little flip. But I thought they described how they got the data for their "cut-in" scenarios at autonomy day?

I'll have to listen again to be sure - but not detailed enough to figure out whether they looked for certain images or certain dynamic actions. The hackers say the "trigger" looks for just some kind of "blob" … ?

ps : Its a bit more complicated.

green on Twitter
 
Last edited:
Good luck with that

england-london-docklands-canary-wharf-complex-canary-riverside-traffic-b0g4e4.jpg


Yes, I know that one is a sculpture, but these sequences aren't

traffic.jpg


9163634-standard.jpg
 
Last edited: