Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Are we underestimating what features Tesla could leverage with the cameras + AI

This site may earn commission on affiliate links.
Not sure if this has been discussed before (I'm sure it has)

But let's just watch the first few seconds of this Figure AI robot demo for a sec and then think about what Tesla may add to the car by leveraging a "a realtime Tesla camera swarm" in a similar way.


But instead of you asking the car.. the car polls itself constantly for info and shares it.

For instance someone pulls into an AMC theater parking lot
The computer notices the lot is full and shares that info

You then later ask the computer
"Navigate to the AMC theater"
it responds
"It was reported 10 minutes ago that the parking lot was full"

"Navigate to the restaurant on main street"
it responds
"It was reported 5 minutes ago that there may be a parade today going down the road"

"Navigate to the town high school"
it responds
"5 minutes ago many people were leaving the building"

"Navigate to Six flags"
it responds
"it's currently raining there..and the traffic is a mess"

"Navigate to my work"
it responds
"Will have to take an alternate route because the snowplows have not cleared the road yet"

"Navigate to the airport"
it responds
"There lots of construction cones in the road...merging 3 lanes into 1...you may be late"

"Navigate to Dad's house"
it responds
"The streets lights are out of order and blinking at the intersection of A & B"

Imagine a tv commercial:
The owner parks on a dark street and falls asleep...with the doors unlocked
The car sees a dog walk by..and does nothing.
The car sees a person walking up...and locks the doors.
The person grabs the door handle, finds it locked, and runs away.
The owner wakes up and ask what happened
It replies "You fell asleep and the doors were unlocked...and some stranger approached..so I locked them..they grabbed the locked handle but then ran away"


I think Tesla should leverage their cameras to create an ecosystem people become so used to that they won't switch to another car brand..like Apple.
Plus like Apple they control both the hardware and software. So they can be the gatekeeper. They can add/remove/modify software at will.
 
Last edited:
Some of what you suggest seems obvious. It shouldn't take much compute to recognise roadworks, for example, and simply upload some metadata about a lane being closed. Whether that's really necessary is questionable though - we already have live traffic data that provides the information we need to avoid a road and re-route.

The parking lot being full is something we don't currently have. In general, anything the car sees that could be useful information for other cars to have access to, could be a good thing, but I think we would have to be telling the car heuristically what kind of information we are looking for (did someone say potholes?) We only have a small pipe to the internet - cellular data. We can't be uploading video constantly to be processed by a higher power - so we need to restrict ourselves to sharing the onboard compute power. What kind of hardware is that Open AI robot running and is the processing happening onboard or upstream in some supercomputer? It got it wrong anyway. The drying rack was not an appropriate place for the plate and cup - they were already dry. They belonged in a cupboard. Well the cup did. The plate needed to be washed first because it had had trash on it. If the robot had access to an LLM it should have known that.

As for the person asleep in the car with the doors unlocked - I'm not even sure I want that kind of AI in my car. It's not normal for me to sleep in my car with the doors unlocked, so it's quite possible that I'm actually having a stroke and the person approaching the car is trying to help. Thanks for nothing, HAL...
 
we already have live traffic data that provides the information we need to avoid a road and re-route.

Can't guarantee all countries have that.

The parking lot being full is something we don't currently have. In general, anything the car sees that could be useful information for other cars to have access to, could be a good thing,

Yes, this is about gathering data nobody else has.

We only have a small pipe to the internet - cellular data. We can't be uploading video constantly to be processed by a higher power

Depends upon the local speed. It may not necessarily be as small as you think. Obviously for FSD training the cameras are uploading clips from cars and I can also watch youtube easily all day on my center screen. Plus the car can stream music constantly on demand as you drive.

The drying rack was not an appropriate place for the plate and cup - they were already dry.
This is just a training/semantic issue as using racks to dry dishes is probably not the typical way the testers use them as they likely have mechanical dishwashers at home. The rack was being used for storage not drying.

As for the person asleep in the car with the doors unlocked - I'm not even sure I want that kind of AI in my car. It's not normal for me to sleep in my car with the doors unlocked, so it's quite possible that I'm actually having a stroke and the person approaching the car is trying to help. Thanks for nothing, HAL...
It's just an example of an advancement to Sentry Mode.

Other things to think about of leveraging the "Tesla camera swarm".
Since we can see in realtime on our phone app all the views of our cameras...I wonder in what scenario it would be useful to see the views from other cars.
 
Last edited:
Tesla is already doing something rudimentary along these lines combining real-time supercharger availability with the number of vehicles en route to a specific supercharger.

Also, just yesterday, the local news stations showed a segment about a fourth traffic light color to be activated based upon data from autonomous vehicles of any make that can report back to an aggregating source. The concepts are in this article published in March:


As for limitations of bandwidth and centralized processing power, I would expect video capture that is processed within in the vehicle to be converted to text messages that meet an as-yet-to-be-determined messaging standard. That way, any car with cameras and a means of processing and communication can participate in the automated crowdsourcing of location and video data.
 
Depends upon the local speed. It may not necessarily be as small as you think. Obviously for FSD training the cameras are uploading clips from cars and I can also watch youtube easily all day on my center screen. Plus the car can stream music constantly on demand as you drive.
Not the same thing. LTE/NR networks are optimised for download not upload. The download is often way higher bandwidth than the upload. Elon's own V12 live demo on Twitter in Palo Alto was really crap video quality - probably because upload bandwidth over LTE/NR.

Yes, the car uploads clips from cars for training purposes, but it doesn't do it live - it uploads when you park your car in the evening with WiFi access. I've had my car upload 2GB overnight occasionally.

Therefore I believe video processing would need to be done in the vehicle, with small data message uploads.

Live traffic data: The great thing about live traffic data is it is not manufacturer-specific. Mobile phones upload that data in real time (I believe both Apple and Android do it) - so there is an enormous data source. Any other proposed data uploaded by cars (e.g. full car parks) would be better if there was an open standard which all manufacturers would use. That way if a BMW reports that the car park is full, then Teslas are also aware.
 
Not sure if this has been discussed before (I'm sure it has)

But let's just watch the first few seconds of this Figure AI robot demo for a sec and then think about what Tesla may add to the car by leveraging a "a realtime Tesla camera swarm" in a similar way.


But instead of you asking the car.. the car polls itself constantly for info and shares it.

For instance someone pulls into an AMC theater parking lot
The computer notices the lot is full and shares that info

You then later ask the computer
"Navigate to the AMC theater"
it responds
"It was reported 10 minutes ago that the parking lot was full"

"Navigate to the restaurant on main street"
it responds
"It was reported 5 minutes ago that there may be a parade today going down the road"

"Navigate to the town high school"
it responds
"5 minutes ago many people were leaving the building"

"Navigate to Six flags"
it responds
"it's currently raining there..and the traffic is a mess"

"Navigate to my work"
it responds
"Will have to take an alternate route because the snowplows have not cleared the road yet"

"Navigate to the airport"
it responds
"There lots of construction cones in the road...merging 3 lanes into 1...you may be late"

"Navigate to Dad's house"
it responds
"The streets lights are out of order and blinking at the intersection of A & B"

Imagine a tv commercial:
The owner parks on a dark street and falls asleep...with the doors unlocked
The car sees a dog walk by..and does nothing.
The car sees a person walking up...and locks the doors.
The person grabs the door handle, finds it locked, and runs away.
The owner wakes up and ask what happened
It replies "You fell asleep and the doors were unlocked...and some stranger approached..so I locked them..they grabbed the locked handle but then ran away"


I think Tesla should leverage their cameras to create an ecosystem people become so used to that they won't switch to another car brand..like Apple.
Plus like Apple they control both the hardware and software. So they can be the gatekeeper. They can add/remove/modify software at will.
There is still a bit of a time lag in response time, it also states the obvious without any alternative answers or any personality, like humor. The voice needs to be smoother and have some appropriate pauses. “The parking lot is really busy this time of day, appears no available spots……did u still want to go or……we can go somewhere else “ ?
 
Does the FSD computer + infotainment computer have enough extra compute for this?

How computationally heavy is the NN vs the hardcoded approach?

I'll be honest I haven't looked into how tesla handles their neural network but I do have a little bit of neural network experience.

There's two side of the computational requirements, training vs replay. On the training side you need as much compute as you can get your hands so you can either reduce the ammount of time each iteration takes or so you can run more iterations in parellel. On the replay side of things you really only need what's required to execute one iteration.

Think about it such as how much time it takes to create a cook book vs how much time it takes to read a recipe from that cook book.

Since all of those actions aren't latency sensitive there's no need for the car to do any heavy lifting outside of encoding a video stream and blasting it off to the servers.

If you would like to better understand what I'm talking about you can get a neural network to play super Mario world fairy easily. Download and instructions can be found here and a video explation on how it works here
 
  • Like
Reactions: Ben W