Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Tesla AI Day - 2021

This site may earn commission on affiliate links.
I wonder if part of the reason why Autopilot has performed worse in poor weather is because of the windshield wiper blocking the view intermittently. Sure, in this case there was snow as well, but having a quarter of the view blocked didn't help either:

windshield wiper.jpg


Hopefully vision-only will go fleet-wide before the next winter for us. In the past, Autopilot would suddenly cancel out because radar thought it was blocked even when it wasn't actively snowing.
 
Another question I had that maybe you all can help me understand. They spent a lot of time talking about the simulations and how they are putting lots of effort into making it photo realistic (ray tracing etc.). Is that because the way they train with it is basically to feed it in to the NN letting the cameras watch the simulator?
 
Another question I had that maybe you all can help me understand. They spent a lot of time talking about the simulations and how they are putting lots of effort into making it photo realistic (ray tracing etc.). Is that because the way they train with it is basically to feed it in to the NN letting the cameras watch the simulator?
I'm sure others will give you a far more detailed answer, but what i heard was they would always use real life first, but on certain occasions they would enhance real life with the simulation information into the NN.
 
Another question I had that maybe you all can help me understand. They spent a lot of time talking about the simulations and how they are putting lots of effort into making it photo realistic (ray tracing etc.). Is that because the way they train with it is basically to feed it in to the NN letting the cameras watch the simulator?

Yes, I think so. And you want the car to perform the same in the simulation as it would in the same scenario in the real world. Otherwise, you can't use the simulation to know how it would actually perform in the real world. So you want the visual input to be as close to the real world as possible so that your perception will respond the same as in the real world. Hope that makes sense.
 
  • Like
Reactions: Big Earl
Yes, I think so. And you want the car to perform the same in the simulation as it would in the same scenario in the real world. Otherwise, you can't use the simulation to know how it would actually perform in the real world. So you want the visual input to be as close to the real world as possible so that your perception will respond the same as in the real world. Hope that makes sense.
Thanks. I guess I figured there would be a more direct way to connect the Sim into the NN instead of going through the whole visualization stack but maybe exercising the whole pipeline top to bottom is best.
 
Last edited:
  • Like
Reactions: diplomat33
Thanks. I guess I figured there would be a more direct way to connect the Sim into the NN instead of going through the whole visualization stack but maybe exercising the whole pipeline top to bottom is best.

Yes it is indeed the best and only way to do it right. They could indeed bypass the initial perception layers since they know what is being shown (since they generated it), but that could cause unknown problems. To a large degree neural nets are black boxes, so if you shortcut things, you literally don't know what real world effect it is going to have.

And they don't just generate pristine video either. They use a special neural network to add real world noise to the video. Just like a real car bounces around a bit and you get dust on the windshield, etc., they want to incorporate that into the video stream of any simulation. So the simulation video stream will be as noisy and imperfect as a real world video stream.
 
  • Like
Reactions: T@oo and Ruffles
I'm certain that the 8 cameras were placed there as a best guess at the time as to what might be needed. They did give Tesla 360 coverage with some overlap. However as the system has evolved they turn out to have large blind spots depending on the position of occluding objects. Also there are blind spots close to the car at low heights. This hampers FSD as well as birds eye view and self-parking.

Their main problem is a stubbornness to admit this. If they gave up and created a new camera system I'll bet they would move ahead very rapidly. It would mean a huge problem to their existing cars on the road. That is what is wrong with their approach IMO. It's sad that they are so advanced in methods yet so stuck with the past hardware.

Where would you put the cameras?

Tesla now has a vision system that stiches together the 8 cameras, so it seems to work fine. No matter what camera system you are using, you'd want to stich together all cameras. What problems is Tesla vision having?
 
  • Like
Reactions: T@oo
its called v2x (vehicle to anything) and I'm 100% a supporter of that kind of tech. I think it will HAVE to be part of the safety system that will make level 5 actually work. I'll say this, even: level 5 wont have enough nines until that closed feedback loop is, well, closed. having something other than the car give you a reality check, that's a must-have in my book.

watch v2x and see if vendors embrace it. maybe other countries will try it first. it really is a key tech, for so many reasons.

Interestingly enough, Tesla IS working on this for Boring company. Tesla vehicles in the Las Vegas tunnels will all be talking to a mother ship controller so that system wide routing and planning can be done. As usual, Tesla will probably have a good system finished and working while the "standards bodies" are still at the working paper stage.
 
The presentations by Karpathy and his team were fantastic and super informative. Maybe AI Day should have stopped after Dojo. End on a high note as George Costanza would say. The Tesla Bot, especially the intro with the guy in spandex dancing, was a jump the shark moment IMO. It was a bit too far. The Tesla Bot sounds like classic Elon hype. The reality is probably that it will take longer than expected to actually work as advertised, as we've seen with FSD.

Yes, another thing for Tesla haters to hang their hat on. But frankly, I'm getting phlegmatic about it all. Tesla haters are always going to hate Tesla no matter what it does. The flip side is that the Tesla Bot makes Tesla continue to appear cool to young people, and most importantly of all, makes it a cool place for AI and mechanical engineers to want to work for. In the end, that's Tesla's ultimate secret sauce - its very talented employees.
 
I'm likely missing something obvious, but I wonder about all the NN's focus on the short time horizon for an individual car. Isn't there a great opportunity for at least each Tesla car to share it's surrounding 3D 'image' to nearby Teslas?

For example, one Tesla could tell an oncoming Tesla "Watch for and avoid the child who's fallen off their bicycle at location x + y".

While that's an interesting long term idea, be aware that the NN absolutely doesn't focus on a single car short time horizon. First, it "remembers" various world states going back in the past for many seconds (15?) and also for many dozens (hundreds?) of meters travelled. It uses that past state information to inform itself for both current perception and path planning.

And then the system applies path planning predictions to every single autonomous agent it perceives. So it tries to guess what each other car and pedestrian around it will do.
 
I'm surprised they did not have a basic prototype on stage. It would not have to be fully functional. But at least something real. "Pajama Man" was cringe-worthy indeed and it makes it look like the Tesla Bot is vaporware (which at this stage I guess it kinda is).

Well, they had a mockup of what it'll look like on stage. I would bet that's as far as they've gotten. I'm looking forward to v1.0 in five years or so.
 
  • Like
Reactions: diplomat33
Is the monte carlo search part really neural network? Or just that it's able to run on the AI chip?

Not NN. Path planning and car control is still via complex C++ code. While they want to get to end-to-end car control via the NN, it sounds like that's still a future project. It sounded to me like they are so close to be able to release vision FSD to the US that they are gunning for that before they explore other things. Also, having DoJo will allow them the ability to do more quicker when that comes online.
 
  • Like
Reactions: emmz0r
Where would you put the cameras?

Tesla now has a vision system that stiches together the 8 cameras, so it seems to work fine. No matter what camera system you are using, you'd want to stich together all cameras. What problems is Tesla vision having?
Even if you fully stitched the cameras from a car all by itself in the middle of a open lot there are blind spots down low and up above. Then there are situations where the car can't see around other cars, walls, bushes. Just having 360 vision in an open lot does not mean you can see past if something is beside you. More sensors are needed in different positions. The ultrasonics have coverage difficulties also.

The cars are hitting objects in the road and driveways, and it appears FSD proceeds unsafely despite situations where its vision is blocked. Put enough sensors to drive safely. How, is up to them.
 
  • Disagree
Reactions: T@oo
Had a 15kW heat sink if I remember correctly.

Right 15 kW, which is insane, BTW. A typical high end home gaming rig would be hard pressed to reject 0.5 kW.

Like the neuro chip presentation this was not geared so much for general public, who might have thought AI Day was going to be like vehicle announcements, but instead was a hiring presentation. I could see some in the public and parts of the press not finding it what they expected and maybe disappointed. I thought it went well and liked seeing the progress. Guess Tesla Bot was more of a surprise.

I thought the presentation was great. I am going to re-listen to it at 1/2 speed, lol, and make a summary.
 
It sounded to me like they are so close to be able to release vision FSD to the US that they are gunning for that before they explore other things.

Based on what I've seen at AI Day, it seems Tesla is trying out all sorts of state-of-the-art and in-house approaches to achieve FSD. They're allowed to do this because they've set up a vast testing infrastructure, so any good idea will be explored and tested.

That's why they were so excited to show their neural rendering of the simulation, which they had only achieved the night prior.
 
  • Like
Reactions: rxlawdude
When they were showing how they stitch the cameras did you get a sense that they stitch using the objects in the view, or by stitching the cameras and then determining the objects?

In other words, if you know exactly how two cameras are placed you can overlap them with the same transformation every time. Then you can look at the panorama. But you would need to rely on the cameras being fixed in position and fully calibrated to each other.

Or if you are just matching two adjacent photos you have to stitch objects until you get a panorama.

In the first instance if there was an unknown object straddling multiple cameras you'd be able to see it as is because the stitch process would just merge the views irregardless of what is there, but in the second instance if the object was flat and featureless you'd have a difficult time determining how to stitch it.

It's amazing how they can determine depth though. That's neat stuff.
 
But Tesla was faced with the problem of different cameras seeing the same large object and thinking they were two different objects.
Two diff objects? One longer object per my example from above. Based on your comments it is unclear how closely you watched the presentation.

 
  • Helpful
Reactions: diplomat33
Two diff objects? One longer object per my example from above. Based on your comments it is unclear how closely you watched the presentation.


I watched it closely. I was thinking of my experience with AP where objects would flicker when they pass from one camera to another. But thanks for the correction.
 
In other words, if you know exactly how two cameras are placed you can overlap them with the same transformation every time. Then you can look at the panorama. But you would need to rely on the cameras being fixed in position and fully calibrated to each other.
That's only true if you already have a depth map of the scene. It's clear when you think about an object between the two cameras, each camera would see a different side of it.
 
If Tesla Vision is 10m+ from a large and mostly featureless continuous wall or a large Gaussian-noise-like object how can it determine distance for an object that resists stitching and the ultrasonics can't be used?

For example if you drove down a gravel alley beside a grey building at night. Or a mirrored building with a wet asphalt road. A road covered with leaves, no curbs visible. Flat new-snow-covered road next to snow-covered cars and snow piles.

Can vision do everything at 70mph without any range-finding sensors?