Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Neural Networks

This site may earn commission on affiliate links.
In the world of Youtube, Instagram and Blogs, is it really a challenge to get behind a microphone?

And no i don't think you should mislead a demographic by spreading false narrative and information to please what that demographic wants to hear. Jimmy's podcast is a tesla fan's wetest dream! (yes i know "wetest" is not a word but in this case it is!)

Propagating that Neural Network is a camera only position.
Propagating that only Tesla uses NN.
Propagating wrong info about the usage of NN.

That's some mighty reality distortion field. I absolutely think he did more harm than good.
You of all people know that 3 years ago the buzzling threads around here was "I can't decide if i want my Model 3 to self deliver itself, 1,000 miles is too much". Don't you miss those Tesla Mythology days?

9 out of 10 FSD posts in TMC, /r/teslamotors and electrek was "Tesla unlike others are using neural network".
It took me every ounce of energy in my body to drain the swamp. Now the narrative is "Tesla is using big complex neural network unlike others simple nn" which is better than the former although hilariously wrong since their networks can't even detect general objects, debris, traffic light, traffic sign, overhead road signs, road markings, barriers/guardrails, curbs, cone and is hilariously inaccurate and in-efficient.

But Now all of a sudden Jimmy wants to plunge us back to the myth age.
This is why from now on i will keep my TV channel stuck on verygreen, i goofed changing it.
Atleast he won't try to water board me with kool-aid.

Eww... all the mouth wash in the world still can't get the taste of kool-aid off my mouth even after just 30 mins of forced drinking.

Heck give me v9 model and i will provide you with the most un-biased in-dept analysis.
IRUU.gif
 
In the world of Youtube, Instagram and Blogs, is it really a challenge to get behind a microphone?

And no i don't think you should mislead a demographic by spreading false narrative and information to please what that demographic wants to hear. Jimmy's podcast is a tesla fan's wetest dream! (yes i know "wetest" is not a word but in this case it is!)

Propagating that Neural Network is a camera only position.
Propagating that only Tesla uses NN.
Propagating wrong info about the usage of NN.

That's some mighty reality distortion field. I absolutely think he did more harm than good.
You of all people know that 3 years ago the buzzling threads around here was "I can't decide if i want my Model 3 to self deliver itself, 1,000 miles is too much". Don't you miss those Tesla Mythology days?

9 out of 10 FSD posts in TMC, /r/teslamotors and electrek was "Tesla unlike others are using neural network".
It took me every ounce of energy in my body to drain the swamp. Now the narrative is "Tesla is using big complex neural network unlike others simple nn" which is better than the former although hilariously wrong since their networks can't even detect general objects, debris, traffic light, traffic sign, overhead road signs, road markings, barriers/guardrails, curbs, cone and is hilariously inaccurate and in-efficient.

But Now all of a sudden Jimmy wants to plunge us back to the myth age.
This is why from now on i will keep my TV channel stuck on verygreen, i goofed changing it.
Atleast he won't try to water board me with kool-aid.

Eww... all the mouth wash in the world still can't get the taste of kool-aid off my mouth even after just 30 mins of forced drinking.

Heck give me v9 model and i will provide you with the most un-biased in-dept analysis.

Jimmy, can you provide Blader the metadata you are looking at? Would be cool to see his interpretation of it added to the mix.
 
Jimmy, can you provide Blader the metadata you are looking at? Would be cool to see his interpretation of it added to the mix.
Yeah, great idea. Reminds me of the old Star Trek movie (The Voyage Home) where Chekov walks around NYC asking pedestrians in his Russian accent to see their "nuclear missiles"... I provided a press of the Ignore button. I suggest everyone else do the same.
 
If you prefer to be in an echo chamber, sure hit that ignore button on Blader. I would rather see if he can back up his assertions with technical analysis a la Jimmy's posts.

I have learned literally nothing from him and I’m skeptical (though less so by the day) of Tesla’s dependence on vision + radar so I’m all ears to alternate solutions. All noise.

As a layperson I have learned a tremendous amount from jimmy and verygreen on this thread, amongst others.
 
I have learned literally nothing from him and I’m skeptical (though less so by the day) of Tesla’s dependence on vision + radar so I’m all ears to alternate solutions. All noise.

As a layperson I have learned a tremendous amount from jimmy and verygreen on this thread, amongst others.

Agreed, so far I don't see any useful information. I am curious if he could add something if he had similar access to the software like Jimmy and Verygreen. Would be interesting to see if his position changes at all after getting hands on. I'm skeptical of his ability to provide 'the most un-biased in-dept analysis.', but I think it would be fair to give him a chance at it and it could end up very illuminating. Worse case we just get a troll post.
 
I have a NN question. The answer is probably not specific to Tesla’s NN, so anyone who knows how this works can chime in.

What form are the outputs of the NN in. In particular are the bounding boxes (of cars and pedestrians) given as XY coordinates of two corners? There can be a variable number of bounding boxes. How is that handled? How is the type of vehicle (car, truck, pedestrian) given?

Is the green driving path simply given as a rectangular pixel grid with the same resolution as the input?
 
I have a NN question. The answer is probably not specific to Tesla’s NN, so anyone who knows how this works can chime in.

What form are the outputs of the NN in. In particular are the bounding boxes (of cars and pedestrians) given as XY coordinates of two corners? There can be a variable number of bounding boxes. How is that handled? How is the type of vehicle (car, truck, pedestrian) given?

Is the green driving path simply given as a rectangular pixel grid with the same resolution as the input?


As in many things in life, the answer is 'it depends'. Some will output just the category of an object in an image, some will output bounding boxes, some will output a picture or overlay. Those are just a few examples. The person that develops the NN decides what they want it to output.

For example, I'm working on one right now where the output is a list of pixels in a picture that are part of a certain object. That list is then turned into a graphic overlay that highlights/masks those areas on the original image.
 
As in many things in life, the answer is 'it depends'. Some will output just the category of an object in an image, some will output bounding boxes, some will output a picture or overlay. Those are just a few examples. The person that develops the NN decides what they want it to output.

For example, I'm working on one right now where the output is a list of pixels in a picture that are part of a certain object. That list is then turned into a graphic overlay that highlights/masks those areas on the original image.
I’m asking what form the output is in for a NN that does output the things I mentioned.
 
Does anyone have experience with or knowledge of using GPUs on AWS? I recently read that Google Brain used ~45,000 GPU hours to use AutoML/Neural Architecture Search to create a neural network architecture called NASNet. I looked up AWS pricing for GPUs and it looks like it's 40 cents per hour. So, 45,000 GPU hours would only be $18,000.

Even for a small startup, that seems reasonable. For a big company, you could afford to pay much more. If you wanted to use 1000x as much computation as Google Brain, if you wanted to use 45 million GPU hours, it would cost $18 million. For Tesla, that feels like a drop in the bucket.

Does that sound right? This cost seems ridiculously low.

Interesting also when you look at Efficient Neural Architecture Search (ENAS), which is attempting to bring the computation cost of Neural Architecture Search (NAS) down by 1000x. If ENAS can achieve equally good results as the NAS that Google Brain used, then with 45 million GPU hours you could do 1,000,000x as much search as Google Brain. Crazy.

Say Tesla really wanted to go nuts with AutoML and spend $180 million. That wouldn't be unfeasible; Tesla could still stay profitable and cash flow positive if it spent another $180 million on R&D in one quarter. With regular NAS, it could do 10,000x as much search as it took to find NASNet. With working ENAS, it could do 10,000,000x more.

Unless I'm getting the actual AWS pricing wrong. So please let me know!

Incidentally, this is why I want mad scientist Elon to stay in control of Tesla. Or at least for Tesla to have a Board that gives Elon the freedom to run the company. I have the feeling that many of the Boards of public companies (outside of the tech world, at least) would lack the imagination to approve of this kind of spending on a mad science project. Yeah, Elon is crazy, and I goddamn hope he stays that way.
 
Last edited:
I’m asking what form the output is in for a NN that does output the things I mentioned.

An example of an algorithm that does bounding boxes is YOLO. You can see What is object detection? Introduction to YOLO algorithm - Appsilon Data Science | We Provide End to End Data Science Solutions for an overview of it. The article talks about x, y, height, width, classification, and confidence level. An example of the output could be something like (.85, 310,240,50,100,4) which means that it is "85% confident that a car (if 4 means car) is at coordinate 310,240 with a height of 50 pixels and a width of 100 pixels". Actually the output would be multiple entries like that - not just one.

A page thats a bit more technical but shows an example of how the driveable area can be highlighted is from the Nvidia site: Fast INT8 Inference for Autonomous Vehicles with TensorRT 3 | NVIDIA Developer Blog. Figure 3 shows the 19 classes they used and Figure 5 shows them overlaid on a sample scene - dark purple for road, yellow for traffic signs, etc. That example is like the one jimmy_d described in the Tesla NN where they started with an existing image model (VGG16) and built onto it. If I recall correctly, jimmy_d said Tesla's is based on Inception instead of VGG16 but its the same basic idea. So the output is a pixel by pixel mapping of what each pixel is. The Nvidia example is a 512×1024 image so the output could be a 512x1024 array where each entry is the number (1-19) of the class for that pixel.

When you say 'green driving path' I'm assuming you mean from the verygreen/damianXVI clips of driving in Paris, etc. So that would be just taking the dark purple section from the Nvidia example and overlaying it (i.e. alpha compositing so you can see through it) on the original image but with green coloring.

I have no insight into what Tesla is actually doing. Those are just simple examples of what *could* be done. Maybe jimmy_d or verygreen have actual output samples from Tesla's and can narrow it down more.
 
Does anyone have experience with or knowledge of using GPUs on AWS? I recently read that Google Brain used ~45,000 GPU hours to use AutoML/Neural Architecture Search to create a neural network architecture called NASNet. I looked up AWS pricing for GPUs and it looks like it's 40 cents per hour. So, 45,000 GPU hours would only be $18,000.

Even for a small startup, that seems reasonable. For a big company, you could afford to pay much more. If you wanted to use 1000x as much computation as Google Brain, if you wanted to use 45 million GPU hours, it would cost $18 million. For Tesla, that feels like a drop in the bucket.

Does that sound right? This cost seems ridiculously low.

Interesting also when you look at Efficient Neural Architecture Search (ENAS), which is attempting to bring the computation cost of Neural Architecture Search (NAS) down by 1000x. If ENAS can achieve equally good results as the NAS that Google Brain used, then with 45 million GPU hours you could do 1,000,000x times as much search as Google Brain.

Look at Elastic GPUs - Amazon Web Services in the section that says "How to Choose". For machine learning they recommend P3 instances. P3s use NVIDIA Tesla V100 GPUs. At Amazon EC2 P3 – Ideal for Machine Learning and HPC - AWS it says $1.23 to $24.48 per hour.

NASNet used Nvidia P100s. The P100 is a year older than the V100.
 
Look at Elastic GPUs - Amazon Web Services in the section that says "How to Choose". For machine learning they recommend P3 instances. P3s use NVIDIA Tesla V100 GPUs. At Amazon EC2 P3 – Ideal for Machine Learning and HPC - AWS it says $1.23 to $24.48 per hour.

NASNet used Nvidia P100s. The P100 is a year older than the V100.

Thanks! So helpful.

Funny that there is a chip called the Nvidia Tesla P100. Did we run out of names?

Okay, here's another way to do the math. Google Brain used around 500 Nvidia P100s, the most expensive version of which costs around $10,000. So 500 of them costs $5 million.

If I'm getting my facts right, Google Brain completed its Neural Architecture Search within 4 days. If you were willing to wait 28 days (and why not? what's the rush?), you only need to buy 1/7th as many GPUs. So 72 would be enough. If you bought 720 GPUs for $7.2 million, you could do 10x as much computation as Google in 28 days. If you bought 7200 for $72 million, you could do 100x as much.

This is conservative because I rounded some numbers up, and used the price of the older chip when it first came out (I think).

Alternatively, if you use AWS pricing and (again, conservatively) treat 1 P100 as equal to 1 V100 hour, the cost of 4.5 million GPU hours (100x what Google did) at $3.06/hour would be $13.8 million.

Come to think of it, could you even use AWS for this? Would you need one big custom instance with like 100 GPUs? Can you split up the NAS task between multiple instances? Is this all solved using virtual machines?
 
Agreed, so far I don't see any useful information. I am curious if he could add something if he had similar access to the software like Jimmy and Verygreen. Would be interesting to see if his position changes at all after getting hands on. I'm skeptical of his ability to provide 'the most un-biased in-dept analysis.', but I think it would be fair to give him a chance at it and it could end up very illuminating. Worse case we just get a troll post.

I have posted lots of helpful input going back to the beginning of this thread and the other "camera" thread.
But here's the thing. I have requested the model many times before and no has given it to me.
I have also requested the raw video outputs and the vector output that verygreen has and i haven't gotten that either.

With information like that, you can actually do an in-depth analysis and come to a informed conclusion about number and percentage of false positive and false negatives, the accuracy of the network compared to a human, Analysis like how easy it is and how much data you would need to build a network with the same level accuracy as the Tesla network. Using the same raw video as the test data of-course.

There's a-lot of things you could do to verify the network.
But an unbiased man is dangerous around these quarters so it is what it is.
 
I have posted lots of helpful input going back to the beginning of this thread and the other "camera" thread.
But here's the thing. I have requested the model many times before and no has given it to me.
I have also requested the raw video outputs and the vector output that verygreen has and i haven't gotten that either.

With information like that, you can actually do an in-depth analysis and come to a informed conclusion about number and percentage of false positive and false negatives, the accuracy of the network compared to a human, Analysis like how easy it is and how much data you would need to build a network with the same level accuracy as the Tesla network. Using the same raw video as the test data of-course.

There's a-lot of things you could do to verify the network.
But an unbiased man is dangerous around these quarters so it is what it is.
You lost me at unbiased.