Elon: FSD Beta tweets

drtimhill · Dec 14, 2021

Terminator857 said:
You are wrong. Has been available to key customers in beta status since summer. You are welcome!

So let's look at this from a business standpoint. Tesla knew 2 years ago or so that they needed massive compute power to train the NNs. What were their options?

1. Try to use off-the-shelf hardware (servers/cloud/GPUs etc) and deploy massive numbers of these.
2. Find someone who has built the kind of custom hardware they need.
3. Build it in-house.

#1 isnt efficient .. CPUs can do it, of course, but at huge power expense (which means $$$$ and time). #2 there are limited options, pretty much the only choice is Google. #3 means dedicating resources to the design. No doubt there was some "ooh we want to build that" from the design team (after all, they already had chip design experience from HW3), but in-house designs always overrun, and have a huge long-term maintenance cost.

So, why didn't they use Google? I think it's pretty obvious. They would be taking (indirectly) a dependency on their #1 major competitor .. Waymo. From a business perspective, this is high-risk. Also, the Tesla mind-set and company culture is vertical integration .. they want to own the entire stack, so they can fine-tune and control it going forward as a competitive edge over less integrated competitors.

I don't know if this will turn out to be a good idea to not, but I'd put money on something like the above being part of their decision-making process.

Knightshade · Dec 14, 2021

Also there's no actual evidence googles solution is "better" (let alone cheaper) for the thing Tesla is actually doing here.

See again nearly every field in the comparison chart for Google V4 was blank.

There's CAD GPUs that beat gaming ones on certain benchmarks, but run like garbage at specific tasks the gaming GPUs excel at. And vice versa.

Likewise Googles solution might work better for SOME tasks, but not specifically what Tesla needs, where Dojo outperforms it.

The fact Tesla is NOT trying to use googles v4 (and given they have mountains of cash, and a stated need for more compute for specific tasks they'd have little reason not to be using it the second it's available at least till Dojo is ready to go) suggests that is the case too.

Terminator857 · Dec 14, 2021

Karpathy said they are highly protective of their data. Suspect they'd rather not upload their data to Google with a connection to Waymo. Cloud costs will likely be higher than a custom solution over the long run. There is something to be said about controlling your own destiny. I doubt Tesla ever seriously considered a Google Cloud solution.

momo3605 · Dec 14, 2021

Terminator857 said:
Karpathy said they are highly protective of their data. Suspect they'd rather not upload their data to Google with a connection to Waymo. Cloud costs will likely be higher than a custom solution over the long run. There is something to be said about controlling your own destiny. I doubt Tesla ever seriously considered a Google Cloud solution.

Until Dojo is ready, I’m sure they will be using large clusters of off-the-shelf GPUs in the cloud

Knightshade · Dec 14, 2021

momo3605 said:
Until Dojo is ready, I’m sure they will be using large clusters of off-the-shelf GPUs in the cloud

Off the shelf GPUs? Yes.

In the cloud? Nope.

They physically built it themselves in-house.

Tesla unveils its new supercomputer (5th most powerful in the world) to train self-driving AI

Tesla has unveiled its new supercomputer, which is already the fifth most powerful in the world, and it’s going to...

electrek.co

Terminator857 · Dec 14, 2021

momo3605 said:
Until Dojo is ready, I’m sure they will be using large clusters of off-the-shelf GPUs in the cloud

Why are you sure of this? Cloud means using a datacenter typically not your own. Tesla is not doing this.
Cloud computing - Wikipedia Quote:
Cloud computing is the on-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user.

momo3605 · Dec 14, 2021

Knightshade said:
Off the shelf GPUs? Yes.

In the cloud? Nope.

They physically built it themselves in-house.

Tesla unveils its new supercomputer (5th most powerful in the world) to train self-driving AI

Tesla has unveiled its new supercomputer, which is already the fifth most powerful in the world, and it’s going to...

electrek.co

ah, hadn't seen this. Generally speaking, setting up on-prem datacenters especially at a gigantic scale is cost prohibitive. Many large AI companies do ML training in the cloud (on AWS or GCP). But I take it back if Tesla has already publicly stated that they have created everything in-house

Bladerskb · Dec 14, 2021

Knightshade said:
Also there's no actual evidence googles solution is "better" (let alone cheaper) for the thing Tesla is actually doing here.

See again nearly every field in the comparison chart for Google V4 was blank.

There's CAD GPUs that beat gaming ones on certain benchmarks, but run like garbage at specific tasks the gaming GPUs excel at. And vice versa.

Likewise Googles solution might work better for SOME tasks, but not specifically what Tesla needs, where Dojo outperforms it.

The fact Tesla is NOT trying to use googles v4 (and given they have mountains of cash, and a stated need for more compute for specific tasks they'd have little reason not to be using it the second it's available at least till Dojo is ready to go) suggests that is the case too.

This is a myth that has been spreading in the tesla comm. Not surprised you picked it up and ran with it.
Dojo is completely general purpose and not specialized. No one is stupid enough to build a specialized training system as it will be DOA because of how fast NN architecture is advancing. Any ML expert would tell u that..
We know the type of NN FSD Beta runs. This isn't some hidden secret. Its not some special NN architecture, its run of the mill.

The "Tesla is doing something special" is one among the thousands of other Tesla myths.

Finally, WE actually have MLPerf benchmarks from Google TPU V4, we have ZERO from DOJO and won't for many years, infact DOJO as presented doesn't even EXIST! and wont for several years if ever.

Knightshade · Dec 14, 2021

Bladerskb said:
This is a myth that has been spreading in the tesla comm. Not surprised you picked it up and ran with it.
Dojo is completely general purpose and not specialized.

The myth is that you ever make an honest post.

I cited actual real world examples of two GPUs that are specialized.

They can both execute the same tasks, but one is MUCH better at a specific class of tasks than more generalized hardware.

The other is better at a DIFFERENT specific class of tasks than more generalized hardware.

So the idea Tesla would bother to custom design a GENERAL PURPOSE chip to primarily handle a SPECIFIC TASK (processing computer vision data)is nonsensical.

In fact the CURRENT GPU cluster Tesla is using is a great example of that. It's a ton of Nvidia A100s.

Which are "GPUs" but they're be terrible at video games compared to something designed for that task. Just as a gaming GPU would be much worse at the specific task they're using the A100s for.

impastu · Dec 14, 2021

Bladerskb said:
This is a myth that has been spreading in the tesla comm. Not surprised you picked it up and ran with it.
Dojo is completely general purpose and not specialized. No one is stupid enough to build a specialized training system as it will be DOA because of how fast NN architecture is advancing. Any ML expert would tell u that..
We know the type of NN FSD Beta runs. This isn't some hidden secret. Its not some special NN architecture, its run of the mill.

The "Tesla is doing something special" is one among the thousands of other Tesla myths.

Finally, WE actually have MLPerf benchmarks from Google TPU V4, we have ZERO from DOJO and won't for many years, infact DOJO as presented doesn't even EXIST! and wont for several years if ever.

It’s always interesting to come across writing like this. A study in intellectual superiority. Thank you for sharing.

Terminator857 · Dec 14, 2021

Bladerskb said:
... Dojo is completely general purpose and not specialized. ...

Are Dojo's 8 and 16 bit floating point numbers specialized?

Tesla releases Dojo whitepaper, Elon Musk teases as 'more important than it may seem'

Tesla has released a new whitepaper regarding a new standard for its Dojo supercomputing platform. CEO Elon Musk teased the...

electrek.co

Quote:
This standard specifies Tesla arithmetic formats and methods for the new 8-bit and 16-bit binary floating-point arithmetic in computer programming environments for deep learning neural network training.

Bladerskb · Dec 14, 2021

Knightshade said:
The myth is that you ever make an honest post.

I cited actual real world examples of two GPUs that are specialized.

They can both execute the same tasks, but one is MUCH better at a specific class of tasks than more generalized hardware.

The other is better at a DIFFERENT specific class of tasks than more generalized hardware.

So the idea Tesla would bother to custom design a GENERAL PURPOSE chip to primarily handle a SPECIFIC TASK (processing computer vision data)is nonsensical.

In fact the CURRENT GPU cluster Tesla is using is a great example of that. It's a ton of Nvidia A100s.

Which are "GPUs" but they're be terrible at video games compared to something designed for that task. Just as a gaming GPU would be much worse at the specific task they're using the A100s for.

Its not the same. When Chip designers talk about specialized NN training system, they are talking about a system where the NN architecture is built in the hardware. If its not doing that then its not specialized. period.

drtimhill · Dec 14, 2021

Bladerskb said:
No one is stupid enough to build a specialized training system as it will be DOA because of how fast NN architecture is advancing. Any ML expert would tell u that..

I'm not sure that is the case at all. NN hardware is advancing in terms of performance and complexity, but the underlying architecture has been evolving relatively slowly.

drtimhill · Dec 14, 2021

Bladerskb said:
Its not the same. When Chip designers talk about specialized NN training system, they are talking about a system where the NN architecture is built in the hardware. If its not doing that then its not specialized. period.

No way ... things can be "specialized" in the sense they are aimed at specific vertical applications but are still flexible enough to adopt to differing needs .. for example a DSP for audio processing etc.

Knightshade · Dec 15, 2021

Yeah you've got a guy who has no idea WTF he's talking about (again) being caught making that obvious (again) and now trying to move goalposts about what "specialized" and "generalized" hardware actually mean to make it less obvious (again).

At least he's consistent, even if it's consistently wrong

BTW, excellent additional example of the actual meanings citing DSPs. They can do LOTS of jobs generally, but you can still absolutely build ones that do specific things better than others (and most do so)....

GlmnAlyAirCar · Dec 15, 2021

Can someone let me know when this thread gets back on topic so I can start watching it again?

qdeathstar · Dec 15, 2021

GlmnAlyAirCar said:
Can someone let me know when this thread gets back on topic so I can start watching it again?

Right, let me know when he throws some more shade!!!

Terminator857 · Dec 15, 2021

At A.I. day a reporter asked the hardware design guy if they could patent some of the stuff. He replied I'm not sure you can patent a linear algebra processor. Implying it is generalized. And yes, you can have generalized and specialized together, although I agree with bladerskb that it is much more general than specialized.

Karpathy tweet about how neural network architectures are becoming more generalized:

https://twitter.com/x/status/1468370605229547522

Knightshade · Dec 15, 2021

Terminator857 said:
At A.I. day a reporter asked the hardware design guy if they could patent some of the stuff. He replied I'm not sure you can patent a linear algebra processor. Implying it is generalized.

That doesn't imply that at all. it implies you can't patent math.

Tesla is using a couple of specific types of math, and thus designed their processor to SPECIALIZE in those operations.

There's plenty of processors out there, as multiple folks have noted now, that while they CAN run a lot of generalized stuff, are SPECIALIZED to run specific things much much better than more generic/generalized silicon.

Same thing here.

And it's not like Tesla doesn't already HAVE some patents about specificity either-

Why Tesla Invented A New Neural Network

Tesla filed a new patent where they described the systems and methods to select a neural network model configuration.

analyticsindiamag.com

According to the patent publication, an embodiment of systems and methods include techniques and systems that are specifically described to determine neural network configurations, which are adapted to a specific platform

But if you want something dojo specific Tesla is creating that's specific, here ya go:

Tesla reveals whitepaper on Dojo supercomputer

Tesla, an American electric vehicle and a clean energy-based company, has released a new whitepaper regarding a new standard for the Dojo supercomputing

analyticsdrift.com

Tesla extended precision support by introducing Configurable Float8 (CFloat8), an eight-bit floating-point format. This format reduces memory storage and bandwidth in storing weights, activations, and gradient values essential for training and increasing larger networks.

Tesla published a 9 page whitepaper on this (mentioned in link above) and more on it here:

Tesla Releases A Guide To Its Dojo Technology – E V O B S E S S I O N

evobsession.com

It mentions another standard Google created to help with NN/AI stuff previously, and how Tesla is now doing a couple of their own to take this further.

Dojo is HW SPECIALIZED for these operations.

That doesn't mean it couldn't run other stuff of course. Just that it's better than anything else at those SPECIALIZED tasks.

Google v4 may well be faster at OTHER tasks of course. But why would Tesla care since those aren't the ones they need to be faster?

qdeathstar · Dec 15, 2021

Who cares. FSD beta is what I care about, and it’s got a long way to go. It doesn’t matter who has the best sensors or the best neural net. It matters who is best at exploiting what they do have. Anything else can be bought.

Elon: FSD Beta tweets

Active Member

Well-Known Member

Active Member

Active Member

Well-Known Member

Active Member

Active Member

Senior Software Engineer

Well-Known Member

Woody

Active Member

Senior Software Engineer

Active Member

Active Member

Well-Known Member

Active Member

Completely Serious

Active Member

Well-Known Member

Completely Serious

Similar threads