Dojo discussion

Discoducky · Feb 24, 2021

Dojo is turning out to be a game changer. Creating this thread to hold some more technical discussions outside of the main investor thread.

Discoducky · Feb 24, 2021

Rarity said:
I had a very different take-away from the discussion, but admittedly I could have misunderstood him. Rather, Jim seemed to imply that Dojo was meant as a more general solution based on a blue-sky approach as to what was possible (i.e., that Dojo would have more than one customer). He was involved at the early stages, but he did not know what Dojo is at this point because of multiple pivots.

Also, I don't believe that he said that Lattner is involved in Tenstorrent, Jim's new outfit. Rather, that Jim put on a conference last year that both Lattner and Karpathy were involved in.

It is up for debate. My sentiment is that it *could* be used for anything, but increasingly less efficient the farther away it gets from its primary purpose. I'd assert that designing a chip to do more than just one thing will inherently make it less efficient at the primary task.

And it does appear that Chris shows at a different company on LinkedIn so you might be right!

LiveLong&Profit said:
MODS: This may be of general interest to genpop , but off course, feel free to kick it to a sub-forum.

Thanks DD ! - great summary: Good level of detail combined with clear explanations.
Follow up question:
In similar projects of this magnitude what is the ballpark ratio of work used for data transfer, compression or bottleneck workarounds, as you describe it, compared to solving the actual problem (FSD)?
Clarification:
Did Keller really say 10- 1 million times faster/better or are you using your own background to evaluate the upper bound? Did he compare to the (in car) FSD version 3 chip efficiency, or to other ways of doing FSD (competitors)

Wild-eyed speculation: Perhaps Tenstorrent will built some kind of specialized compute unit which Tesla can use in DOJO? Perhaps made-to-order, Tesla-only. Perhaps not a main or bulk part but supplementary? I love Teslas vertical integration, but sometimes having a trusted partner is worth a lot.

General amazement:
I still find it hard to believe that DOJO is generic enough for other tasks. Elon said it could mine bitcoin, and your summary seems to imply a generic quality. To my limited understanding, FSD is (or was) considered so freaking hard that solving that requires specialized hardware - as evidenced by Tesla doing exactly that re. their custom car chips.
It also kind of doesn't make sense financially: Solving FSD is worth so much money, that even making a lot of bitcoins wouldn't really measure up. On the other hand, if solving FSD takes a number of years, and a huge amount of compute and custom chips, having some extra income is useful.

The only way that I can make sense of DOJO being generic is if DOJO is actually a trojan horse kind of tech for solving AGI !!
Is there a chance that this is actually what Tesla is trying to do? Or am I flying of a tangent here?
(How does that rhyme with Elons continued warnings about AI?)

Or, if DOJO does not solve AGI entirely, then solves an at least a large subset of AGI. Or perhaps doesn't quite solve, but boosts other known techniques by a significant order of magnitude - a kind of AGI runway. Which in the end may be the missing link for solving AGI ?..!
Maybe Elon concluded that FSD was close enough to AGI that he might as well solve for AGI. And then get FSD 'for free'.
If that is the case, then solving for a subset of AGI is worth a ... what is the level above f***ton?
Most of us are getting used to Tesla being 10+ startups. I used to think that solving FSD was worth a lot - to mankind, but also to us fans and investors. Solving AGI (or a significant subset) Danm! That is is 'huger than huge'.

If true, then someone PM Warren Redlich - his most crazy estimates are way too conservative!

(AGI: Artificial General Intelligence)

My opinion for the general speed up for FSD being as high as 1M times multiple.

Also, it would be cool if it could be used for AGI, but it is doubtful as AGI will mostly need much more compute and much more memory as it is going to be fed much more training data than just CV inputs.

EVWatcher · Feb 24, 2021

Fwiw I think that is could have stayed in the Tesla thread. It’s as relevant to Tesla as an invest as battery form factor discussions.

I’d be curious to see if Tesla doesn’t develop some wearable tech with cameras everywhere to collect data about walking around like driving around so that it can do for human living spaces what they are doing for cars.
But maybe I’m oversimplifying it.

Mengy · Feb 24, 2021

I have this hunch we are closer to Level 5 FSD than is generally thought.

lascavarian · Feb 24, 2021

Mengy said:
I have this hunch we are closer to Level 5 FSD than is generally thought.

Until the cabin camera is lit up, I think most of the progress is still in development. I also wonder about an audio feed for use in rare cases.

It does seem Keller is working on breakthrough HW too.

Mo City · Feb 24, 2021

Discoducky said:
Dojo is turning out to be a game changer. Creating this thread to hold some more technical discussions outside of the main investor thread.

Why is Dojo needed if Level 5 is achievable without it?

Do you believe Dojo will use a custom-built processor just for Tesla?

Please go into this now: Dojo will work seamlessly with their inference chip (the chip that is in the car) to do inference as well as possibly other tasks (I won't go into that now).

UkNorthampton · Feb 25, 2021

I just want to capture some Dojo / Elon twitter stuff: https://twitter.com/elonmusk/status/1224194125844701184

My bold emphasis & italics for reply that Elon answered

My view is that Elon always does things with multiple purposes in mind, all with Mars & beyond in mind (before we get wiped out by magnetic poles switching, asteroids etc)

Dojo will be huge, will do speech, video, text, DNA, medicines, structure design and much more for Tesla & rented out for others.

In my opinion, Dojo will be more profitable than AWS.

===================================

Elon Musk: “At Tesla, using AI to solve self-driving isn’t just icing on the cake, it the cake” -
@lexfridman
Join AI at Tesla! It reports directly to me & we meet/email/text almost every day. My actions, not just words, show how critically I view (benign) AI.

Autopilot AI
Apply now to work on Tesla Autopilot and join our mission to accelerate the world’s transition to sustainable energy.
tesla.com
===================================
Elon Musk @elonmusk
Feb 2, 2020

Tesla will soon have over a million connected vehicles worldwide with sensors & compute needed for full self-driving, which is orders of magnitude more than everyone else combined, giving you the best possible dataset to work with
===================================
Elon Musk@elonmusk
Feb 2, 2020

Our custom 144 TOPS in-vehicle inference computer, where almost every TOP is useable & optimized for NN, far exceeds anything else in volume production, giving you the hardware you need to run sophisticated nets
===================================
Elon Musk@elonmusk
Feb 2, 2020

Dojo, our training supercomputer, will be able to process vast amounts of video training data & efficiently run hypersparce arrays with a vast number of parameters, plenty of memory & ultra-high bandwidth between cores. More on this later.
===================================
Elon Musk@elonmusk
Feb 2, 2020

We are (obviously) also looking for world-class chip designers to join our team, based in both Palo Alto & Austin
===================================
Elon Musk@elonmusk
Feb 3, 2020

Our NN is initially in Python for rapid iteration, then converted to C++/C/raw metal driver code for speed (important!). Also, tons of C++/C engineers needed for vehicle control & entire rest of car. Educational background is irrelevant, but all must pass hardcore coding test.
===================================
Pranay Pathole@PPathole
Feb 3, 2020

Pytorch for building NN?
===================================
Elon Musk@elonmusk Replying to @PPathole
PyTorch is the most frequently used external tool set/library

Johan · Feb 25, 2021

Great foresight with this thread @Discoducky! I also think Dojo will turn out very important in the long run.

From what I've read, and come to understand, about Tesla's in-house chip making they've made FPGA prototype chips, but do we have any confirmation they've moved to the ASIC stage of manufacturing now? In other words that they've landed on a specific hardware design? Do we have any info on the capacity of Dojo in for example Teraflops/second - even though such a metric may not be super relevant with a specific purpouse supercomputer?

UkNorthampton · Feb 25, 2021

Johan said:
Great foresight with this thread @Discoducky! I also think Dojo will turn out very important in the long run.

From what I've read, and come to understand, about Tesla's in-house chip making they've made FPGA prototype chips, but do we have any confirmation they've moved to the ASIC stage of manufacturing now? In other words that they've landed on a specific hardware design? Do we have any info on the capacity of Dojo in for example Teraflops/second - even though such a metric may not be super relevant with a specific purpouse supercomputer?

I'm wondering if they have a Jim Keller / https://twitter.com/tenstorrent link up

Also, this is interesting but may mean nothing

Careers – Tenstorrent
"Based in Toronto with offices in Austin, Tenstorrent is growing quickly. And, we are proudly backed by top-tier Venture Capital firms including Real Ventures and Eclipse Venture Capital, as well as prominent industry luminaries."

https://eclipse.vc/team/greg-reichow/
"Greg was Tesla’s executive leader of global manufacturing, factory/automation engineering, supply chain, and product excellence. While at Tesla, he led growth from low-volume Roadster production to the fully integrated manufacturing of the Model S and X"

CLK350 · Feb 25, 2021

Agreed that 1/ this is a *very* important part of Tesla's portfolio of advanced tech 2/ this thread should probably stay in the main discussion thread for that reason, especially considering the other irrelevant chatter there.

Just to mention one possibility: Tesla Digital Coins (TDC)

1/ it might be easy/ feasible to add/ make some design changes to the next MCU so that it *also* can be very efficient at BTC/ crypto mining, with all cars mining while their MCU's are idling 2/ We know Elon is very cognizant of the archaic financial system re: back in '99 when founding X.com aka Paypal's predecessor 3/ It's very much on Elon's mind these days re: BTC and Doge tweets 4/ With Starlink, crypto mining of TDC's (Tesla Digital Coins) can run unimpeded across the whole globe, actually Mars and the whole galaxy too.
This is a big advantage over Bitcoins (BTC) which are dependent on the internet as we know it, functioning across boundaries. Recall that the Chinese control over 50% of BTC mining AND can probably close their digital boundaries if they wanted to.

TLDR: The Tesla fleet of millions of MCU's (TFMM) and Starlink could be orchestrated by Dojo aka Tesla's Feds to create a very compelling alternative Bitcoin, the TDC (Tesla Digital Coin)

Other possible Dojo/ TFMM / Starlink uses will likely pop up too, besides the obvious reselling/ franchising of FSD to other carmakers. For example, renting it to solve (faster) prime decomposition (for 3 letter agencies or whoever) as this is key to cracking security apps.

Discoducky · Feb 25, 2021

Johan said:
Great foresight with this thread @Discoducky! I also think Dojo will turn out very important in the long run.

From what I've read, and come to understand, about Tesla's in-house chip making they've made FPGA prototype chips, but do we have any confirmation they've moved to the ASIC stage of manufacturing now? In other words that they've landed on a specific hardware design? Do we have any info on the capacity of Dojo in for example Teraflops/second - even though such a metric may not be super relevant with a specific purpouse supercomputer?

I hope they aren't pursuing FPGA, just not enough memory and throughput. Not optimal for matrix math at scale with FP optimimization. Highly doubtful IMO.

Assume they'll shoot for something better than what they could get off the shelf from GPUs or something that will be available from GPUs in the next 3 to 5 years. Also, I think it is a safe assumption to think that the inference chip will be a similar architecture to ensure seamless transition from training to inference.

TFLOPS isn't as important as memory footprint at a high bandwidth as well as connectivity between compute cores. Assume the speed of light is ideal and not much else matters.

Buckminster · Mar 23, 2021

Singuy said:
Someone on reddit did a nice TLDR.

Working with Elon Musk: "He's an incredible person, I'm still trying to map out his superpowers. Incredible intuition even with lack of information. Great judgement. He's a double edged sword because he wants the future yesterday. You have to have a certain attitude to tolerate that. If you can, you will thrive at Tesla."

Hardest problem is variability. So many possible problems in the real world.

75% of his time is spent curating data. 25% on algorithms. Machine Learning is Programming where the computer fills in the blanks.

Interventions are a great trigger for training the system. The team can detect disagreements too: A stop sign flickers on and off or map data is wrong.

"Millions of images easily" on the system. Takes 2-3 weeks to train on new data.

The neural network in cars today do not make road edge predictions based on one image from one camera anymore, but a birds-eye view with data from all 8 cameras. Known as "Software 2.0".

Operation Vacation: For the engineers writing code, the goal is for neural networks to improve by themselves (and with help of labelers) and theoretically the engineers could go on vacation while the system continues to improve. Mentions "it's a half joke, our northstar."

They do-not outsource their data labeling.

Dojo: An in-house (vertical integration) chip designed for training mass amounts of data. Active project currently at Tesla.

Currently do a lot of manual labeling but looking to train data based on sensors. For example: Running some cars on Lidar/Radar to verify and train visual data.

Waymo vs Tesla: Waymo does a ton of HD mapping with many sensors and many humans before a car can enter any area. Tesla is not using HD maps, but "Low definition" maps that simply give "Left turn, Right turn" directions. Tesla relies on inexpensive parts and expensive software to drive on any road you give it. "We don't know to centimeter level accuracy where a curb is. The car needs to look at images and decide where it should be." A much "higher bar, harder to design" problem but a lot less expensive.

Implies that Waymo doesn't have enough cars to collect the data to solve FSD but doesn't outright say it. "Scale is incredibly important for dataset curation. I would rather trade sensors for data."

Expect exponential improvements in automation everywhere, from cars to drones to warehouses, in just the next few years. The growth in the AI space in the past 4 years has not been close to linear and Andrej is excited to see what the next 4, 10, even 20 years will bring us.

The Robot Brains Podcast

Discoducky · Mar 23, 2021

Buckminster said:
The Robot Brains Podcast

I'm halfway through and it is fun to listen to. I've seen some notes already from others and the talk seems very interesting until the very end.

Listening to James on Dave's YouTube channel has also been a joy. James really knows his stuff and it is great to see Dave and him having these in depth talks.

Buckminster · Mar 24, 2021

Could the Dojo chip be wafer scale?

World's Largest Computer Chip Getting a Major Performance Boost

Buckminster · Jul 29, 2021

https://twitter.com/x/status/1420654035208740864

Should be getting a major update on Dojo on the 19th at AI day.

Buckminster · Aug 3, 2021

https://twitter.com/x/status/1422496499217440768

Discoducky · Sep 5, 2021

Very detailed breakdown of D1: Tesla Dojo – Unique Packaging and Chip Design Allow An Order Magnitude Advantage Over Competing AI Hardware

Super cool stuff!

SwTslaGrl · Sep 5, 2021

^There's a follow up article from 25 Aug with some damped views on technical, cost and timeline.

generalenthu · Sep 5, 2021

Discoducky said:
Very detailed breakdown of D1: Tesla Dojo – Unique Packaging and Chip Design Allow An Order Magnitude Advantage Over Competing AI Hardware

Super cool stuff!

Tangential question for you @Discoducky given AI day was a recruiting event.

Tesla is clearly doing some cutting edge work and by all accounts very selective when hiring people for it's engineering teams. While the pay may be less than FAANGs, the upside on the stock comp is certainly much higher. Plus more challenging work that's in pursuit of a great mission.

How does this translate into the top notch AI, software, or hardware engineers from the FAANGs applying to Tesla? Did you hear anything about the event actually moving the needle on that front?

Buckminster · Sep 17, 2021

Best summary I have seen.
Douma thinks that Dojo will get 80% utilisation versus a Nvidia cluster of around 20%.

Dojo discussion

P100DL, 2021 M3, 4 CT reservations and counting

P100DL, 2021 M3, 4 CT reservations and counting

Member

Active Member

little endian

Active Member

TSLA - 12+ startups in 1

Ex got M3 in the divorce, waiting for EU Model Y!

TSLA - 12+ startups in 1

Member

P100DL, 2021 M3, 4 CT reservations and counting

Well-Known Member

P100DL, 2021 M3, 4 CT reservations and counting

Well-Known Member

Well-Known Member

Well-Known Member

P100DL, 2021 M3, 4 CT reservations and counting

Member

Active Member

Well-Known Member

Similar threads