Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.

Dojo 2020


Koopa Troopa
Nov 24, 2019
New Donk City
At Autonomy Day, Elon said:

“The car is an inference-optimized computer. We do have a major program at Tesla — which we don’t have enough time to talk about today — called Dojo. That’s a super powerful training computer. The goal of Dojo will be to be able to take in vast amounts of data — at a video level — and do unsupervised [a.k.a. self-supervised] massive training of vast amounts of video with the Dojo computer. But that’s for another day.”
Dojo is for self-supervised learning on video and, therefore, presumably for computer vision.

DeepMind recently demonstrated that you can get better performance on image recognition with 2x to 5x fewer hand-labelled training images when you do self-supervised pre-training beforehand.

Yann LeCun also recently shared his prediction for self-supervised learning on video in 2020. LeCun helped pioneer the field of deep learning and won a Turing Award for that. He's also a computer science professor at NYU and Chief AI Scientist at Facebook. LeCun wrote:

“This suggests that the way forward in AI is what I call self-supervised learning. It’s similar to supervised learning, but instead of training the system to map data examples to a classification, we mask some examples and ask the machine to predict the missing pieces. For instance, we might mask some frames of a video and train the machine to fill in the blanks based on the remaining frames.”
Here's a short clip of LeCun explaining this idea:

The full talk is worth watching.

LeCun continues:

“This approach has been extremely successful lately in natural language understanding. Models such as BERT, RoBERTa, XLNet, and XLM are trained in a self-supervised manner to predict words missing from a text. Such systems hold records in all the major natural language benchmarks.​

In 2020, I expect self-supervised methods to learn features of video and images. Could there be a similar revolution in high-dimensional continuous data like video?​

One critical challenge is dealing with uncertainty. Models like BERT can’t tell if a missing word in a sentence is “cat” or “dog,” but they can produce a probability distribution vector. We don’t have a good model of probability distributions for images or video frames. But recent research is coming so close that we’re likely to find it soon.​

Suddenly we’ll get really good performance predicting actions in videos with very few training samples, where it wasn’t possible before. That would make the coming year a very exciting time in AI.”
Now I feel like I understand the purpose of Dojo. There is a research hurdle to clear that Dojo won't solve: representing uncertainty in video prediction.

But if it is solved in 2020 as LeCun predicts, then the main constraints on self-supervised pre-training will be data and compute. Tesla has access to plenty of video data, which is cheap to record, upload, and store. It can also use active learning to select which video clips to upload. Then Dojo is intended to provide 10x more useful training compute at a lower cost.

There's also the constraint of inference compute. Can HW3 run big enough neural networks in real time? I guess we'll see. DeepScale is supposed to squeeze down neural networks to fit on HW3 and HW4 is already in the works and it's supposed to be 3x more powerful.

Maybe with the breakthrough in self-supervised video prediction that LeCun anticipates, adopted by Tesla and accelerated by Dojo, we'll see an order of magnitude increase in Tesla's computer vision performance.
Last edited:


Senior Software Engineer
Oct 24, 2016
Maybe with the breakthrough in self-supervised video prediction that LeCun anticipates, adopted by Tesla and accelerated by Dojo, we'll see an order of magnitude increase in Tesla's computer vision performance.

Elon said that Dojo was for planning and control "Over time, I would expect that it moves really to just training against video, video in, car and steering and pedals out… that’s what we’re gonna use the dojo system for."

Its basically what they are doing today with their in-house Nvidia chips being done a bit faster and in greater scale with "dojo".
  • Like
  • Disagree
Reactions: mikes_fsd and J1mbo


Average guy who loves autonomous vehicles
Aug 3, 2017
Terre Haute, IN USA
Elon shares some additional info on Dojo:


Products we're discussing on TMC...

About Us

Formed in 2006, Tesla Motors Club (TMC) was the first independent online Tesla community. Today it remains the largest and most dynamic community of Tesla enthusiasts. Learn more.

Do you value your experience at TMC? Consider becoming a Supporting Member of Tesla Motors Club. As a thank you for your contribution, you'll get nearly no ads in the Community and Groups sections. Additional perks are available depending on the level of contribution. Please visit the Account Upgrades page for more details.