NVDA software is often cited as a huge positive. It probably is right now but will soon be a liability. Another area where Dojo can leapfrog.
You can install our site as a web app on your iOS device by utilizing the Add to Home Screen feature in Safari. Please see this thread for more details on this.
Note: This feature may not be available in some browsers.
The end game is that, in addition of the CPU/GPU/Flash/RAM silicon we have in every computing device, there will be additional silicon for AI inference (i.e. using the neural networks). That will be TPU's, not GPU's. There will be lots of different TPU's, each one matched to the intended use. E.g. the Tesla Autopilot cpu, or the Apple m1 series 16-core Neural Engine. If you are a software developer (aka neural network consumer), you're probably going to use an abstraction layer that can interact with all these different devices without hardcoding for one of them (so that you don't depend on Nvidia CUDA). I'm currently using ONNX (ONNX | Home). Nvidia is probably not going to be dominant here because:
- SOC builders like Intel, AMD, Apple will include (actually they already have done that to some degree) that functionality into their SOC.
-If it's not built into a SOC, there is still the possibility to use a dedicated TPU. Google sells one cheaply (Products | Coral), and there are others, and probably an order of magnitude more under development.
NVidia sells it's own SOC ARM-based product in this market: Jetson, Xavier NVIDIA Embedded Systems for Next-Gen Autonomous Machines
TPU's give more bang for the buck than GPU's:
Then there is the training part of AI. This requires massive data centers, massive amounts of data, massive amounts of compute. For the moment Nvidia is dominant here, but they charge so much for their H100 TPUs NVIDIA H100 Tensor Core GPU that their clients will all look for cheaper alternatives. Tesla even built their own system in about 1 or 2 years, called Dojo. It's difficult to imagine that Nvidia will remain without competition here for a long time.
A new inference load at xAI
Seems like it was a large amount of compute coming online from various sources. Imo this is what Tesla are doing well, being agile. When everyone struggles to get nVidia chips, they have already diversified to alternate suppliers, doing it inhouse and getting things up and running fast. And they are not afraid to make the billion dollar investment into compute. Meanwhile where is Toyota, VW etc with this? Have we heard anything?
After watching Karpathy's recent interview I gleaned something I think is very important with regard to Dojo.
He was talking about how horribly inefficient today's GPU's are. Even NVIDIA's best are power hungry monsters that just don't do their jobs very well compared to the low-power human brain at only about 20 watts. It's obvious that when it comes to AI hardware, we are doing it wrong.
Clearly, there is unbelievable potential for improvement.
So the Dojo project can be justified on the basis that AI hardware is in its infancy. It's time for exploration. Like the gold rush, most will fail, but the one who digs in the right spot will be richly rewarded.
There's gold in them thar hills!
Tidbit of news about the next generation of Dojo from TSMC at the North American Technology Symposium. The stated timeline is to expand the reticle limit to 5.5 by 2026, which would enable 3.5 times the compute, and full wafer integration by 2027 which enables 40 times the compute.
![]()
Expect a Wave of Wafer-Scale Computers
TSMC tech allows for one version now and a more advanced version in 2027spectrum.ieee.org
Sounds like Tesla will be bringing a lot of inference chips into their data center. Why? I think it is to do extremely fast validations of new training builds. Instead of running a new build in shadow mode for weeks in their fleet they will now be able to run a new build on a large cluster of inference chips using recorded video or even simulations. This will speed up the validation process by a huge amount.
Correcting myself here. According to an earlier post from Elon, the inference chips are used during the training step to generate the loss number. So they use the actual production inference chips during training as part of the back propagation algorithm.Sounds like Tesla will be bringing a lot of inference chips into their data center. Why? I think it is to do extremely fast validations of new training builds. Instead of running a new build in shadow mode for weeks in their fleet they will now be able to run a new build on a large cluster of inference chips using recorded video or even simulations. This will speed up the validation process by a huge amount.