Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Vision running on diffusion

This site may earn commission on affiliate links.

Buckminster

Well-Known Member
Aug 29, 2018
10,326
51,390
UK
What is diffusion model in AI?


Diffusion models are deep generative models that work by adding noise (Gaussian noise) to the available training data (also known as the forward diffusion process) and then reversing the process (known as denoising or the reverse diffusion process) to recover the data. The model gradually learns to remove the noise.31 Mar 2023
More here
 
My not super qualified guess is that they will just replace some of their attention layers in their neural network to some diffusion based architecture. Ie layer 3 here. Benefit is that they save some compute/improve performance for same compute. Wonder if this can be used for language of lanes :-/

Tesla-FSD-Beta-AI-Day-2022-10.png
 
Elon is not talking about generative diffusion. Look up jump diffusion or see post in FSD tweets.

Tesla needs to do a better job predicting pedestrian intent.
Diffusion seems to be more compute-efficient than transformers for vision.
From 2019: Develop an effective variant for estimating
visibility statuses of objects while tracking them in videos. Dealing
with partial or full occlusions is a long standing problem in
computer vision but largely remains unsolved. In this work,
we cast the above problem as a Markov Decision Process
and develop a policy-based jump-diffusion method to jointly
track object locations in videos and estimate their visibility
statuses. Our method employs a set of jump dynamics to change
object’s visibility statuses and a set of diffusion dynamics to
track objects in videos. Different from traditional jump-diffusion
process that stochastically generates dynamics, we utilize deep
policy functions to determine the best dynamic for the present
state and learn the optimal policies using reinforcement learning
methods. Our method is capable of tracking objects with full or
partial occlusions in crowded scenes.
 
  • Like
Reactions: DanCar
3-7 mins explains diffusion.
Following on - could remove / reduce labelling. Douma disagrees.
H/W3 might require diffusion because more efficient
Tesla implementation conjecture later
I hadn‘t known that Tesla HW 3 inference chip was optimized for CNNs rather than transformers, and that HW 4 will be much better here. What I had realized was that I would often see FSD running behind reality, notwithstanding Elon’s claim that the computer reacts faster. It can only react faster if it can perceive faster, and it generally doesn’t do that.
 
  • Like
Reactions: Buckminster