Vision running on diffusion

Buckminster · Apr 9, 2023

https://twitter.com/x/status/1644942345383673856

What is diffusion model in AI?

Diffusion models are deep generative models that work by adding noise (Gaussian noise) to the available training data (also known as the forward diffusion process) and then reversing the process (known as denoising or the reverse diffusion process) to recover the data. The model gradually learns to remove the noise.31 Mar 2023

More here

heltok · Apr 9, 2023

My not super qualified guess is that they will just replace some of their attention layers in their neural network to some diffusion based architecture. Ie layer 3 here. Benefit is that they save some compute/improve performance for same compute. Wonder if this can be used for language of lanes :-/

Knightshade · Apr 9, 2023

Curious if HW3 cameras are sufficient for the degree of detail needed for the type of pedestrian prediction he's talking about here vs the higher rest HW4 ones?

DanCar · Apr 9, 2023

Buckminster said:
https://twitter.com/x/status/1644942345383673856

More here

Elon is not talking about generative diffusion. Look up jump diffusion or see post in FSD tweets.

DanCar · Apr 9, 2023

Knightshade said:
Curious if HW3 cameras are sufficient for the degree of detail needed for the type of pedestrian prediction he's talking about here vs the higher rest HW4 ones?

3 front facing cameras should help. If the pedestrians aren't close then the zoomed in cameras should help.

Supcom · Apr 9, 2023

DanCar said:
3 front facing cameras should help. If the pedestrians aren't close then the zoomed in cameras should help.

Pedestrians far away are not the issue. It's the ones nearby that need better prediction. You need better prediction for what the pedestrian is going to do in the next couple seconds.

Buckminster · Apr 9, 2023

DanCar said:
Elon is not talking about generative diffusion. Look up jump diffusion or see post in FSD tweets.

DanCar said:
Tesla needs to do a better job predicting pedestrian intent.

https://twitter.com/x/status/1644942345383673856
Diffusion seems to be more compute-efficient than transformers for vision.
From 2019:
https://ieeexplore.ieee.org/ielaam/76/8786781/8425080-aam.pdf
Develop an effective variant for estimating
visibility statuses of objects while tracking them in videos. Dealing
with partial or full occlusions is a long standing problem in
computer vision but largely remains unsolved. In this work,
we cast the above problem as a Markov Decision Process
and develop a policy-based jump-diffusion method to jointly
track object locations in videos and estimate their visibility
statuses. Our method employs a set of jump dynamics to change
object’s visibility statuses and a set of diffusion dynamics to
track objects in videos. Different from traditional jump-diffusion
process that stochastically generates dynamics, we utilize deep
policy functions to determine the best dynamic for the present
state and learn the optimal policies using reinforcement learning
methods. Our method is capable of tracking objects with full or
partial occlusions in crowded scenes.

Buckminster · Apr 16, 2023

Douma discusses diffusion from 1:40:

Dewg · Apr 16, 2023

Buckminster said:
Douma discusses diffusion from 1:40:

I found this section interesting. I never thought about Tesla creating planner paths for all other vehicles as well as its own. There are times when the car would turn on signals for a second and then turn them off - this could explain why.

Buckminster · Apr 23, 2023

https://twitter.com/x/status/1650056268218810368

Buckminster · May 7, 2023

https://twitter.com/x/status/1655318205236215811

Baumisch · May 8, 2023

It's gonna be a loooong time until v12 then

Bitdepth · May 8, 2023

Dewg said:
I found this section interesting. I never thought about Tesla creating planner paths for all other vehicles as well as its own. There are times when the car would turn on signals for a second and then turn them off - this could explain why.

They talked about it at AI Day 2021. They run the autopilot planner for other on road agents.

DanCar · May 8, 2023

Baumisch said:
It's gonna be a loooong time until v12 then

Is it a quality based decision? Could they release it next week?

Buckminster · Jun 16, 2023

https://twitter.com/x/status/1669753411623956501

Buckminster · Jun 30, 2023

Diffusion is a stepping stone?

https://twitter.com/x/status/1674948179236884481

Buckminster · Jul 3, 2023

3-7 mins explains diffusion.
Following on - could remove / reduce labelling. Douma disagrees.
H/W3 might require diffusion because more efficient
Tesla implementation conjecture later

Cosmacelf · Jul 3, 2023

Buckminster said:
3-7 mins explains diffusion.
Following on - could remove / reduce labelling. Douma disagrees.
H/W3 might require diffusion because more efficient
Tesla implementation conjecture later

I hadn‘t known that Tesla HW 3 inference chip was optimized for CNNs rather than transformers, and that HW 4 will be much better here. What I had realized was that I would often see FSD running behind reality, notwithstanding Elon’s claim that the computer reacts faster. It can only react faster if it can perceive faster, and it generally doesn’t do that.

Buckminster · Dec 22, 2023

https://twitter.com/i/spaces/1rmxPMwPXXQKN?s=20
1:06 AGI is looking like a combo of diffusion and transformers.

Vision running on diffusion

Well-Known Member

Active Member

Well-Known Member

Active Member

Active Member

Active Member

Well-Known Member

Well-Known Member

Active Member

Well-Known Member

Well-Known Member

Member

Member

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Similar threads