Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

FSD Beta Videos (and questions for FSD Beta drivers)

This site may earn commission on affiliate links.
Detection of objects is super easy and a well defined process. Collect samples, train on them. wahla, it works. There are standard APIs in cloud networks that work very well. Standard APIs in other formats for object detection also.
You just have to train for every possible object that could exist in every possible condition. Easy!
And then somehow not have any false positives... maybe not so easy but I'm no expert. I am always surrounded by orange cones.
They really need to solve the whole artificial general intelligence thing.
 
  • Funny
Reactions: drtimhill
You just have to train for every possible object that could exist in every possible condition. Easy!
Are you suggesting the trillion samples that Tesla has collected doesn't amount too much? Need 10x as much to capture common barricades?

Sounds like the "height from video pixels" algo isn't working out so well.
Quote: FSD 10 predicts height from video pixels directly, without needing to classify groups of pixels into objects. In principle, even if a UFO crashed on the road right in front of you, it would still avoid the debris. Some work still needed to tune sensitivity.
 
Last edited:
  • Like
Reactions: beachmiles
Are you suggesting the trillion samples that Tesla has collected doesn't amount too much? Need 10x as much to capture common barricades?
Someone still has to label them and I assume there’s some limitation to how many different variations can be recognized due to the size of the neural net. But I guess your point is that if they wanted to reliably recognize that particular type of fencing they could do it. So yeah, the problem isn’t recognizing any individual object, they have all the data they need.
 
Someone still has to label them and I assume there’s some limitation to how many different variations can be recognized due to the size of the neural net. ...
Yeah, dusk, dawn, night, day all play a role. NNs can be very stupid.
Some thoughts:
  1. Can they make a cloud API fast enough so size of local NN is less of an issue? Big NN in the cloud.
  2. Can we create specialized hardware that is slow and fat (big NN), that is economical that can handle this type of situation? Flash drives come in terabyte sizes. Can we create specialized hardware with multi billion parameter NN using nand flash tech?
  3. Would be nice to know details of "height from video pixels" algo and why it didn't work here.
  4. Tesla should have an algorithm running that says there is something up ahead, I don't know what it is, so disengage FSD. If Tesla were able to reliably calculate distance of objects this would be simple. Lots of different technologies for doing this easily. Google pixel cameras have been doing a depth map for some time.
 
Last edited:
  • Like
Reactions: beachmiles
Are you suggesting the trillion samples that Tesla has collected doesn't amount too much? Need 10x as much to capture common barricades?

Sounds like the "height from video pixels" algo isn't working out so well.
Quote: FSD 10 predicts height from video pixels directly, without needing to classify groups of pixels into objects. In principle, even if a UFO crashed on the road right in front of you, it would still avoid the debris. Some work still needed to tune sensitivity.
I don't think this works very well for the specific type of object being described in the conversation previously ("low occlusion gates" or in more common terms things like chainlink fences or metal gates). I think even a dedicated NN may have a hard time recognizing that, and the voxel based approach works far better for solid objects (like typical barriers and pillars).
 
Last edited:
Yeah, dusk, dawn, night, day all play a role. NNs can be very stupid.
Some thoughts:
  1. Can they make a cloud API fast enough so size of local NN is less of an issue? Big NN in the cloud.
  2. Can we create specialized hardware that is slow and fat (big NN), that is economical that can handle this type of situation? Flash drives come in terabyte sizes. Can we create specialized hardware with multi billion parameter NN using nand flash tech?
  3. Would be nice to know details of "height from video pixels" algo and why it didn't work here.
  4. Tesla should have an algorithm running that says there is something up ahead, I don't know what it is, so disengage FSD. If Tesla were able to reliably calculate distance of objects this would be simple. Lots of different technologies for doing this easily. Google pixel cameras have been doing a depth map for some time.
I doubt you could get the latency and no way could you get the reliability over cellular.
No idea but the NN chips keep getting bigger. HW3 is ancient now.
I don’t think we if they’re actually using the “Vidar” NN. I think not. Probably too many false positives.
One thing about FSD Beta is that it’s optimized for making YouTube videos. People want to see videos of it handling difficult situations with zero interventions or entertaining failures. No one wants to watch videos of the car stopping and saying “I don’t know what to do.”
 
Yeah, dusk, dawn, night, day all play a role. NNs can be very stupid.
Some thoughts:
  1. Can they make a cloud API fast enough so size of local NN is less of an issue? Big NN in the cloud.


  1. No. Latency, even if you had perfect connectivity (which nobody does) would be fatally bad.


    [*]Can we create specialized hardware that is slow and fat (big NN), that is economical that can handle this type of situation? Flash drives come in terabyte sizes. Can we create specialized hardware with multi billion parameter NN using nand flash tech?

    ....what?

    Flash drives are storage, not compute.

    They need more compute to run larger NNs.

    That's what HW3 was- it runs significantly larger NNs than HW2.x was capable of it.

    HW4 will be another jump in this capability. ENOUGH of a jump? Nobody knows. Including Tesla.

    Until you solve the problem nobody knows how much is "enough"


    If Tesla were able to reliably calculate distance of objects this would be simple.

    It can, and does.

    How do you think TACC knows to adjust speeds for cars ahead of you? By knowing their distance and the relative speeds of both vehicles.
 
No. Latency, even if you had perfect connectivity (which nobody does) would be fatally bad.
[/LIST]

Lets see your calculations related to this specific incident of not seeing the barricade in a slow moving car.

  1. ....what?

    Flash drives are storage, not compute.
(moderator edit) See embedded dram or controllers on nand flash.

It can, and does.
[/LIST]
In the context of the situation we are discussing, not stopping for the barricade, your statement is wrong.
 
Last edited by a moderator:
Yeah, dusk, dawn, night, day all play a role. NNs can be very stupid.
Some thoughts:
  1. Can they make a cloud API fast enough so size of local NN is less of an issue? Big NN in the cloud.
  2. Can we create specialized hardware that is slow and fat (big NN), that is economical that can handle this type of situation? Flash drives come in terabyte sizes. Can we create specialized hardware with multi billion parameter NN using nand flash tech?
  3. Would be nice to know details of "height from video pixels" algo and why it didn't work here.
  4. Tesla should have an algorithm running that says there is something up ahead, I don't know what it is, so disengage FSD. If Tesla were able to reliably calculate distance of objects this would be simple. Lots of different technologies for doing this easily. Google pixel cameras have been doing a depth map for some time.
1. No, since cloud has indeterminate latency (even if average is good enough, and it isnt, the 95N would be WAY off). And the bandwidth would be massive. Do you know how much data the car is processing every second?
2. We already have specialized hardware in HW3 .. and no it doesnt scale like Flash etc. Nor can it since the topologies are vastly different.
3. Tesla rarely if ever talk about their algorithms for obvious reasons.
4. Depth maps use time of flight or focus pixels, which only work at short distances and/or take too long.

You can just wave your hands and say "the car needs to to X, its easy I read about it on the Interweb" .. the car needs to handle things in hard real-time, with predictable highly-reliable results. Sure, Google have depth maps, which work most of the time on close-in subjects. But if they fail to work (and they do), all that happens is you get a fuzzy photo. So what? If the car depth maps fail, you get a crash.
 
  • Like
Reactions: EVNow
Lets see your calculations related to this specific incident of not seeing the barricade in a slow moving car.

(moderator edit)

(moderator edit) See embedded dram or controllers on nand flash.
(moderator edit) None of them run the type of code needed here. It's like saying your fancy coffee maker has a processor so if you buy enough coffee makers you can run Cyberpunk on them. Your claims make no sense at all and demonstrate a vast lack of understanding of how anything involved works.

In-memory processing DOES have some uses for NN training- but that's not done in the car anyway, and Tesla has vastly better resources available (with more on the way via Dojo) for that task.



In the context of the situation we are discussing, not stopping for the barricade, your statement is wrong.

Except, it's not.

The problem wasn't knowing the distance to the object.
 
Last edited by a moderator:
1. No, since cloud has indeterminate latency (even if average is good enough, and it isnt, the 95N would be WAY off). And the bandwidth would be massive. Do you know how much data the car is processing every second?
Do you know the context of a slow moving car moving towards a barricade?

2. We already have specialized hardware in HW3 .. and no it doesnt scale like Flash etc. Nor can it since the topologies are vastly different.
(moderator edit)

3. Tesla rarely if ever talk about their algorithms for obvious reasons.
They are actually more open than anyone else. They talk more about their algorithms than anyone else.

4. Depth maps use time of flight or focus pixels, which only work at short distances and/or take too long.

You can just wave your hands and say "the car needs to to X, its easy I read about it on the Interweb" .. the car needs to handle things in hard real-time, with predictable highly-reliable results. Sure, Google have depth maps, which work most of the time on close-in subjects. But if they fail to work (and they do), all that happens is you get a fuzzy photo. So what? If the car depth maps fail, you get a crash.
No, you have multiple algorithms running , if they don't agree then you either stop and figure out which is the most reliable. There is no single algo that Tesla is using that is reliable. By your logic we will forever be crashing.
 
Last edited by a moderator:
Lets see your calculations related to this specific incident of not seeing the barricade in a slow moving car.


Your ignorance is showing. See embedded dram or controllers on nand flash.


In the context of the situation we are discussing, not stopping for the barricade, your statement is wrong.
As far as using cloud NNs, then the delta between the cars bandwidth/latency needs for cloud NNs and that actually available to the car is so big its not worth doing the math. Seriously.

Not sure about ignorance here, but why do you think you can extrapolate from DRAM or embedded flash controllers to NNs? Do you understand the architecture of NNs? Or even DRAM?