Neural Networks

banned-66611 · Jun 14, 2018

mongo said:
Elon Tweet Jun 10th:

How is not fixing the "crashes into the back of stationary vehicles" concentrating on safety?

croman · Jun 14, 2018

banned-66611 said:
How is not fixing the "crashes into the back of stationary vehicles" concentrating on safety?

It is. They've already improved it in the latest fw.

chillaban · Jun 14, 2018

croman said:
It is. They've already improved it in the latest fw.

And basically every release along the way too. 10.4 was a huge improvement in recognizing stopped vehicles. 2018.12 or so all the way with .18 were heavily tweaking how soon TACC started braking for perceived stopped cars. And .21.9 seemed to both improve recognizing stopped cars earlier as well as detecting partially lane offset intruders earlier.

It’s fair to say they’ve been working on the safety aspect of Autopilot and delivering improvements as they become ready.

CK_Stuggi · Jun 14, 2018

BigD0g said:
They are using 2 cameras for stereoscopy @jimmy_d and @verygreen reported this long ago like 18.10 days or earlier.

Sorry must have missed or forgotten that post. Thanks for the heads up.

robertvg · Jun 18, 2018

Nice work on the Autopilot overlay program @DamianXVI @verygreen @jimmy_d !
A rare look at what Tesla Autopilot can see and interpret

Kristoffer Helle · Jun 18, 2018

more info on this? one wideo was removed?

verygreen · Jun 18, 2018

It was not removed, I forgot to publish it and Fred did not let me know it's not visible. I have it published now.

BigD0g · Jun 18, 2018

verygreen said:
It was not removed, I forgot to publish it and Fred did not let me know it's not visible. I have it published now.

@verygreen That video / work is worthy of it's own thread, I'm sure the conversations will begin, vs getting lost in this thread.

verygreen · Jun 18, 2018

BigD0g said:
@verygreen That video / work is worthy of it's own thread, I'm sure the conversations will begin, vs getting lost in this thread.

Seeing world in autopilot

CK_Stuggi · Jun 18, 2018

The video is back. Would be interesting to combine the radar data with data from the Autopilot Maps thread.

SlicedBr3ad · Jun 18, 2018

as of 21.9 on M3, i've noticed that car chooses right lane when on AP instead of left lane (i.e. mX owner that died by hitting divider). so there is that going for ongoing AP improvement.

J X 3 · Jun 18, 2018

@ jimmy_d

Where you talk about whitening the database, my theory is that this why you need "emotions", if emotions are defined as only a way to quantify the importance of a memory. IMO Google is somewhat doing that here Enabling Continual Learning in Neural Networks | DeepMind
Don't know anything about this topic really but why do we remember corner cases, they confuse us or worse, become dangerous. Why do we remember a strong smell or pain.

And in your lengthy post form April, why do you say you see no evidence of FSD? I mean, perception is the hard part ,path planning is easier than a board game, if you solve perception, you got FSD. Granted perception also needs to be able to anticipate what all those objects might do but likely most of that is left for later.

omgwtfbyobbq · Jun 25, 2018

EAP/the safety system really impressed me today. When DW was driving earlier, I think the system correctly identified a stopped car/distracted car as a potential hazard and beeped accordingly.

We were coming up to a light in the inner right turn lane. As we approached the light, we got a green right arrow, and the cars in the far right turn lane started moving/turning, as did the first car in the inner right turn lane. The second car in the inner right turn lane, which was in front of us, stayed put because the driver was distracted, and maybe a half second after my wife started braking harder because this car wasn't moving, the safety system gave a few warning beeps.

I believe the system correctly inferred that the stopped car was a possible hazard because of the green for the two right turn lanes and/or because the car wasn't following the flow of traffic. I know my wife has had much harder braking coming up to cars stopped at intersections without the car warning us about anything, so I don't think it warned us because my wife was braking harder than normal.

Matias · Aug 20, 2018

Nice article about what is easy and what is hard with the current technology.

What's easy/hard in AI & Computer Vision these days?

Writer sees autonomous driving belonging to the most difficult computer vision problems.

strangecosmos · Aug 20, 2018

jimmy_d said:
Google's recently announced TPU3 pods have a raw throughput of 100 petaflops (100,000,000,000,000 operations per second). To put that in perspective: various estimates of the computational performance of the human brain tend to fall into the range of 10^16 to 10^19 ops/second, so we are now in the zone (plus or minus a year) where cutting edge NN training hardware runs at a capacity roughly comparable to the human brain. And at the current growth rate in just a few years the best machines will be 100x the speed of the human brain, which is sort of the expected threshold where development of algorithms of human brain complexity can be done.

The former director of the Human Brain Project, Henry Markram, at one point said he thought it would take exascale computing to simulate the human brain — in the ballpark of 1 exaflop. 1 exaflop is 1000 petaflops or 10^18 flops, and 10 exaflops is 10,000 petaflops or 10^19 flops.

However, estimates go a lot higher depending on which entities in the brain you think actually do computational work. If you want to go down to the individual molecule level it’s 10^43 flops (see page 80). From what I can tell, there seems to be little consensus among neuroscientists or cognitive scientists which entities actually matter to cognition and which don’t.

Until recently, neuroscientists thought glia — brain cells that outnumber neurons roughly 10 to 1 — played no role in cognition. But increasingly glia are thought to play a role in cognitive processes like learning and memory. Whoops! We missed about a trillion cells in our understanding of how cognition works!

I would add that even if you had infinite computing power, that doesn’t mean you will be able to develop human-level artificial general intelligence. Computation is necessary but not sufficient. One idea I find compelling is that progress in AI follows progress in neuroscience. AI is essentially an exercise in biomimicry. So, no matter how much computation we have, or data, or capital, or deep learning PhDs, we won’t reach human-level AGI until we have a better understanding of the brain and cognition. Once we understand how biological intelligence works, we can copy it.

Not saying this is certainly true, but it is consistent with what we know. Deep learning doesn’t seem to provide a path to human-level AGI, even though it’s possible (as Daniel Dennett has speculated) that the human brain implements deep neural network-like processes for low-level cognitive problems — and hence it makes sense that deep neural networks can match or surpass biological neural networks on many tasks.

It seems to be an open question how much artificial intelligence will (or even can) diverge from biological intelligence. It’s a profound question.

Maybe after Tesla launches full autonomy and the stock goes to a bazillion a share I will devote the rest of my life to working on high-level theories of cognition.

malcolm · Aug 29, 2018

Andrej Karpathy on Twitter

MarcusMaximus · Aug 29, 2018

Trent Eady said:
AI is essentially an exercise in biomimicry

This used to be more true, but the underlying structure of the neurons and activations themselves aren’t as capable as our human-created AI. That might sound surprising, because, obviously, our man-made networks are nowhere near as capable on the whole at tasks like driving, etc. But nature has us bested in overall network structure(including in the ability to adapt structure at inference time) and just in sheer processing power vs. power requirements.

strangecosmos · Aug 30, 2018

MarcusMaximus said:
This used to be more true, but the underlying structure of the neurons and activations themselves aren’t as capable as our human-created AI.

Really, why do you say that?

MarcusMaximus · Aug 30, 2018

Trent Eady said:
Really, why do you say that?

Early on in neural network design, people actively tried targeting just simulating the way human neurons work, with what’s called a Binary Stochastic Neuron. Those certainly do the job, in that you can make a functioning net entirely out of them. But other, man made, activation functions tend to perform better in a given architecture. ReLU(Rectified Linear Unit) is one of the commonly used. On output neurons, there’s a wide array of them, but they tend to just try to model the type of output expected, e.g. Softmax for multiple mutually exclusive outputs, sigmoid if it should be between 0 and 1, etc. But I’ve never actually seen Stochastic Binary Neurons used in the field, because, while they’re interesting, they don’t perform as well.

Notably, while we know very little about the brain overall, we know a LOT about indivual neurons. So it’s not surprising we’d be able to model them directly and improve on them.

jimmy_d · Sep 25, 2018

Referencing this post: Seeing the world in autopilot, part deux

Am posting here because this is mostly about neural networks. Will make a link over there pointing to this post.

seeing the world in autopilot part deux observations:

First, some background:

Based on taking apart a set of AP2 binaries earlier this year I came up with a general structure for how data flow was going through the system and was able to identify some intermediate processing products from the nature of the data structures, how they were being used, and the names of variables.

This general structure has a group of advanced CNN networks processing the output from each of the 7 navigation cameras (excluding the backup camera) followed by a second set of networks that I called post-processing networks. The camera networks were identifying and localizing several classes of objects in the field of view of all of the cameras. Among the types of objects that seem to be detected were vehicles, traffic signals, and lane markings. The second layer of networks generated outputs that seemed to be focused on identifying and understanding the shape of lanes, assigning vehicles to lanes, predicting whether other vehicles were moving, stopped, or parked, and also identifying physical landmarks (mainly poles and maybe the corners of buildings).

So all of this was from looking at the code. I had started that analysis with the hope of finding a way to understand the system capabilities but being unable to see the code in action what I could come up with was pretty limited.

Now we see some beautiful output from the efforts of @verygreen and @DamianXVI which can extend my earlier observations by giving us examples of what comes out of the network.

So I’m going to interpret what’s happening in @verygreens video here in light of what I’ve seen in the code.

First - the annnotations here seem to be output from second layer of networks, not from the primary camera networks. There are various ways to show this but a simple one is thus - vehicle ID’s in the video persist from frame to frame. It’s not possible for the camera networks to do that because they only process one frame at a time and have no knowledge of other frames or the machine state other than one single frame of camera output. Downstream networks have to correlate the output from successive frames of camera network output in order to allow the ID to persist.

The major categories of annotation are: predicted lane boundaries and boundary type, vehicles, trucks, pedestrians, motorcycles, bicycles, and driveable space.

Lane boundaries predict the left and right edges of the acceptable driving lane that the AP vehicle currently occupies with color coding indicating whether a lane boundary is separating oncoming or same direction traffic. At junctions where a turn might optionally occur multiple lane boundaries will be identified representing the edges of the driving lane for each of the optional paths that have appeared. I never saw more than two options at once. Options are show for the occupied lane but otherwise the only lane boundaries predicted are the far boundaries of adjacent lanes.

It’s notable that the lane boundaries aren’t just identifying pavement marking or curbs. The boundaries are present even when pavement marking are absent and both the left and right lane boundaries appear even when only one of them is easily identifiable from what’s in the camera view. Aside from lane markings AP seems to use the presence and state of other vehicles and the presence of obstacles to predict lane boundaries.

Objects all seem to carry a confidence value in % which probably represents the confidence of the system in it’s object class prediction. Identified objects optionally carry several attributes including a lane assignment, motion state (moving, stopped, or stationary), distance, and relative velocity. It seems that objects are also labelled as to whether AP has a corresponding radar return associated with a particular object. Notably, lane assignment include making a distinction whether a lane is a parking lane (off-road) or not. Making that distinction requires a lot of context. Lane assignment also seems to have a lot of states including not just whether the object is in your lane, left, or right but also whether it’s straddling lanes and also something labeled IMM - which might be “immediately adjacent”

Drivable space represents unobstructed area that the AP vehicle has physical access to and is bounded by edge markings that indicate the kind of obstacle that is limiting the drivable space at that section of the edge. Vehicles and pedestrians are obstacles that are in a different class from other sorts of barriers. While traffic cones, bollards, and fencing aren’t called out as discretely identified objects it’s clear that AP is seeing them and recognizing them functionally because it adjusts the driving space according to their presence.

And finally we have that beautiful, beautiful path prediction arrow in orange. I love this element because it probably gives us the most abstracted and subtle insight into what AP is ‘thinking’ as it moves around the world. I’ll make the strong claim that the path prediction is the output of a neural network because it behaves probabilistically, seems to be affected by the full context of a scene, lacks hysteresis, and presents a continuous selection space. A human written heuristic is unlikely to show this behavior. From the shape of the path prediction we can see that AP is making a nuanced prediction of the road shape extending out at least a couple of hundred meters and - and this is really amazing to me - is able to usefully predict the rising/falling shaped of the road ahead and predict the probable path of road sections *which it cannot see*. So it estimates that a road around a blind curve will continue curving and it estimates the direction a road takes over a blind rise even when the road shape leading up to the rise is pretty complicated. This latter is probably the critical capability that finally solved the disastrous ‘cresting hill’ fail that finally went away when 2018.10.4 shipped out.

So some interesting things we know from this:

AP2 estimates distance to objects and their relative velocity based on vision even when no radar return is available for an object. It looks like radar is thus a fully redundant backup for vision capabilities. Whether that is from stereo vision processing or from scale estimates is still open, but there’s clearly a useful degree of distance estimation that is being extracted even for items that have no radar return signal.
The FOV of the camera seems to be quite a bit wider than the FOV of the radar - objects lose their radar signal at the edges of the camera FOV. This seems to be a view from the main camera - if the wide angle camera has comparable recognition accuracy then the useful FOV of the vision system is going to be enormously larger than that of the radar.
AP2 identifies vehicles even when they are substantially occluded, and at quite substantial distances.
Strange backgrounds seem to confound identification as much as occlusion does - maybe more. Cyclists seen against a background of traffic have much lower confidence than cyclists with a background of pavement or buildings.
Radar still seems to be bad at seeing cross traffic - though it seems like vision makes up for that pretty well.
At a minimum we can see that radar and vision are being fused since single objects are being given both radar and vision attributes. Is forward sonar also being fused? There doesn’t seem to be any good evidence of that here.
This video doesn’t rule out the possibility that high definition maps are being used for driving, but it also doesn’t present any evidence to support it.

I’d be really interested to know if there’s any evidence of navigation data being fed into AP2 to be used as part of navigation. And it would also be interesting to know if there’s any evidence of AP gathering data to be used to create HD maps. A lot of groups seems to be relying on HD maps as a critical part of their driver assistance systems (comma ai, cruise, waymo) but so far I haven’t seen any evidence that Tesla is actually doing that - aside from some claims from a few years ago.

Neural Networks

banned-66611

Guest

Well-Known Member

Active Member

Member

Extremely Well-Known Member

Member

Curious member

Active Member

Curious member

Member

Member

Member

Active Member

Active Member

Non-Member

Active Member

Active Member

Non-Member

Active Member

Deep Learning Dork

Similar threads