Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Seeing the world in autopilot, part deux

This site may earn commission on affiliate links.
Actually 3D bounding box might be the most important thing in object detection. A 2d bounding box (which includes shapes) doesn't include important information which you need. For example the precise orientation of a car, with that you can predict its actions. This is
That's debatable. Cars have this tendency of backing out too, and then have you never seen cars moving sideways? ;)

vital in a dense environment, during parking, in a parking lot. Is this car trying to pull out or not. What precise direction is he pulling to, With just the information necessary to produce 2d bounding box, you don't see it. But the information is reflected in the information necessary to produce a 3d bounding box. Information like, is this the left side of a car, the door might pop out, is the door currently open? is this car trying to complete a turn.
You might have noticed different colors at edges of driving spaces around objects. They seem to mean different things (1 color per different value, but we do not know what the values mean. perhaps some means "this is a car front/side/whatever"?)

is this car coming at me or... the list goes on. As Mobileye Amon says. 2d bounding box is irrelevant.
2d bouding box might just be a debug aid for a different team for al we know and not used by the actual driving algorithm that uses interpreted data in the drivable space, though? That's why I say 3D bounding box while neat eyecandy, might have the same data expressed in other ways.
There's radar return to tell you things and then of course there's relative speed.

While its impressive that distances and speed are done by visual. Mobileye has been doing this since eyeq3. There's no tangible difference between atleast the firmware you have unraveled versus what eyeq3 has been doing since 2014 production date.
There is a huge difference on hilly roads. HUGE. Also I have no way of extracting data like this from eyeq3 ;)

I'm just laying out that the lack of 3d boxes and the instability of the detection shows their weakness compared to mobileye.
Well, instability is definitely there, no argument about it.

3d bounding box whether you are using lider or camera gives you the precise orientation of an object with which you can deduce its precise moment to moment decision. This is very important in dense traffic or parking lot.

Imagine if you painted inside of the 2d boxes with solid color and then look at it in the perspective as a human. Notice how you lose all indication of what's going on in the scene, only that there are objects in front of you but no idea the scenario, orientation or state they are in.

oh wait, you already did that.

Like I said in the comment, more information is availble, just did not want to update the old tool. there's still speed and direction and lane and overlap information and all that.

Notice how you don't know if the car infront of you is turning and where they are turning to or not. Or even if they are going the same direction as you. you're basically driving blind.
The relative speed tells us if it's going in the same direction or not. Other ways to tell if the car is turning, like it stays withing turning lane bounding lines?

This is even now more evident when you are driving on dense surface streets. A car could be in-front of you at a intersection in the adjacent turning lane, preparing to turn into the parallel lane next to you. With a 2d bounding box info, you won't know if that car is driving forward parallel to you or driving forward adjacent to you or making a turn.
Just forget about the 2D bounding box, ok? Let's assume all the really useful info is reflected via the driving space border around the obstacle.

Ask yourself, can you drive in dense traffic with the above view superimposed to your view? Answer is absolutely not. You need to know the orientation of cars.
Nah, not really. I need to know the direction of their movement, though.

If only driving were just about not hitting things and not prediction, knowing and planning around the actions or perceived actions of another driver.
There's little indication any such planning is actually going on in this mostly current firmware, though, so this point is kind of moot.
 
Does this exist port on the non-dev versions? Do you think plugging in an ssd drive will be necessary for the V9 dash-cam feature or is there sufficient high-speed storage for there to be a usable dash cam without? (I think a dash cam with a rolling 30 second buffer that is saved if airbags deploy is better than nothing, but I'd probably keep my existing dash cam if that is the limit of the functionality).
Yeah, the posr is standard on all APE units, but currently it's not used on non-dev units. I susepct they won't open the port for V9 dashcam but rather let you grab short clips (think 10 seconds?) on "As needed basis", at least all the current developments point in this direction.

How much of the metadata does this represent? I.e. was there enough uninterpretable metadata for street signs and road markings such as give way lines & pedestrian crossings to be being recognised, or is this something that will likely need the AP3 hardware?
in the part we are looking there's no space for additional details like that. But it's clear some road markings have other effects. e.g. an arrow on the pavement usually triggers the left-bending lines as if for a turning lane even if there are no visible lane markings at all.
 
3:55 – even on “highways” gore area is apparently considered driveable? While technically true it’s probably not something that should be attempted.
4:08 – gore zone surrounded by bollards is correctly showing up as undriveable.


I suspect, based on watching some of this, that the green is used for "irrespective of legality, I could drive here", and a purple line seems to be somehow involved in rating the level of confidence in the border of a legal operation area, beyond which it may be technically safe to pull over in an emergency, even if you can't drive there normally:


Screen Shot 2018-09-25 at 1.23.21 PM.png Screen Shot 2018-09-25 at 1.23.06 PM.png Screen Shot 2018-09-25 at 1.23.00 PM.png
 
@verygreen can you comment as to if this type of AP vision is constantly running in our car (ie "Shadow Mode") or, does it only run while AP is actually engaged (I realize you probably manually engaged this recording to extract your videos)?

I'm curious as to how legitimate "shadow mode" is in reality and how Tesla is evaluating AP against actual driver inputs and using that to improve driving.
 
  • Like
Reactions: daktari
@verygreen can you comment as to if this type of AP vision is constantly running in our car (ie "Shadow Mode") or, does it only run while AP is actually engaged (I realize you probably manually engaged this recording to extract your videos)?

I'm curious as to how legitimate "shadow mode" is in reality and how Tesla is evaluating AP against actual driver inputs and using that to improve driving.
Yeah, the detections are run 100% of the time. Apparently some of this state could be matched from the "triggers" to cause snapshots to be generated and sent back to Tesla. That said I do not think it actually compares this model to the actual driving input at this time.
 
Yeah, the detections are run 100% of the time. Apparently some of this state could be matched from the "triggers" to cause snapshots to be generated and sent back to Tesla. That said I do not think it actually compares this model to the actual driving input at this time.


Purely anecdotal but I figure this is as interesting of a place to note it as anywhere else: The other day I was driving and saw a light at the upcoming intersection was yellow. Out of an abundance of caution (I didn't see it change and didn't know how long it'd been) I went max-effort braking and stopped a few feet over the line (while the light was still yellow, hilariously. Overcaution isn't always good!), then backed up to be a good boy and be where I should have been. That night, I had significant upload activity to the Hermes-snapshot host at Tesla. Coincidence? Maybe.

Also find it interesting that they named a host after the patron god of the roads and travelers :p
 
Last edited:
  • Informative
  • Love
Reactions: NateB and KyleDay
I've had my S for 8 months now and seen AP2.5 improve over that timeline. In the video ... it's interesting how a car ahead changing from "my-lane" to "overlap left" to "Imm-Left" is shown. The letter transition is done when the car enters the "Imm Left" lane.

Autopilot reaction is always to wait until the lane is clear before accelerating ...almost always doing it "late" .... I'd probably accelerate some amount of time before a car ahead changes position from "overlap left" to "imm left" ... that is when the intent is clearly known and there is path to pass clearly opening up.
 
  • Like
Reactions: scottf200
I've had my S for 8 months now and seen AP2.5 improve over that timeline. In the video ... it's interesting how a car ahead changing from "my-lane" to "overlap left" to "Imm-Left" is shown. The letter transition is done when the car enters the "Imm Left" lane.

Autopilot reaction is always to wait until the lane is clear before accelerating ...almost always doing it "late" .... I'd probably accelerate some amount of time before a car ahead changes position from "overlap left" to "imm left" ... that is when the intent is clearly known and there is path to pass clearly opening up.

Yeah, I found the transitional state being recognized to be interesting as well. I think it'll be a while before overlap is accepted as a state where the car can accelerate, but still cool to see.
 
and the instability of the detection shows their weakness compared to mobileye.

Well, instability is definitely there, no argument about it.

We don't know if the data that has been tapped into is the final stage of perception processing. or do we?

This data could be unstable because it is the raw output from the detection / NN system before it is processed by tracking / prediction algorithms. That's what it looks like to me.
 
  • Helpful
Reactions: croman
did you see IC display? similarly chaortic so it does appear to be the final version. The data is clearly already processed by other systems - see the predicted path thingie - that must be dependent on the earlier detected lanes

IC display?

I realize the predicted path must be dependent of the lanes,

but is it not possible the data stream that it saves, includes a mix of data output some that is early/raw output, and some that is final processed data, and some in between?


It seems unlikely to me that Tesla AP does steering control based on the lanes and projected path in this video, way too shaky. I feel there has got to be more processing of this data downstream even before the lower level control software. So I'd guess the same is true for the object data.
 
It's kind of interesting to me that they have all this information about drive-able areas, yet the current system won't let you change lanes into or outside of an HOV lane in Washington, because we use a sold while line, and AP treats those as the edge of the road, no matter what. It's both amazing that they can tell the difference between dashed and sold lines, but disappointing that they can't tell the difference between a white and yellow line, or that an HOV lane is an acceptable place to go since it's fully bounded by two lane lines. Pre 2018.10 I could go in/out of HOV lanes but it knew the outer side of the HOV was the edge of the road, so this is a regression.
 
It's kind of interesting to me that they have all this information about drive-able areas, yet the current system won't let you change lanes into or outside of an HOV lane in Washington, because we use a sold while line, and AP treats those as the edge of the road, no matter what. It's both amazing that they can tell the difference between dashed and sold lines, but disappointing that they can't tell the difference between a white and yellow line, or that an HOV lane is an acceptable place to go since it's fully bounded by two lane lines. Pre 2018.10 I could go in/out of HOV lanes but it knew the outer side of the HOV was the edge of the road, so this is a regression.

Perhaps they are using data from the map to ensure that they do not allow crossing into lanes like this.

If they are to enable to automatic lane changing... they will need to implement this capability anyways.. likely by using map data.