powertoold
Active Member
10.69.3 is the beginning of the end for LIDAR approaches!
You can install our site as a web app on your iOS device by utilizing the Add to Home Screen feature in Safari. Please see this thread for more details on this.
Note: This feature may not be available in some browsers.
I would guess the Autopilot Vision team was hoping/expecting that the "simple" approach of adding more training data with "road markings" would be picked up by the general vision component which outputs a rich visual representation from the 8 camera views. I think it did somewhat work but also led to some odd behaviors when an arrow was partially or only briefly visible as it could look like some other similar but actually different situation.I'm hoping this will help solve what I see as my biggest source of error. FSD will sometimes move into a turn lane when no turn is imminent. If it wasn't paying attention to road markings then that would help explain the problem.
It is a matter of time. They wanted to start testing their code and refining it before it is feature complete. They haven't released it yet, so they are still on the "adding from the start" phase in my book.And if you are going to need to add road markings anyway, then why not add it from the start? To me, it makes more sense to add road markings and other things to have that robust system from the start rather than have an unreliable system that eventually becomes more reliable when you finally add what you should have added sooner.
Employee rollout today, should see wide rollout next week.
I like your optimism.
Definitely optimistic but historically the public version went out a week after the employee's received it.
This is where I don't quite agree with Tesla's design philosophy. They seem to start with the bare bones and then work up based on what they need. I would prefer if they just started with what they need for a robust system from the start. Sure, it would be great if vision-only could intuit direction of lane without road markings but that is not going to happen 100% all the time. And if you are going to need to add road markings anyway, then why not add it from the start? To me, it makes more sense to add road markings and other things to have that robust system from the start rather than have an unreliable system that eventually becomes more reliable when you finally add what you should have added sooner.
I believe the "Vector Lanes grammar" part is specifically referring to how each "word" is constructed (point indices, topology types, fork/merge reference, spline coefficient), but the overall approach of treating this as a language task with these words in a sentence was introduced back with 10.11. Some parts were already improved with 10.69.2's "Increased recall of forking lanes by 36% by having topological tokens participate in the attention operations of the autoregressive decoder…"Suggests they did have this in there after all, though unclear if they do have grammar for intersections, as discussed on AI Day
Yes. And all I am saying is that it is not 100% clear how they are handling intersections and whether they have rolled out all the grammar there. Maybe it all comes together with the rest of the vector lanes grammar, I am not sure. But it’s a more complicated scenario than simple forking and merging, so maybe they are still relying on other methods.but these lanes predictions are happening all the time well before an intersection, but it's usually not as complex with lanes just continuing straight
Humans do understand occlusions (an offshoot of occupancy) and I would argue certainly do effectively build a point cloud of some sort. No one knows what humans do really of course. But have to at least find a way to emulate those basic capabilities.. Humans don't drive like this, when you drive you don't first build a point cloud around you... It's always going to drive like a robot if they take this approach.
Presumably they railed against lidar because they claimed they could perform the same task as lidar equally well in most circumstances with vision. Which is what they are trying to do. It is certainly difficult.Same with occupancy networks, especially after railing against Lidar, they go and build a Lidar simulator using vision.
You are advocating an end-to-end leaning network. Various companies have commented on this - including MobilEye, who say this doesn't work.All the car needs to do is drive, just ask the network to tell you where humans would have driven, or where humans making a left turn would have driven, etc. Network should go from perception straight to plan.
Special feature detectors do add complexity, and that goes against the learnings of AlphaGo's hand-crafted heuristic features replaced with AlphaZero's "stack more layers" approach. But doing so might have benefits other than just getting things out the door sooner. Maybe Tesla did see that their existing network architecture could learn lanes correctly in their much larger networks used by the offline autolabeler, but practically that network can't run in the time constraints needed for driving.Building hand-selected feature detectors is always going to be a stopgap, throwaway measure to get the thing out the door. It's just going to result in tons of complexity and bugs.
At least Tesla's approach doesn't prevent them from going end-to-end in the future, unlike some other companies. Is hand-tuning a policy around gore zones and lane markers and traffic cones going to accrue to Optimus? If you solve driving in the right way, you lead straight into cooking and cleaning robots.Special feature detectors do add complexity, and that goes against the learnings of AlphaGo's hand-crafted heuristic features replaced with AlphaZero's "stack more layers" approach. But doing so might have benefits other than just getting things out the door sooner. Maybe Tesla did see that their existing network architecture could learn lanes correctly in their much larger networks used by the offline autolabeler, but practically that network can't run in the time constraints needed for driving.
So potentially the custom detection won't be needed in the future, but at least for now with the existing compute budget, this is what Tesla needs to do.
Of course driving is going to be end-to-end, that's how everything eventually ends up. If you want to build a detector for cars in 2022, you don't make a wheel detector, and a hood detector, and a door detector, and a brake light detector. That's how it worked a decade ago. Today you just ask the network to tell you where entire cars are.You are advocating an end-to-end leaning network. Various companies have commented on this - including MobilEye, who say this doesn't work.
Haven't you learned by now never to assume dates with FSD.2022.36.15 Tesla Update Debuts FSD Beta 10.69.3 for Employees - TeslaNorth.com
Tesla on Monday evening released its 2022.36.15 software update, which includes the anticipated Full Self-Driving (FSD) beta version 10.69.3 for employees. According to @Teslascope, this FSD beta 10.69.3 update is being released to Tesla employees in Canada and the USA, based on vehicles in its...teslanorth.com
Looks like 10.69.3 is utilizing 2022.36.* stack. Good to hear as I'm currently on 2022.28.*
Employee rollout today, should see wide rollout next week.
We know this will be a major release so one could logically expect a longer employee testing period.Definitely optimistic but historically the public version went out a week after the employee's received it.
I’m surprised they didn’t have this already - the computer already rendered the arrows on the display so clearly they were identifying them. If they took the time to identify them why not actually use the data?I would guess the Autopilot Vision team was hoping/expecting that the "simple" approach of adding more training data with "road markings" would be picked up by the general vision component which outputs a rich visual representation from the 8 camera views. I think it did somewhat work but also led to some odd behaviors when an arrow was partially or only briefly visible as it could look like some other similar but actually different situation.
Having an explicit module for road markings should have crisp input signals (as well as strong "negative" signal if there isn't actually a road marking) feeding into the language component that actually predicts lanes and their topology resulting in "improves lane topology error at intersections by 38.9%."
I’m surprised they didn’t have this already - the computer already rendered the arrows on the display so clearly they were identifying them. If they took the time to identify them why not actually use the data?
I have definitely noticed our Canadian friends tend to have more interventions according to them, the testers, than we do.Everything Tesla deploys has to work in a large variety of edge cases, USA/Canada wide. Something that may seem obvious in most situations may not work in that final 1% of cases, and Tesla has to solve even the last 1% of cases before deploying this stuff.
That's why even something "simple" like traffic controls / lights took a long time, because as we know, there are swinging traffic lights, lights that are broken, lights pointed in odd directions, what lights correspond to which lanes, etc. Etc.
This is like the “holy grail” of FSD.Of course driving is going to be end-to-end, that's how everything eventually ends up. If you want to build a detector for cars in 2022, you don't make a wheel detector, and a hood detector, and a door detector, and a brake light detector. That's how it worked a decade ago. Today you just ask the network to tell you where entire cars are.