Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Wiki Consumer AV - Status Tracking Thread

This site may earn commission on affiliate links.
This is my main concern as well. ME seems to be betting that they can pull off a low cost L4 without it. And you might be able to do a very poor L4 without heavy ML but all the evidence suggests you need SOTA NN and heavy ML in prediction and planning if you want to achieve high performance/reliability in L4. I do hope ME will change their mind on that before it is too late.



I am curious, how many compute TOPS do you think Waymo uses? Over or under 1000? My guess is over 1000, probably closer to 2000.
7 years ago they were using 100 TOPs.
Today they are definitely in the hundreds of TOPs and maybe up to 1,000 TOPs.
But I don't see them being much over 1k.
But no one knows for-sure.

 
  • Like
Reactions: diplomat33
7 years ago they were using 100 TOPs.
Today they are definitely in the hundreds of TOPs and maybe up to 1,000 TOPs.
But I don't see them being much over 1k.
But no one knows for-sure.


Thanks for the info. The reason I guessed 1000+ TOPS is because I know Waymo has a lot of sensor data from cameras, lidar and radar as well as a lot of cutting edge NN for perception, prediction and planning. I figured all that probably requires a lot of compute. Although I know Waymo has been working to reduce the compute of their lidar perception. At ECCV 2022, Anguelov presented their work on Range Sparse Net, a new lidar model that does not depend on range and therefore is less compute intensive.

BJm0hM9.png
 
  • Like
Reactions: Bladerskb
This is my main concern as well. ME seems to be betting that they can pull off a low cost L4 without it. And you might be able to do a very poor L4 without heavy ML but all the evidence suggests you need SOTA NN and heavy ML in prediction and planning if you want to achieve high performance/reliability in L4. I do hope ME will change their mind on that before it is too late.



I am curious, how many compute TOPS do you think Waymo uses? Over or under 1000? My guess is over 1000, probably closer to 2000.
The majority of compute is sensor processing. The more sensing the more compute. Higher res sensors, even more compute. It really doesn't make any sense comparing in my opinion.
 
The majority of compute is sensor processing. The more sensing the more compute. Higher res sensors, even more compute. It really doesn't make any sense comparing in my opinion.

I was really just curious what Bladerskb thought about how many TOPS Waymo uses since I have not heard much discussion on that.

I agree we should not compare ME to Waymo since they have very different approaches and business models. ME's business model is selling ADAS to OEMs with the hope of eventually providing L3 and L4 products to OEMs as a way of making more money. Since consumer cars have very narrow profit margins, cost is essential. So consumer cars need a low cost, low compute solution. You cannot put the expensive sensor suite of a Waymo on a consumer car. Hence, it makes sense for ME focus on providing as much autonomy as possible with the lowest sensor suite and lowest compute possible. And incidentally, this is why Tesla is also trying to "solve FSD" with a vision-only approach. Waymo does not have these same constraints since they are doing robotaxis only. So they can put more sensors and more compute on their cars.

To your point, yes, if you have more sensors and more high res sensors, processing the data will require more compute. But I think it also really depends on the software. I am not an expert but I imagine that there are some software approaches that might be more compute intensive than others. So for example, you might have a vision NN that is not very efficient and requires more compute or another vision NN that is more efficient and requires less compute, on basically the same sensor.

And I don't think we should ignore the compute requirements of the prediction and planning stacks either. ME has talked about how there are "brute force" approaches to doing behavior prediction where the NN models every single possible path that all objects in your scene might take. These models are very compute intensive because the number of possible futures can grow very quickly. Or you might do behavior prediction by only modeling "reasonable futures" which will require less compute since the number of futures is much lower. ME proposes a different model that does not predict exact paths at all but instead only predicts intent of other vehicles and uses their RSS safety model to ensure a safe path. This model requires even less compute since it does not need to model paths for objects. My point being that different approaches to doing behavior prediction can require more or less compute. Same is true for driving policy or planning. An approach where you use 300,000 lines of code for your driving policy will require different compute than an approach that uses a NN for driving policy.
 
Xpeng's XNGP expands to cover all roads in China:

Xpeng (NYSE: XPEV) today announced that its XNGP (Xpeng navigation guided pilot) feature becomes available on all roads in China, after expanding the coverage of the ADAS (Advanced Driver Assistance System) to more than 200 cities early last month.

"Unlimited XNGP, available all over the country, on any road," Xpeng said on Weibo today.

The first users with smart driving experience already have access to the XNGP with expanded coverage, and the feature will be pushed out to more users, Xpeng said.

XNGP is Xpeng's ADAS similar to Tesla's (NASDAQ: TSLA) FSD (Full Self-Driving).

The electric vehicle (EV) maker's goal is for the XNGP to provide driver assistance in all scenarios, including highways, city roads, internal campus roads, and parking lots.

The feature was initially available only on highways, then known as Highway NGP. Xpeng then gradually extended its coverage to urban areas, known as City NGP.

In September 2022, Xpeng began opening up the XNGP feature in Guangzhou based on a pilot basis. By June 2023, the feature's coverage in urban areas expanded to five cities including Guangzhou, Shenzhen, Foshan, Shanghai and Beijing.

Xpeng announced the expansion of XNGP coverage to 52 cities on December 29, 2023, and increased that number to 243 on January 2 this year.

On January 30, He Xiaopeng, Xpeng's chairman and CEO, said the company will launch point-to-point XNGP capabilities in China in 2024, which will additionally cover internal roads and parking lot scenarios in addition to highways and city roads.

 
According this, the disengagement rate of XNGP on highways is 1 per 2,000 km (1,242 miles).

Currently the system covers 569,000 kilometers of roads and actual vehicle mileage has exceeded 3.7 million kilometers. XPeng say the accident rate is only a tenth of the level of human driving. With the highway element the number of takeovers is down to 1 in every 2000 km. Furthermore the average speed for vehicles using the highway part of NGP has increased by 13%. The user penetration rate for the highway NGP element is 94.7% while for the city portion it is 83.2% with 40% of overall mileage by cars being undertaken by the system.

 
  • Funny
Reactions: AlanSubie4Life
First promo vid of Mobileye city streets supervision
 
  • Like
Reactions: diplomat33
First promo vid of Mobileye city streets supervision

Thank you so much for sharing. Nice to finally see something of in-production SV on city streets.

One thing that really stands out to me is how busy and congested Chinese cities are. That's a lot of vehicles and pedestrians often getting in the path of the vehicle. And it looked like there was a little rain too. It is not the easiest driving environment. Obviously, it is a promo video but SV does seem to do pretty good with navigating around everything. I did notice though at 1:55 mark the car stops awkwardly as it almost hit a motorcyclist.

Do you know the sensors on the car? I am just curious if the car has a forward radar or lidar helping SV avoid collisions or if it is pure vision. Thanks.

I look forward to seeing more videos of city driving as SV rolls out to more cars.
 
Do you know the sensors on the car? I am just curious if the car has a forward radar or lidar helping SV avoid collisions or if it is pure vision. Thanks.
11 cameras and one forward radar.

The majority of compute is sensor processing. The more sensing the more compute. Higher res sensors, even more compute. It really doesn't make any sense comparing in my opinion.
It actually might be the smallest part of it. What really matters is the resolution of the model you are running, the complexity of the models' architecture and the FPS you are running the model at. Mobileye for example uses older architectures like CNN that are a-lot more efficient, cropped out resolutions, very low FPS and then also traditional computer vision that are very very very efficient.

However running SOTA complex architecture models like Waymo does, at full resolution and very high FPS requires very high compute.
 
One thing that really stands out to me is how busy and congested Chinese cities are. That's a lot of vehicles and pedestrians often getting in the path of the vehicle. And it looked like there was a little rain too. It is not the easiest driving environment. Obviously, it is a promo video but SV does seem to do pretty good with navigating around everything. I did notice though at 1:55 mark the car stops awkwardly as it almost hit a motorcyclist.
Are you serious ? There is more congestion with traffic and pedestrians (and construction) in a suburb of Seattle like Bellevue than what you see in that video !

But, yes, thanks for the video @Bladerskb . What I’d like to really see are independent longer videos by consumers like we see here with FSD.
 
It actually might be the smallest part of it. What really matters is the resolution of the model you are running, the complexity of the models' architecture and the FPS you are running the model at. Mobileye for example uses older architectures like CNN that are a-lot more efficient, cropped out resolutions, very low FPS and then also traditional computer vision that are very very very efficient.

However running SOTA complex architecture models like Waymo does, at full resolution and very high FPS requires very high compute.
Sorry, but I do not understand technical terms like "SOTA complex architecture" ;) "running SOTA complex architecture models like Waymo does" means literary nothing. Is it considered to be."SOTA" because it has many sensors at high res or SOTA because it's complex? Or is is SOTA and complex because it isn't optimised or what's the idea?

I stand by my previous statement. Ie Many high res inputs into the nets is what's driving compute cost more than anything else. Processing 25+ high res feeds is very costly even if it's just merging and getting the BEV afaik.
 
Last edited:
You are both saying the same things but in different ways.

Sensor processing is the most computationally expensive. Well with high resolution sensors as well the type of architecture and software you need to process the volume of data coming in realtime plays a role in how much compute you require.

You can always throw more compute at a problem but now you're consuming more power which is limited on an autonomous vehicle. Everyone is finding ways to optimize their architecture to do more with less compute.
 
  • Like
Reactions: diplomat33
Zeekr rolled out NZP (Mobileye Supervision) in 36 cities for both Zeekr 001 and 009:


And here is video of Zeekr's LCC+ feature showcasing its ability to avoid road obstacles. My understanding is that LCC+ is an enhanced lane centering feature that can also steer around obstacles in the lane.

Is this highway only or urban too i.e. is this like FSD or NOA ?

Point-to-point​
automated navigation​
& adaptive cruise​
control​