I think radar (and similarly lidar) are good at gauging distance (and also speed) of object. Its response to material depend on the absorption of the particular wavelength of the particular material. In radar frequency what i heard is that metal objects will present a bigger reflection than an organic object. I suspect this is why overhead freeway signs are such a problem because they throw out such a large false signal. The AI algorithm may be ignoring those large radar signals from stationary objects to avoid false-alarms from the signs.
Radar also doesn't have the spatial resolution of a camera. Radar works at much longer wavelength than light. When the car is traveling at freeway speed, and trying to tell a large stationary metal object ahead, whether it's at ground level (a car), vs 20 ft overhead (a sign), is not trivial. 60mph is 88 ft/sec, so seeing something 5 sec ahead is ~450ft away, a 20ft vertical placement difference at that far away could be difficult for radar if it doesn't have the necessary spatial resolution. Lidar, on the other hand, with higher resolution, would do much better in this specific case.
IMO what is needed is for camera image processing to catch up and be able to recognize stationary objects. This is also difficult. With one camera (mono-vision) it is impossible to tell distance (and therefore speed) of an object from 1 single frame. By comparing frames at different times, the size of the object would change and that could tell the relative speed. But a lot of things can factor in, such as lateral speed, and also a big difficulty is that the image processing need to be able to recognize an object in one frame and "remember" the same object in the next frame.
With 2 camera and stereo vision, distance can be computed, and therefore speed. The difficulty here is the computation power needed to do it in real time. I trust Tesla saying that they can get there with the HW they have now, but it will require a lot of development work.
What about three radars and training on the image directly ahead in the area of interest for the motion of the car? I know, a hardware change and the computation issue is amplified, but if they have "trained" the old system to look ahead of the most immediate car, why not more?