You can install our site as a web app on your iOS device by utilizing the Add to Home Screen feature in Safari. Please see this thread for more details on this.
Note: This feature may not be available in some browsers.
Technically, yes.Will the Australian model 3 be capable of operating under the new FSD ie only camera no radar.
Will the Australian model 3 be capable of operating under the new FSD ie only camera no radar.
The human eye uses two eyes to help the brain asses depth or distance. Its why possibly no creatures only have one eye. Only exception I can think of is pirates and the tesla camera system, where each camera does not see anything the other camera sees, so each camera is one eye. Its intersting that technology can overcome this depth of field limitation, but maybe thats why we have phantom braking and panic breaking when cars are a long way off.I mean it's fairly simple logic..
Humans seem to drive OK with with 2 eyes and a few mirrors.
Your Tesla has 8 cameras, far better positioned than the human eye.
Doesnt the radar see under or through the car in front? Mine has reacted many times to a fast stopping car two cars in front that I cannot t see.Seems too, that the cameras night performance is way down compared to their daytime performance and way less than our own eyes night vision. I'm unconvinced the removal of radar is a positive step towards better auto pilot performance. Its more about supply chain issues, shortages and reducing cost I suspect.
yep, there is a good Youtube video of a Tesla emergency warning / braking when the radar picks up an incident ahead of the car in front.
I just cant see in this example, how the camera alone at this distance would have picked up the deacceleration of the vehicle ahead of the vehicle in front
Other than the three cameras at the front, and two pillar cameras that also face forward.Only exception I can think of is pirates and the tesla camera system, where each camera does not see anything the other camera sees,
Wrong. Human eyes can only use binocular cues from 5cm to 5m. This is because human eyes are only around 7cm apart making triangulation of objects in stereoscopic more than 5 metres away imperceptible based on our eye's angular resolution. The vast majority of depth perception (especially when driving) comes from relative size and motion which is why you can in theory drive just fine with one eye closed.The human eye uses two eyes to help the brain asses depth or distance.
Most animals on earth have eyes that do not operate in stereoscopic mode (i.e. binocular vision) so no, having stereo vision is not the reason no animals have <2 eyes at birth. Ducks and rabbits for example have eyes that do not overlap and thus have to move their head rapidly to determine distance via motion. Also, in the same way that humans work, they have an intrinsic understanding of how large certain objects should be and then by seeing those objects at a certain appaerent size in their field of view can infer distance.Its why possibly no creatures only have one eye.
Again you are completely and utterly incorrect. The cameras do overlap quite considerably. However that doesn't bring that much useful depth information (i.e. geometric pixel triangulation) compared to a trained neural net that works much in the same way humans do. The depth is just an emergent property with no calculations done at all.Only exception I can think of is pirates and the tesla camera system, where each camera does not see anything the other camera sees, so each camera is one eye. Its intersting that technology can overcome this depth of field limitation, but maybe thats why we have phantom braking and panic breaking when cars are a long way off.
One of the things I’ve wondered about in the new software is whether they are going to include object persistence in the algorithm.I see many people making this incorrect assumption. The biggest thing people need to remember when evaluating vision based system is that the neural net input has far more data than a YouTube video.
Let me explain. The video you are watching is a blackvue dashcam. This has a consumer grade camera with an RGB pixel sensor along with automatic aperture control. This in essence means that video is being processed so that it looks good for a human viewer. Fine details that are required for automated driving tasks are lost as they are not visually appealing or necessary for our brains. Furthermore, the dynamic range of the sensor is limited to a large degree in order to keep the compressed h.264/h.265 video compatible with SDR playback (i.e. 100 nits of brightness).
In other words, the visual data stream in the YouTube video is missing most of the data compared to the Tesla FSD computer. To elaborate further, the AutoPilot cameras capture RCCB video, which essentially means capturing 2 colour channels in order to prioritise luminance data (i.e. changes in brightness which reveals edges and therefore objects). Moreover, the FSD computer input is capable of 12-bits per channel plus a dynamic range pre-processor. This is to condition the video sensor data to have the highest possible signal for the neural net in situations where there is say bright sunlight and a shadowy overpass. To a human viewer, this image would look terribly washed out and devoid of detail (as you would need to map all this information to 8-bit RGB pixels with 100-nit SDR) but in fact all the data necessary for perception is there in the raw data which you cannot see on a traditional monitor.
What about occluded objects I hear you ask? Well this is where having multiple cameras with multiple vantage points operating at 10x human processing speed helps. A neural net can be trained to perform object detection and classification on partially occluded objects with super-human ability (given the right training corpus). The reason for this is that one of the fundamental principles in convolutional neural nets is the ability to detect fine edges and pass these features down the layers of abstraction until you end up with a higher level label like car, dog, person etc. If there is even a single frame (which is 36 times per second or every 31ms) from any one of the cameras (which have different angles compared to your 2 eyeballs), this prediction can be added to the environmental entities at each timestamp via a function called orthographic feature transform. The Bird's Eye View network Tesla has developed can then stitch these entities together in a 3D world to more accurately determine if there is indeed a car in front of your lead car and more importantly what trajectory this car is on (including being stopped). The BEV is the missing piece to allow radar-less sensor suite and I believe even basic AutoPilot will use the BEV in the fullness of time.
Secondly, the AP suite has one long range forward facing camera. Its field of view would be similar to this:
View attachment 670419
Incidentally, this frame is captured about a second before the AP forward collision alarm sounds. Its obvious even with this low resolution image that there is a car with its brake lights on directly in front of the lead car. The FSD computer will see up to 250m ahead in great detail so things like cars going around slight bends to reveal sides, objects visible through windshields etc will all bubble up to the BEV representation of the world. compare this to what is required for the radar equivalent where the radar returns need to be heuristically filtered out to determine what is the car in front of the lead car. This is far more error prone and temperamental. Ever heard of phantom braking? This is basically the AutoPilot trying to resolve a disagreement between the the radar says and what the vision sees.
As an aside, keep in mind that the brake lights themselves are emitting electromagnetic radiation at a specific wavelength. With a training corpus annotating brake lights, a neural net can detect brake lights to a superhuman level and then accurately determine relative distance and given a second frame, estimate velocity/acceleration/trajectory. This is a huge freebie when it comes to detection and is very close in effect as vehicle to vehicle communication just done via visible light emitters rather than radio emitters.
TL;DR: There is no fundamental reason that Vision-Only AutoPilot could not perform the same as radar sensor fusion. In fact, I believe eventually it will perform far better. AP cameras have better field of view, 10x more data and can operate 10x the speed than a human. This coupled with specialised neural nets will allow AutoPilot to have super-human perception and reaction times.
One of the things I’ve wondered about in the new software is whether they are going to include object persistence in the algorithm.
HW3 runs a neural networkGrimRe - are you saying the onboard MCU is working as a neural net?
>>HW3 runs a neural network<<
Is that actually IN the MCU or OTA? I would have thought a NN would be too much for the car to handle. Presumably, too, HW 3 will have to be retrofitted?
Not sure if you are trolling or not… but yes every model 3 does as well as model x / s since April 2019Does anyone have HW3 at present?