Back in the day some programs used self modifying code (its been too long so I don't recall for certain, but I believe the game Doom was one example), but they didn't learn either. It was used because dynamically modifying the code paths improved execution efficiency. At the cost of complexity, but that is a longer story. In any case, these days a CPU will disallow modification of code in memory, but there are details and caveats beyond the scope here. My point is that -- in practice -- programs use static code that does not modify itself. Execution changes based on parameters changing, but not the code itself changing. And, yes, in some cases it can be difficult to separate the two.
But Tesla's code for autopilot is not self modifying. It is a static binary that only changes when a new version is compiled. That new version is taking into account improved data sets ("training") and users only see an actual change when they download a new version.
So why do people see things that they interpret as the car "learning"? Well, aside from the widespread misinformation about neural networks there is the fact of how neural networks work. When I write a program it is, essentially, deterministic. If I'm writing a game I can use pseudo random numbers to vary things up, but the code itself will always give the same results for the same inputs.
But neural networks don't quite work like that. For starters, you are incredibly unlikely to actually be feeding in the same inputs. Driving down the same road you will have slightly different speeds, the sun will be in a different position, there will be different cars, numbers of cars, pedestrians, etc., all conspiring to give a different input from the cameras despite it being the same road.
Which takes us to the core of neural network programming: rather than being deterministic it is probabilistic. Thats a high level view, not an implementation level injection of randomness. The idea is that, given a particular situation (say, identifying a set of pixels as representing a car) you want the neural net to arrive at the correct result 99.999% of the time. You take your data set and you "train" your model. I think it is more accurate to call it "iterative compilation" but I didn't make the terms up. For starters, because the process is done repeatedly until a goal is reached. In unbiased (or "physical") rendering this is done until you are satisfied with the image quality. For neural networks you do it until you hit the accuracy level you want. (Of course, sometimes your model and data set aren't up to achieving it so you have to collect more data or improve your labeling or improve your model or ..., but that's a detail.)
The point of all of this is that at the end of the day it there is a probability that the car will perform as desired with autopilot in any given situation. If it goes from 60% to 90% you are likely to notice an improvement (about 75% fewer undesired outcomes), but you might not. If it goes from 99% to 99.1% it has improved, but you may very well not notice (about 10% fewer undesired outcomes, and already scarce to start with). And, if you experience something in the remaining 0.9%, you will have a worse outcome.
To get an independent, black box, sense for how autopilot performs you not only need to use the same stretch of road, but you need to ensure the same time of day, the same environment (other cars, etc.), that you engage it with your car in the same position and speed, and then run it thousands of time to get a statistically meaningful sample. At the end of the day Tesla is the only one who could even remotely do this kind of real world validation of their accuracy targets.