If Tesla is training with both towards good and away from bad examples, it should be relatively straightforward to label and train "this portion is good" and "this portion is bad." Or even if only just the good parts, video leading up to that section will be part of the history context and earlier bad controls would be skipped. If the desire is to train end-to-end on human driving, it would seem natural to not train control outputs with the portions that have Autopilot active.
Triggers can be set up to capture video of varying duration some customizable time after a disengagement or other condition, so somebody disengaging 11.x ahead of needing to quickly switch multiple lanes after a turn might not need a special trigger. Sure, the timing could be off and not get the full clip of a "perfect" example, but if any of it's still good enough, those portions can still be used for training, and hopefully across the full fleet, there should be enough partial and full examples.