The system definitely didn't roll all stops, probably because something leading up to the non-rolled stops reduced the certainty of the criteria Tesla had been using to initiate a roll. But I think it would have been there as a way of improving the experience by integrating stops where it could avoid the excessive hesitation by rolling where possible, and that could be perceived by users as better than hesitating at every single stop.
I believe lack of motion on either end can be a problem because of the way the system/algorithms use images to gauge position and motion, speed, direction, etc. The car needs to determine if any nearby objects are headed towards it and whether it's safe to proceed, and the car's own motion can be correlated with the visual data to more quickly determine motion of everything around it. If the car is sitting still, it wouldn't have that reference point and can only process more visual data before it'll be confident enough to hit the accelerator.
This paper describes something that sounds a lot like what Tesla would be using. On Page 2 where it talks about the KITTI MOD Dataset that uses trained networks and labelling for object detection, I think it's also talking about using odometer and GPS/IMU (inertial) data as part of this inexpensive way of calculating motion etc by considering motion of the camera itself.
And this isn't saying that autonomous systems can't detect moving objects when their own vehicle is at a standstill, but having another reference point could accelerate the pace of decision-making.
This is all just speculation though, I'm not an expert in this tech and there are currently very few people in the world who are.
It could even be that volume of training data in an area is what contributed to whether or not a stop would be rolled, which kinda puts a new spin on the "California roll"