I'm not sure you think "vision" is and what's been solved there to think it's now just a question of intelligence. Vision is still fooled.
We spent some time looking at Watson, the IBM intelligent thing. They're trying to make out it has AGI characteristics but its more like an set of aggregated ANI services wrapped into something thats more than a single use case. Look at the Jeopardy game thing, it was trained for the show. Give the same system a Pointless question and it wouldn't have a clue where to start, nor could it naturally comprehend the how to answer the rules of the show without being trained, something that a human can get with a description of the rules and 3 or 4 example questions and answers.
All existing AI approaches are at their core statistical models with layers of convolution and filters to transform the input to give a different lens on the situation, the filters being totally deterministic. Chat bots are still pretty dumb and are narrow AI, yes they're better than they were but thats because we train them to look for other things - it's like image recognition, you can train it to see a face, dog, cat, elephant, speed sign. car etc, the next level is the same model to see a smiling face, happy face, angry dog, barking dog, etc.
No system has passed the Turing test and the Google "It's starting to think for itself" thing is really just someone getting a little carried away.
Anyone selling AGI at the moment is pretty much selling snake oil, and anyone buying it are probably idiots given the number of use cases where simpler applications can leverage significant benefits.
But back to the vision aspect, the old adage of rubbish in, rubbish out applies, and if all you've got feeding the system are half a dozen low resolution cameras then no matter how good the AI, it's compromised. It's like telling Lewis Hamilton to drive without having any tactile feedback through his hands, his body, his hearing etc. The more input you provide the better the understanding of the environment, and that's why taking away any sensor is a retrograde step.