Skip to main content


Improving Computer Vision with Ideas from Human Vision


NSF-funded researchers at the University of California have significantly improved the recognition capabilities of computer vision systems by giving them “common sense”. Human vision is many times better than any computer vision system. One of the reasons for this is that humans are able to use knowledge of where certain objects are generally seen in an image (the sky is usually at the top) and of what types of objects are generally seen together (boats are often seen with water). Professor Belongie and graduate students Andrew Rabinovich, Andrea Vedaldi, Carolina Galleguillos, and Eric Wiewiora have given computer vision systems a similar ability. Their system consists of three stages. First the image is segmented into different parts using standard computer vision methods, then plausible labels are generated for each part and then finally the most compatible set of labels is selected. The first figure shows an example where the small oblong yellow object is originally thought to be a lemon by the Computer Vision system, but after recognizing a tennis racket, a person and a tennis court, the system is able to infer that the yellow object is actually a tennis ball. The second figure shows another example where a boat that had initially been thought to be a cow is correctly labeled as a boat because it is sitting in what has been recognized as water. The System can learn its common-sense knowledge through learning what objects tend to co-occur in a set of training images or by using another source of contextual information such as can be obtained from Google sets (where you type a few words and Google sets returns you related words). Andrew Rabinovich, Carolina Galleguillos, and Eric Wiewora are all NSF IGERT (Integrative Graduate Education and Research Traineeship) fellows in the Vision and Learning in Humans and Machines Traineeship program at UCSD run by Professors Virginia de Sa and Garrison Cottrell. This work is an excellent example of using knowledge of human vision and sophisticated machine learning methods to improve computer vision algorithms. It was presented at the International Conference on Computer Vision 2007.

Address Goals

This activity created a significantly better computer vision recognition. Improved computer vision systems could have huge impact in a variety of applications. Improved computer vision systems can also be an important part of research infrastructure for example by allowing automatic monitoring.