Boffins give computers 'common sense'

18 October 2007 in News

Computer scientists use Google Labs widget in object recognition software

A little-known Google Labs widget has enabled researchers from UC San Diego and UCLA to add " common sense" to computers.

The computer scientists have added the ability to use context to help identify objects in photographs in an automated image labelling system.

For example, if a conventional automated object identifier has scanned an image and identified 'person', 'tennis racket', 'tennis court' and 'lemon', the new post-processing context check will re-label 'lemon' as 'tennis ball'.

"We think that our paper is the first to bring external semantic context to the problem of object recognition," said computer science professor Serge Belongie from UC San Diego.

The researchers showed that Google Sets can be used to provide external contextual information to automated object identifiers.

Google Sets generates lists of related items or objects from just a few examples. If a user types in 'John', 'Paul' and 'George', it will return the words 'Ringo', 'Beatles' and 'John Lennon'.

Similarly, entering 'neon' and 'argon' will generate a list of other noble gasses, i.e. 'helium', 'krypton', 'xenon' and 'radon'.

"In some ways, Google Sets is a proxy for common sense," explained Professor Belongie.

"In our paper we showed that you can use this common sense to provide contextual information that improves the accuracy of automated image labelling systems."

The image labelling system is a three-step process. Firstly, an automated system splits the image into different regions using image segmentation.

In the tennis example, image segmentation separates the person, the court, the racket and the yellow sphere.

Next, an automated system provides a ranked list of probable labels for each of these image regions.

Finally, the system adds a dose of context by processing all the different possible combinations of labels within the image, and maximising the contextual agreement among the labelled objects within each picture.

It is during this step that Google Sets can be used as a source of context that helps the system turn a 'lemon' into a 'tennis ball'.

In this case, these 'semantic context constraints' helped the system to disambiguate between visually similar objects.

The Objects in Context paper (PDF) will be presented today at the 11th IEEE International Conference on Computer Vision in Rio de Janeiro.