Robot Object Recognition. The human visual system is extremely powerful. It is equipped with a high selectivity that allows us to distinguish among even very similar objects, like the faces of identical twins. Some studies believe that the human visual system can discriminate among at least tens of thousands of different object categories. Compared to this ability, even the most sophisticated computer system would falter. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes and scale, when they are translated or rotated, or even when they are partially obstructed from view. This task is still a challenge for robot object recognition and computer vision systems in general.
It is supposedly relatively easy to build a computer system that can be highly selective. This would involve having the computer simply memorize all the pixels in several training images. However, such a system would lack any power to generalize, such as in the case of Funes the Memorious, the fictitious Argentian character with a vast memory and no ability to generalize. He could not recognize a face after even the most minute change in it, and even slightly transformed objects would represent completely new and different objects to him. Similarly, though computers could take note of an object at any time, it would not be able to keep track if it changes.
As such, though modern computers are known to perform many complex tasks much faster and more precisely than humans, in other areas such as pattern recognition, a three-year-old can outperform the most sophisticated algorithms available today. Pattern recognition tasks are one of the bases for genuine intelligence, which is the ability to learn, to adapt and to extrapolate. Interpreting sensory information and transforming this information into meaningful signals is crucial in everyday life, which is probably why the human brain has the remarkable ability to recognize visual patterns in a most robust and selective manner. Perhaps when we ourselves can understand how our neurons can achieve these remarkable properties, it will be possible to translate this knowledge into algorithms for better machine visual and pattern recognition.
Robot object recognition is concerned with determining the identity of an object being observed in the image from a set of known labels. Although object recognition in computer vision, or the task of finding a given object in an image or video sequence, is still a tricky field in robotics, there have been great advances in recent years. Object recognition is one of the most fascinating abilities that humans easily possess, thus translating it into machine ability has been studied and worked on for more than four decades. There have been significant efforts made to develop representation schemes and algorithms aimed at recognizing generic objects in images taken under different imaging conditions (e.g. viewpoint, illumination, and occlusion).Within a limited scope of distinct objects like handwritten digits, fingerprints, faces, and road signs, there has been substantial success.
Still, it is a daunting task to develop robot object recognition systems that match the cognitive capabilities of human beings, or systems that are able to tell the specific identity of an object being observed.
Central to robot object recognition systems is how the consistency of an image, taken under different lighting and positions, is extracted and recognized. To work, algorithms are made to adopt certain representations or models, either in 2D or 3D, to capture these characteristics, which then facilitate procedures to tell their identities. The recognition process, which could be generative or discriminative, is then carried out by matching the test image against the stored object representations or models in the database.
With more reliable representation schemes and recognition algorithms being developed, more progress continues to be made towards recognizing objects even under variations in viewpoint, illumination and under partial occlusion. While research continues to find more robust representation schemes and recognition algorithms for recognizing generic objects, there are severable object recognition systems already available for hobbyists and robot enthusiasts today.
Skilligent Robot Vision System is a software component which implements powerful object recognition and object tracking algorithms. The system is specifically designed for robotics applications, including visual object recognition and tracking, image stabilization, visual-based servoing, human-to-machine interaction and visual-augmented navigation. The
system keeps digital object representations in an indexed structure which is optimized for fast searches as the software scans a video stream coming from a camera. It can use multiple images of the same object taken from different views, which effectively removes the restriction (~30-45 degrees) on the maximum change of the angle of view. It also has a Multi-View Object Recognition feature enables the software to reliably recognize landmark objects from various points of view.
RoboRealm also has a simplified application for use in computer vision, image analysis, and robotic vision systems. It features an easy point-and-click interface that only requires an inexpensive USB webcam and a PC to add machine vision to robotic projects. RoboRealm has compiled several image processing functions into a windows-based application that can be used with a webcam, TV tuner, IP camera, etc. The system may then be used to see a robot's environment, so that the user may process the acquired image, analyze what needs to be done and send the needed signals to the robot's motors and servos.
There are other object recognition software ranging from simple ones to those like Imagu, which performs geometric and topological detection to facilitate advanced object recognition and segmentation. Already there are software solutions that claim to be able to accurately and reliably “identify numerous object classes in numerous environments by employing carefully selected and highly customizable algorithmic building-blocks,” among others. Impressive, but I’d say it will take a few more decades for robot object recognition to even come close to matching the speed and skill of the human brain when it comes to visual intelligence. Nice to know we humans can still do some things better.