Computer vision is a field of Artificial Intelligence (AI) that aims to allow computers to see and process images the same way that humans do. As you can imagine, this is not an easy task to achieve. Think about it, putting a camera on to your computer doesn’t mean that it can see… The potential computer vision holds is huge! Throughout the rest of this post, we will explore how computer vision works.
So, how does computer vision work?
As we’ve mentioned, getting a computer to see an image, process it and understand exactly what that image is showing is a difficult task. To explain how computer vision can be achieved we’ve broken down the process into 3 parts:
- Image acquisition;
- Image processing;
- Analysis of the image.
Following this process, a computer will be able to decide exactly what that image is representing. Below is a diagram to show this in a visual flowchart.
We will now take a look at each of these phases.
The first step in computer vision is acquiring the image. This is the process of taking the world around us and turning it into binary data composed of ones and zeros that are interpreted as digital images. To me and you, that means taking a picture. We do this through using various different cameras from the camera in your smartphone, to webcams and digital cameras. Now that the computer has the data, it can process the image.
The second step in computer vision is processing the information contained within the image. To do this, there are many different (and complicated) algorithms that are applied to process the image. We won’t go into detail about how each algorithm works as they are very advanced, but we will take a look at what some of these algorithms detect.
Edges for example. The appropriately named edge detection algorithm allows computers to understand the edges of objects within an image. From this, the computer can begin to understand what the image is. An easy way to get your head around this is to think of a face. If the computer can detect edges of your face such as; nose, mouth, eyes and chin. It has an almost sketch-like outline of what the image contains. This is one way in which a computer can begin to ‘see’ an image from the real world.
Another algorithm used in computer vision is segmentation. This is basically separating the image to simplify it so the information is easier to analyse. This is usually used to locate boundaries of objects. So, it is similar to edge detection but goes a step further to group the boundaries into different sections. The easiest way to understand this is by picturing a basket of apples. The computer uses the segmentation algorithm to find the boundaries of each apple. Once the computer has this information it can be processed to establish that there are multiple round objects in the image. Now that the computer has the information that there are round objects in the image. Further analysis can take place.
The final step in the computer vision flowchart involves looking at the data collected in previous steps and analysing it. The computer can then ‘look’ at the image and make a decision about what it contains. One example of the high-level analysis is object recognition. If we think about the information collected in the previous steps, at this point the computer knows the boundaries of objects within the image and also has a sketch-like outline of faces if they are present. That is already a pretty good understanding of what the image shows. However, it can be taken even further. With the use of the object recognition algorithm the computer can then take the information about the object boundaries and match that to the outline of objects that they already know.
Say for example a computer knows the outline of cars, bikes, road signs and people as they have ‘seen’ them before. The computer can also detect faces using the edge detection algorithm we mentioned earlier to further recognise faces. The image below represents what the computer can interpret from this information. In other words, decide what the image represents.
Hopefully, you now have a basic understanding of how computer vision can enable computers to ‘see’ and understand images and surroundings. As you can probably imagine, the potential computer vision holds is massive for the modern world! You might have missed the fact that computer vision has actually already been integrated into society. Keep your eyes peeled for an upcoming blog on computer vision technology in our everyday lives.