Tech

What Does Machine Vision Really Mean and How Can We Make the Most of It?

David Horvath

Jan 30, 2024 • 5 min read

Data analytics and evaluation leads to valuable conclusions, and even predictions. Data can be gathered through countless methods, and one of them is just looking at something and receive visual information. In the realm of computers, however, this is way more complicated. Machine vision is a field that develops rapidly, so in this post the current use cases and emerging trends.

What does Machine Vision even mean?

Machine vision is a process which utilizes visual information as a data source. Upon receiving the data, the machine runs an analysis and arrives at a decision point, which often leads to a mechanical solution. One example would be the barrier at the parking exit that lifts automatically after the computer checks the information coming from the payment machine against the license plate number it reads with the CCTV.

How Machine Vision works?

It looks straightforward: all we need is a camera image, an analytical software, and from that it goes business as usual, with the hardware and software solutions depending on the task.

However, it’s a little more complicated than that because humanity still didn’t exactly solve the puzzle of ‘vision’ yet. Science doesn’t have a comprehensive answer about how we humans create mental images from the visual information coming from our eyes.

Since we don’t know how vision works, it’s not easy to predict how an algorithm will work when it’s facing visual information that is trivial for us humans.

One example would be that we, humans, can create concepts out of verbal descriptions. If someone explains to us how a certain cat or oil rig looks like, we can pick that image from the stack quite easily.

Machines can’t do that. To understand how a cat looks like, it needs to look at, say, a thousand images with the label ‘cat’. The machine finds the recurring patterns in the chaotic swirl of pixels: these are the things we recognize as ‘ears’, ‘tail’, the peculiarities in the facial expressions of the many different cats. For the machine, these are certain arrangements of colors and geometric shapes that seem to return again and again. Other arrangements do not return so often, so those will be ignored, deemed irrelevant to the ‘cat’ label.

After doing this in a sufficient amount, it can identify whether the new images contain cats or not.

Elements of a Machine Vision system

When we say ‘machine vision’, we mean the whole process, until the usage goal is reached. This is why the system needs to have these elements:

1. Image Creation - usually a camera with any necessary feature (microscope, ultrasound, LIDAR, etc)

2. Signal Processing - a video card or similar tool used for processing the image data

3. Software - the programs used for data evaluation, the heart of the whole process set up by experts

4. Communication - Diagnostics UI

Why is machine vision important today?

Machine Learning technologies have largely been enabled by two factors: the sheer amount of data generated and the computational capacities. These are great conditions for setting up the kinds of high level software we call ‘AI solutions’.

Just think about the development of phone cameras, the raw expansion of megapixels in the last 10 years. Identification capabilities are closing in on 100% accuracy. In many cases the machines are reacting faster and more accurately to visual clues than humans.

What is Machine Vision used for?

Camera imagery have been used in industrial settings for decades now, but the last few years brought a new level of sophistication.

These are the major areas where Machine Vision is utilized:

Quality Assurance

One of the typical use cases in manufacturing is checking the quality of the product. Is the welding seam within the parameters? Is the bore diameter exactly what it should be? Does the soda or medicinal drink has just the right color?

These are simple “yes or no” questions that can be answered with the analysis of the camera image.

Modeling

Realistic models of buildings or objects, reconstruction in a 3D virtual space is possible, largely automated.

Positioning

Is the box angled correctly on the conveyor belt? Does the robot arm place the part exactly where it should?

Phenomena Detection

Scanning for medical conditions, finding structural faults in buildings, detecting insulation inefficacies.

Visual Similarity Detection

QR code, barcode.

Area detection

Kickoff or corner? The video referee is capable of such decisions on its own. More and more traditional sports experiment with its deployment in the field, even in cricket and baseball.

Counting

Traffic measurements, step counters in the office or in the supermarket, heatmaps created from the walking patterns.

Object recognition

What kind of sign is on the traffic post? Recognizing pedestrians, checking distances dynamically. Self-driving technology is maybe the widest known application of machine vision, for obvious reasons.

Analyzing an image is a bit like solving a mystery. The process goes through asking the right questions:

1. The object in the image belongs to which category? (Classification)

2. Why type of the object is visible on the image? (Identification)

3. Is the object visible on the image? (Verification)

4. Where are the objects in the image? (Detection)

5. Which pixels are part of the object in the image? (Segmentation)

6. Which objects are visible and where? (Recognition)

These are the questions most current software works with, like the ones that let the supermarket managers know that a shelf is empty and should be refilled, or make automated payment possible. Others detect hidden contraband, or security risks in a building structure, or traffic situations that can lead to jams. There are programs that can automatically assess damage for insurance companies.

Quality assurance can achieve so much more than filtering out the problematic parts, if the software has Machine Learning capabilities. A great industrial use case is predictive maintenance, which helps to complete the necessary maintenance tasks just the right time. All the system needs is data about the appropriate running parameters, so it can interfere when it detects an anomaly. You can read more about how Predictive Maintenance and other Industry 4.0 solutions here or
to understand how this fits into the broader AI journey, check our detailed exploration.

A Lexunit Case Study: Stacking the Hay (Photo Segmentation)

As an example, we present the steps of a Machine Vision project we did for a client.

We’ve been contacted by a global insurance company with a well-defined problem. They gathered a vast, unstructured archive of images which is growing day-by-day. They needed to find a solution for categorizing these images, by the single vehicle that is featured on them.

How can you identify a car? For starters, of course, you’ve got the license plate.

The first step was to make the system capable of deciding whether a license plate is visible on the image or not. If there is, it should identify it and record the data.

The next step is that the rest of the images will be segmented by whether they contain the same plate or not. If the plate is not visible, the system tries to find the one image with a license plate that this photo is the most similar to.

With this method, the system was able to successfully group all the photos by the car featured on them.

A Lexunit Case Study: Apples and Oranges (Automated Data Entry for Bookkeepers)

SzamlAI is, in a way, a Machine Vision project.

This software recognizes any kind of printed invoice and digitally captures all the data from it, no matter how it is formatted.

Machine Vision tech is developing rapidly. Hardware is getting stronger and liquid lenses might bring another big push for the imaging side. And image-based decisionmaking is within the expertise of AI developers.