Cameras for Machine Vision Technology
Manufacturing is about getting it right each and every time for the customer. Getting it right means understanding the technology and the tradeoffs in the engineering. We know a lot about cameras, because camera selection can make or break a vision system. We work with vendors we’ve grown to trust and know what questions to ask any vendor when considering the best cameras for a vision system. Some applications can be solved with the least expensive cameras, but others require more in cameras, software, lighting or lensing. This article will help you understand the choices available to you in cameras and how they work.
Basic Operation Principles:
For our purposes, a camera is a system that measures electro-magnetic radiation. Most cameras will measure visible spectrum radiation (light), and do so with an array of imaging elements (pixels). This function allows us to accurately measure how much light is present and where it is coming from.
Eyes function similarly, which is why visible spectrum cameras make images look similar to what the eyes see. Both are photosensitive in the visible spectrum, and so both devices create roughly the same signal. Infrared or IR cameras do the same, except with heat, and that is why the images produced are much different than what the eyes see.
All camera sensors are photosensitive, meaning that they convert radiation that hits them into a charge, via the photo-electric effect. If you shine light of a certain wavelength at certain materials, this excites and frees some electrons, developing a charge. A camera is thus essentially a small solar panel. But instead of simply converting light to electricity, it seeks to measure that electricity and determine how much light hit the sensor.
Visible spectrum cameras have silicon-based sensors, and silicon is photosensitive in the visible light spectrum. When the light hits a silicon sensor, some of that light energy is absorbed, freeing electrons and building an electric charge. Similarly, Mercury Cadmium Telluride or (MCT) is sensitive to electro-magnetic radiation in the 3m to 12m range. When radiated heat hits MCT, charge continues to build. So MCT based sensors can be used to build certain types of infrared (IR) cameras.
A camera sensor has many independent pixels that accumulate charge over the course of exposure time. At the end of exposure time, circuitry inside the camera drains the charge, amplifies it, and converts it to a numeric measurement (digital signal), which can be stored or sent to another component. Ample light means many photons hit the sensor, which also means full “buckets” of charge exist, and high pixel values can be recorded.
Real World Implementations:
Cameras are fiendishly difficult to build. It’s all well and good to describe a light-measuring device with 5 million pixels, but building one presents enormous engineering challenges. What distinguishes camera vendor A from camera vendor B is how well they meet these challenges and eliminate error in the measurements. A crucial part of building any imaging or vision system is getting good data. Poor (or uninformed) camera selection impedes this and will doom any project.
When we choose cameras for our vision systems, we always consider the following features:
Roughly speaking, there are two types of camera sensors: CMOS and CCD. The distinction comes in how each type drains the charge it accumulates. These days, CMOS sensors have dramatically improved because they can print such small, precise circuits, so it is now largely the sensor of choice. CMOS sensors drain charge from each pixel individually. The charges are piped to ADCs (analog to digital converters) and read that way. The potential implications of this design include high speed and lower photo-sensitivity. CCD sensors, a less common choice, drain charge from 1 or more taps at corners of the sensor. The potential implications of this design are slower readout speeds, but more photosensitivty, and “blooming” – bright spots in an image can wash out a larger area.
Imperfections in Imaging:
In an ideal world, there would be a material that converted 100% of light at selected wavelengths to electrical charge. Unfortunately no such material exists. Furthermore, no material with a high QE (quantum efficiency) has the same spectral response as the human eye. It actually sees a wider spectrum.
In practice, glass in the lens cuts the UV component and an IR-cut filter can reduce the near infrared component before it hits the sensor to better replicate the human eye. Typical silicon sensors have QE’s of anywhere between 25% and 60%. Manufacturers should be able to provide the QE response curve for a camera upon request.
Full Well Capacity, Saturation Capacity, Pixel Size and SNR (Signal to Noise Ratio):
Statistically speaking, there will always be variance in how much charge builds on a given pixel during a given exposure. Take two pictures of the exact same scene with the exact same lighting, and you won’t read the same charge accumulation each time. First of all this is because photons can be thought of like rain. Even if it’s raining at a given rate, it’s not necessarily true that the same number of drops hits a specific cobblestone every second. Similarly, a given pixel gets a different number of photons every exposure even if conditions are exactly the same.
And then quantum efficiency is probabilistic for each photon. So if you have a QE of 50%, the first exposure may generate 4,900 electrons and the second may generate 5,100.
These factors combine to create signal-to-noise ratio. Before we’ve even talked about error, there is simply a point beyond which we can’t measure. To tilt the field in our favor though, we want to be taking measurements in a large enough electron bucket that randomness doesn’t play a large role. We want large signal and little noise.
Full Well Capacities on typical machine vision cameras range from about 5,000 e- to 50,000 e-. However reaching the full well capacity is difficult because when the pixel nears its maximum capacity, electrons begin to repel each other and some blooming can occur. Therefore we typically consider a saturation capacity we can realistically get to without introducing these problems. The theoretical max SNR we can achieve is equal to the square root of the saturation capacity.
SNR is often expressed in bits or decibels. To get bits we simply take log2(SNR). To get decibels we take 20 * log10( SNR). 1 bit equals 6.02 dB. Sometimes you’ll see both measures. Bear in mind that because both bits and dB are logarithmic scales, 1 additional bit means a camera is 2 times better, ans 10dB is equivalent to 10 times better.
But isn’t higher resolution always better?
How can a lower resolution camera possibly be more expensive? The answer lies in a larger sensor, meaning larger pixels, deeper wells, and thus better SNR per the math described above.
Maximum SNR and Measured SNR:
In the real world we can never actually achieve the theoretical max SNR.
The plot below measures how close a mid-grade CCD achieves the theoretical SNR limit at different levels of lighting. The graph has already excluded considerations of quantum efficiency because we’re only considering the photons that were absorbed and thus freed electrons. The diagonal in the graph represents maximal theoretical SNR as Sqrt(photons absorbed)=SNR all along that line. Many different units of this mid-grade CCD were tested and are shown as different green lines. The red line is the theoretical curve that fits the observed data. It represents the theoretical average unit for this camera model. A perfect camera would show up as a straight line along the diagonal. It always achieves the best possible SNR. (Obviously, not such a thing exists that money can buy.) A poorly performing camera would not get very close to its theoretical SNR and would be far to the right, and far below the diagonal. Note how SNR increases with illumination. Well-lit images enable us to take the best measurements.
Dark Noise and Dynamic Range:
On each pixel in a sensor there are a certain number of background electrons that will be read even if no photons hit a given pixel. It’s impossible for us to control electrons moving around according to the laws of physics, and to accordingly ever empty a pixel well. On a CCD, dark noise is typically 8 to 25 electrons. On a CMOS we typically see 15 to 110.
Dynamic Range is simply full / empty. It is the ratio of the maximum value to the minimum value. Higher dynamic ranges mean a camera can measure a greater range of illumination in a given image. If my camera has a saturation capacity of 10,000 e- and a dark noise of 100, in one image, my camera can only effectively distinguish between items whose brightness are within factors of 100. If my camera has a QE of 50%, this means that anything emitting under 200 photons per exposure is effectively indistinguishable from dark noise. Similarly any object emitting more than 20,000 photons per exposure will overwhelm the well capacity. Therefore in one exposure I can only effectively measure light levels between 200 and 20,000 photons.
We can adjust this range by changing exposure time. If we are reading mostly white pixels (the image is washing out), we can cut the exposure time. At very short exposure time, however, an image can become more or less black. An item emitting 20,000 photons per second, for example, means 200 photons per exposure at a 10ms exposure, which yields about 100 electrons – indistinguishable from dark noise. Going to a half second exposure can improve this, yielding 10,000 photons, but even at slow speeds would create motion blur.
Other sources of noise:
ADC (Analog to Digital Conversion) Noise:
Not all ADCs are alike. Because they are effectively the yardstick in the camera, it’s very important that they are accurate and consistent. Manufacturers will typically calibrate ADCs under uniform lighting conditions to make sure that the ADC on one side is not giving lower values than the ADC on the other.
Photo Response Non-Uniformity (PRNU):
Not all pixels are created equally and not all pixels respond linearly to increasing light. As we move to higher quality sensors this number will fall. 1% is very good.
Some pixels will simply not work. In the process of fabricating over 1 million identical pixel wells on a 1 mega-pixel sensor there is bound to be an error or two. Even at six-sigma precision, we should expect 3.4 defective pixels per million. As a camera ages, pixels can fail. This is one reason we generally recommend overshooting slightly on resolution for machine vision applications. Any system that intends to rely on the value of a single pixel is setting itself up to fail.
Thermal Noise (a component of Dark Noise):
Besides light, heat can also excite silicon atoms, causing them to release electrons. This factor is one component of dark noise and is fairly constant. At room temperature it’s generally a few electrons per pixel per second so it’s not big enough to justify a cooled camera. However, when we image very dark areas over very long exposure time windows (i.e. astronomy) this becomes a big problem. It can also be a problem in exceedingly hot environments like a steel plant. For machine vision applications in these situations, we typically design cooling enclosures to minimize thermal noise as much as possible.
Have any questions? Let us know.