NIR (Near InfraRed spectroscopy) is a versatile technology that is widely used to provide reliable and accurate analysis for a wide range of sample types. The ability to provide accurate results relies on two factors:
As described in an earlier blog, NIR is an analytical technique that is based on measuring differences in how NIR energy is transmitted or reflected off of a sample, depending on the chemical composition of the sample. Thus NIR is a secondary technique, meaning that a primary reference method is required to create an NIR calibration. The calibration itself is a mathematical correlation between the raw NIR data from samples and the chemical constituent or property of interest.
The first task for developing an NIR calibration is to collect samples. The samples should be representative of the future unknown samples to be measured in all areas of potential variability including constituent range(s), origin, seasonal variation etc. Collecting the right samples is often the most difficult step in creating an NIR calibration.
For full spectrum instruments such as PDA, FT and scanning monochromators, a sufficient number of samples is required to properly account for the variability of the samples across the many datapoints collected by these instruments. A minimum number of samples to get a provisional calibration is normally 50 samples, but good robust calibrations are usually over 100 samples depending on the product type, and some can be based on thousands of samples.
A first collection of selected samples is then sent for reference analysis. This might be Dumas or Kjeldahl for protein, Soxhlet extraction for fats and oils, or any other approved reference method for the constituent(s) of interest. It is important that the reference data be of high quality and with minimal error as the accuracy and performance of the NIR calibration are limited by the reference error.
Once the reference data are obtained, they are added to raw sample spectra and these data are regressed against each other, often using a PLS regression. The output is a linear equation that can be applied to future unknown samples to predict the constituents of interest. A diagram is below:
From this point on, the calibration above can be used on unknown samples to predict the protein content. Outlier indicators can be used to ensure that the new samples being analyzed are similar to the samples used to create the calibration, ensuring that the results are accurate and reliable.