By Bill Mahanna
I had the pleasure of attending a presentation by John Shenk, one of the early pioneers in the application of near-infrared (NIR) technology to the analysis of forage and grains.
Shenk spoke at the Feb. 13-14 joint conference of the NIRS Forage & Feed Consortium, FeedAC and the National Forage Testing Assn. (NFTA; Feedstuffs, April 14).
Listening to presentations by Shenk and Paolo Berzaghi (NIRSC technical consultant) about NIR calibration development and their interaction with laboratory managers and technicians in attendance made me wish more nutritionists were in the audience so they could walk away with the sense of confidence in NIR values being generated by commercial and university laboratories.
What is often forgotten about NIR-predicted values is that NIR is a secondary method based on a regression against a primary (or reference) method. Consequently, the NIR value can never be more accurate than the primary reference method. We all need to remember the limitations and laboratory errors associated with methods such as neutral detergent fiber (NDF) so that we are not unfairly blaming NIR spectroscopy (NIRS) prediction models for errors associated with the original reference chemistry.
NIRS is not new
NIRS has been discussed in literature since 1939, but it was not until 1968 that Karl Norris and co-workers with the Instrumentation Research Lab of the U.S. Department of Agriculture first applied the technology to agricultural products.
They observed that cereal grains exhibited specific absorption bands in the NIR region and suggested that NIR instruments could be used to measure grain protein, oil and moisture. Research in 1976 demonstrated that absorption of other specific wavelengths was correlated with chemical analysis of forages (Shenk, 2001).
Shenk and his research team utilized a custom-designed spectro-computer system in 1977 to provide a rapid and accurate analysis of forage quality. Early in 1978, their group developed a portable instrument for use in a mobile van to deliver nutrient analysis of forages directly on farm and at hay auctions. This evolved into the use of university extension mobile NIR vans in Pennsylvania, Minnesota, Wisconsin and Illinois.
In 1978, the USDA NIRS Forage Network was founded to develop and test computer software to advance the science of NIRS grain and forage testing. By 1983, several commercial companies had begun marketing NIR instruments and software packages for forage and feed analysis. By 1983, several commercial companies had begun marketing NIR instruments and software packages for forage and feed analysis (Shenk, 2001).
How NIRS works
NIRS is based on the interaction of physical matter with light in the near-infrared spectral region (700-2,500 nm). Sample preparation and presentation to the NIR instrument vary widely. Though dried, finely ground samples are often employed, whole grains or fresh, unground samples also can be scanned. Instruments can be stationary in a laboratory or mobile (e.g., on board a silage chopper).
Monochromatic light produced by an NIR instrument interacts with plant material in a number of ways, including as reflection, refraction, absorption and diffraction. Vibrations of the hydrogen bonded with carbon, nitrogen or oxygen cause molecular "excitement" responsible for absorption of specific amounts of radiation of specific wavelengths. This allows labs to relate specific chemical bond vibrations (generating specific spectra) to concentration of a specific feed component (e.g., starch).
Spectroscopy is possible because molecules react the same way each time they are exposed to the same radiation.
NIR instruments are much less sensitive in quantifying individual inorganic elements (e.g., calcium, phosphorus or magnesium) or mixtures (e.g., ash) because they are measuring the influence of these "contaminating materials" on the covalent bonds.
Building a 'calibration'
The individual laboratory or consortium that develops a prediction model uses software packages to perform the mathematical calculations necessary to associate the NIR spectra of reference samples with the reference chemistry of those reference samples. This mathematical process is called "chemometrics." The mathematical equations developed are termed "prediction models," although they are also called "calibrations."
The robustness of an NIR prediction model is, in part, determined by the size and representative nature of the calibration population samples that will be analyzed by reference methods. The sample population should represent the full diversity of plant materials to be scanned.
For instance, if the goal is to develop a prediction model for crude protein in corn grain, then samples of corn from diverse genetic and environmental backgrounds need to be included in the population to be analyzed by the chosen reference method in a lab with high NFTA performance statistics. When a particular analytical methodology may not exist (e.g., for prediction of ethanol yield from corn fermentation), laboratories may develop an entirely new reference method.
Numerous samples should be scanned by NIR and assayed by wet chemistry procedures to obtain good calibration statistics. A "proof-of-concept" model will utilize 50-60 samples; fully developed prediction models can be built from no fewer than 80-100 samples, but this number can be greater (1,000s) depending upon the error terms associated with each analyte. The final number of samples required is dependent upon the analytical and spectral diversity within the reference samples selected for developing the prediction model (Sevenich, 2008).
To reduce total error, it is desirable to have multiple replicates of the sample analyzed by the reference method and scanned multiple times with the specific NIR instrument. If the calibration set is being developed for a dried, ground sample NIR instrument, then drying conditions must be standardized. Spectra production is quite sensitive to differences in sample particle size and shape. As a direct result, consistent sample preparation (e.g., grinding) is critical.
One of the largest sources of error in NIR predictions between labs that use the same calibration is a result of differences in how the labs prepare the sample (e.g., different type or worn grinders; Berzaghi, 2008).
To develop robust NIRS prediction models and valid results, laboratories must:
Questions for the lab
As users of NIR-predicted values, we should all feel comfortable asking our chosen analytical partners questions about NIR prediction model and wet chemistry statistics. A better understanding of standard errors or confidence intervals for lab values can help instill confidence in how to use these values in a similar way that other statistics (e.g., P-values) help us determine the confidence we put in research trial summaries.
Here is a list of statistics that reputable NIR laboratories will be able to discuss (Ruser, 2007; Allen, 2008; Sevenich, 2008; Owens, 2008):
To characterize reference methods, specific categories (loose, moderate and tight) can be used. Digestibility, with an SD of about two units, is an example of a loose fit. NDF as a percentage of dry matter with a value of 1.0-1.5 is an example of a moderate fit. Crude protein as a percentage of dry matter with an SD of 0.3-0.5 is an example of a tight fit.
When the reference method is imprecise, the precision of predicting composition of unknown samples also will be imprecise. This also will be reflected as greater NIR SEP and lower R2 values (Sapienza, 2008).
The Table illustrates a sliding scale of how "robustness" or "goodness of fit" of an NIRS prediction model varies with SD of the reference method. The categories that describe the goodness of fit of the prediction models are favorable, moderately favorable and unfavorable (Sapienza, 2008).
The size of SEP generally varies directly with SD of the reference method. A reference method must have a low SD if NIR is expected to provide useful information or be a stand-alone analytical method. An example of a favorable prediction model would be predicting crude protein as a percent of dry matter with an SEP of 0.3-0.5 and an R2 of 0.95.
In contrast, an example of an unfavorable prediction model might be NDF as a percentage of dry matter with an SEP of two to three and an R2 of 0.80 (Sapienza, 2008).
The Bottom Line
NIR analysis as an analytical technique has a long and credible history. NIR is a secondary method that never can be more accurate than the reference method upon which it is based. Statistically robust prediction models allow for a rapid and repeatable assay procedure for nutritional values that helps the livestock industry detect and manage variability in composition among and within feedstuffs.
The cost effectiveness of NIR analysis allows the total analytical error (sampling and laboratory) to be reduced because a larger number of subsamples or sequential samples can be assayed with a limited analytical budget than is possible using the more expensive wet chemistry approaches.
To enhance trust, nutritionists, producers and laboratories are encouraged to communicate more fully and openly so that NIR prediction model and wet chemistry statistics are understood more clearly.
This article was originally published in June 2008 Feedstuffs issue, and is reproduced with their permission.