Mid-infrared spectroscopy is a high-throughput technique that allows the prediction of milk quality traits on a large-scale. The accuracy of prediction achievable using partial least squares (PLS) regression is usually high for fatty acids (FA) that are more abundant in milk, whereas it decreases for FA that are present in low concentrations. Two variable selection methods, uninformative variable elimination or a genetic algorithm combined with PLS regression, were used in the present study to investigate their effect on the accuracy of prediction equations for milk FA profile expressed either as a concentration on total identified FA or a concentration in milk. For FA expressed on total identified FA, the coefficient of determination of cross-validation from PLS alone was low (0.25) for the prediction of polyunsaturated FA and medium (0.70) for saturated FA. The coefficient of determination increased to 0.54 and 0.95 for polyunsaturated and saturated FA, respectively, when FA were expressed on a milk basis and using PLS alone. Both algorithms before PLS regression improved the accuracy of prediction for FA, especially for FA that are usually difficult to predict; for example, the improvement with respect to the PLS regression ranged from 9 to 80%. In general, FA were better predicted when their concentrations were expressed on a milk basis. These results might favor the use of prediction equations in the dairy industry for genetic purposes and payment system.

Variable selection procedures before partial least squares regression enhance the accuracy of milk fatty acid composition predicted by mid-infrared spectroscopy

GOTTARDO, PAOLO;PENASA, MAURO;DE MARCHI, MASSIMO
2016

Abstract

Mid-infrared spectroscopy is a high-throughput technique that allows the prediction of milk quality traits on a large-scale. The accuracy of prediction achievable using partial least squares (PLS) regression is usually high for fatty acids (FA) that are more abundant in milk, whereas it decreases for FA that are present in low concentrations. Two variable selection methods, uninformative variable elimination or a genetic algorithm combined with PLS regression, were used in the present study to investigate their effect on the accuracy of prediction equations for milk FA profile expressed either as a concentration on total identified FA or a concentration in milk. For FA expressed on total identified FA, the coefficient of determination of cross-validation from PLS alone was low (0.25) for the prediction of polyunsaturated FA and medium (0.70) for saturated FA. The coefficient of determination increased to 0.54 and 0.95 for polyunsaturated and saturated FA, respectively, when FA were expressed on a milk basis and using PLS alone. Both algorithms before PLS regression improved the accuracy of prediction for FA, especially for FA that are usually difficult to predict; for example, the improvement with respect to the PLS regression ranged from 9 to 80%. In general, FA were better predicted when their concentrations were expressed on a milk basis. These results might favor the use of prediction equations in the dairy industry for genetic purposes and payment system.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3204297
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 9
  • OpenAlex ND
social impact