In recent years, increasing attention has been fo-cused on the genetic evaluation of protein fractions in cow milk with the aim of improving milk quality and technological characteristics. In this context, advances in high-throughput phenotyping by Fourier transform infrared (FTIR) spectroscopy offer the opportunity for large-scale, efficient measurement of novel traits that can be exploited in breeding programs as indica-tor traits. We took milk samples from 2,558 Holstein cows belonging to 38 herds in northern Italy, operating under different production systems. Fourier transform infrared spectra were collected on the same day as milk sampling and stored for subsequent analysis. Two sets of data (i.e., phenotypes and FTIR spectra) collected in 2 different years (2013 and 2019-2020) were compiled. The following traits were assessed using HPLC: true protein, major casein fractions [alpha S1-casein (CN), alpha S2- CN, beta-CN, Kappa-CN, and glycosylated-Kappa-CN], and major whey proteins (beta-lactoglobulin and alpha-lactalbumin), all of which were measured both in grams per liter (g/L) and proportion of total nitrogen (% N). The FTIR predictions were calculated using the gradient boosting machine technique and tested by 3 different cross-validation (CRV) methods. We used the following CRV scenarios: (1) random 10-fold, which randomly split the whole into 10-folds of equal size (9-folds for training and 1-fold for validation); (2) herd/date-out CRV, which assigned 80% of herd/date as the training set with independence of 20% of herd/date assigned as the validation set; (3) forward/backward CRV, which split the data set in training and validation set accord-ing with the year of milk sampling (FTIR and gold standard data assessed in 2013 or 2019-2020) using the "old" and "new" databases for training and valida-tion, and vice-versa with independence among them; (4) the CRV for genetic parameters (CRV-gen), where animals without pedigree as assigned as a fixed train-ing population and animals with pedigree information was split in 5-folds, in which 1-fold was assigned to the fixed training population, and 4-folds were assigned to the validation set (independent from the training set). The results (i.e., measures and predictions) of CRV-gen were used to infer the genetic parameters for gold stan-dard laboratory measurements (i.e., proteins assessed with HPLC) and FTIR-based predictions considering the CRV-gen scenario from a bi-trait animal model using single-step genomic BLUP. We found that the prediction accuracies of the gradient boosting machine equations differed according to the way in which the proteins were expressed, achieving higher accuracy when expressed in g/L than when expressed as % N in all CRV scenarios. Concerning the reproducibility of the equations over the different years, the results showed no relevant differences in predictive ability be-tween using "old" data as the training set and "new" data as the validation set and vice-versa. Comparing the additive genetic variance estimates for milk protein fractions between the FTIR predicted and HPLC mea-sures, we found reductions of -19.7% for milk protein fractions expressed in g/L, and -21.19% expressed as % N. Although we found reductions in the heritability estimates, they were small, with values ranging from -1.9 to -7.25% for g/L, and -1.6 to -7.9% for % N. The posterior distributions of the additive genetic correlations (ra) between the FTIR predictions and the laboratory measurements were generally high (>0.8), even when the milk protein fractions were expressed as % N. Our results show the potential of using FTIR predictions in breeding programs as indicator traits for the selection of animals to enhance milk protein fraction contents. We expect acceptable responses to selection due to the high genetic correlations between HPLC measurements and FTIR predictions.

Predicting milk protein fractions using infrared spectroscopy and a gradient boosting machine for breeding purposes in Holstein cattle

Bisutti, V
Membro del Collaboration Group
;
Vanzin, A
Membro del Collaboration Group
;
Pegolo, S
Membro del Collaboration Group
;
Toscano, A
Membro del Collaboration Group
;
Schiavon, S
Membro del Collaboration Group
;
Tagliapietra, F
Membro del Collaboration Group
;
Gallo, L
Membro del Collaboration Group
;
Cecchinato, A
Membro del Collaboration Group
2023

Abstract

In recent years, increasing attention has been fo-cused on the genetic evaluation of protein fractions in cow milk with the aim of improving milk quality and technological characteristics. In this context, advances in high-throughput phenotyping by Fourier transform infrared (FTIR) spectroscopy offer the opportunity for large-scale, efficient measurement of novel traits that can be exploited in breeding programs as indica-tor traits. We took milk samples from 2,558 Holstein cows belonging to 38 herds in northern Italy, operating under different production systems. Fourier transform infrared spectra were collected on the same day as milk sampling and stored for subsequent analysis. Two sets of data (i.e., phenotypes and FTIR spectra) collected in 2 different years (2013 and 2019-2020) were compiled. The following traits were assessed using HPLC: true protein, major casein fractions [alpha S1-casein (CN), alpha S2- CN, beta-CN, Kappa-CN, and glycosylated-Kappa-CN], and major whey proteins (beta-lactoglobulin and alpha-lactalbumin), all of which were measured both in grams per liter (g/L) and proportion of total nitrogen (% N). The FTIR predictions were calculated using the gradient boosting machine technique and tested by 3 different cross-validation (CRV) methods. We used the following CRV scenarios: (1) random 10-fold, which randomly split the whole into 10-folds of equal size (9-folds for training and 1-fold for validation); (2) herd/date-out CRV, which assigned 80% of herd/date as the training set with independence of 20% of herd/date assigned as the validation set; (3) forward/backward CRV, which split the data set in training and validation set accord-ing with the year of milk sampling (FTIR and gold standard data assessed in 2013 or 2019-2020) using the "old" and "new" databases for training and valida-tion, and vice-versa with independence among them; (4) the CRV for genetic parameters (CRV-gen), where animals without pedigree as assigned as a fixed train-ing population and animals with pedigree information was split in 5-folds, in which 1-fold was assigned to the fixed training population, and 4-folds were assigned to the validation set (independent from the training set). The results (i.e., measures and predictions) of CRV-gen were used to infer the genetic parameters for gold stan-dard laboratory measurements (i.e., proteins assessed with HPLC) and FTIR-based predictions considering the CRV-gen scenario from a bi-trait animal model using single-step genomic BLUP. We found that the prediction accuracies of the gradient boosting machine equations differed according to the way in which the proteins were expressed, achieving higher accuracy when expressed in g/L than when expressed as % N in all CRV scenarios. Concerning the reproducibility of the equations over the different years, the results showed no relevant differences in predictive ability be-tween using "old" data as the training set and "new" data as the validation set and vice-versa. Comparing the additive genetic variance estimates for milk protein fractions between the FTIR predicted and HPLC mea-sures, we found reductions of -19.7% for milk protein fractions expressed in g/L, and -21.19% expressed as % N. Although we found reductions in the heritability estimates, they were small, with values ranging from -1.9 to -7.25% for g/L, and -1.6 to -7.9% for % N. The posterior distributions of the additive genetic correlations (ra) between the FTIR predictions and the laboratory measurements were generally high (>0.8), even when the milk protein fractions were expressed as % N. Our results show the potential of using FTIR predictions in breeding programs as indicator traits for the selection of animals to enhance milk protein fraction contents. We expect acceptable responses to selection due to the high genetic correlations between HPLC measurements and FTIR predictions.
File in questo prodotto:
File Dimensione Formato  
2023 Mota et al JDS Predicting milk protein fraction using infrared spectroscopy and a gradient boosting machine for breeding purposes in Holstein cattle JDS 2023.pdf

accesso aperto

Tipologia: Published (publisher's version)
Licenza: Creative commons
Dimensione 5.56 MB
Formato Adobe PDF
5.56 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3473435
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
  • OpenAlex ND
social impact