Predicting Insemination Outcome in Holstein Dairy Cattle using Deep Learning

Alishahi, Mohammad; Ravakhah, Mahdi

doi:10.22067/ijasr.2024.89072.1212

Predicting Insemination Outcome in Holstein Dairy Cattle using Deep Learning

Document Type : Research Articles

Authors

Department of Computer Engineering, Research Center of Smart Distribution Networks, Mashhad Branch, Islamic Azad University, Mashhad, Iran

10.22067/ijasr.2024.89072.1212

Abstract

Introduction: Development of a predictive model using machine learning can help livestock farmers to increase their understanding of the performance potential of their livestock. It can assist in decision-making processes related to livestock management, elimination and replacement selection, nutrition, reproduction and other matters of livestock management. Predicting insemination outcomes provides valuable insights to improve reproductive performance, breeding processes, milk production and overall livestock efficiency. The integration of predicting models in the existing systems in animal husbandry increases its practical application as a decision support tool for animal farmers. By developing a tool that can determine the reproductive success of livestock, ranchers can optimize their production and breeding strategies and improve overall livestock management practices to increase reproductive efficiency and profitability. In this study
Material and Methods: This study utilized data from the Helal Agro-Industry Co., a commercial dairy farm associated with the Iranian Red Crescent Investment Company. The commercial dairy herds in this region primarily consist of Holstein-Friesian cattle. In terms of record-keeping and efficient data management, the agricultural enterprise utilizes the Modiran Farmer software. This software leverages the Microsoft SQL Server database infrastructure to facilitate the storage of pertinent information. The dataset encompasses a diverse array of tables containing entries spanning various aspects such as reproduction, milking, health profiles, genetic insights, and broader characteristic attributes. The temporal scope of the database spans from January 1994 through May 2023, encapsulating a substantial historical period. We executed a SQL query against the database to generate a dataset of insemination records and their corresponding features. For each insemination record, we retrieved 25 features encompassing covariates related to milking, reproduction, management factors, health, and insemination result. The data underwent further pre-processing after the extraction process to make it suitable for the proposed models. We proposed three different models of Long Short-Term Memory, Multi-Layer perceptron, and XGBoost. A distinct set of cow IDs was acquired, and then, it was partitioned into three subsets: 70% for training, 10% for validation, and 20% for testing. In order to work with LSTM model, by identifying the temporal dependencies relations between a cow’s insemination cycles, we stacked these cycles to form sequences that can then be processed by LSTM model. So, the sets of unique cow IDs were then used to generate the sequences for each cow. A data augmentation method was used to generate all possible sub-sequences of cows’ insemination. Then, the sequences were aligned and stacked to achieve a constant length of 20. In total, about 168,000 training sequences, 23,000 validation sequences, and about 46,000 test sequences were generated. We tuned the parameters and hyperparameters of each model and upon finalizing the optimal architectures for our models, we initiated training experiments by fitting the models to the prepared datasets.
Results and Discussion: Our experimental findings reveal that the proposed LSTM model significantly improved prediction accuracy compared to the MLP and XGBoost models. The LSTM model, with its architecture of three consecutive LSTM layers, was able to demonstrate the best performance across all evaluation metrics on average over the 10 training runs. LSTM networks are designed to handle long time dependencies well. These networks use memory cells to hold important information over time, which makes them suitable for ordinal data such as time series. In contrast, XGBoost and MLP are not specifically designed to handle temporal dependencies and their performance is more limited on this type of data. Also, LSTM network can learn complex dependencies between ordinal data well. This ability is attributed to the unique structure of LSTM and its gate mechanisms, which enable the network to filter out irrelevant information while retaining essential information. In contrast, models based on XGBoost and MLP are less effective in this area, as they primarily focus on direct interactions between features and struggle to capture temporal dependencies. LSTM-based models excel in extracting higher-level features from data due to their deep learning capabilities. These features provide richer information for classification tasks, ultimately improving classification accuracy. Although XGBoost-based models are known for their precision, they are less adept at extracting high-level features. Additionally, the memory structure of LSTM allows it to handle fluctuations and unexpected variations in sequential data, effectively distinguishing critical information from noise. This feature helps LSTM perform better in situations where the data contains a lot of noise and fluctuations.
Conclusion: Overall, we presented and tested the performance of different models for predicting the results of artificial insemination of livestock. This prediction can help livestock farmers improve performance, increase fertility, and reduce livestock costs. In the problem of predicting the results of artificial insemination of livestock, the presented LSTM neural network model shows the best performance based on the stated evaluation criteria, and then the XGBoost-based classifier has better performance than MLP.

Keywords

Main Subjects

Animal Physiology and biotechnology

©2023 The author(s). This is an open access article distributed under Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source.

References

Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22^nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
Ehret, A., Hochstuhl, D., Krattenmacher, N., Tetens, J., Klein, M. S., Gronwald, W., & Thaller, G. (2015). Use of genomic and metabolic information as well as milk performance records for prediction of subclinical ketosis risk via artificial neural networks. Journal of Dairy Science, 98(1), 322-329. https://doi.org/10.3168/jds.2014-8602.
Fenlon, C., O’Grady, L., Dunnion, J., Shalloo, L., Butler, S., & Doherty, M. (2016). A Comparison of Machine Learning Techniques for Predicting Insemination Outcome in Irish Dairy Cows. Irish Conference on Artificial Intelligence and Cognitive Science. http://ceur-ws.org/Vol-1751/AICS_2016_paper_30.pdf.
González-Recio, O., Jiménez-Montero, J. A., & Alenda, R. (2013). The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. Journal of Dairy Science, 96(1), 614-624. https://doi.org/10.3168/jds.2012-5630.
Hempstalk, K., McParland, S., & Berry, D. P. (2015). Machine learning algorithms for the prediction of conception success to a given insemination in lactating dairy cows. Journal of Dairy Science, 98(8), 5262-5273. https://doi.org/10.3168/jds.2014-8984.
Hidalgo, A., Zouari, F., Knijn, H., & Van Der Beek, S. (2018). Prediction of postpartum diseases of dairy cattle using machine learning. In Proceedings of the World Congress on Genetics Applied to Livestock Production. World Congress on Genetics Applied to Livestock Production (p. 104).
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735.
Ho, P. N., & Pryce, J. E. (2020). Predicting the likelihood of conception to first insemination of dairy cows using milk mid-infrared spectroscopy. Journal of Dairy Science, 103(12), 11535-11544. https://doi.org/10.3168/jds.2020-18589.
Jiang, C., Chen, Y., Chen, S., Bo, Y., Li, W., Tian, W., & Guo, J. (2019). A mixed deep recurrent neural network for MEMS gyroscope noise suppressing. Electronics, 8(2), 181. https://doi.org/10.3390/electronics8020181.
Li, B., Zhang, N., Wang, Y. G., George, A. W., Reverter, A., & Li, Y. (2018). Genomic prediction of breeding values using a subset of SNPs identified by three machine learning methods. Frontiers in Genetics, 9, 237. https://doi.org/10.3389/fgene.2018.00237.
Li, Y., Raidan, F. S. S., Vitezica, Z., & Reverter, A. (2018). Using random forests as a prescreening tool for genomic prediction: Impact of subsets of SNPs on prediction accuracy of total genetic values. In World Congress on Genetics Applied to Livestock Production (WCGALP), February, (pp. 1130-p). Massey University.
Long, N., Gianola, D., Rosa, G. J., Weigel, K. A., & Avendano, S. (2007). Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. Journal of Animal Breeding and Genetics, 124(6), 377-389. https://doi.org/10.1111/j.1439-0388.2007.00694.x.
Mammadova, N., & Keskin, İ. (2013). Application of the support vector machine to predict subclinical mastitis in dairy cattle. The Scientific World Journal, 2013(1), 603897. https://doi.org/10.1155/2013/603897.
Mikshowsky, A. A., Gianola, D., & Weigel, K. A. (2017). Assessing genomic prediction accuracy for Holstein sires using bootstrap aggregation sampling and leave-one-out cross validation. Journal of Dairy Science, 100(1), 453-464. https://doi.org/10.3168/jds.2016-11496 .
Oluoch, L., Stachó, L., Viharos, L., Viharos, A., & Mikó, E. (2021). Random forest regression models for lactation and successful insemination in Holstein friesian cows. 1. Mathematical aspects. Gradus, 8(2), 1-8. https://doi.org/10.47833/2021.2.agr.001.
Pascanu, R., Mikolov, T., & Bengio, Y. (2013, May). On the difficulty of training recurrent neural networks. In International conference on machine learning(pp. 1310-1318). Pmlr.
Romadhonny, R. A., Gumelar, A. B., Fahrudin, T. M., Setiawan, W. P. A., Putra, F. D. C., Nugroho, R. D., & Budiani, J. R. (2019, September). Estrous cycle prediction of dairy cows for planned artificial insemination (AI) using multiple logistic regression. In 2019 International Seminar on Application for Technology of Information and Communication (Isemantic) (pp. 157-162). IEEE. https://doi.org/10.1109/isemantic.2019.8884272.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533-536. https://doi.org/10.1038/323533a0.
Rutten, C. J., Steeneveld, W., Vernooij, J. C. M., Huijps, K., Nielen, M., & Hogeveen, H. (2016). A prognostic model to predict the success of artificial insemination in dairy cows based on readily available data. Journal of Dairy Science, 99(8), 6764-6779. https://doi.org/10.3168/jds.2016-10935.
Shahinfar, S., Page, D., Guenther, J., Cabrera, V., Fricke, P., & Weigel, K. (2014). Prediction of insemination outcomes in Holstein dairy cattle using alternative machine learning algorithms. Journal of Dairy Science, 97(2), 731–742. https://doi.org/10.3168/jds.2013-6693.
Shrestha, A., & Mahmood, A. (2019). Review of deep learning algorithms and architectures. IEEE Access, 7, 53040-53065. https://doi.org/10.1109/access.2019.2912200.
Tyrrell, H. F., & Reid, J. T. (1965). Prediction of the energy value of cow's milk. Journal of Dairy Science, 48(9), 1215-1223. https://doi.org/10.3168/jds.s0022-0302(65)88430-2.
Yao, C., Zhu, X., & Weigel, K. A. (2016). Semi-supervised learning for genomic prediction of novel traits with small reference populations: an application to residual feed intake in dairy cattle. Genetics Selection Evolution, 48, 1-9. https://doi.org/10.1186/s12711-016-0262-5.
Zaheer, R., & Shaziya, H. (2019). A study of the optimization algorithms in deep learning. In 2019 Third International Conference On Inventive Systems And Control (ICISC), January, (pp. 536-539). IEEE. https://doi.org/10.1109/icisc44355.2019.9036442.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Iranian Journal of Animal Science Research

Predicting Insemination Outcome in Holstein Dairy Cattle using Deep Learning

References

Send comment about this article

Volume 16, Issue 4 - Serial Number 60
December 2025
Pages 529-541

Files

History

Share

How to cite

Statistics

Predicting Insemination Outcome in Holstein Dairy Cattle using Deep Learning

References

Send comment about this article

Volume 16, Issue 4 - Serial Number 60December 2025Pages 529-541

Files

History

Share

How to cite

Statistics

Volume 16, Issue 4 - Serial Number 60
December 2025
Pages 529-541