Benchmarking between item based collaborative filtering algorithm and genomic best linear unbiased prediction (GBLUP) model in terms of prediction accuracy for wheat and maize//Estudio comparativo en términos de capacidad predictiva para datos de trigo y maíz entre el algoritmo de filtrado colaborativo y el modelo genómico mejor predictor lineal insesgado (GBLUP)

Autores/as

  • Osval A. Montesinos-López Faculty of Telematics, University of Colima. Av. Universidad 333, Col. Las Viboras, C.P. 28040, Colima, Mexico https://orcid.org/0000-0003-4464-3385
  • Emeterio Franco-Pérez Faculty of Marketing, University of Colima. Av. Universidad 333, Col. Las Viboras, 28040, Colima, Mexico https://orcid.org/0000-0003-1191-9787
  • Francisco J. Luna-Vázquez Faculty of Telematics, University of Colima. Av. Universidad 333, Col. Las Viboras, C.P. 28040, Colima, Mexico https://orcid.org/0000-0002-8093-7068
  • Josafat Salinas-Ruiz Colegio de Postgraduados Campus Córdoba, Km. 348 Carretera Federal Córdoba-Veracruz, Amatlán de los Reyes, 94946, Mexico https://orcid.org/0000-0001-6754-5177
  • Sara Sandoval-Carrillo Faculty of Telematics, University of Colima. Av. Universidad 333, Col. Las Viboras, C.P. 28040, Colima, Mexico https://orcid.org/0000-0002-8316-1314
  • Marco Alberto Valenzo Jiménez University Michoacana de San Nicolas de Hidalgo (UMSNH), Avenida Francisco J. Mujica S/N Ciudad Universitaria C.P. 58030, Morelia, Michoacan, Mexico https://orcid.org/0000-0001-6155-5948
  • Jaime Cuervas University of Quintana Roo, Chetumal, Quintana Roo, Blvd. Bahia s / n, Del Bosque, 77019 Chetumal, Mexico
  • Pedro C. Santana-Mancilla Faculty of Telematics, University of Colima. Av. Universidad 333, Col. Las Viboras, C.P. 28040, Colima, Mexico https://orcid.org/0000-0002-4184-0116

DOI:

https://doi.org/10.18633/biotecnia.v22i2.1255

Palabras clave:

GBLUP, Item Based Collaborative Filtering, Genomic Selection, Comparison, Prediction accuracy

Resumen

Aim/background: in view of the growing demand for food, new methodologies are needed to improve the genomic selection (GS) methodology to obtain more productive plant varieties and there is empirical evidence that GS it is revolutionizing plant breeding for food production around the world. Methods: since the prediction models play a key role in GS, for this reason Montesinos-López et al. (2018) proposed the item based collaborative filtering (IBCF) algorithm for Genomic prediction. For this reason, in this paper we compare the IBCF algorithm with the most popular genomic prediction model called the Genomic Best Linear Unbiased Prediction (GBLUP). Results: We found that the GBLUP is superior than the IBCF model, but the IBCF is competitive to the GBLUP model since produced very similar predictions, but with the large advantage that it is extremely efficient in terms of time for implementation. Conclusions: we found that the GBLUP is better than the IBCF algorithm but the IBCF is more than 400 times more efficient than the GBLUP model in terms of time for implementation. Limitations: The main limitation of the study is that it was performed in univariate terms and it is possible that the IBCF will perform better with multivariate data.

RESUMEN

Objetivo / antecedentes: en vista de la creciente demanda de alimentos, se necesitan nuevas metodologías para mejorar la selección genómica (GS) para obtener variedades de plantas más productivas y en menor tiempo y existe evidencia que la SG está revolucionando el mejoramiento de plantas que ayudará a incrementar la producción de alimentos a nivel mundial. Métodos: dado que los modelos de predicción juegan un papel clave en GS, Montesinos-López et al. (2018) propusieron el algoritmo de filtrado colaborativo (IBCF) para la predicción genómica. Por esta razón, en este artículo comparamos el algoritmo IBCF con el modelo de predicción genómica más popular denominado mejor predictor lineal insesgado Bayesiano (GBLUP). Resultados: Encontramos que el GBLUP es superior en capacidad predictiva al modelo IBCF, pero el IBCF es competitivo con el modelo GBLUP ya que produjo predicciones muy similares, pero con la ventaja de que es eficiente en términos de tiempo de implementación. Conclusiones: encontramos que el GBLUP es mejor que el algoritmo IBCF, pero el IBCF es 400 veces más eficiente que el modelo GBLUP en términos de tiempo de implementación. Limitaciones: la principal limitación del estudio es que se realizó en términos univariados y es posible que el IBCF se desempeñe mejor con datos multivariados.

Citas

Crossa, J., De Los Campos, G., Pérez, P., Gianola, D., Burgueño, J., Araus, J. L., Braun, H. J. 2010. Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics, 186(2), 713–724. https://doi.org/10.1534/genetics.110.118521

Crossa, J., Jarquín, D., Franco, J., Pérez-Rodríguez, P., Burgueño, J., Saint-Pierre, C., … Singh, S. 2016. Genomic Prediction of Gene Bank Wheat Landraces. G3: Genes|Genomes|Genetics, 6(7), 1819–1834. https://doi.org/10.1534/g3.116.029637

Cuevas, J., Granato, I., Fritsche-Neto, R., Montesinos-Lopez, O. A., Burgueño, J., Sousa, M. B. e., & Crossa, J. 2018. Genomic-Enabled Prediction Kernel Models with Random Intercepts for Multi-environment Trials. G3: Genes|Genomes|Genetics, 8(4), g3.300454.2017. https://doi.org/10.1534/g3.117.300454

De los Campos, G., & Pérez-Rodríguez, P. 2016. BGLR: Bayesian Generalized Linear Regression. CRAN. Retrieved from https://cran.r-project.org/web/packages/BGLR/index.html

Montesinos-López, O. A., Montesinos-lópez, A., & Crossa, J. 2018. IBCF.MTME: Item Based Collaborative Filtering for Multi-Trait and Multi-Environment Data. CRAN. https://doi.org/10.13140/RG.2.2.30286.97605

Montesinos-López, O. A., Montesinos-López, A., Crossa, J., & Pérez-Rodríguez, P. 2018. GFR: Genomic Functional Regression. Retrieved from https://github.com/frahik/GFR

Meuwissen, T. H. E., Hayes, B. J., & Goddard, M. E. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157(4), 1819–1829. https://doi.org/11290733

Montesinos-López, O. A., Montesinos-López, A., Crossa, J., Montesinos-López, J. C., Luna-Vázquez, F. J., Salinas, J.,… Buenrostro-Mariscal, R. 2017. A Variational Bayes Genomic-Enabled Prediction Model with Genotype × Environment Interaction. G3: Genes|Genomes|Genetics, 7(8), g3.117.041202. https://doi.org/10.1534/g3.117.041202

Montesinos-López, O. A., Montesinos-López, A., Crossa, J., Montesinos-López, J. C., Mota-Sanchez, D., Estrada-González, F., Juliana, P. 2018. Prediction of Multiple-Trait and Multiple-Environment Genomic Data Using Recommender Systems. G3: Genes|Genomes|Genetics, 8(1), 131 LP-147. Retrieved from http://www.g3journal.org/content/8/1/131. abstract

Montesinos-López, O. A., Montesinos-López, A., Crossa, J., Toledo, F. H., Pérez-Hernández, O., Eskridge, K. M., & Rutkoski, J. 2016. A Genomic Bayesian Multi-trait and Multi-environment Model. G3: Genes|Genomes|Genetics, 6(9), 2725–2744. https://doi.org/10.1534/g3.116.032359

Mota, R. R., Silva, F. F. e, Guimarães, S. E. F., Hayes, B., Fortes, M. R. S., Kelly, M. J.,… Moore, S. 2018. Benchmarking Bayesian genome enabled-prediction models for age at first calving in Nellore cows. Livestock Science, 211, 75–79. https://doi.org/10.1016/j.livsci.2018.03.009

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Ausria. Retrieved from https://www.rproject.org/

Van Raden, P. M. 2008. Efficient methods to compute genomic predictions. J. Dairy Sci. 91:4414–4423.

Wei, S., Ye, N., Zhang, S., Huang, X., & Zhu, J. 2012. Itembased collaborative filtering recommendation algorithm combining item category with interestingness measure. In 2012 International Conference on Computer Science and Service System (pp. 2038–2041). IEEE. https://doi.org/10.1109/CSSS.2012.507

Publicado

2020-03-21

Número

Sección

Artículos