GWAS-assisted genomic prediction of cadmium accumulation in maize kernel with machine learning and linear statistical methods

作  者:Yan HL, Guo HY, Xu WX, Dai CH, Kimani W, Xie JY, Zhang HZF, Li T, Wang F, Yu YJ, Ma M, Hao ZF*, He ZY*
影响因子:14.224
刊物名称:Journal of Hazardous Materials
出版年份:2023
卷:441  期:  页码:129929

论文摘要:

The production and use of many heavy meal contained materials almost inevitably release cadmium (Cd) into environment, generating Cd pollutants with adverse impacts on food and human health. Developing an effective method for Cd concentration evaluation in food crops could be an effective approach for toxicity prediction and pollution control. Here, we exploited the genotype-to-phenotype relationship of maize kernel Cd accumulation at whole-genome level, and developed genome-wide association study (GWAS) assisted genomic-enabled prediction (GP) models using machine learning and linear statistical methods. In benchmark tests, marker density and training populations were key parameters in determining GP baseline precision. With optimized parameters, three statistical methods, including Bayes A, ridge regression–best linear unbiased prediction (rrBLUP) and random forest (RF), showed the highest prediction accuracy (Bayes A, 0.83; rrBLUP, 0.89; RF, 0.75) with 100 iterations of cross-validation. In field trial, GP models with rrBLUP performed better than Bayes A and RF, with a higher GP accuracy (rMG) and lower mean absolute error value. Integrating GP with GWAS can be implemented as an effective strategy for accurate evaluation of Cd concentration, which could provide useful guidelines for accelerating the selection and breeding cycle of low-Cd food crops and addressing the environmental Cd contamination problem.

全文链接:https://www.sciencedirect.com/science/article/pii/S030438942201723X?via%3Dihub