loading page

Statistical sampling of missing environmental variables improves biophysical genomic prediction
  • +11
  • Abdulqader Jighly,
  • Thabo Thayalakumaran,
  • Surya Kant,
  • Joe Panozzo,
  • Rajat Aggarwal,
  • David Hessel,
  • Kerrie Forrest L,
  • Frank Technow,
  • Radu Totir,
  • Mike Goddard,
  • Jennie Pryce,
  • matthew hayden,
  • Jesse Munkvold,
  • Garry J. O'Leary
Abdulqader Jighly
AgriBio Centre For AgriBioscience

Corresponding Author:abdulqader.jighly@agriculture.vic.gov.au

Author Profile
Thabo Thayalakumaran
AgriBio Centre For AgriBioscience
Author Profile
Surya Kant
Agriculture Victoria
Author Profile
Joe Panozzo
Agriculture Victoria
Author Profile
Rajat Aggarwal
Corteva Agriscience Johnston Global Business Center
Author Profile
David Hessel
Corteva Agriscience Wamego KS USA
Author Profile
Kerrie Forrest L
AgriBio Centre For AgriBioscience
Author Profile
Frank Technow
Corteva Agriscience Johnston Global Business Center
Author Profile
Radu Totir
Corteva Agriscience Johnston Global Business Center
Author Profile
Mike Goddard
AgriBio Centre For AgriBioscience
Author Profile
Jennie Pryce
AgriBio Centre For AgriBioscience
Author Profile
matthew hayden
AgriBio Centre For AgriBioscience
Author Profile
Jesse Munkvold
Corteva Agriscience Johnston Global Business Center
Author Profile
Garry J. O'Leary
Agriculture Victoria
Author Profile

Abstract

Since the invention of whole genome prediction (WGP) more than two decades ago, breeding programs have established extensive reference populations that are cultivated under diverse environmental conditions. The introduction of the CGM-WGP model, which integrates crop growth models (CGM) with WGP, has expanded the applications of WGP to the prediction of unphenotyped traits in untested environments, including future climates. However, CGMs require multiple seasonal environmental records, unlike WGP, which makes CGM-WGP less accurate when applied to historical reference populations that lack crucial environmental inputs. Here, we investigated the ability of CGM-WGP to approximate missing environmental variables to improve prediction accuracy. Two environmental variables in a wheat CGM, initial soil water content (InitlSoilWCont) and initial nitrate profile, were sampled from different normal distributions separately or jointly in each iteration within the CGM-WGP algorithm. Our results showed that sampling InitlSoilWCont alone gave the best results and improved the prediction accuracy of grain number by 0.07, yield by 0.06 and protein content by 0.03. When using the sampled InitlSoilWCont values as an input for the traditional CGM, the average narrow-sense heritability of the genotype-specific parameters (GSPs) improved by 0.05, with GNSlope, PreAnthRes and VernSen showing the greatest improvements. Moreover, the root mean square of errors for grain number and yield was reduced by about 7% for CGM and 31% for CGM-WGP when using the sampled InitlSoilWCont values. Our results demonstrate the advantage of sampling missing environmental variables in CGM-WGP to improve prediction accuracy and increase the size of the reference population by enabling the utilisation of historical data that is missing environmental records.