Using Random Forests to identify SNP associated with leg defects in broiler chicken: impact of correcting for population structures — ASN Events

Using Random Forests to identify SNP associated with leg defects in broiler chicken: impact of correcting for population structures (#19)

Yutao Li 1 , Andrew George 2 , Rachel Hawken 3 , Robyn Sapp 3 , Sigrid Lehnert 1 , Antonio Reverter 1
  1. CSIRO Agricultural Flagship, St Lucia, QLD, Australia
  2. CSIRO Digital Productivity, Brisbane, QLD, Australia
  3. Cobb-Vantress, Siloam Springs, Arkansas , USA

The machine learning method, Random Forests (RF) has been shown to be effective in genome-wide association studies (GWAS). However existence of population structure (PS) without correcting it may cause spurious results in a RF analysis. In this study, we examined the impact of correcting for PS on the RF analysis of leg defect data from a commercial poultry population of 826 chickens genotyped for 44,129 SNP markers. The results show that correcting for PS led to: 1) a significant improvement in the estimates of SNP variable importance values; 2) a significant reduction in false positives identified in the uncorrected data; and 3) a stronger evidence for a set of SNP associated with the led defect phenotype. 

Full Paper