Imputation accuracy measurement and post-imputation quality in imputed SNP genotypes for dairy cattle (#21)
Imputation of genotypes is a cost-effective method for generating genotypes for un-typed loci and allows data from different genotyping panels and platforms to be combined. Accuracy of imputation can be defined in a number of ways, each providing a different way to distinguish well imputed from poorly imputed SNP. The aims of this study were to compare different measures of imputation accuracy in low density panel data and determine how well the allelic R2 measure reported by BEAGLE performs across minor allele frequency (MAF) as a post-imputation filtering tool. Genotypes for 28793 New Zealand dairy cows from a low density BeadChip were used in the study. For 17593 animals, 9166 of 16512 SNP were masked and imputed using version 4.0 of BEAGLE software. Imputation accuracy for SNP with MAF ≥ 0.005 was high, but was variable for low MAF (< 0.005) SNP. Genotypic concordance was poorly correlated with allelic R2 for low MAF SNP, whilst other imputation accuracy measures were highly correlated with allelic R2 across all MAF classifications (> 0.81). Results showed that post-imputation filtering based on allelic R2 is an effective approach for removing poorly imputed SNP, including those of low MAF.