Genomic prediction using sequence data in a multibreed context (#69)
A key factor for a successful genomic selection scheme is the ability to accurately predict genomic breeding values (GEBV), which relies heavily on the availability of a large reference population (Goddard, 2009). Numerically small dairy cattle populations are therefore challenged in achieving future genetic gains relative to breeds with large reference populations. One obvious solution is to join populations. Here the general experience is that joining populations of the same or closely related breeds results in an increased reliability of GEBV, while joining distantly or unrelated breeds does not (Lund et al., 2014).
Sequence data can potentially increase the reliability of multi breed genomic prediction because it contains variants in high linkage disequilibrium (LD) with the causative variants. Sequence data does, however, also contain a large number of variants in low LD with the causative mutations, which may result in more noise in predictions. The objective of this study was to use sequence variants to increase the reliability of multi breed prediction, which was studied in two parts. In part 1, a simulation study based on real sequence data was carried out. This simulation study used the regression of genomic relationships at causative mutations on genomic relationships at prediction markers to measure the loss in prediction reliability as a consequence of using markers in imperfect LD with QTL (de los Campos et al., 2013). It was concluded that the potential is there, but it is important to use only variants very close to the causative mutations. In part 2, we tested different models and approaches for multi breed genomic predictions, with focus on the small dairy cattle breeds Danish Jersey and Danish Red when joined with Nordic and French Holsteins. Here, sequence variants selected from a multi breed GWAS for production traits were used as prediction markers in a genomic component to capture covariances across breeds. Different models and selection strategies were compared. Large increases in reliability, up to 0.10, were observed for multi breed prediction using QTL variants compared to within breed prediction using only 50K markers. Our results show that using a selective number of sequence variants can result in large increases in reliability in multi breed genomic predictions even for small and distantly related breeds, but careful selection of the variants is essential