Data compression: a new way to infer genomic relationship matrices and highlight regions of interest in commercial lines of broiler chicken (#93)
To date genomic prediction of breeding values in livestock exploits correlation based measures of animal relatedness, based on shared patterns of genome-wide single nucleotide polymorphism (SNP) genotypes, and captured by the genomic relationship matrix (GRM). However, it is not clear whether correlation is the best way of quantifying those shared patterns. Here, we continue our exploration of whether one can build relationship matrices based on the concept of compression efficiency from Information Theory. Drawing on 4 commercial broiler lines, 2 for selection of genetically superior roosters selected for growth and efficiency, and 2 for selection of genetically superior hens selected for reproductive performance, we found that data compression clustered the male lines together but separate from the female lines. Further, a sliding window version of the approach identified different gene regions apparently selected in male versus female lines. In males two prominent regions harboured IGF-1 (Chromosome 1) and one of its cognate receptors INSR (Chromosome 28) as putative selection signatures. These regions were not identified in the female lines, but different regions harbouring the reproductive hormone receptor GNRHR (Chr 10) and folate metabolism FOLH1 (Chr 1) were awarded extreme scores.