Complex Population Structure and Haplotype Patterns in the Western European Honey Bee from Sequencing a Large Panel of Haploid Drones

Abstract

Honey bee subspecies originate from specific geographical areas in Africa, Europe and the Middle East, and beekeepers interested in specific phenotypes have imported genetic material to regions outside of the bees’ original range for use either in pure lines or controlled crosses. Moreover, imported drones are present in the environment and mate naturally with queens from the local subspecies. The resulting admixture complicates population genetics analyses, and population stratification can be a major problem for association studies. To better understand Western European honey bee populations, we produced a whole genome sequence and single nucleotide polymorphism (SNP) genotype data set from 870 haploid drones and demonstrate its utility for the identification of nine genetic backgrounds and various degrees of admixture in a subset of 629 samples. Five backgrounds identified correspond to subspecies, two to isolated populations on islands and two to managed populations. We also highlight several large haplotype blocks, some of which coincide with the position of centromeres. The largest is 3.6 Mb long and represents 21% of chromosome 11, with two major haplotypes corresponding to the two dominant genetic backgrounds identified. This large naturally phased data set is available as a single vcf file that can now serve as a reference for subsequent populations genomics studies in the honey bee, such as (i) selecting individuals of verified homogeneous genetic backgrounds as references, (ii) imputing genotypes from a lower-density data set generated by an SNP-chip or by low-pass sequencing, or (iii) selecting SNPs compatible with the requirements of genotyping chips.