Evaluation of SNP filtering criteria for kinship estimation
We next examined the filtering parameters for SNPs using a total of 20 individuals of the F0 and F1generations, including kinships known from the rearing conditions. The known kinships among the 20 individuals included three parent–offspring relationships, one FS, and a maximum of five HS. We considered the parameters under which all known kinships were correctly reproduced as good parameters that could accurately estimate unknown kinships.
The filtering parameters considered were [1] minimum depth (MIN_DP), 5; [2] maximum depth (MAX_DP), 30–200; [3] minimum mean depth (MIN_MEAN_DP), 15; [4] minimum genotyping quality (MIN_GQ), 20–30; [5] call rate (CR), 0.7–1.0; [6] minor allele frequencies (MAF), 0.01–0.1; [7] deviation from Hardy–Weinberg equilibrium (HWE), 0.00001–0.01; [8] heterozygosity (HET), 0.6–0.8; and [9] linkage disequilibrium (LD), 0.1–0.3. The ranges of values following each parameter are the minimum–maximum values considered. Terms for the parameters and their values were determined based on several previous studies (Barnes & Breen, 2010; Chen et al., 2017; Dou et al., 2017; Miyagawa et al., 2008; Nishida et al., 2008; O’Leary et al., 2018; Roshyara et al., 2014).
We used COLONY 2.0 (Jones and Wang, 2010) and Sequoia (Huisman, 2017) to estimate kinships. First, we varied the values of each of the nine filtering parameters listed above, along with the allele dropout rate and error rate settings in COLONY 2.0, to identify multiple candidate combinations that could accurately reproduce known kinships, and then identified the ones for which Sequoia could also produce correct estimates. Because observations from laboratory experiments suggest thatT. ichikawai practices “continuous” polygamy (Watanabe, 1994a, 1994b), we assumed random mating in COLONY 2.0. The estimation accuracy was set to very high, and run length was set to 3 out of a range of 1–4. Vcftools (Danecek et al., 2011), Plink (Purcell et al., 2007), and SnpEff (Cingolani et al., 2012) were used for SNP filtering.