Genome-wide SNP heterozygosity estimation and functional SNP categorization
To assess the genetic diversity of the T. ichikawai genome, we estimated the genome-wide SNP heterozygosity of the reference individual. Initially, stLFR barcode-trimmed MGISEQ reads were mapped to the reference genome using NextGenMap (Sedlazeck et al., 2013), and a binary format alignment/map (BAM) file was generated. The BAM file was sorted by SAMTools version 1.7 (Li, H. et al., 2009). Next, local realignments of INDELs in the sorted BAM file were conducted by GATK v3.8.1 (McKenna et al. 2010). Then, a genomic variant call format (GVCF) file of the reference individual was generated by GATK HaplotypeCaller with options -hets 0.001 and -indelHeterozygosity 0.001. Finally, SNPs of the reference individual were called and an output variant call format (VCF) file was generated using the GATK GenotypeGVCF tool. For genotyped SNPs, variant filtering was applied using the GATK VariantFilteration tool with cutoff values as follows: MQ > 30.00, SOR < 4.000, QD > 2.00, FS < 60.000, MQRankSum > –20.000, ReadPosRankSum > –10.000, and ReadPosRankSum < 10.000.