Genome assembly and chromosome anchoring
To obtain a high-quality reference genome for P.tomentosa , we sequenced and assembled the genome of GM15
employing a combination of PacBio, Hi-C and Illumina methods. Its size
was estimated to be ~800 Mb by K-mer analysis (Fig. 1e,
Table S3). A total of ~54 Gb (~ 70×
coverage) PacBio data was assembled to generate a primary draft
assembly. To obtain a chromosome-scale assembly, Hi-C reads (430 million
reads, 65 Gb, ~ 80×coverage) were used to map the
primary draft assembly and construct Hi-C linkage information. Finally,
a fine Hi-C interaction map was constructed, and confirmed that
potential misjoins had been corrected in the final assembly, and a total
of 38 chromosome-scale pseudomolecules were successfully anchored (Fig.
2, Table S4), generating a diploid genome size of 740.2 Mb. The 38
chromosome-scale pseudomolecules covered 92.1% of the estimated 800 Mb
genome (Table 1). The sizes of contig N50 and scaffold N50 reached 0.96
Mb and 17.13 Mb, with the longest contig and scaffold being 5.47 Mb, and
46.68 Mb, respectively (Table 1).