Enrichment Using Distant Reference
The R gene libraries presented in this study were developed by enrichingSilphium integrifolium DNA for R genes using baits developed fromHelianthus annuus , a model organism with well-developed genomic
resources. Despite an estimated divergence age of between between 22.5
(Meireles et al., 2020) and 33.5 million years ago (Zhang et al., 2022),
the baits were able to enrich libraries to contain a median of 63% of
reads originating from R genes, representing a 36-fold enrichment over
WGS. For comparison, a previous study by Andolfo et al. (2014) found
success enriching the Solanum lycopersicum (common tomato) with
baits designed from Solanum tuberosum (common potato), a congener
estimated to be 6.7 Ma divergent by TimeTree (Kumar et al., 2017). This
study demonstrates the economical promise of RenSeq for studying the
immune systems of non-model plants under a variety of ecological and
evolutionary pressures. It also showcases the screening of crop wild
relatives that are of agronomic interest, such as S.
integrifolium , for disease resistance genes that might enable more
robust response to pathogens that pose a challenge for the domestication
of the plant.
The reference genome contained a much higher number of R genes in the
draft genome assembly compared to the enriched libraries (873 compared
to ~400-600), which can be accounted for by two factors.
First, the draft genome is assembled from an F1 hybrid between two
different species, S. integrifolium and S. perfoliatum .
While we expect overlap in many of the R genes due to homology, our
count is likely an overestimate of the true number contained within the
haploid genome of S. integrifolium due to the nature of R genes
as a rapidly diversifying gene family. Second, RenSeq enrichment likely
captures a different (and smaller) subset of the R genes than the WGS
PacBio sequencing we employed for the draft assembly.