R Gene Annotations
We annotated the Illumina RenSeq libraries, PacBio RenSeq libraries, and
the draft genome for their R gene content. Each assembly was annotated
for R genes using NLR-Annotator (Steuernagel et al., 2020), which
searches for common R gene peptide motifs in protein-coding sequence.
The annotations were performed with the “-a” flag, which outputs the
amino acid sequence of the nucleotide-binding domain (NBARC domain). We
used the NBARC sequences of the draft genomic assembly to infer a
maximum likelihood tree in RAxML v.8.2.11 (Stamatakis 2006; Stamatakis
2014) using the peptide substitution model “PROTGAMMAAUTO” with 100
rapid bootstraps.
The two broad, non-overlapping classes of R genes that we analyzed in
this study, TNLs (containing a toll-interleukin-like type N-terminal
domain) and CNLs (containing a coiled-coil type N-terminal domain), were
identified from the annotations based on diagnostic motifs. To identify
TNLs, we parsed annotations for the motifs 13, 15, and 18 (as defined in
Table S1 of Jupe et al., 2013). For CNLs, we searched for motifs 2, 6,
16 and 17. These two sets of motifs occur only in their respective class
of N-terminal domain.