DISCUSSION
This study aimed to identify loci and possible candidate genes
associated with the color phenotype of tamar stage (dry) date fruit. In
our association study, we used genotypic and image data of 188 QC
filtered dry date fruit samples. Using the FarmCPU GWAS method and R/B
color phenotype, we identified six significant SNPs from the GWAS result
(based on the FDR adjusted p-value cut-off) associated with the color
phenotype (Table 2). These SNPs span over linkage groups LG3, LG4, LG5,
LG10 and one unplaced scaffold MU008982. Among these, the highly
significant SNP LG4s65268794 is linked to the previously discovered VIR
gene responsible for yellow or red fresh fruit color. The remaining SNPs
(five as new loci from this study) are also significantly associated
(Wilcoxon statistical test) with fruit color though likely either only
for the dry fruit stage or finer color detail in fresh fruit (figure
3a). Possible candidate genes were mapped from the regions surrounding
significant SNPs. As expected from previous studies (K. M. Hazzouri et
al., 2015; Khaled M. Hazzouri et al., 2019), the R2R3-MYB transcription
factor gene is present on LG4 and located 16 kb away from the
significant SNP LG4s65268794. We confirmed that the this previously
identified VIR genotype also has a major effect even on dry fruit color
in our samples likely stemming from the starting color at the fresh
fruit stage of yellow or red (Supplementary Figure 10).
Beyond the fruit color classification provided by the genotype of the
R2R3-MYB transcription factor gene, we investigated how the genotypic
variation of other SNPs (Table 2) associated with the color phenotype of
dry fruit. That is, do these SNPs provide further genetic contribution
to the dry date fruit color phenotype? To do that end we assessed our
SNP associations within the homozygous VIR wild type and homozygous
VIRIM groups. The results revealed that the newly identified SNPs,
excluding SNP LG4s19036701, could give further resolution to the light
and dark color fruit on top of fruit color classification by the VIR
genotype (Figure 3b & 3c). For some SNPs we think that low numbers of
samples in the two groups (wild type group=25 and VIRIM group=61 ) might
be the reason for the lack of the statistically significant association.
However, the overall picture is that the newly identified loci are
associated with the color phenotype and can distinguish the dry fruit
color beyond simply the color observed in fresh fruit. This association
may relate to genetic control during the fruit ripening process.
Based on literature search and Blast2GO gene ontology analysis, 6 genes
were identified in the regions surrounding significant SNPs (Table 2)
that play a role in fruit ripening and pigmentation in other plants
(Table 3). RNA-Seq analysis reveals that many of these genes are
differentially expressed early or late in the development stages of both
light (Khalas) and dark (Kenezi) color fruit (Figure 5). The genes
identified, such as the Ethylene-responsive transcription factor12 gene
and RING U-Box superfamily protein, are present in the candidate genomic
region of SNP LG10s12886617 (LG 10). Ethylene-responsive transcription
factor-12 gene is similar to the DORNROSCHEN-like protein in Arabidopsis
thaliana. It contains the AP2 domain and has a significant role in the
ethylene-activated signalling pathway and cytokinin signalling pathway
(Das et al., 2012; Phukan, Jeena, Tripathi, & Shukla, 2017). In
Arabidopsis, cytokinin signalling increases the sugar-induced
anthocyanin biosynthesis (Das et al., 2012). The candidate region from
LG10 also contains an uncharacterised protein (gene id:
PDK50.r1.LG10G00073880) which contains Myb DNA-binding 3 domain
(Ambawat, Sharma, Yadav, & Yadav, 2013). The Protochlorophyllide
reductase gene is present withing the region surrounding LG3s906369 SNP.
This gene plays a vital role in chlorophylls’ biosynthetic pathway
(Garrone, Archipowa, Zipfel, Hermann, & Dietzek, 2015; Yamazaki,
Nomata, & Fujita, 2006).
SNPs that were filtered out due to high FDR, yet remained near the top
of our list also identified regions with many genes related to fruit
ripening and pigmentation (Kaler, Gillman, Beissinger, & Purcell, 2020;
Y. M. Zhang, Jia, & Dunwell, 2019) . We used the unadjusted p-value
10e-7 as a cut-off value for identifying those lists of significant SNPs
(Supplementary Table 2) however other SNPs may just be below the
threshold of significance based on sample numbers used here. The
candidate region of SNP LG13s8766984 (LG13) contains an AP2-like
ethylene-responsive transcription factor. Other genes include
4-coumarate-CoA ligase and 4-coumarate:coA ligase 3, Myb family
transcription factor family protein, AP2/B3-like transcriptional factor
family protein, and Chalcone-flavanone isomerase that are present around
the region of SNP LG5s4683788 (LG5). Chalcone Isomerase is a critical
enzyme for the anthocyanin biosynthesis (J. H. Kang et al., 2014; Sun et
al., 2019). The 4-coumarate: CoA ligase is a key enzyme in
phenylpropanoid metabolism in plants (Y. Li, Kim, Pysh, & Chapple,
2015; C. H. Wang et al., 2016). Metabolome study of dates detected
countable enrichment of phenylpropanoids in the early development of
dates (Diboun et al., 2015). Our gene expression analysis shows
4-coumarate: CoA ligase genes ( gene id: PDK50.r1.LG5G00393990 and
PDK50.r1.LG5G00393890) are highly expressed in early in dark color fruit
compared with light color, peaking at 45-75 days post pollination
(Supplementary Figure 13). These genes may provide candidates for
further study if larger sample numbers reveal them to be indeed be
significantly associated with fruit color.
By combining the genotypic data of extensively diverse samples collected
from 14 countries and the color phenotype of dry fruit (tamar stage
fruit), we successfully performed a GWAS using the FarmCPU method. We
identified multiple significant loci and possible candidate genes
associated with the color variation of fruit. The new SNPs association
with the color of dry date fruit will help add resolution to our
understanding of genetic control of commercially important phenotypes in
this fruit crop.