Figure 4. Fruit color differences within the homozygous VIRIM sample
group (yellow fresh fruit color) when categorized by the genotypes of
SNP LG3s906369. Fruits were grouped together which are homozygous for
the (a) ALT or (b) REF allele of SNP LG3s906369.
Results shows that the lightness of the color increases (R/B value
increases) when the sample is homozygous ALT allele of SNP LG3s906369.
Candidate genes and SNPs annotation.
We considered potential candidate genes when they were within ±150 kb of
a significantly associated SNP (total of 300 kb). While this is broad
given average LD decay, we wanted to ensure the possibility of capturing
linked genes that may be outside the standard LD decay size. There were
117 genes present across all the potential regions of significant SNPs
from the GWAS result of the R/B color phenotype (Supplementary file 3).
The putative gene functions were assigned by similarity using blast2Go
software. The blast2Go and literature search results show that many
genes related to fruit ripening and pigmentation are present around the
region of significant SNPs (Table 3) and many are present within a 50 kb
of the significant SNPs. The gene expression analysis using the RNA-seq
data for Kenezi (dark brown color) and Khalas (light brown color) fruit
varieties shows that many genes from the potential candidate regions
were expressed in fruit during the various days post-pollination (dpp)
(Figure 5). R2R3 transcription factor gene from LG4 was expressed late
in the development stage in both dark and light brown color fruit
cultivar (dpp 105, 120 and 135). The R2R3 transcription factor gene has
reduced expression in light color fruit variety compared with dark.
RING/U-box superfamily protein from LG10 was expressed at dpp 105 to 120
in the light fruit variety ( peaking at dpp 120) compared to the dark
color fruit. Other genes such as Protochlorophyllide reductase (LG3),
Basic helix-loop-helix (BHLH) DNA-binding superfamily (LG3), were
expressed in early in the development stage (dpp 45 and 75) and reduced
expression in late in the development stage (dpp 105, 120 and 135) in
both light and dark brown color fruit varieties.
We conducted structural variation analysis on all potential regions from
all LGs. The analysis showed a 5 kb deletion in the candidate region of
SNP LG10s12886617 on LG 10 (Supplementary Figure 12). While this 5 kb
deletion is located 800 bases away from the pentatricopeptide
repeat-containing protein gene and 7 kb away from the SNP LG10s12886617,
association of its presence/absence to the GWAS SNP was not high enough
to warrant further investigation. The SNPs and INDELs from all potential
regions were annotated using SNPEff software and filtered based on LD R2
value >=0.6 and putative impacts value High, Moderate and
Modifier. The SNP’s filter results show that a total of 34 SNPs are
present across the potential region, 12 non-synonymous variants, one
frameshift variants, and 21 three-prime and five prime UTR variants
(Supplementary file 4). The SNP list has only one SNP (LG10s12771512)
from the gene list mentioned in Table 2. SNP LG10s12771512 is within the
Ethylene-responsive transcription factor12 gene from LG10. SIFT results
of amino acid substitution effects on protein function analysis showed
that the SNP is putatively deleterious possibly affecting protein
function.
Table 3: List of genes detected around the regions of significant SNPs
from GWAS result association with the R/B fruit color phenotype. Genes
were selected if they have a putatively significant role in fruit
ripening and pigmentation. ±150 kb on both sides of significant SNP
(total 300 kb) region were considered as a potential region for
identifying the possible candidate gene.