Figure 2. GWAS analysis of tamar stage fruit color (R/B) of date palm.a. decay of linkage disequilibrium with physical distance in the GWAS mapping population. The half decay distance is 25.9 kb.b & c : GWAS of fruits date palm color(n=188), (b): QQ plot (b) and (c) Manhattan plot using the LD pruned SNPs set (3,541,727 SNPs) for all linkage groups (LG) and unplaced scaffolds.
Association study of Tamar stage fruit color
Genotyping of date palm samples using GATK HaplotypeCaller produced a filtered call set of 31,373,292 SNPs. Following the application of quality control filters (QC filters), we identified 10,183,834 SNPs across 188 samples. The association study was performed for all fruit color phenotypes such as Red, Green, Blue, R/B, R/G & G/B using the FarmCPU GWAS method. For computational efficiency of FarmCPU, we chose to remove LD-correlated SNPs from the GWAS panel. LD pruning using PLINK resulted in 3,541,727 SNPs. GWAS results of all phenotypes except the ‘Blue’ color channel phenotype showed overlapping significant SNPs. After manual analysis, we chose to focus on the GWAS results of the R/B phenotype for the reasons mentioned above. Q-Q plot of FarmCPU genome-wide association results with the R/B phenotype shows a sharp deviation from the expected P-value distribution in the tail area and lamda score as 1.02 (Figure 2b) showing good control of both false positives and false negatives. The FDR adjusted P-value cut-off of the R/B GWAS result shows that 6 SNPs are significantly associated with the phenotype (Table 2 and Figure 2c ). These SNPs are from multiple linkage groups (LG) such as LG3, LG4, LG5, LG10 and one unplaced scaffold MU008982.1. A SNP from LG4 (LG4s65268794) has a highly significant p-value (10e-12), and it is common between GWAS results of the Red (R), R/B and R/G phenotypes (Figure 2c, Supplementary Figure 4 & 7). SNP LG10s12886617 from LG10 is common in R/B and G/B phenotypes (Supplementary Figure 8). The association result of R/G also shows a significant SNP in LG10 (LG10s12504803), which is very close to LG10s12886617 SNP. The significant SNPs from the association result of R/B phenotypes were used for identifying the loci and possible candidate genes associated with the color phenotype of Tamar stage fruit.
Table 2. List of significant SNPs associated with R/B phenotype from FarmCPU GWAS result. An FDR-adjusted p-value of 0.05 ( 5%) was used as a cut-off value for identifying the list of significant SNPs associated with the phenotype