3 Results
3.1 Detection and isolation of PDCoV
There was a total of 42 positive samples for PDCoV in 314 samples, which
were suspected of PDCoV, with a positive rate of 13.4%. Two PDCoV
completed genome were obtained in this study, whose accessions were
MH715491 and MT263013 in GenBank.
Moreover, a strain of PDCoV
(MT263013), which could be stably passaged on the LLC-PK cell line, was
successfully isolated and confirmed by an indirect immunofluorescence
assay (ELISA) and reverse transcription-polymerase chain reaction
(RT-PCR) (Fig. 1).
3.2 Distribution and phylogenetic
analysis of PDCoV
In 2012, PDCoV was first found in Hong Kong, China, and broke out in the
United States in 2014. At present, PDCoV has been reported in more than
11 countries all over the world. In China, pig farms in 14 provinces
were infected with PDCoV (Fig. 2). It has spread all over the world,
showing a global trend. The analysis results from the ML tree and BI
tree for PDCoV completed genome were almost identical (Fig. 3). The
results indicated that the full genomes
would be
classified into three major
lineages, which named the Southeast Asia (SEA) lineage, including
Thailand, Vietnam and Laos, America lineage, including the USA, Peru,
Japan, and South Korea (JSK), and China (CHN) lineage. We also observed
that MW685622 and MW685624 in Ayiti were highly similar to KY065120 in
Tianjin, China (99.8%) and MW685623 in Ayiti were highly similar to
KR150443 in Arkansas, USA (98.9%).
Bayesian Skyline Plot analysis revealed that the estimated effective
population size went up from 15 to 45 between 1989 and 2010, after a
brief fluctuation, and the effective population floated in a range of
around 70 up to 2019 (Fig. 4a).
According to the root-to-tip regression from TempEst (version 1.5.3),
the analysis of temporal structure revealed aspects of the clock-like
structure of spike gene (n = 130, correlation coefficient = 0.56,
R2 = 0.32), which indicated the sufficiently strong
temporal signal to estimate time-calibrated phylogenies using molecular
clock models (Supplementary Figs S1). Similar to the full genome, spike
genes were also classified into
three major lineages by the analysis of maximum clade credibility (MCC)
trees (Fig. 4b). Moreover, our reconstruction confirmed that the virus
spread from the CHN lineage, however, it was interesting that SEA
lineage was the origin according the BSSVS analysis. The MCC tree
indicated that the probability for PDCoV originating from the CHN
lineage (49.05%) and SEA lineage (48.45%) is similar. BSSVS analysis
demonstrated PDCoV spread from Southeast Asia to the USA, Ayiti, Peru
and China with high BF value and posterior probability (Fig. 5a,
Supplementary Tables S5) and spread from China with low BF value and
posterior probability. Combining these two
points, it can be determined that
PDCoV originated in Asia, more likely in Southeast Asia, as consistent
with the results of Xiao’s phylogenetic analysis (Ye et al., 2020).
The phylogeographic inference
indicated that PDCoV might have originated around June of 1989 (June
1982–January 1996, 95% highest posterior density).
3.3 The spread of PDCoV
The worldwide spatial dispersal networks of PDCoV were reconstructed. We
selected the transmission routes with BF values exceeding 3 and
posterior probability exceeding 0.5 to analyze (Ye et al., 2020). There
were six discrete sampling locations and nine significant transmission
routes (Fig. 6). SEA and JSK were the major output of PDCoV. SEA was
linked with four locations, including Ayiti (BF = 68.10, migration rate
= 0.924), JSK (BF = 9.48, migration rate = 0.938), the USA (BF = 6.97,
migration rate = 0.936), and China (BF = 38425, migration rate = 1.345).
JSK had connections to four locations, including Peru (BF = 54.13,
migration rate = 0.913), USA (BF = 4.62, migration rate = 0.927), SEA
(BF = 9.93, migration rate = 0.958) and Ayiti (BF = 70.21, migration
rate = 0.933). In addition, there was a transmission routes from the USA
to JSK (BF = 7.52, migration rate
= 3.553) (Fig. 5b, Supplementary
Tables S5). China and Southeast Asia are adjacent to each other, the
distance is about 2000km and the communication is relatively close.
Therefore, there was the highest BF value and high migration rate.
Remarkably, a special case got our attention. We observed that a strong
signal of viral dissemination from the USA to JSK, even though the two
places are approximately 11200 km apart, suggesting that the PDCoV in
the JSK may spread from the USA, as consistent with the results of MCC
tree. In Ayiti, pigs were reintroduced from North American populations,
mainly the USA and in small part, Canada (Alexander, 1992) and there was
a certain link between Ayiti and JSK.
Infections in Ayiti were likely to
be associated with the importation of pigs from the USA.
3.4 Protein structure analysis
Two samples named LC216914 and LC216915 in GenBank were collected from
pigs’ nasopharyngeal, suggesting PDCoV may be able to cause respiratory
infections in pigs (Woo et al., 2017). A sample named MK248485 in
GenBank was collected from chickens, suggesting PDCoV can infect
chickens (Boley et al., 2020). Three samples named MW685622, MW685623,
and MW685624 in GenBank were collected from children’s blood, suggesting
PDCoV has the potential to infect humans (Lednicky et al., 2021).
Comparing the six special sequences with all the sequences, it was found
that there were similar changes in seven amino acid sites. The structure
of the S protein from residues 52 to 1017 was shown in Fig. 7, because
this region of beginning and end are hydrophobic and can adversely
affect protein solubility (Lednicky et al., 2021).
Residue 38 was mutated from P to L, which may affect the secondary
structure of the protein, because proline is a subamino acid, which
cannot form intra-chain hydrogen bonds and is prone to β-turn angle
formation. Residue 40 was mutated from R to S, which reduces the space
resistance and enhances hydrophilicity. There was a N-glycosylation site
between residue 41 and 44 so residue deletion at site 45 may affect
glycosylation (Lednicky et al., 2021). The mutation between A and V at
residue 137 and 551 may eliminates specific Van der Waals contact,
potentially enhancing protein flexibility and dynamic movement of S1.
Moreover, this change may represent a common mechanism that enhances
dynamic movements, accelerating virus membrane fusion events and
transmission (Lednicky et al., 2021; Thompson et al., 2021). Mutation at
residue 670 altered the spatial site resistance. The phosphorylation of
the protein is mainly carried out on tyrosine, serine, and threonine
residues in the peptide chain. Residue 689 was mutated from S to A, with
a phosphorylation site losing (Fig. 8 and Table. 1).