New partial sequences of the novel coronavirus may help shed light on the early spread of the virus, after a scientist uncovered 34 samples of the SARS-CoV-2 virus from early on in the initial epidemic in Wuhan that were mysteriously deleted from the internet, according to a pre-print analysis published on bioRxiv.
Jesse Bloom, the author of the analysis, is the principal investigator at the Bloom Lab at the Fred Hutch research institute, which studies the evolution of proteins and viruses. Bloom is also an investigator at the Howard Hughes Medical Institute.
While the virus was originally thought to have jumped from animals to humans at the seafood market, that theory was largely dismissed after many early cases were found to have no connection to the market. Some studies have found cases from as early as September 2019.
While it is still unclear how the SARS-CoV-2 virus began spreading among humans, the virus is believed to be descended originally from coronaviruses from bats.
The problem, explained Bloom, is that while one would expect the earlier samples of the virus to be more similar to the bat coronavirus than later samples, this is not the case. The sequences collected from the cases linked to the seafood market are notably different from the bat coronavirus compared to other sequences collected at later dates outside Wuhan, with the market samples containing three extra mutations compared to the samples collected later.
Very few sequenced virus samples are available from the Wuhan epidemic except for a dozen samples collected in late December in 2019 from patients connected to the Huanan Seafood Market.
In the pre-print analysis, Bloom explained that the lack of information may be partially due to an order issued to unauthorized Chinese labs to destroy all coronavirus samples from early in the outbreak.
Bloom first noticed that data was missing from the National Institutes of Health's Sequence Read Archive (SRA), when he saw that data listed in a study on early mutational events of the virus was missing from the SRA. Data can only be deleted from the SRA by an email request.
Bloom later found that the sequencing project was removed from the China National GeneBank as well, shortly after the data was removed from the SRA.
The deleted sequences Bloom found somewhat fill in the gap between the samples collected from the Huanan Seafood Market and the possible progenitor viruses found in bats. The 13 partially reconstructed sequences are more similar to the bat coronaviruses than the samples found in the seafood market.
Bloom stressed in the pre-print analysis that this suggests that the sequences from the market are "not representative of the viruses that were circulating in Wuhan in late December of 2019 and early January of 2020."
The scientist bemoaned the fact that the sequences were deleted, stressing that it "clearly would have been more scientifically informative to fully sequence the samples rather than surreptitiously delete the partial sequences."
"There is no plausible scientific reason for the deletion," wrote Bloom, explaining that the paper the sequences were linked to had no corrections, stated that human subjects gave their approval and that the sequencing shows no evidence of any contamination.
"It therefore seems likely the sequences were deleted to obscure their existence. Particularly in light of the directive that labs destroy early samples and multiple orders requiring approval of publications on COVID-19, this suggests a less than wholehearted effort to trace early spread of the epidemic," wrote Bloom in the pre-print analysis.
"A careful re-evaluation of other archived forms of scientific communication, reporting, and data could shed additional light on the early emergence of the virus," advised Bloom, adding that it may be possible to obtain more information about the early spread of SARS-CoV-2, even if on-the-ground investigations face difficulties.
The study was prepared in May 2020 by the Lawrence Livermore National Laboratory in California and was referred by the State Department when it conducted an inquiry into the pandemic's origins during the final months of the Trump administration, the report said.
US intelligence agencies are considering two likely scenarios - that the virus resulted from a laboratory accident or that it emerged from human contact with an infected animal - but they have not come to a conclusion, he said.