A year and a day ago, the genetic sequence of the virus that has since spread across the world was shared. Though we were yet to appreciate the effect that the virus would come to have on our lives, this was already the moment at which science started to fight back. In this new series of graphics, made with the Royal Society of Chemistry, we’ll be highlighting the key scientific milestones that have brought us treatments, vaccines, and more.
Like our own genetic sequence, the genetic sequence of a virus contains all the instructions it needs to function: the code for the proteins that help the virus invade our bodies’ cells, and the steps for it to make copies of itself so it can infect new hosts. Whereas our genetic information is encoded in the double-stranded DNA molecule, that of SARS-CoV-2 is encoded in its single-stranded relative, RNA.
Genetic material is built up from nucleotides. Just as the alphabet’s letters build up words and sentences, nucleotides build up the genetic code. The difference is that, whereas the English alphabet has 26 letters, the genetic code’s nucleotide alphabet has just four: adenosine, guanosine, cytidine and thymidine. These nucleotides are given the letters A, G, C and T for short. In RNA, thymidine is replaced by uridine (designated by the letter U).
Determining the genetic sequence involves working out the sequence of As, Gs, Cs and Us that make it up. This isn’t as straightforward as it sounds when you realise that the genetic sequence of SARS-CoV-2 is approximately 30,000 nucleotides long. There are a number of techniques that have been developed to allow scientists to crack the code – that used to initially sequence SARS-CoV-2 is known as ‘sequencing by synthesis’.
As the name suggests, sequencing by synthesis involves working out the genetic sequence by building it. First, the virus’s RNA sequence is converted to DNA. This DNA is purified, broken up into smaller sections to be sequenced, then copied thousands of times over. Each smaller section of DNA is ‘unzipped’ to be copied, splitting it into two complementary strands.
During sequencing, a polymerase enzyme builds up the genetic sequence one nucleotide at a time, using the unzipped DNA as a template for the copy it constructs. Each nucleotide has a ‘terminator’ molecule attached to it, which hits the pause button every time a new nucleotide is added to the chain. This terminator also has a fluorescent tag attached, which can be detected by a special camera. The light given off by the fluorescent tag indicates which nucleotide was added to the chain.
After this, the terminator and tag are removed, unpausing the copying reaction, which begins again. Another nucleotide is added. Pause. The nucleotide added is recorded. Unpause. And so on, until all of the almost 30,000 nucleotides have been recorded.
This method might sound time-consuming, recording each nucleotide one-by-one, but it’s actually pretty quick, and can determine the genetic sequence in a matter of hours. The SARS-CoV-2 virus was only actually isolated a few days prior to the draft genetic sequence being published. The sequence was published by a Chinese-led research consortium as open access, meaning that other scientists across the world could make use of it immediately.
We’ll look more closely at some of the scientific advances it enabled as we go through this series, but it’s worth highlighting a few pertinent ones now. For example, the recently approved Moderna vaccine utilises part of the RNA code of SARS-CoV-2 – and was designed just days after the genetic sequence was released.
There’s also been a lot of concern and discussion recently about new emerging variants of the SARS-CoV-2 virus. Knowledge of these variants is made possible by genetic sequencing, helping us to compare current sequences to this original sequence and understand how changes might affect the virus.
Back on January 10, 2020, none of us were anticipating the pandemic to come. But one of the key scientific tools to help us emerge from it was already in place.
This graphic was developed in partnership with the Royal Society of Chemistry.
Enjoy Compound Interest’s posts? Consider supporting Compound Interest on Patreon!
The graphic in this article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. See the site’s content usage guidelines.
- Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome – F Wu and others
- The genetic sequence, origin and diagnosis of SARS-CoV-2 – H Wang and others
- Chinese researchers reveal draft genome of virus implicated in Wuhan pneumonia outbreak – J Cohen, Science
- Sequencing-by-synthesis: explaining the Illumina sequencing technology – BiteSizeBio