Basic notions about DNA
Let’s first review some general notions. The genome is the set of genes of an organism or species. These are encoded in deoxyribonucleic acid (DNA), which is made up of four types of subunits, called nitrogenous bases. The nitrogenous bases of DNA are adenine, thymine, cytosine, and guanine. Each DNA sequence that codes for a protein is called a gene.
DNA is a double strand in the form of a helix, on each side of the chain there are two complementary nitrogenous bases: adenine is complementary to thymine and cytosine is complementary to guanine. Therefore, if we know that at a point in one of the two chains there is a cytosine, we know that at that same point in the complementary chain there is a guanine. This is important to understand the sequencing method developed by Sanger.
Frederick Sanger was a British biochemist, who received the 1958 Nobel Prize in Chemistry for his research on the structure of proteins, and in 1980 he received a second Nobel Prize in Chemistry for developing the first DNA sequencing technique. His findings on DNA sequencing were published in 1975 and marked the beginning of numerous genetic investigations.
The Sanger sequencing method started from a single DNA strand , which would have previously been separated from its complementary strand. The DNA synthesis process is carried out by the polymerase enzyme (which we talk about in this article about PCR tests ), which is added to the DNA strand that you want to sequence. Polymerase, in the presence of a DNA strand, a sequence from which synthesis starts (called “primer”) and nitrogenous bases, is capable of synthesizing the complementary DNA strand.
How to obtain the sequence of the DNA strand being synthesized ? By stopping DNA synthesis so that we know what the last nucleotide was added to the chain, and putting the many incomplete chains of different lengths in order .
For this , DNA synthesis is carried out in four separate reactions . To each of them are added the four nucleotides, and a dideoxynucleotide (adenine, thymine, cytosine or guanine). Dideoxynucleotides lack a molecule that is necessary for DNA synthesis to continue, so that when one is added to the chain, synthesis stops and the strand remains incomplete.
In each of the four reactions , a concentration of the dideoxynucleotide 100 times lower than that of the corresponding nucleotide is added . This makes it easy to produce DNA strand sequences of different lengths, but without forcing the strand to be too short that sequencing becomes too tedious a process.
Once the reaction has finished, the four samples are placed in an acrylamide gel , which by applying electricity is capable of separating molecules of different sizes (electrophoresis). This allows the incomplete strands of DNA to be sorted by length and the nucleotide sequence known . It would, of course, be the complementary sequence to the strand that was used as a template for DNA synthesis.
This was the way Sanger originally developed his sequencing method, for which he had to run the tests manually without any automation. Technological advances have made it possible to carry out sequencing by this method much more quickly and efficiently. For example, using different inks to label all four dideoxynucleotides , allowing sequencing with a single reaction instead of four.
The original procedure allowed for about 80 nucleotides to be sequenced at a time . It was a major advance over previous technologies, but sequencing entire genes could be tedious. The first completely sequenced genome of an organism, that of the bacteriophage Phi-X174, was obtained thanks to this technique in 1977. The genome of this bacteriophage is arranged in 11 genes and has a total of 5,386 nitrogenous bases.