Nucleic acid secondary structure

The secondary structure of a nucleic acid molecule refers to the basepairing interactions within a single molecule or set of interacting molecules, and can be represented as a list of bases which are paired in a nucleic acid molecule.[1] The secondary structures of biological DNA's and RNA's tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complicated base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.

In a non-biological context, secondary structure is a vital consideration in the rational design of nucleic acid structures for DNA nanotechnology and DNA computing, since the pattern of basepairing ultimately determines the overall structure of the molecules.


Fundamental concepts

Base pairing

Top, an AT base pair demonstrating two intermolecular hydrogen bonds; bottom, a GC base pair demonstrating three intermolecular hydrogen bonds.

In molecular biology, two nucleotides on opposite complementary DNA or RNA strands that are connected via hydrogen bonds are called a base pair (often abbreviated bp). In the canonical Watson-Crick base pairing, adenine (A) forms a base pair with thymine (T), as and guanine (G) with cytosine (C) in DNA. In RNA, thymine is replaced by uracil (U). Alternate hydrogen bonding patterns, such as the wobble base pair and Hoogsteen base pair, also occur—particularly in RNA—giving rise to complex and functional tertiary structures. Importantly, pairing is the mechanism by which codons on messenger RNA molecules are recognized by anticodons on transfer RNA during protein translation. Some DNA- or RNA-binding enzymes can recognize specific base pairing patterns that identify particular regulatory regions of genes.

Hydrogen bonding is the chemical mechanism that underlies the base-pairing rules described above. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content, but contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly and stabilization is mainly due to stacking interactions.[2]

The larger nucleobases, adenine and guanine, are members of a class of doubly ringed chemical structures called purines; the smaller nucleobases, cytosine and thymine (and uracil), are members of a class of singly ringed chemical structures called pyrimidines. Purines are only complementary with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established; purine-purine pairings are energetically unfavorable because the molecules are too close, leading to overlap repulsion. The only other possible pairings are GT and AC; these pairings are mismatches because the pattern of hydrogen donors and acceptors do not correspond. The GU wobble base pair, with two hydrogen bonds, does occur fairly often in RNA.

Nucleic acid hybridization

Two complementary regions of nucleic acid molcules will bind and from a double helical structure held together by base pairs.
Melting stability of base steps (B DNA)[3]
Step Melting ΔG
/Kcal mol−1
T A -0.12
T G or C A -0.78
C G -1.44
A G or C T -1.29
A A or T T -1.04
A T -1.27
G A or T C -1.66
C C or G G -1.97
A C or G T -2.04
G C -2.70

Hybridization is the process of complementary base pairs binding to form a double helix. Melting is the process by which the interactions between the strands of the double helix are broken, separating the two nucleic acid strands. These bonds are weak, easily separated by gentle heating, enzymes, or physical force. Melting occurs preferentially at certain points in the nucleic acid.[4] T and A rich sequences are more easily melted than C and G rich regions. Particular base steps are also susceptible to DNA melting, particularly T A and T G base steps.[5] These mechanical features are reflected by the use of sequences such as TATAA at the start of many genes to assist RNA polymerase in melting the DNA for transcription.

Strand separation by gentle heating, as used in PCR, is simple providing the molecules have fewer than about 10,000 base pairs (10 kilobase pairs, or 10 kbp). The intertwining of the DNA strands makes long segments difficult to separate. The cell avoids this problem by allowing its DNA-melting enzymes (helicases) to work concurrently with topoisomerases, which can chemically cleave the phosphate backbone of one of the strands so that it can swivel around the other. Helicases unwind the strands to facilitate the advance of sequence-reading enzymes such as DNA polymerase.

Secondary structure motifs

Nucleic acid secondary structure is generally divided into helices (contiguous base pairs), and various kinds of loops (unpaired nucleotides surrounded by helices). Frequently these elements, or combinations of them, can be further classified, for example, tetraloops, pseudoknots, and stem-loops.

Double helix

The double helix is an important tertiary structure in nuleic acid molecules which is intimately connected with the molecule's secondary structure. A double helix is formed by regions of many consecutive base pairs.

The DNA double helix is a right-handed spiral polymer of nucleic acids, held together by nucleotides which base pair together. A single turn of the helix constitutes about ten nucleotides, and contains a major groove and minor groove, the major groove being wider than the minor groove.[6] Given the difference in widths of the major groove and minor groove, many proteins which bind to DNA do so through the wider major groove.[7] Many double-helical forms are possible; for DNA the three biologically relevant forms are A-DNA, B-DNA, and Z-DNA, while RNA double helices have structures similar to the A form of DNA.

Stem-loop structures

An example of an RNA stem-loop secondary structure

The secondary structure of nucleic acid molecules can often be uniquely decomposed into stems and loops. The stem-loop structure in which a base-paired helix ends in a short unpaired loop is extremely common and is a building block for larger structural motifs such as cloverleaf structures, which are four-helix junctions such as those found in transfer RNA. Internal loops (a short series of unpaired bases in a longer paired helix) and bulges (regions in which one strand of a helix has "extra" inserted bases with no counterparts in the opposite strand) are also frequent.

There are many secondary structure elements of functional importance to biological RNA's; some famous examples are the Rho-independent terminator stem-loops and the tRNA cloverleaf. There is a minor industry of researchers attempting to determine the secondary structure of RNA molecules. Approaches include both experimental and computational methods (see also the List of RNA structure prediction software).


This example of a naturally occurring pseudoknot is found in the RNA component of human telomerase. Sequence from.[8]

A pseudoknot is an RNA secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem. Pseudoknots fold into knot-shaped three-dimensional conformations but are not true topological knots. The base pairing in pseudoknots is not well nested; that is, base pairs occur that "overlap" one another in sequence position. This makes the presence of general pseudoknots in RNA sequences impossible to predict by the standard method of dynamic programming, which uses a recursive scoring system to identify paired stems and consequently cannot detect non-nested base pairs with the most common algorithms. Limited subclasses of pseudoknots can be predicted using dynamic programs described in.[9] Newer structure prediction techniques such as stochastic context-free grammars also do not take pseudoknots into account.

Several important biological processes rely on RNA molecules that form pseudoknots. For example, the RNA component of human telomerase contains a pseudoknot that is critical for activity.[8] Though DNA can also form pseudoknots, they are generally not present in biological DNA.

Secondary structure prediction

One application of bioinformatics uses predicted RNA secondary structures in searching a genome for noncoding but functional forms of RNA. For example, microRNAs have canonical long stem-loop structures interrupted by small internal loops. A general method of calculating probable RNA secondary structure is dynamic programming, although this has the disadvantage that it cannot detect pseudoknots or other cases in which base pairs are not fully nested. More general methods are based on stochastic context-free grammars. A web server that implements a type of dynamic programming is Mfold.

For many RNA molecules, the secondary structure is highly important to the correct function of the RNA — often more so than the actual sequence. This fact aids in the analysis of non-coding RNA sometimes termed "RNA genes". RNA secondary structure can be predicted with some accuracy by computer and many bioinformatics applications use some notion of secondary structure in analysis of RNA.

See also


  1. ^ Dirks, Robert M.; Lin, Milo; Winfree, Erik & Pierce, Niles A. (2004). "Paradigms for computational nucleic acid design". Nucleic Acids Research 32 (4): 1392–1403. doi:10.1093/nar/gkh291. PMC 390280. PMID 14990744. 
  2. ^ Peter Yakovchuk, Ekaterina Protozanova and Maxim D. Frank-Kamenetskii. Base-stacking and base-pairing contributions into thermal stability of the DNA double helix. Nucleic Acids Research 2006 34(2):564-574; doi:10.1093/nar/gkj454 PMID 16449200
  3. ^ Protozanova E, Yakovchuk P, Frank-Kamenetskii MD (2004). "Stacked–Unstacked Equilibrium at the Nick Site of DNA". J Mol Biol 342 (3): 775–785. doi:10.1016/j.jmb.2004.07.075. PMID 15342236. 
  4. ^ Breslauer KJ, Frank R, Blöcker H, Marky LA (1986). "Predicting DNA duplex stability from the base sequence". PNAS 83 (11): 3746–3750. doi:10.1073/pnas.83.11.3746. PMC 323600. PMID 3459152. 
  5. ^ Richard Owczarzy (2008-08-28). "DNA melting temperature - How to calculate it?". High-throughput DNA biophysics. Retrieved 2008-10-02. 
  6. ^ Alberts et al. (1994). The Molecular Biology of the Cell. New York: Garland Science. ISBN 978-0815341055. 
  7. ^ Pabo C, Sauer R (1984). "Protein-DNA recognition". Annu Rev Biochem 53: 293–321. doi:10.1146/ PMID 6236744. 
  8. ^ a b Chen JL, Greider CW. (2005). "Functional analysis of the pseudoknot structure in human telomerase RNA". Proc Natl Acad Sci USA 102 (23): 8080–5. doi:10.1073/pnas.0502259102. 
  9. ^ Rivas, E.; Eddy, S. R,. A dynamic programming algorithm for RNA structure prediction including pseudoknots. J Mol Biol 1999, 285, 2053.

External links

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Nucleic acid tertiary structure — Example of a large catalytic RNA. The self splicing group II intron from Oceanobacillus iheyensis.[1] The tertiary structure of a nucleic acid is its precise three dimensional structure, as defined by the atomic coordinates.[2] …   Wikipedia

  • Nucleic acid structure prediction — This article is about the computational prediction of nucleic acid structure. For experimental methods, see Nucleic acid structure determination. Nucleic acid structure prediction is a computational method to determine nucleic acid secondary and… …   Wikipedia

  • Nucleic acid design — can be used to create nucleic acid complexes with complicated secondary structures such as this four arm junction. These four strands associate into this structure because it maximizes the number of correct base pairs, with A s matched to T s and …   Wikipedia

  • Nucleic acid thermodynamics — is the study of the thermodynamics of nucleic acid molecules, or how temperature affects nucleic acid structure. For multiple copies of DNA molecules, the melting temperature (Tm) is defined as the temperature at which half of the DNA strands are …   Wikipedia

  • Nucleic acid structure — refers to the structure of nucleic acids such as DNA and RNA It is often divided into four different levels: Primary structure the raw sequence of nucleobases of each of the component DNA strands; Secondary structure the set of interactions… …   Wikipedia

  • Secondary structure prediction — is a set of techniques in bioinformatics that aim to predict the local secondary structures of proteins and RNA sequences based only on knowledge of their primary structure amino acid or nucleotide sequence, respectively. For proteins, a… …   Wikipedia

  • Nucleic acid structure determination — This article is about the experimental determination of nucleic acid structure. For computational methods, see Nucleic acid structure prediction. Structure probing of nucleic acids is the process by which biochemical techniques are used to… …   Wikipedia

  • Nucleic acid double helix — Double helix redirects here. For other uses, see Double helix (disambiguation). Two complementary regions of nucleic acid molecules will bind and form a double helical structure held together by base pairs. In molecular biology, the term double… …   Wikipedia

  • Nucleic acid sequence — A series of codons in part of a mRNA molecule. Each codon consists of three nucleotides, usually representing a single amino acid. The sequence or primary structure of a nucleic acid is the composition of atoms that make up the nucleic acid and… …   Wikipedia

  • Secondary structure — In biochemistry and structural biology, secondary structure is the general three dimensional form of local segments of biopolymers such as proteins and nucleic acids (DNA/RNA). It does not, however, describe specific atomic positions in three… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.