Exons, Introns, Codons, & their equivalents
Three common technical terms in
molecular genetics, exon,
intron, and codon, have specific
technical definitions, but are often miss-used in
hurried or short-hand presentations. The main thing to
remember is that exon and
introns are features of DNA,
whereas codons are
features of mRNA. Homologous sequences
in the other type of nucleic need to be called something
else, otherwise there is a danger the roles of DNA and
RNA in the Central Dogma ("DNA makes
RNA makes Protein")
will be confused.
By definition, exons and introns are
sequences in a protein-coding gene region
of a double-stranded DNA molecule (dsDNA)
that are expressed as proteins, or
intervening sequences not so
expressed. The exons and introns are typically shown as
the single-stranded sequences of the Sense Strand
of the dsDNA, written 5'-3', left to
right.
Transcription of the
complementary Template Strand produces a heterogeneous
nuclear RNA (hnRNA) that is identical (co-linear)
in 5'-3' orientation and base sequences to the DNA
Sense Strand, with the substitution of U for
T. The RNA sequences equivalent to the DNA
exons and introns are sometimes themselves referred
to as "exons" and "introns," however this
is technically incorrect and also confuses their functional
role in transcription and translation with exons and
introns as gene sequences in DNA. The RNA sequences
equivalent to to DNA exons and introns can be
referred to as "exon transcripts" and "intron
transcripts," or "equivalents,"
respectively.
Processing of the hnRNA to mRNA
involves excision ('splicing out') of the
intron transcripts and ligation of the remaining exons.
Once the final mRNA is formed, translation
is the process of reading (as amino acids) a series
of three-base sequences called codons. Codons
are read according to the Genetic Code, which is
an RNA code. Because the mRNA region is
equivalent to a DNA exon, the same
three-base series can be identified in the Sense
Strand (substituting T for U).
The three-base DNA motifs are some called "codons",
however this is again technically incorrect and confuses
the information content of Genes with
the function of RNA in the Genetic
Code. The DNA equivalents to codons can be
referred to as 'triplets.'
In bioinformatics, the
64 triplets are sometimes presented as a "translation
table" that can be used directly with the DNA
Sense Strand sequence to infer the protein sequence.
This is practical, except that "translation" here
means 'deciphering of coded information' is not
the same as the molecular process of mRNA translation.
There's an app for that: see SM
Carr, HT Wareham & Craig D. 2014. A web
application for generation of DNA sequence
exemplars with open and closed reading frames in
genetics and bioinformatics education. CBE –
Life Sciences Education 13, 373-374, which
reviews this and includes an app that renders dsDNA as
protein sequences.
Figure & Text ©
2024 by Steven M. Carr