The Genetic Code


The Central Dogma: DNA makes RNA makes protein
Central Dogma

 

In principle: The DNA genotype does not produce the phenotype directly
   
A DNA gene contains the information necessary for the production of proteins,
        which is expressed biochemically through an intermediate molecule, RNA,
        which functions as a Genetic Code


The
Genetic Code ...
    is an RNA code
    specifies amino acids that make up proteins
         Protein expression leads directly (or indirectly) to the phenotype

    Allows logical inference of the protein product directly from DNA:
        see next section, and lab exercise
    was "cracked" before the details of translation were understood:
         we can talk about the Code before describing RNA translation

Alternative alleles of genes arise by mutation
     which alters the DNA sequence of genes  
         which may cause amino acid substitutions in proteins
             which may affect the function of those proteins
     Most genes are highly polymorphic



The Genetic Code is ... 

        a messenger RNA (mRNA) code
            i.e.., the code is written in RNA
            DNA is a coding molecule,
                    but not  the 'genetic code' in the biochemical sense

        in 64 triplets (codons) : 61 for amino acids + 3 'stops' [iG1 7.19]
               mRNA codons are read 5'3'
               20 amino acids:  note 1- & 3-letter abbreviations
                                              [more on amino acids & proteins in next section]
                For example,

      5' - A U G U U C C C C A AG G U U G A - 3'
            met   phe   pro   lys   gly    *       
               M     F     P     K     G     *                     

       Degenerate: most amino acids are encoded by more than one codon
            first two positions are critical: third position can "wobble [see next section]
                  if third can be either puRine (R), or either pYrimidine (Y)
                      two-fold degeneracy
                  if third can be any base 
                      four-fold degeneracy
                  Leucine (leu) has six-fold degeneracy with six codons in unusual arrangement

 

# codons / amino acid

trp, met

1 @

ser, arg, leu

6 @

ile

3 @

14 others

2 or 4 @

     Unambiguous: any one triplet codes for only one amino acid
                but not vice versa, because of wobble

        'Always' begins with an 'start' or 'initiator' codon:  AUG

        'Always' ends with a 'stop' or 'terminator' codon:  UAG, UAA, or UGA

     Universal (with some important exceptions)
            Five Kingdoms (animals, plants, algae, fungi, & monera)
                        use the same codes for nuclear DNA (nucDNA)

                Organelles (chloroplasts & mitochondria) have separate genomes:
                cpDNA & mitochondrial DNA codes are evolutionarily modified
                   e.g., UGA codes for trp in vertebrate mtDNA code  [iG1 7.Table 2]
                             Stop codons may be formed by addition of "A"s to transcript
                             Lab exercises use mtDNA, so the mtDNA code is important


Alteration & Variation in the Genetic Code: Mutations & SNPs

    Mutations - interchanges of one base type for another
        transitions   - alternative pyrimidines [ CT ]  or purines [ AG ]
        transversions -  purine  pyrimidine [C / T A / G]

     Recognized in individuals & populations as SNPs ("snips": single nucleotide polymorphisms)
                [SNPs, Mutations, & Mutants: a note on terminology & some lessons from history]

        Alternative nucleotide sequences of a gene correspond to alternative alleles
             or: a single gene occurs in variant forms (alleles)

  Single-base mutations
        Consequences of exon SNPs depend on position in triplet

            3rd position
                 typically a silent mutation - if position "wobbles", no change to amino acid
                 sometimes a mis-sense mutation - results in different amino acids

           2nd position - always a missense mutation
           1st position - almost always a missense replacement
                                      [Leu codons are major exception]
            stop codon mutations may occur at any position: coding  non-coding triplet
                non-sense (termination) mutations terminate polypeptides prematurely
                HOMEWORK #8: Identify all codons one step away from a termination codon

        mutations in non-coding DNA have variable effects
               Ex.: mutations in promoter regions
                       mutations at intron / exon splice junctions

Mis-sense mutations in DNA cause substitutions in protein
   Proteins do not mutate!
      Consequences depend on position of substitution in polypeptide
        none:  substitution not in active site or binding site
        minor: substitution of same type (synonymous substitution)
             Allozymes are enzymes arising from allelic variation of enzyme genes
                    [see Lab Exercise #2]
        major: substitution affects structure / function (nonsynonymous substitution)
             Ex.: Glu Val   in beta-globin  produces Sickle-cell hemoglobin (HbS)
                         HOMEWORK #9: What is the DNA mutation involved?

Insertion / Deletion ("indel") mutations
        gain or loss of one or more nucleotides alters the reading frames
        frameshift mutations  (examples)
              single & double nucleotide indel downstream amino acids change
                    non-sense mutation eventually (quickly) produced
              triplet indel - insertion / deletion of single amino acid
                   typically milder consequences
                   multiple triplet insertions produce major effects
                       Ex.: CGG repeats in
"Fragile X" Syndrome
         length mutations - very large indels (102~6 bps)


Genes are highly polymorphic (w/ multiple alleles) wrt their SNP variation
       
[Concept of "wild type" allele is erroneous]

        Phenylalanine Hydroxylase (PAH) (OMIM citation 261600)
             has 14 exons, encodes 2.4kb mRNA for 452 amino acid protein

        Among 68 alleles that affect enzymatic activity of PAH  [GenBank List]
                68% miss-sense SNPs (many produce Phenylketonuria (PKU))
                13% non-sense SNPs (premature termination)
                  9% indel SNPs
(single base 1~5 triplets whole exon)
                10% splice-site SNPs (including most common variant allele)
              
        Most alleleic variants of the PAH locus are 3rd position silent:
                no affect on PAH expression
                & therefore undetected



Homework #10
:
     (1) "What is a Gene?" Write a one-paragraphj essay that that distinguishes Gene, Allele, and Locus
    
(2) Critique the following statements:
           
"PAH is the gene for Phenylketonuria (PKU)."
            "PKU is a genetic disease caused by absence of the PAH
gene."


Text material ©2016 by Steven M. Carr