Alignment Problem Set Answer key


Video clips

PLEASE NOTE that these video clips are very crude - not even rough drafts, more like prototypes. They were recording in one take, in an hour between classes, in poor lighting, and aren't edited or polished in any way. But because I've had (predictably) some questions about how to get starting with alignments, I'm posting these anyway. Perhaps I can make some nice ones soon, but more likely I'll focus on getting some clips made for the tree construction & interpretation problem sets.


  1. Align the following two sequences: Sequence A = GGGCUUCCGGCCACA and Sequence B = GCGCUUCCGGGCGCA
         
    No gaps are required, these just line up!          
         Sequence A = GGGCUUCCGGCCACA
         Sequence B = GCGCUUCCGGGCGCA
  2. Now add the following sequence to this alignment: Sequence C = GUGCUUCGGACGC
         
    The placement of the first gap could be one of two places, 
    either is as good as the other.  
    Sequence A = GGGCUUCCGGCCACA
    Sequence B = GCGCUUCCGGGCGCA
    Sequence C = GUGCUUC-GGACGC-   OR
    Sequence C = GUGCUU-CGGACGC-
  3. Now add the following sequence to this alignment: Sequence D GGCUUACGGUCACA
           
          This one is easy, but it this additional sequence shows which of the 
       two good alignments of sequence C is best. 
                  
    Sequence A = GGGCUUCCGGCCACA
    Sequence B = GCGCUUCCGGGCGCA
    Sequence C = GUGCUUC-GGACGC-
    Sequence D = G-GCUUACGGUCACA
  4. Align the following sequences:
               
       Sequence A : GGAGCAGUCCGUGGAUC
       Sequence B : UAGGAGCAGCCGUGGAUC      
       Sequence C : GGAGCAGGCCGCGGUACC
          
    This one's trickier because they don't start at the same place.  
    Why do you think the middle gaps work best where they are?                      
    Sequence A : --GGAGCAGUCCGUGG-AUC
    Sequence B : UAGGAGCAG-CCGUGG-AUC
    Sequence C : --GGAGCAGGCCGCGGUACC
  5. Align the following sequences:
         
         Sequence A : CUCGAGUUAACCCGGCACCCG
         Sequence B : GCUCGGGUUAACACGGACCCG
         Sequence C : UCGAGCCAACUCGGACCCG
         
         Don't be fooled by the easy ones!          
    Sequence A : -CUCGAGUUAACCCGGCACCCG
    Sequence B : GCUCGGGUUAACACGG-ACCCG
    Sequence C : --UCGAGCCAACUCGG-ACCCG
  6. Align the following sequences (note that these are in Fasta format, commonly used for the electronic transfer of sequence data):
             
             >tRNA-A 
             GGGCUCAUAGCUCAGCGGUAGAGUGCCUCCUUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA
             >tRNA-B 
             GGGCUCAUCGCUCAGCGGUAGAGUGCCUCCCUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA
             >tRNA-C 
             GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA
             >tRNA-D 
             GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUCGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA
             >tRNA-E 
             GGGCCGGUAGCUCAGUCUGGUAGAGCGUCGCCUUGGCAUGGCGAAGGCCGGGGUUCAAAUCCCCACCGGU
             
             Long alignments are no different than short ones.          
    tRNA-A GGGCUCAUAGCUCAGC--GGUAGAGUGCCUCCUUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA
    tRNA-B GGGCUCAUCGCUCAGC--GGUAGAGUGCCUCCCUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA
    tRNA-C GGGCUCGUAGCUCAGC--GGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA
    tRNA-D GGGCUCGUAGCUCAGC--GGGAGAGCGCCGCCUUCGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA
    tRNA-E GGGCCGGUAGCUCAGUCUGGUAGAGCGUCGCCUUGGCAUGGCGAAGGCC-GGGGUUCAAAUCCCCACCGGU---
  7. Draw the secondary structures of the sequences in this alignment:
                                
        
          A A                 ( ( ( ( - - - - ) ) ) ) - -
        G     A       Seq V   G G G G G A U A C U U C U A
         C - G        Seq W   A U G C U U C G G C A U U A
         U - A        Seq X   G U U U U U U - A A G C U A
         G - C        Seq Y   G G C - C U U G - G C C - A       
    G - C U A Seq Z U U U U U U U U A A A A A A | | | | | | | | | | | +---------+ | | | | | +-------------+ | | | +-----------------+ | +---------------------+ Since you who which bases pair with who in the reference, you also know it for all of them : A U U C U U U G A U G U U U U U U G - C C - G U - A C G U - A G - U G - C U - A C - G U - A G - U U - A U - G G - C U - A G - C U A A - U U A G - C U A G - C A U - A A A Seq V Seq W Seq X Seq Y Seq Z
  8. Create an alignment of the following RNA structures:
             
             This is just the reverse of the previous 
                      
                                       A C           U U
         U C           U C C         U     G       C     G
       U     G       G       A        G - C         G - C         U
        C - G          G - C          C - G         C - G      U     U 
        U - G          C - G               A        U - A       U - A
        C - G          A - U          A - U         C - G       U - A
        C - G U      A G - C U A    A G - C U A     C - G U     C - G U A
    
    
             - ( ( ( ( ( - - - - - ) ) ) - ) ) - -
         #1  - C C U C - - U U C G - G G - G G U -
         #2  A G A C G - G U C C A - C G - U C U A
         #3  A G A C G - - U A C G - C G A U C U A
         #4  - C C U C G - C U U G C G A - G U U -
         #5  - C U U - - - U U U - - - A - A G U A
               | | | | |           | | |   | |
               | | | | +-----------+ | |   | |
               | | | +---------------+ |   | |
               | | +-------------------+   | |
               | +-------------------------+ |
               +-----------------------------+ 
  9. Add the following RNA structure to the preexisting alignment:
        U C   
      U     G      U                  ( ( ( ( - - - - ) ) ) ) - - ( ( ( - - - - ) ) ) - - -
       C - G    U     U       Seq W   C U U C G A G A G A A G - G C U C U U C G G A G U A U
       U - G     U - A        Seq X   C C C - U G C - - G G G U G C G C U U C - G C G U A -
       C - G     U - A        Seq Y   U C C - C U U C - G G G - A C U C C U U G G G G U G C
       C - G U A C - G U A    Seq Z   G A A G G A G A U U U U U G G G G U U U A C C C U A G
                              Seq ?   C C U C U U C G G G G G U A C U U U U U - A A G U A -
                                      | | | |         | | | |     | | |         | | |
                                      | | | +---------+ | | |     | | +---------+ | |
                                      | | +-------------+ | |     | +-------------+ |
                                      | +-----------------+ |     +-----------------+
                                      +---------------------+   
Use the basepairs to guide where this sequence goes - it's easy!