Section 2 : Phylogenetic analysis

Phylogenetic analysis is the examination of natural (phyletic, evolutionary) relationships amongst a group of organisms. In practice, this usually means creating some kind of phylogenetic graph showing these relationships.

Molecular phylogenetic analysis is the use of macromolecular structure (usually sequences) to reconstruct the phylogenetic relationships between organisms. In other words, the extent of difference between homologous DNA, RNA, or protein sequences in different organisms is used as a measure of how much these organisms have diverged from one another in evolutionary history.

The typical scenario where a phylogenetic analysis is needed is the phylogenetic characterization of a novel organism. For example, determining the phylogenetic placement (“phylotype”) of a novel organism in order to make predictions about it’s unknown properties. This might be a clinical isolate of a potential pathogen, an organism that carries out some useful biochemistry, an organism that seems to be abundant in an interesting environment, or … well, anything.

We'll go through how a phylogenetic analysis is performed in great detai, then go over trees, how they work, and what they mean, and some of the problems with them. There are 4 purposes for doing this:

  1. to demonstrate that phylogenetic trees are not magic (they are straightforward transformations of sequences into a graph showing how they are related by similarity), so that you can interpret them appropriately,

  2. to show you how this process is done so that you can do it youself,

  3. to introduce you to some of the complications and alternative processes, so that you can recognize the importance of different approaches and parameters, and...

  4. to make you aware of some of the most important pitfalls in the process, so that you can think critically about phylogentic trees.