WIBR Bioinformatics and Research Computing

Comparative Protein Analysis (2005)

URL: /education/bioinfo2005/proteins

Class poster

Session 1 | Session 2 | Session 3 |

Session 1: Phylogenetic Trees and Multiple Sequence Alignments      

"Bootstrapping involves taking random samples of positions from the alignment. If the alignment has N positions, each bootstrap sample consists of a random sample of N positions, taken with replacement i.e. in any given sample, some sites may be sampled several times, others not at all. Then, with each sample of sites, you calculate a distance matrix as usual and draw a tree. If the data very strongly support just one tree the sample trees will be very similar to each other and to the original tree, drawn without bootstrapping. However, if parts of the tree are not well supported, then the sample trees will vary considerably in how they represent these parts. In practice, you should use a very large number of bootstrap replicates ( we will use 100). For each grouping on the tree, you record the number of times this grouping occurs in the sample trees. For a group to be considered "significant" at the 95% level (or P <= 0.05 in statistical terms) you expect the grouping to show up in >95% of the sample trees. If this happens, then you can say that the grouping is significant, given the data set and the method used to draw the tree."
(Ref: Molecular Phylogeny at Langara College)

Session 2: Protein Domain Identification and Classification      

Session 3: Protein Structure Prediction and Comparison      

Questions or comments?   latek@wi.mit.edu
Bioinformatics at Whitehead Institute