Getting To Know
Your Protein Exercise I Bioinformatics for Biologists 2005 In this exercise, you will be using an unknown sequence to prepare multiple sequence alignment and phylogenetic tree figures. Upon the completion of this exercise, you should have a solid ability to search for homologous sequences, align them, create a phylogenetic tree, and produce manuscript-quality figures of your results. Follow the steps detailed below and use either the applications located on your computer or that are web-based. Please follow the steps in order. If you have difficulty with any of the steps, please ask for assistance.
Step 1 – Find homologous sequences
I. BLAST the following sequence against the non-redundant protein database using the blastp program at:
http://www.ncbi.nlm.nih.gov/BLAST/ Selected BLAST results:
gi|21979456|gb|AAM09075.1| raptor [Homo sapiens] >gi|220949... 2614 0.0 gi|30061325|ref|NP_083174.1| raptor [Mus musculus] >gi|4657... 2542 0.0 gi|54035208|gb|AAH84088.1| LOC495002 protein [Xenopus laevis] 2412 0.0 gi|7242961|dbj|BAA92541.1| KIAA1303 protein [Homo sapiens] 2193 0.0 gi|34875607|ref|XP_213539.2| similar to p150 target of rapa... 1557 0.0 gi|50745382|ref|XP_426232.1| PREDICTED: similar to p150 tar... 1332 0.0 gi|12855312|dbj|BAB30288.1| unnamed protein product [Mus mu... 1237 0.0 gi|24640048|ref|NP_572294.1| CG4320-PA [Drosophila melanoga... 1023 0.0 gi|31711792|gb|AAP68252.1| At3g08850 [Arabidopsis thaliana]... 932 0.0 gi|47214942|emb|CAG10764.1| unnamed protein product [Tetrao... 924 0.0 gi|55236253|gb|EAL39258.1| ENSANGP00000026347 [Anopheles ga... 923 0.0 gi|6403497|gb|AAF07837.1| unknown protein [Arabidopsis thal...
920 0.0
II. What is this sequence? RAPTOR Does it have any characteristic domains? WD40 Repeats
Step 2 – Create a FASTA file containing homologous sequences
I. Compile the accession numbers for the following sequences into one text file – one id number per line: Drosophila, Mouse, Rat, Human.
AAM09075.1 NP_083174.1 XP_213539.2 NP_572294.1
II. Use Batch ENTREZ at NCBI to retrieve all of the sequences corresponding to the accession numbers in a FASTA formatted file. Save this file to your computer.
Step 3 – Align Sequences with ClustalX
I. Start the ClustalX application, then FILE->LOAD SEQUENCES.
II. ALIGNMENT->DO COMPLETE ALIGNMENT.
III. FILE->WRITE ALIGNMENT AS POSTSCRIPT.
[ Alternatively, you can create an
alignment with the web tool at: http://www.ebi.ac.uk/clustalw/ ] Step 4 – Create Phylogenetic Tree
I. In ClustalX, TREES->Draw N-J Tree
II. In TreeView, OPEN your .ph file • Notice the options to create different shape trees.
III. PRINT->SAVE AS PDF [ Alternatively, you can build trees with your alignment at http://www.ebi.ac.uk/clustalw/ ]
[ NOTE: From here to the end, we assume you are using your desktop applications. ] Step 5 – Manage Postscript Files
I. OPEN alignment postscript file with Acrobat Distiller. • Note that you can view each
page individually.
II. Extract and save each page separately, with new names •For each page, enter the page number to extract and save with a unique name.
Step 6 – Annotate Figure
I. OPEN PDF in Adobe Illustrator
II. FILE->EXPORT, save as a Tiff
Step 7 – Create Powerpoint Presentation
I. NEW PRESENTATION
II. Choose
blank slide.
III. INSERT->PICTURE->FROM FILE, select your tiff or PDF file
IV. Rotate the image 90 degrees CCW.
V. Label your slide as you see fit. |