>Human haptoglobin alpha(2FS)-beta protein MSALGAVIALLLWGQLFAVDSGNDVTDIADDGCPKPPEIAHGYVEHSVRYQ CKNYYKLRTEGDGVYTLNDKKQWINKAVGDKLPECEADDGCPKPPEIAHGY VEHSVRYQCKNYYKLRTEGDGVYTLNNEKQWINKAVGDKLPECEAVCGKPK NPANPVQRILGGHLDAKGSFPWQAKMVSHHNLTTGATLINEQWLLTTAKNL FLNHSENATAKDIAPTLTLYVGKKQLVEIEKVVLHPNYSQVDIGLIKLKQK VSVNERVMPICLPSKDYAEVGRVGYVSGWGRNANFKFTDHLKYVMLPVADQ DQCIRHYEGSTVPEKKTPKSPVGVQPILNEHTFCAGMSKYQEDTCYGDAGS AFAVHDLEEDTWYATGILSFDKSCAVAEYGVYVKVTSIQDWVQKTIAEN
>Drosophila melanogaster Odorant receptor 85e (Or85e) MASLQFHGNVDADIRYDISLDPARESNLFRLLMGLQLANGTKPSPRLPKW WPKRLEMIGKVLPKAYCSMVIFTSLHLGVLFTKTTLDVLPTGELQAITDA LTMTIIYFFTGYGTIYWCLRSRRLLAYMEHMNREYRHHSLAGVTFVSSHA AFRMSRNFTVVWIMSCLLGVISWGVSPLMLGIRMLPLQCWYPFDALGPGT YTAVYATQLFGQIMVGMTFGFGGSLFVTLSLLLLGQFDVLYCSLKNLDAH TKLLGGESVNGLSSLQEELLLGDSKRELNQYVLLQEHPTDLLRLSAGRKC PDQGNAFHNALVECIRLHRFILHCSQELENLFSPYCLVKSLQITFQLCLL VFVGVSGTREVLRIVNQLQYLGLTIFELLMFTYCGELLSRHSIRSGDAFW RGAWWKHAHFIRQDILIFLVNSRRAVHVTAGKFYVMDVNRLRSVITQAFS FLTLLQKLAAKKTESEL
>Drosophila melanogaster Odorant receptor 23a (Or23a) MKLSETLKIDYFRVQLNAWRICGALDLSEGRYWSWSMLLCILVYLPTPMLL RGVYSFEDPVENNFSLSLTVTSLSNLMKFCMYVAQLTKMVEVQSLIGQLDA RVSGESQSERHRNMTEHLLRMSKLFQITYAVVFIIAAVPFVFETELSLPMP MWFPFDWKNSMVAYIGALVFQEIGYVFQIMQCFAADSFPPLVLYLISEQCQ LLILRISEIGYGYKTLEENEQDLVNCIRDQNALYRLLDVTKSLVSYPMMVQ FMVIGINIAITLFVLIFYVETLYDRIYYLCFLLGITVQTYPLCYYGTMVQE SFAELHYAVFCSNWVDQSASYRGHMLILAERTKRMQLLLAGNLVPIHLSTY VACWKGAYSFFTLMADRDGLGS
We can use the UCSC BLAT tool. BLAT can quickly find genomic sequences of 95% or greater similarity by keeping an index of the entire genome in memory. Click UCSC Genome Bioinformatics website, and choose Blat from the left frame to go to the BLAT Browser. Paste the raw sequense or FASTA-formated sequence obtained in the last question to the big text box, choose the human Genome, July 2003 Assembly, DNA in Query type and press submit button.
There are 3 hits for NM_004635.2. The first one is on chromosome 3, and is the best among the three hits because of the dramatic differences in the SCORE, the length of the alignment(only misssed 10 bases by comparing query START, END and QSIZE), and the percent IDENTITY. To obtain more information on the first hit, click on the details link. This page includes three parts: NM_004635 sequence, the genomic sequence and the alignment of the NM_004635 to the genomic sequence. The MATCHING BASES between the cDNA and genomic sequence are in upper case and darker blue, Gaps are in lower-case and black. Light blue and upper-cases indicate the the BOUNDARIES of the aligned regions on the either side of a gap and are often splices sites.
>LostWorld DinoDNA from the book The Lost World gaattccgga agcgagcaag agataagtcc tggcatcaga tacagttgga gataaggacg gacgtgtggc agctcccgca gaggattcac tggaagtgca ttacctatcc catgggagcc atggagttcg tggcgctggg ggggccggat gcgggctccc ccactccgtt ccctgatgaa gccggagcct tcctggggct gggggggggc gagaggacgg aggcgggggg gctgctggcc tcctaccccc cctcaggccg cgtgtccctg gtgccgtggg cagacacggg tactttgggg accccccagt gggtgccgcc cgccacccaa atggagcccc cccactacct ggagctgctg caaccccccc ggggcagccc cccccatccc tcctccgggc ccctactgcc actcagcagc gggcccccac cctgcgaggc ccgtgagtgc gtcatggcca ggaagaactg cggagcgacg gcaacgccgc tgtggcgccg ggacggcacc gggcattacc tgtgcaactg ggcctcagcc tgcgggctct accaccgcct caacggccag aaccgcccgc tcatccgccc caaaaagcgc ctgcgggtga gtaagcgcgc aggcacagtg tgcagccacg agcgtgaaaa ctgccagaca tccaccacca ctctgtggcg tcgcagcccc atgggggacc ccgtctgcaa caacattcac gcctgcggcc tctactacaa actgcaccaa gtgaaccgcc ccctcacgat gcgcaaagac ggaatccaaa cccgaaaccg caaagtttcc tccaagggta aaaagcggcg ccccccgggg gggggaaacc cctccgccac cgcgggaggg ggcgctccta tggggggagg gggggacccc tctatgcccc ccccgccgcc ccccccggcc gccgcccccc ctcaaagcga cgctctgtac gctctcggcc ccgtggtcct ttcgggccat tttctgccct ttggaaactc cggagggttt tttggggggg gggcgggggg ttacacggcc cccccggggc tgagcccgca gatttaaata ataactctga cgtgggcaag tgggccttgc tgagaagaca gtgtaacata ataatttgca cctcggcaat tgcagagggt cgatctccac tttggacaca acagggctac tcggtaggac cagataagca ctttgctccc tggactgaaa aagaaaggat ttatctgttt gcttcttgct gacaaatccc tgtgaaaggt aaaagtcgga cacagcaatc gattatttct cgcctgtgtg aaattactgt gaatattgta aatatatata tatatatata tatatctgta tagaacagcc tcggaggcgg catggaccca gcgtagatca tgctggattt gtactgccgg aattc
Problem 1. dottup
Problem 2. stretcher and then water
Problem 3. bl2seq -i filename1 -j filename2 -p blastn
Problem 4. N/A
Problem 5. blastall -p blastx -i dino.txt -d nr -o dino.out