ANSWERS TO HOMEWORK 2

    1. The description line(the one after ">") is different between the two fasta files.
    2. By default, the blast2 program masks off segments of the query sequences that have low compositional complexity by the DUST program as mentioned in the class. So, you see multiple n in the query sequence but not in the target sequence. In this case, it's better to disable the Filter. After you disable the filter, NM_004635.3 is identical to NM_004635.2 except that NM_003645.2 has additional 9 bases at 5' end.

  1. From the BLAST result, you will see that the first 9 bases of the NM_003645.2 is not mapped in the genome.