MEME version 3.0 (Release date: 2004/07/26 08:17:15)
For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.sdsc.edu.
This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.sdsc.edu.
If you use this program in your research, please cite:
Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
DATAFILE= SNT2_YPD.fsa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ iYNL182C 1.0000 726 iYBL075C 1.0000 301 iYIL160C 1.0000 384 iYPR183W 1.0000 462 iYCR090C-1 1.0000 789 iYAL039C-0 1.0000 1150 iYPR157W 1.0000 610 iYOL117W 1.0000 326 iYLR149C 1.0000 652 iYJL093C 1.0000 592 iYBR143C 1.0000 1044 iYLR176C 1.0000 802 iYPR104C 1.0000 567 iYBR138C 1.0000 345 iYHR138C 1.0000 479 iYKL172W 1.0000 462 iYJR152W 1.0000 1512 iYCR090C-0 1.0000 770 iYLR228C-1 1.0000 857 iYDR261C-1 1.0000 888
This information can also be useful in the event you wish to report a problem with the MEME software. command: meme SNT2_YPD.fsa -dna -nmotifs 5 -minw 7 -maxw 11 -revcomp model: mod= zoops nmotifs= 5 evt= inf object function= E-value of product of p-values width: minw= 7 maxw= 11 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 13718 N= 20 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.318 C 0.182 G 0.182 T 0.318 Background letter frequencies (from dataset with add-one prior applied): A 0.318 C 0.182 G 0.182 T 0.318
BL MOTIF 1 width=11 seqs=18 iYHR138C ( 134) CGGCGCTACCA 1 iYAL039C-0 ( 282) CGGCGCTACCA 1 iYPR183W ( 238) CGGCGCTACCA 1 iYIL160C ( 212) CGGCGCTACCA 1 iYBR138C ( 204) CGGCGCTAGCA 1 iYBR143C ( 203) CGGCGCTATCA 1 iYJL093C ( 268) CGGCGCTATCA 1 iYCR090C-1 ( 564) CGGCGCTATCA 1 iYBL075C ( 267) CGGCGCTATCA 1 iYLR176C ( 222) TGGCGCTACCA 1 iYLR149C ( 253) TGGCGCTACCA 1 iYPR104C ( 421) TGGCGCTATCA 1 iYPR157W ( 332) TGGCGCTATCA 1 iYNL182C ( 203) TGGCGCTATCA 1 iYKL172W ( 196) CGGCGCTAGGG 1 iYDR261C-1 ( 307) TGTCGCGACCA 1 iYJR152W ( 776) CCGTGCTAGCA 1 iYCR090C-0 ( 567) TGTCACTAGCA 1 //
log-odds matrix: alength= 4 w= 11 n= 13518 bayes= 9.68522 E= 7.5e-025 -1081 175 -1081 29 -1081 -171 237 -1081 -1081 -1081 229 -152 -1081 237 -1081 -251 -251 -1081 237 -1081 -1081 246 -1081 -1081 -1081 -1081 -171 157 165 -1081 -1081 -1081 -1081 109 29 29 -1081 237 -171 -1081 157 -1081 -171 -1081
letter-probability matrix: alength= 4 w= 11 nsites= 18 E= 7.5e-025 0.000000 0.611111 0.000000 0.388889 0.000000 0.055556 0.944444 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 0.944444 0.000000 0.055556 0.055556 0.000000 0.944444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.055556 0.944444 1.000000 0.000000 0.000000 0.000000 0.000000 0.388889 0.222222 0.388889 0.000000 0.944444 0.055556 0.000000 0.944444 0.000000 0.055556 0.000000
Time 24.71 secs.
BL MOTIF 2 width=11 seqs=20 iYIL160C ( 260) TTTCTTTTTTC 1 iYCR090C-0 ( 166) TTTTTTTTTTC 1 iYJR152W ( 243) TTTTTTTTTTC 1 iYPR157W ( 469) TTTTTTTTTTC 1 iYCR090C-1 ( 313) TTTTTTTTTTC 1 iYPR183W ( 12) TTTTTTTTTTC 1 iYAL039C-0 ( 855) TTTCTTTTTCC 1 iYLR176C ( 244) TTTCTTCTTTC 1 iYDR261C-1 ( 1) TTTTTTCTTTC 1 iYHR138C ( 242) TTTCTTTTTTT 1 iYJL093C ( 514) TTTCTTTTGCC 1 iYNL182C ( 707) TTTCTTTTTTT 1 iYBR143C ( 8) TTTCCTTTGTC 1 iYOL117W ( 271) TTTCTTTTCTT 1 iYPR104C ( 120) TTTATTTTCTC 1 iYLR149C ( 532) TTTTTTTTTCT 1 iYLR228C-1 ( 29) TTTTTTTTTAC 1 iYKL172W ( 156) TTTATTTTCCC 1 iYBR138C ( 299) TTTTTTCTTTA 1 iYBL075C ( 16) TTTTCTTTGTA 1 //
log-odds matrix: alength= 4 w= 11 n= 13518 bayes= 9.39853 E= 2.5e+000 -1097 -1097 -1097 165 -1097 -1097 -1097 165 -1097 -1097 -1097 165 -167 114 -1097 65 -1097 -86 -1097 150 -1097 -1097 -1097 165 -1097 -28 -1097 142 -1097 -1097 -1097 165 -1097 -28 -28 114 -266 14 -1097 124 -167 194 -1097 -67
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 2.5e+000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.100000 0.400000 0.000000 0.500000 0.000000 0.100000 0.000000 0.900000 0.000000 0.000000 0.000000 1.000000 0.000000 0.150000 0.000000 0.850000 0.000000 0.000000 0.000000 1.000000 0.000000 0.150000 0.150000 0.700000 0.050000 0.200000 0.000000 0.750000 0.100000 0.700000 0.000000 0.200000
Time 54.94 secs.
BL MOTIF 3 width=11 seqs=20 iYNL182C ( 295) GGGGGAGAGGG 1 iYKL172W ( 324) GGGGACGGAGG 1 iYBR143C ( 964) GGGGAAGAGAA 1 iYLR149C ( 582) GGGGAAGAGAA 1 iYAL039C-0 ( 221) GGGGGCGGAAG 1 iYPR183W ( 333) GGGGAAGAAAA 1 iYLR176C ( 623) GGGGAAGGGGA 1 iYHR138C ( 78) GGGGGCAAAAG 1 iYBR138C ( 258) GGGGACAGGGG 1 iYIL160C ( 142) GAGGGAGAAAG 1 iYCR090C-1 ( 607) GAGGGAAAAAG 1 iYOL117W ( 140) GTGGAAGGAAA 1 iYCR090C-0 ( 714) GGTGAAAAGGG 1 iYJR152W ( 897) GAGGAAAAAGA 1 iYPR157W ( 358) GTGGAAAAAAA 1 iYJL093C ( 500) GAGGGACAGGA 1 iYDR261C-1 ( 546) GATGGCGAAGA 1 iYBL075C ( 160) GGTGACCAGAA 1 iYPR104C ( 308) GGTGACGGAAT 1 iYLR228C-1 ( 694) GTGCAAAAAAG 1 //
log-odds matrix: alength= 4 w= 11 n= 13518 bayes= 9.65041 E= 2.9e+002 -1097 -1097 246 -1097 -35 -1097 172 -108 -1097 -1097 214 -67 -1097 -186 238 -1097 103 -1097 94 -1097 103 94 -1097 -1097 14 -86 159 -1097 114 -1097 72 -1097 92 -1097 114 -1097 92 -1097 114 -1097 65 -1097 131 -266
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 2.9e+002 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.600000 0.150000 0.000000 0.000000 0.800000 0.200000 0.000000 0.050000 0.950000 0.000000 0.650000 0.000000 0.350000 0.000000 0.650000 0.350000 0.000000 0.000000 0.350000 0.100000 0.550000 0.000000 0.700000 0.000000 0.300000 0.000000 0.600000 0.000000 0.400000 0.000000 0.600000 0.000000 0.400000 0.000000 0.500000 0.000000 0.450000 0.050000
Time 85.97 secs.
BL MOTIF 4 width=9 seqs=19 iYIL160C ( 103) CTTCTTTCT 1 iYLR228C-1 ( 553) CTTTTTTCT 1 iYBR138C ( 118) CTTTTTTCT 1 iYPR157W ( 403) CTTTTTTCT 1 iYAL039C-0 ( 150) CTTTTTTCT 1 iYPR104C ( 293) CTCCTTTCT 1 iYJL093C ( 41) CTCCTTTCT 1 iYHR138C ( 114) CTCTTTTCT 1 iYLR149C ( 182) CTCTTTTCT 1 iYCR090C-0 ( 401) CTTCTGTCT 1 iYBR143C ( 898) CTTCTGTCT 1 iYOL117W ( 64) CTTCTTTGT 1 iYPR183W ( 265) CTTTTTTGT 1 iYDR261C-1 ( 453) CTCCGTTCT 1 iYLR176C ( 338) CTTGTTTCT 1 iYJR152W ( 357) CTTTTTTTT 1 iYKL172W ( 368) CTTCTTACT 1 iYNL182C ( 365) CTTTGTTGT 1 iYCR090C-1 ( 290) CTTTTGTTT 1 //
log-odds matrix: alength= 4 w= 9 n= 13558 bayes= 9.41732 E= 3.9e+005 -1089 246 -1089 -1089 -1089 -1089 -1089 165 -1089 53 -1089 121 -1089 121 -179 73 -1089 -1089 -79 149 -1089 -1089 -21 140 -259 -1089 -1089 157 -1089 202 -21 -159 -1089 -1089 -1089 165
letter-probability matrix: alength= 4 w= 9 nsites= 19 E= 3.9e+005 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.263158 0.000000 0.736842 0.000000 0.421053 0.052632 0.526316 0.000000 0.000000 0.105263 0.894737 0.000000 0.000000 0.157895 0.842105 0.052632 0.000000 0.000000 0.947368 0.000000 0.736842 0.157895 0.105263 0.000000 0.000000 0.000000 1.000000
Time 116.52 secs.
BL MOTIF 5 width=11 seqs=2 iYAL039C-0 ( 700) GTGTGGCCGAC 1 iYCR090C-1 ( 415) GTGTGGCCGAC 1 //
log-odds matrix: alength= 4 w= 11 n= 13518 bayes= 12.7224 E= 5.1e+005 -765 -765 245 -765 -765 -765 -765 165 -765 -765 245 -765 -765 -765 -765 165 -765 -765 245 -765 -765 -765 245 -765 -765 245 -765 -765 -765 245 -765 -765 -765 -765 245 -765 165 -765 -765 -765 -765 245 -765 -765
letter-probability matrix: alength= 4 w= 11 nsites= 2 E= 5.1e+005 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000
Time 145.58 secs.
CPU: ncc005
MOTIFS
For each motif that it discovers in the training set, MEME prints the following information:
J. Kyte and R. Doolittle, 1982. "A Simple Method for Displaying the Hydropathic Character of a Protein", J. Mol Biol. 157, 105-132.
Summing the information content for each position in the motif gives the total information content of the motif (shown in parentheses to the left of the diagram). The total information content is approximately equal to the log likelihood ratio divided by the number of occurrences times ln(2). The total information content gives a measure of the usefulness of the motif for database searches. For a motif to be useful for database searches, it must as a rule contain at least log_2(N) bits of information where N is the number of sequences in the database being searched. For example, to effectively search a database containing 100,000 sequences for occurrences of a single motif, the motif should have an IC of at least 16.6 bits. Motifs with lower information content are still useful when a family of sequences shares more than one motif since they can be combined in multiple motif searches (using MAST).
Multilevel TTATGTGAACGACGTCACACT consensus AA T A G A GA AA sequence T C TT T
You can convert these blocks to PSSMs (position-specific scoring matrices), LOGOS (color representations of the motifs), phylogeny trees and search them against a database of other blocks by pasting everything from the "BL" line to the "//" line (inclusive) into the Multiple Alignment Processor. If you include the -print_fasta switch on the command line, MEME prints the motif sites in FASTA format instead of BLOCKS format.
Note: Earlier versions of MEME gave the posterior probabilities--the probability after applying a prior on letter frequencies--rather than the observed frequencies. These versions of MEME also gave the number of possible positions for the motif rather than the actual number of occurrences. The output from these earlier versions of MEME can be distinguished by "n=" rather than "nsites=" in the line preceding the matrix.