Microarray analysis exercises 2

Use the Excel "TTEST" function on the "ttest" sheet. Ex: TTEST(norm_filt!B2:C2,norm_filt!D2:E2,2,3)
The "TTEST" function takes four arguments: the first array, the second array, the number of tails, the type.
The number of tails is 2, since one tissue can have an expression that is lower or higher than the other.
The type of t test is 3, which refers to two-sample unequal variance.
But this formula generates an error for data where all expression values have been floored to 1 (which is also when both means are 1).
To prevent this error, we want to check if both means are 1 by using the IF and AND statements:

The "AND" statement checks if the series of statements between the parentheses are true. Ex: =AND(means!B2=1,meansC2=1)
If the "AND" statement is true, print "1"; otherwise do the ttest.
Combine the IF, AND, and TTEST functions: =IF(AND(means!B2=1,means!C2=1),1,TTEST(norm_filt!B2:C2,norm_filt!D2:E2,2,3))

Compile the mean expression values of all genes that show a significant change in expression (to use later for clustering) in the four tissues.

On the "selected" sheet use the Excel "VLOOKUP" function. Ex: =VLOOKUP($A2,means!$A$2:$E$12627,2,FALSE)
The "VLOOKUP" function takes 4 arguments: the value to search for, the table to search (containing the value to search for in the first column), the column number from which the matching value is returned, "FALSE" (to indicate that you want an exact match rather than the closest match).
The "table to search" is the 5 columns (1 columns of gene IDs + 4 columns for the 4 tissues) of mean expression values.
Note that the positions of the table to search must be fixed (with a "$" before each column and row").
Note that this command, when copied into lots of cells, can take the computer a while to perform.
Save this sheet as a text file by either one of these methods:

Use any or all of these data sets. The third dataset, being across more tissues, may be the most interesting.
1. a pre-processed set of expression values (not ratios.
2. a full set of expression ratios (transformed to log base 2), with values compared to the mean across all tissues
Open Cluster 3.0, a clustering application that works on all operating systems. It's an enhanced version of the Eisen clustering program. See the manual for more information about the program.
File > Open and select your file of expression data (one of the files in Part V.1).
Note that there are some filtering and normalization functions on the tabs "Filter Data" and "Adjust Data", but we've already performed these steps.
Try Hierarchical clustering using the default settings.

Go to the "Hierarchical" table and check "Cluster" under genes
Click on "Centroid Linkage" (or "Average Linkage") to use a clustering algorithm that is not sensitive to outliers.
When clustering is completed it'll be shown at the bottom of the window.
Cluster 3.0 generates several files during clustering:

The .cdt file (containing the re-ordered expression data) will be read by JavaTreeView.
For hierarchical clustering, .gtr and .atr files describe the structure of the gene and/or array trees.
For k-means clustering, the .kgg file lists the genes in each of the clusters.

Note the new column GWEIGHT (for gene weight) and the new row EWEIGHT (for experiment weight)
You may modify these weights for future clustering (to give more weight, for example, to certain arrays).

Open and view your initial (pre-clustered) text file.
Open and view your final (clustered) file (with a .cdt extension).
Try selected a region of the data to get a more detailed view.
Try Settings > Pixel Settings and adjust the contrast to get the most informative view for your data.
Note that if you used expression values (rather than ratios), you'll only see two colors and those between them.
Try clustering across Genes and Arrays (tissues) to analyze tissue relatedness.
If you wish to use the web link feature, go to Settings > Url Settings and use the link https://www.affymetrix.com/analysis/netaffx/fullrecord.affx?pk=HG-U95AV2:HEADER (so probeset 32615_at is linked to the Affymetrix NetAffx page https://www.affymetrix.com/analysis/netaffx/fullrecord.affx?pk=HG-U95AV2:32615_at). This Affymetrix site requires free registration but provides a lot of good data.

Follow the same steps as you did with hierarchical clustering above, but after opening the file, go to the k-Means tab, check "Organize genes" and click on Execute.

Optional: While in JavaTreeView, try Export > Export to Postscript and save all or part of your figure. This will produce an image of optimal resolution. Otherwise, you may wish to export to GIF or bitmap (which are easier to handle in Photoshop, but lower resolution).
Optional: Open the heatmap in Illustrator or Photoshop.