#Hands on exercises # "#"Are before comment lines and are used to explain the commands or show the output of the commands #STEP 0 "Make a work directory and copy input files into it" #a. In your browser open the page: http://barc.wi.mit.edu/hot_topics/ #You can copy paste the commands from that page as we need them #b. Log into tak. See handouts. #c. Go to the BaRC training folder: cd /nfs/BaRC_training #Create a folder with your login name with mkdir command: mkdir your_login_name #[Note: Replace your_login_name with your tak login name] #Go to the directory that you just created: cd your_login_name #Check where you are: pwd #copy input files and commands file cp /nfs/BaRC_Public/Hot_Topics/ChIPseq_2018/inputs/* . & cp /nfs/BaRC_Public/Hot_Topics/ChIPseq_2018/ChIPseq_2018.commands.txt . # Quality control and read mapping # refer to previous Hot_Topics: http://barc.wi.mit.edu/education/hot_topics/ChIPseq_2017/ChIPseq_2017.commands.txt # HANDS-ON 1 # Strand cross-correlation analysis: Look at the profile of the mapped reads # we are going to use a different set of bam files # Brca1_chr11.bam # control_chr11.bam ############# bsub run_spp.R -c=Brca1_chr11.bam -savp -out=Brca1_chr11_run_spp.out bsub run_spp.R -c=control_chr11.bam -savp -out=control_chr11_run_spp.out # HANDS-ON 2 # PEAK CALLING: MACS # Input files: # Brca1_chr11.bam # control_chr11.bam ############## bsub macs2 callpeak -t Brca1_chr11.bam -c control_chr11.bam --name Brca1_chr11 -f BAM -g hs --nomodel -B --extsize 135 # HANDS-ON 3 # view peaks in genome browser with IGV genome browser ################# # Make a bedgraph file to to be visualized in IGV # bedGraph format has four columns of data: # chrom chromStart chromEnd dataValue, with the last column dataValue sets the signal of the region to be displayed in a genome browser. # The command below first filters out the description lines starting with "#", then use fold_enrichment (8th column) as the dataValue. grep -v "#" Brca1_chr11_peaks.xls | grep -v start | tail --lines=+2 | cut -f1-3,8 > Brca1_chr11.bedgraph # Convert the .bdg files to bigwig (.bw) so it is easier to visualize in IGV bedGraphToBigWig Brca1_chr11_treat_pileup.bdg /nfs/genomes/human_gp_feb_09_no_random/anno/chromInfo.txt Brca1_chr11_treat_pileup.bw bedGraphToBigWig Brca1_chr11_control_lambda.bdg /nfs/genomes/human_gp_feb_09_no_random/anno/chromInfo.txt Brca1_chr11_control_lambda.bw # view peaks in genome browser with IGV genome browser. # If you haven't installed IGV, You can download it for free from Broad website (http://software.broadinstitute.org/software/igv/). # choose "Human hg19" genome build # load files with "File" -> "Load From File"; choose the following files bam files: Brca1_chr11.bam and control_chr11.bam peak file: Brca1_chr11.bedgraph peak summit file: Brca1_chr11_summits.bed wiggle files: Brca1_chr11_treat_pileup.bw and Brca1_chr11_control_lambda.bw # Type "FADD" in the textbox # Notice that the maximum data ranges for the two wiggle files are different. # You can adjust the maximum level with right click and select "Set Data Range" # In this case, because control bam file has ~2 mil reads while Brca1 has ~1mil, the upper limit in control should be twice as much as Brca1. You can set 200 for control and 100 for Brac1. # HANDS-ON 4: # Identify transcription factors regulate a gene in certain cell type # ############## # What are transcription factors regulate Synaptophysin (SYP) gene in human brain? Go to Encode site: https://www.encodeproject.org/ -> click on "Matrix" under "Data" from the top panel -> Type "brain" in the textbox under "Experiment Matrix" -> narrow down with left panels: under "Organism" select Homo sapiens; under "Organ" select "brain"; under "Genome assembly(visualization)" select current genome build "GRCh38" -> under the main (right) panel: under "Assay" choose "ChIP-seq" under "Target of assay" choose "transcription factor" After you have narrowed down the samples, click "Visualize" button -> choose "UCSC" browser to view the tracks At the top of the UCSC Genome Browser, type "SYP", click on "go". -> Click on "ENCODE ChIP-seq" track, under "Select view" turn "Optimal IDR thresholded peaks" on pack and hide the others. -> GENCODE or Refseq track suggests that SYP is in the negative strand. Zoom out with "3x" next to "Zoom out" from top panel, and highlight the promoter region around TSS. For easy visualization, you can also hold and move the GENCODE or Refseq gene track up or down to close to the transcription factor tracks. -> The browser suggests 10 TFs bind to this region, most of them in one cell line. Strong evidence that REST and TAF1 regulate SYP because they bind to promoters in both SK-N-SH and PFSK-1 cell lines.