Skip to content
Snippets Groups Projects
Commit a8013a64 authored by Laura Cook's avatar Laura Cook
Browse files

edit input/output folders to run peak calling pipeline on subsampled BAM files

parent 835656b1
No related branches found
No related tags found
No related merge requests found
...@@ -551,38 +551,21 @@ Overlap replicate peaks for H3K27ac. ...@@ -551,38 +551,21 @@ Overlap replicate peaks for H3K27ac.
# 10. Consensus peaks and grouping into putative enhancers and promoters # 10. Consensus peaks and grouping into putative enhancers and promoters
Separate overlapped replicate peak files:
```
less H3K27ac_overlap.narrowPeak| grep "A-3" > H3K27ac_overlap_repA.narrowPeak
less H3K27ac_overlap.narrowPeak| grep "B-3" > H3K27ac_overlap_repB.narrowPeak
less H3K4me3_overlap.narrowPeak| grep "A-2" > H3K4me3_overlap_repA.narrowPeak
less H3K4me3_overlap.narrowPeak| grep "B-2" > H3K4me3_overlap_repB.narrowPeak
```
Keep only the intersected region:
```
bedtools intersect -a H3K27ac_overlap_repA.narrowPeak -b H3K27ac_overlap_repB.narrowPeak > H3K27ac_consensus.narrowPeak
bedtools intersect -a H3K4me3_overlap_repA.narrowPeak -b H3K4me3_overlap_repB.narrowPeak > H3K4me3_consensus.narrowPeak
```
Find uniquely H3K4me3 sites (i.e. peaks that don't overlap with H3K27ac): Find uniquely H3K4me3 sites (i.e. peaks that don't overlap with H3K27ac):
``` ```
bedtools intersect -v -a macs2/H3K4me3_consensus.narrowPeak -b macs2/H3K27ac_consensus.narrowPeak > H3K4me3_only.narrowPeak bedtools intersect -v -a H3K4me3_overlap.narrowPeak -b H3K27ac_overlap.narrowPeak > H3K4me3_only.narrowPeak
``` ```
Find unique H3K27ac sites (i.e. peaks that don't overlap with H3K4me3): Find unique H3K27ac sites (i.e. peaks that don't overlap with H3K4me3):
``` ```
bedtools intersect -v macs2/H3K27ac_consensus.narrowPeak -b macs2/H3K4me3_consensus.narrowPeak > H3K27ac_only.narrowPeak bedtools intersect -v -a H3K27ac_overlap.narrowPeak -b H3K4me3_overlap.narrowPeak > H3K27ac_only.narrowPeak
``` ```
Find peaks common between H3K27ac & H3K4me3 with a reciprocal overlap of at least 50%. Find peaks common between H3K27ac & H3K4me3 with a reciprocal overlap of at least 50%.
``` ```
bedtools intersect -f 0.5 -r -a macs2/H3K4me3_consensus.narrowPeak -b macs2/H3K27ac_consensus.narrowPeak > H3K4me3_and_H3K27ac.narrowPeak bedtools intersect -f 0.5 -r -a H3K4me3_overlap.narrowPeak -b H3K27ac_overlap.narrowPeak > H3K4me3_and_H3K27ac.narrowPeak
``` ```
# 11. Find overlap with TWARs # 11. Find overlap with TWARs
...@@ -598,38 +581,38 @@ blastn -task blastn -num_threads 4 -db dunnart_pseudochr_vs_mSarHar1.11_v1.fasta ...@@ -598,38 +581,38 @@ blastn -task blastn -num_threads 4 -db dunnart_pseudochr_vs_mSarHar1.11_v1.fasta
## Intersect with H3K4me3 and H3K27ac ## Intersect with H3K4me3 and H3K27ac
``` ```
intersectBed -a H3K4me3_consensus.narrowPeak -b dunnart_TWARs.bed > twar_H3K4me3_overlap.bed intersectBed -wb -a H3K4me3_overlap.narrowPeak -b dunnart_TWARs.bed > twar_H3K4me3_overlap.bed
``` ```
``` ```
intersectBed -a H3K27ac_consensus.narrowPeak -b dunnart_TWARs.bed > twar_H3K27ac_overlap.bed intersectBed -a H3K27ac_overlap.narrowPeak -b dunnart_TWARs.bed > twar_H3K27ac_overlap.bed
``` ```
```
intersectBed -a H3K4me3_only.narrowPeak -b dunnart_TWARs.bed > twar_H3K4me3_only_overlap.bed
```
# 12. HOMER annotate peaks ```
intersectBed -a H3K27ac_only.narrowPeak -b dunnart_TWARs.bed > twar_H3K27ac_only_overlap.bed
```
```
intersectBed -a H3K4me3_and_H3K27ac.narrowPeak -b dunnart_TWARs.bed > twar_H3K27ac_and_H3K4me3_overlap.bed
```
### How Basic Annotation Works
The process of annotating peaks/regions is divided into two primary parts. The first determines the distance to the nearest TSS and assigns the peak to that gene. The second determines the genomic annotation of the region occupied by the center of the peak/region.
### Distance to the nearest TSS # 12. ChIPseeker annotate peaks
By default, `annotatePeaks.pl` loads a file in the "/path-to-homer/data/genomes/<genome>/<genome>.tss" that contains the positions of RefSeq transcription start sites. It uses these positions to determine the closest TSS, reporting the distance (negative values mean upstream of the TSS, positive values mean downstream), and various annotation information linked to locus including alternative identifiers (unigene, entrez gene, ensembl, gene symbol etc.). This information is also used to link gene-specific information (see below) to a peak/region, such as gene expression.
### Genomic Annotation
To annotate the location of a given peak in terms of important genomic features, `annotatePeaks.pl` calls a separate program (assignGenomeAnnotation) to efficiently assign peaks to one of millions of possible annotations genome wide. Two types of output are provided. The first is "Basic Annotation" that includes whether a peak is in the TSS (transcription start site), TTS (transcription termination site), Exon (Coding), 5' UTR Exon, 3' UTR Exon, Intronic, or Intergenic, which are common annotations that many researchers are interested in. A second round of "Detailed Annotation" also includes more detailed annotation, also considering repeat elements and CpG islands. Since some annotation overlap, a priority is assign based on the following (in case of ties it's random [i.e. if there are two overlapping repeat element annotations]):
* TSS (by default defined from -1kb to +100bp) # Calling peaks after normalising to 10M reads
* TTS (by default defined from -100 bp to +1kb)
* CDS Exons
* 5' UTR Exons
* 3' UTR Exons
* CpG Islands
* Repeats
* Introns
* Intergenic
Although HOMER doesn't allow you to explicitly change the definition of the region that is the TSS (-1kb to +100bp), you can "do it yourself" by sorting the annotation output in EXCEL by the "Distance to nearest TSS" column, and selecting those within the range you are interested in. This is so we can compare to mouse peaks.
Based on https://davemcg.github.io/post/easy-bam-downsampling/ script for doing this.
```
bash subsample.sh
```
# 13. ChIPseeker annotate peaks Run script in the directory with the files you want to subsample.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment