Normalised to the reads per genomic content (normalized to 1x coverage)
Normalised to the reads per genomic content (normalized to 1x coverage)
Produces a coverage file
Produces a coverage file
...
@@ -111,17 +71,7 @@ The bigWig format is an indexed binary format useful for dense, continuous data
...
@@ -111,17 +71,7 @@ The bigWig format is an indexed binary format useful for dense, continuous data
-`smoothLength`: defines a window, larger than the binSize, to average the number of reads over. This helps produce a more continuous plot.
-`smoothLength`: defines a window, larger than the binSize, to average the number of reads over. This helps produce a more continuous plot.
-`centerReads`: reads are centered with respect to the fragment length as specified by extendReads. This option is useful to get a sharper signal around enriched regions.
-`centerReads`: reads are centered with respect to the fragment length as specified by extendReads. This option is useful to get a sharper signal around enriched regions.
# phantomPeakQuals
### rule deeptools_fingerprint:
### rule deeptools_plotCoverage:
### rule deeptools_bamPEFragmentSize:
# 6. phantomPeakQuals
Information from: https://docs.google.com/document/d/1lG_Rd7fnYgRpSIqrIfuVlAz2dW1VaSQThzk836Db99c/edit
Information from: https://docs.google.com/document/d/1lG_Rd7fnYgRpSIqrIfuVlAz2dW1VaSQThzk836Db99c/edit
...
@@ -154,10 +104,8 @@ RSC; RSC>0.8 (0 = no signal; <1 low quality ChIP; >1 high enrichment
...
@@ -154,10 +104,8 @@ RSC; RSC>0.8 (0 = no signal; <1 low quality ChIP; >1 high enrichment
Quality tag based on thresholded RSC (codes: -2:veryLow,-1:Low,0:Medium,1:High; 2:veryHigh)
Quality tag based on thresholded RSC (codes: -2:veryLow,-1:Low,0:Medium,1:High; 2:veryHigh)
### rule phantomPeakQuals:
# 7. Call peaks (MACS2)
# Call peaks (MACS2)
__Input file options__
__Input file options__
...
@@ -196,34 +144,7 @@ I've left all the shifting model and peak calling arguments as default
...
@@ -196,34 +144,7 @@ I've left all the shifting model and peak calling arguments as default
-`peaks.xls`: a tabular file which contains information about called peaks. Additional information includes pileup and fold enrichment
-`peaks.xls`: a tabular file which contains information about called peaks. Additional information includes pileup and fold enrichment
-`summits.bed`: peak summits locations for every peak. To find the motifs at the binding sites, this file is recommended
-`summits.bed`: peak summits locations for every peak. To find the motifs at the binding sites, this file is recommended
# Create consensus peaksets for replicates
## Compare peaks to ENCODE peaks
With p < 0.01 I only call ~85000 peaks but ENCODE call ~125000
Look at pvalues, qvalues and peak length between the two lists.
Average Peak Length:
Plot pvalues of ENCODE on the x and my calls on the y for each sample
# 8. Peak QC
### rule get_narrow_peak_counts_for_multiqc:
### rule bamToBed:
Convert BAM to tagAlign file for calculating FRiP QC metric (Fraction of reads in peaks)
### rule frip:
# 9. Create consensus peaksets for replicates
Edited version of ENCODE `overlap_peaks.py` - recommended for histone marks.
Edited version of ENCODE `overlap_peaks.py` - recommended for histone marks.