From 4fec3164c0b8c29cf2905d78e45063ed55917a02 Mon Sep 17 00:00:00 2001 From: Laura Cook <l.cook2@student.unimelb.edu.au> Date: Fri, 24 Jul 2020 17:07:19 +1000 Subject: [PATCH] added section on phantomPeakquals --- dunnart/README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/dunnart/README.md b/dunnart/README.md index 7d10509..bbfdd6c 100644 --- a/dunnart/README.md +++ b/dunnart/README.md @@ -371,6 +371,14 @@ ChIP-seq Standards: # 6. phantomPeakQuals +Information from: https://docs.google.com/document/d/1lG_Rd7fnYgRpSIqrIfuVlAz2dW1VaSQThzk836Db99c/edit + +This set of programs operate on mapped Illumina single-end read datasets in tagAlign or BAM format. Because my data is paired-end I need to only use the forward read. + +A high-quality ChIP-seq experiment will produce significant clustering of enriched DNA sequence tags/reads at locations bound by the protein of interest; the expectation is that we can observe a bimodal enrichment of reads (sequence tags) on both the forward and the reverse strands. + +Cross-correlation analysis is done on a filtered (but not-deduped) and subsampled BAM. There is a special fastq trimming for cross-correlation analysis. Read1 fastq is trimmed to 50bp first using trimfastq.py (last modified 2017/11/08, https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/master/src/trimfastq.py). And then it is separately mapped as SE. Reads are filtered but duplicates are not removed. Then 15 million reads are randomly sampled and used for cross-correlation analysis. + ### rule phantomPeakQuals: -- GitLab