From 4fec3164c0b8c29cf2905d78e45063ed55917a02 Mon Sep 17 00:00:00 2001
From: Laura Cook <l.cook2@student.unimelb.edu.au>
Date: Fri, 24 Jul 2020 17:07:19 +1000
Subject: [PATCH] added section on phantomPeakquals

---
 dunnart/README.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/dunnart/README.md b/dunnart/README.md
index 7d10509..bbfdd6c 100644
--- a/dunnart/README.md
+++ b/dunnart/README.md
@@ -371,6 +371,14 @@ ChIP-seq Standards:
 
 #  6. phantomPeakQuals
 
+Information from: https://docs.google.com/document/d/1lG_Rd7fnYgRpSIqrIfuVlAz2dW1VaSQThzk836Db99c/edit
+
+This set of programs operate on mapped Illumina single-end read datasets in tagAlign or BAM format. Because my data is paired-end I need to only use the forward read.
+
+A high-quality ChIP-seq experiment will produce significant clustering of enriched DNA sequence tags/reads at locations bound by the protein of interest; the expectation is that we can observe a bimodal enrichment of reads (sequence tags) on both the forward and the reverse strands.
+
+Cross-correlation analysis is done on a filtered (but not-deduped) and subsampled BAM. There is a special fastq trimming for cross-correlation analysis. Read1 fastq is trimmed to 50bp first using trimfastq.py (last modified 2017/11/08, https://github.com/ENCODE-DCC/chip-seq-pipeline2/blob/master/src/trimfastq.py). And then it is separately mapped as SE. Reads are filtered but duplicates are not removed. Then 15 million reads are randomly sampled and used for cross-correlation analysis.
+
 
 ### rule phantomPeakQuals:
 
-- 
GitLab