I have a question, I am working on a chIPseq data where tumors are having a viral infection. We IP with a human specific antibody for our gene of interest. The S.E reads from hiseq were aligned to human genome using BWA which has worked fine and gave me some probable binding sites after peak calling. Now I am working on to find what happened to viral factors. So I took viral genome (around 10K) using Bowtie. Here is a screen shot for Bowtie SAM file, There are only 0.30% uniquely mapped reads.
QNAME FLAG RNAME POS MAPQ CIGAR MRNM MPOS ISIZE SEQ QUAL OPT
@HD VN:1.0 SO:unsorted
@SQ SN:AF148805 LN:137969
@PG ID:Bowtie VN:0.12.7 CL:"bowtie -q -p 8 -S -n 2 -e 70 -l 28 --maxbts 800 -y -k 1 -a --best --phred33-quals /tmp/3006527.cyberstar.psu.edu/tmp5nKzJC/tmpb50_zP /galaxy/main_pool/pool3/files/005/338/dataset_5338393.dat"
HWI-ST550_0201:3:1101:1671:2197#ACAGTG/1 4 * 0 0 * * 0 0 AAAATTCAGGCTCTCTATTTCACAGTTCATTAGTTCATTCGTTTACTGTG CCCFFFFFHHHHHJGIJJJJHIIJJJIGIHIIIJJGIJJJJJJJIJIJII XM:i:0
HWI-ST550_0201:3:1101:1678:2241#ACAGTG/1 4 * 0 0 * * 0 0 AGTGGTGTTTAATATAGTTTTGGGTATTTTTAACTAAAAATCATTGTTAT ?@@B?2AD?D<<CAE4AGHIF9CEG+AFDHID3C?9?CDFC**:?9*B9D XM:i:0
HWI-ST550_0201:3:1101:1626:2216#ACAGTG/1 4 * 0 0 * * 0 0 GTTGCGGGAGAAGCCAAACGCGGCGAGTCTTGCTAAAGCCGTCGCCGTAG BBCFFFFFFHHHF>GGGHCGEHIGGAE=CDFACEEEEDDDBDD;BB57<? XM:i:0
HWI-ST550_0201:3:1101:1580:2218#ACAGTG/1 4 * 0 0 * * 0 0 ACAGAAATGGCATCAAGAGACCTTGATTACAAGGATATGAATCTCTTAAG CCCFFFFFHHGHHIIJJIJJJJJJJDIJJJIIIJIJJJJIJJIJIJJIJI XM:i:0
HWI-ST550_0201:3:1101:1779:2214#ACAGTG/1 4 * 0 0 * * 0 0 CCAATCTCTGCTACAGTTTGTTTCCCTCAATTTCTAATTACTTTAAAAAG CC@FFFFFHHDHDFGHEGIJIIJJJJGIGJJJJJIIJJEIIEHGJIGJJI XM:i:0
_________________________________________________________
How I should be selecting only uniquely mapped reads to viral genome?
Why I have so low number of uniquely mapped reads? Is there any way that I can increase this unique mapping? What will be the best strategy to align to viral genome in this case, Should I be aligning to viral genome all reads or first align to human then align to un-mapped reads to viral genome. I also tried it with BWA gave around 0.29% unique alignment.
QNAME FLAG RNAME POS MAPQ CIGAR MRNM MPOS ISIZE SEQ QUAL OPT
@HD VN:1.0 SO:unsorted
@SQ SN:AF148805 LN:137969
@PG ID:Bowtie VN:0.12.7 CL:"bowtie -q -p 8 -S -n 2 -e 70 -l 28 --maxbts 800 -y -k 1 -a --best --phred33-quals /tmp/3006527.cyberstar.psu.edu/tmp5nKzJC/tmpb50_zP /galaxy/main_pool/pool3/files/005/338/dataset_5338393.dat"
HWI-ST550_0201:3:1101:1671:2197#ACAGTG/1 4 * 0 0 * * 0 0 AAAATTCAGGCTCTCTATTTCACAGTTCATTAGTTCATTCGTTTACTGTG CCCFFFFFHHHHHJGIJJJJHIIJJJIGIHIIIJJGIJJJJJJJIJIJII XM:i:0
HWI-ST550_0201:3:1101:1678:2241#ACAGTG/1 4 * 0 0 * * 0 0 AGTGGTGTTTAATATAGTTTTGGGTATTTTTAACTAAAAATCATTGTTAT ?@@B?2AD?D<<CAE4AGHIF9CEG+AFDHID3C?9?CDFC**:?9*B9D XM:i:0
HWI-ST550_0201:3:1101:1626:2216#ACAGTG/1 4 * 0 0 * * 0 0 GTTGCGGGAGAAGCCAAACGCGGCGAGTCTTGCTAAAGCCGTCGCCGTAG BBCFFFFFFHHHF>GGGHCGEHIGGAE=CDFACEEEEDDDBDD;BB57<? XM:i:0
HWI-ST550_0201:3:1101:1580:2218#ACAGTG/1 4 * 0 0 * * 0 0 ACAGAAATGGCATCAAGAGACCTTGATTACAAGGATATGAATCTCTTAAG CCCFFFFFHHGHHIIJJIJJJJJJJDIJJJIIIJIJJJJIJJIJIJJIJI XM:i:0
HWI-ST550_0201:3:1101:1779:2214#ACAGTG/1 4 * 0 0 * * 0 0 CCAATCTCTGCTACAGTTTGTTTCCCTCAATTTCTAATTACTTTAAAAAG CC@FFFFFHHDHDFGHEGIJIIJJJJGIGJJJJJIIJJEIIEHGJIGJJI XM:i:0
_________________________________________________________
How I should be selecting only uniquely mapped reads to viral genome?
Why I have so low number of uniquely mapped reads? Is there any way that I can increase this unique mapping? What will be the best strategy to align to viral genome in this case, Should I be aligning to viral genome all reads or first align to human then align to un-mapped reads to viral genome. I also tried it with BWA gave around 0.29% unique alignment.
Comment