Hey,
I recently got several ChIP seq datasets back from our collaborators. Upon analysing the results I was quite surprised to find that the results were poor, surprised given the fact that the antibodies all work in ChIP and these ChIPs in particular definitely worked as they were QC'd as much as possible regards enrichments at known binding sites etc.....
Anyway it was suggested to me to run the raw sequencing files through FASTQC which I did. I had noticed during the analysis a high level of read duplication in the libraries and sure enough FASTQC picked up on this as well. All of the libraries fail miserably on this parameter.
My question is, where does this high level of read duplication come from? Surely it has to be from the PCR amplication in the library prep protocol (I did 18 cycles). Should I expect much better results if I used fewer rounds of PCR - 12,14.....something like this?
Thanks
Optimus
I recently got several ChIP seq datasets back from our collaborators. Upon analysing the results I was quite surprised to find that the results were poor, surprised given the fact that the antibodies all work in ChIP and these ChIPs in particular definitely worked as they were QC'd as much as possible regards enrichments at known binding sites etc.....
Anyway it was suggested to me to run the raw sequencing files through FASTQC which I did. I had noticed during the analysis a high level of read duplication in the libraries and sure enough FASTQC picked up on this as well. All of the libraries fail miserably on this parameter.
My question is, where does this high level of read duplication come from? Surely it has to be from the PCR amplication in the library prep protocol (I did 18 cycles). Should I expect much better results if I used fewer rounds of PCR - 12,14.....something like this?
Thanks
Optimus
Comment