Hi everyone
I'm a graduate student just started to do some NGS for my thesis project.
Most of the problems I had I could have searched and found it here on seq answer but I think I have a situation where I might need some help.
I have done a 2X150 PE Hiseq sequening by pooling 3 different populations of Drosophila. Using a reference genome based reassembly I used bwa and yada yada in the end I've had pretty good coverage where at least only for chromosome 2L on average there was about 70X coverage.
This is really good but I think its alittle overkill for me since running the fastq files through fastqc indicated the level of duplication for the library was around~25% and I'm tending to think now that I'm not really "learning" new and many of the sequencing is being wasted.
I'm on a very limited budget and I'm pretty much having a dilema on whether I can pool more samples (maybe 4 or even 5 samples) during my sequencing reaction so I can sequence more populations.
With this in mind I was trying to mimic a situation where I've initially pooled 4 or 5 populations by decreasing the number of reads in my current fastq file.
So it was a long way to explain how I can randomly delete a significant proportion of paired reads from my initial fastq file?
Thanks again for reading this far!
I'm a graduate student just started to do some NGS for my thesis project.
Most of the problems I had I could have searched and found it here on seq answer but I think I have a situation where I might need some help.
I have done a 2X150 PE Hiseq sequening by pooling 3 different populations of Drosophila. Using a reference genome based reassembly I used bwa and yada yada in the end I've had pretty good coverage where at least only for chromosome 2L on average there was about 70X coverage.
This is really good but I think its alittle overkill for me since running the fastq files through fastqc indicated the level of duplication for the library was around~25% and I'm tending to think now that I'm not really "learning" new and many of the sequencing is being wasted.
I'm on a very limited budget and I'm pretty much having a dilema on whether I can pool more samples (maybe 4 or even 5 samples) during my sequencing reaction so I can sequence more populations.
With this in mind I was trying to mimic a situation where I've initially pooled 4 or 5 populations by decreasing the number of reads in my current fastq file.
So it was a long way to explain how I can randomly delete a significant proportion of paired reads from my initial fastq file?
Thanks again for reading this far!
Comment