Hi,
I have a illumina MiSeq data set, 32GB size genome, 300bp reads. Quality of reads degrades towards the 3' end in both R1 & R2, more in R2. I want to align reads to its reference using BWA-mem and later proceed in to variant calling using GATK pipeline.
I decided to do quality trimming of poor quality bases. I used Trimmomatic with window size 5, avg quality 20 and filtered reads <70bp. Are these parameters too stringent?
FASTQC reports for raw reads and trimmed reads are attached.
Output of paired data sets from Trimmomatic recovered 82% for both R1 & R2. Unpaired sets were 8% and 1% for R1 & R2 respectively. In this case is it ok to disregard unpaired sets in the mapping step?
Based on my raw data is it advisable to straight away move on to mapping & skip trimming?
How could I verify that my mapping is satisfactory? Would you recommend any tool to check mapping quality?
Appreciate comments on these isues.
Thanks
Best Regards
Rangika
I have a illumina MiSeq data set, 32GB size genome, 300bp reads. Quality of reads degrades towards the 3' end in both R1 & R2, more in R2. I want to align reads to its reference using BWA-mem and later proceed in to variant calling using GATK pipeline.
I decided to do quality trimming of poor quality bases. I used Trimmomatic with window size 5, avg quality 20 and filtered reads <70bp. Are these parameters too stringent?
FASTQC reports for raw reads and trimmed reads are attached.
Output of paired data sets from Trimmomatic recovered 82% for both R1 & R2. Unpaired sets were 8% and 1% for R1 & R2 respectively. In this case is it ok to disregard unpaired sets in the mapping step?
Based on my raw data is it advisable to straight away move on to mapping & skip trimming?
How could I verify that my mapping is satisfactory? Would you recommend any tool to check mapping quality?
Appreciate comments on these isues.
Thanks
Best Regards
Rangika
Comment