Hello everybody,
this is my first try to trim Illumina (paired-end) reads on the unix command line.
If i get it correctly, de-multiplexing was already done by the sequencing service.
I guess this also means that the adapters are also gone already.
What i want to do is trimming the reads by quality.
I checked it on FastQC and want to get rid of read with a quality below 20.
I tried trim_galore with
trim_galore ../name1.fastq -q 20 --paired > trim_name.fastq
which gives me:
"No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)
Please provide an even number of input files for paired-end FastQ trimming! Aborting ..." <- i got the idea of this line
But i don't really know how to find out how my data are encoded.
The data look like this.
@NS500339:99:H3H52AFXX:1:11101:5599:1027 2:N:0:GTGAAA
NNNCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGANANCTCNNAAAA
+
###/AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEE<6/<EEEEEEEEAAE6EEE/EEEEEEEEEEEAAEEEEE/6EE<AEEEEAEA<EEE/EEE/AAAE#<#<A/##/<<6
I also tried fastx toolbox with the following command
fastq_quality_filter -q 20 -i ../name.fastq -v -o trimmed_name.fastq
The program works,
Minimum percentage: 0
Input: 4564772 reads.
Output: 4564772 reads.
discarded 0 (0%) low-quality reads.
but if i check it again with FastQC, there are still reads with a quality below 20.
Maybe someone can please help me with one of the programs.
Thanks a lot, Alex
this is my first try to trim Illumina (paired-end) reads on the unix command line.
If i get it correctly, de-multiplexing was already done by the sequencing service.
I guess this also means that the adapters are also gone already.
What i want to do is trimming the reads by quality.
I checked it on FastQC and want to get rid of read with a quality below 20.
I tried trim_galore with
trim_galore ../name1.fastq -q 20 --paired > trim_name.fastq
which gives me:
"No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)
Please provide an even number of input files for paired-end FastQ trimming! Aborting ..." <- i got the idea of this line
But i don't really know how to find out how my data are encoded.
The data look like this.
@NS500339:99:H3H52AFXX:1:11101:5599:1027 2:N:0:GTGAAA
NNNCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGANANCTCNNAAAA
+
###/AEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEE<6/<EEEEEEEEAAE6EEE/EEEEEEEEEEEAAEEEEE/6EE<AEEEEAEA<EEE/EEE/AAAE#<#<A/##/<<6
I also tried fastx toolbox with the following command
fastq_quality_filter -q 20 -i ../name.fastq -v -o trimmed_name.fastq
The program works,
Minimum percentage: 0
Input: 4564772 reads.
Output: 4564772 reads.
discarded 0 (0%) low-quality reads.
but if i check it again with FastQC, there are still reads with a quality below 20.
Maybe someone can please help me with one of the programs.
Thanks a lot, Alex
Comment