Hi everybody,
my name is Fabian Roger and I am a PhD student in Ecology at the university of Gothenburg.
I am at a stage where I need to submit sequences to ENA but I don't have the files in the right format. All I have are 6 FASTQ files (3 forward, 3 reverse) and a list of sequence names that assigns each read to it's corresponding sample (144 samples in total).
In order to get the files into the right format I need to do two things:
1) demultiplex the FASTQ files into separate samples (split in forward and reverse)
2) check if there are any non-biological sequences remaining.
I tried to do 1) with filterbyname.sh script from bbmap but I can't get it to work. Are there any solutions that are comparably fast that could do that?
And would you have a good suggestion how to check 2)?
Information about the reads:
We sequenced environmental samples (freshwater) with the 341F-806R bacterial 16S primers and sequenced it with 2x250 bp on Illumina MISEQ platform.
Any help would be greatly appreciated!
I attach two small files from my data.
my name is Fabian Roger and I am a PhD student in Ecology at the university of Gothenburg.
I am at a stage where I need to submit sequences to ENA but I don't have the files in the right format. All I have are 6 FASTQ files (3 forward, 3 reverse) and a list of sequence names that assigns each read to it's corresponding sample (144 samples in total).
In order to get the files into the right format I need to do two things:
1) demultiplex the FASTQ files into separate samples (split in forward and reverse)
2) check if there are any non-biological sequences remaining.
I tried to do 1) with filterbyname.sh script from bbmap but I can't get it to work. Are there any solutions that are comparably fast that could do that?
And would you have a good suggestion how to check 2)?
Information about the reads:
We sequenced environmental samples (freshwater) with the 341F-806R bacterial 16S primers and sequenced it with 2x250 bp on Illumina MISEQ platform.
Any help would be greatly appreciated!
I attach two small files from my data.
Comment