I'm trying to use Tophat2 in galaxy to map paired reads, but the drop key for selecting the files doesn't recognize any files imported (files look fine using fastqc). It only recognizes them after running fastq groomer.
I don't know the platform used to sequence these, so I don't know what to use for running fastq groomer. I tried illumina 1.3-1.7 and then separately sanger/illumina. But then when I used tophat to map either sets of files, the mapping results were terrible. For illumina1.3, it gave me 94.7 discordant alignments. For sanger/illumina, it gave me 0% mapped reads. I'm assuming the problem is the file type I'm converting? The data have been used before for RNA seq DGE analysis, so I'm assuming they're fine.
My question: how can I know from the original fastq file what to put for the fastq groomer? Or: any helpful information.
Original fastq files (top line):
GWZHISEQ02:321YMKACXX:4:1101:1856:1996 1:N:0:ATCACG
CACGATGATGGCCTTCGACGGCAAGTACGACTTCCCCCTGGACATCAGCGA
+
@@CFDDFFHHHHHJJHJIIIJDIJJDGHIIJJJIJJJJJIJIJJJGJJJHH
I don't know the platform used to sequence these, so I don't know what to use for running fastq groomer. I tried illumina 1.3-1.7 and then separately sanger/illumina. But then when I used tophat to map either sets of files, the mapping results were terrible. For illumina1.3, it gave me 94.7 discordant alignments. For sanger/illumina, it gave me 0% mapped reads. I'm assuming the problem is the file type I'm converting? The data have been used before for RNA seq DGE analysis, so I'm assuming they're fine.
My question: how can I know from the original fastq file what to put for the fastq groomer? Or: any helpful information.
Original fastq files (top line):
GWZHISEQ02:321YMKACXX:4:1101:1856:1996 1:N:0:ATCACG
CACGATGATGGCCTTCGACGGCAAGTACGACTTCCCCCTGGACATCAGCGA
+
@@CFDDFFHHHHHJJHJIIIJDIJJDGHIIJJJIJJJJJIJIJJJGJJJHH
Comment