Hi all.
I've recently installed TopHat and the test files ran without problems, thus I assume the installation went OK.
Now applying my own data things seems not to go so smoothly. I ran a subset (1000000 sequences) of my paired-end Illumina GA2 reads to test my data. I don't get any junction (which I also wouldn't expect with only 1000000 reads on a mammalian genome) but it surprised me that the accepted_hits.sam file is empty. If I understand correctly this file should contain the position and sequence of the aligned reads to the genome? Since I thought that the problem could be caused by a wrong fastq format I also aligned my subset with bowtie against my reference genome. This seems to go OK. The reason for my suspicion is that Tophat indicate a seed length of 52bp but my sequences are 51bp.
Thus, does anyone have any idea what is going wrong and is it somehow possible to control the seed length in tophat (as in bowtie with the -l option).
Regards, Ole
Some information:
example of my fastq format:
@HWI-EA332:5:13596#0/2
GCTGATCCGGGACTGCCGGCCTGTGAGGCTGCCCACCTGCGCGGCGGGGGC
+HWI-EA332:5:13596#0/2
`aa__]ZHZ_]\]V[]NXX_[FJFSJTY]R\\]VWHZFQ][JOWMZ\[_BB
The tophat screen:
[Wed Sep 30 09:29:55 2009] Preparing output location ./tophat_out/
[Wed Sep 30 09:29:55 2009] Checking for Bowtie index files
[Wed Sep 30 09:29:55 2009] Checking for reference FASTA file
[Wed Sep 30 09:29:55 2009] Checking for Bowtie
Bowtie version: 0.10.1.0
[Wed Sep 30 09:29:55 2009] Checking reads
seed length: 52bp
format: fastq
quality scale: --solexa1.3-quals
[Wed Sep 30 09:30:20 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:34:15 2009] Joining segment hits
Splitting reads into 2 segments
[Wed Sep 30 09:34:23 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:39:36 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:44:53 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:48:42 2009] Joining segment hits
Splitting reads into 2 segments
[Wed Sep 30 09:48:49 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:54:02 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:59:22 2009] Searching for junctions via segment mapping
Warning: junction database is empty!
[Wed Sep 30 10:01:08 2009] Joining segment hits
[Wed Sep 30 10:01:08 2009] Joining segment hits
[Wed Sep 30 10:01:08 2009] Reporting output tracks
-----------------------------------------------
Run complete [00:31:12 elapsed]
My command:
./tophat --solexa1.3-quals RefGenome part10_1.ma.fq part10_2.ma.fq
I've recently installed TopHat and the test files ran without problems, thus I assume the installation went OK.
Now applying my own data things seems not to go so smoothly. I ran a subset (1000000 sequences) of my paired-end Illumina GA2 reads to test my data. I don't get any junction (which I also wouldn't expect with only 1000000 reads on a mammalian genome) but it surprised me that the accepted_hits.sam file is empty. If I understand correctly this file should contain the position and sequence of the aligned reads to the genome? Since I thought that the problem could be caused by a wrong fastq format I also aligned my subset with bowtie against my reference genome. This seems to go OK. The reason for my suspicion is that Tophat indicate a seed length of 52bp but my sequences are 51bp.
Thus, does anyone have any idea what is going wrong and is it somehow possible to control the seed length in tophat (as in bowtie with the -l option).
Regards, Ole
Some information:
example of my fastq format:
@HWI-EA332:5:13596#0/2
GCTGATCCGGGACTGCCGGCCTGTGAGGCTGCCCACCTGCGCGGCGGGGGC
+HWI-EA332:5:13596#0/2
`aa__]ZHZ_]\]V[]NXX_[FJFSJTY]R\\]VWHZFQ][JOWMZ\[_BB
The tophat screen:
[Wed Sep 30 09:29:55 2009] Preparing output location ./tophat_out/
[Wed Sep 30 09:29:55 2009] Checking for Bowtie index files
[Wed Sep 30 09:29:55 2009] Checking for reference FASTA file
[Wed Sep 30 09:29:55 2009] Checking for Bowtie
Bowtie version: 0.10.1.0
[Wed Sep 30 09:29:55 2009] Checking reads
seed length: 52bp
format: fastq
quality scale: --solexa1.3-quals
[Wed Sep 30 09:30:20 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:34:15 2009] Joining segment hits
Splitting reads into 2 segments
[Wed Sep 30 09:34:23 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:39:36 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:44:53 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:48:42 2009] Joining segment hits
Splitting reads into 2 segments
[Wed Sep 30 09:48:49 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:54:02 2009] Mapping reads against RefGenome with Bowtie
[Wed Sep 30 09:59:22 2009] Searching for junctions via segment mapping
Warning: junction database is empty!
[Wed Sep 30 10:01:08 2009] Joining segment hits
[Wed Sep 30 10:01:08 2009] Joining segment hits
[Wed Sep 30 10:01:08 2009] Reporting output tracks
-----------------------------------------------
Run complete [00:31:12 elapsed]
My command:
./tophat --solexa1.3-quals RefGenome part10_1.ma.fq part10_2.ma.fq
Comment