Dear all,
This is my first thread here, so hello everyone!

I am new in the field, so I hope my questions are not too trivial and could interest some of you.
We have RNAseq data 36bp single end reads. I am interested to get quantitative expression data comparable between different exeprimental conditions and also on the discovery of new transcripts/exons. I have so far tried the bowtie/tophat/cufflinks pipeline for the analysis, using the default settings of those software with the option -g 1 -G /path/to/gff/file as a starting point.
I am however wondering if I can improve the mapping by using blat instead/or in complement of bowtie. I have access to a computing grid and will not have to run the pipeline everyday (as long as we believe the mapping is as comprehensive as possible). So the processing time of blat is theoriticaly not too problematic for me.
Does anyone knows how blat perform compared to bowtie? I know for instance that Erange use blat for the unmappable reads to increase the mapping coverage. Tophat is efficient for abundant transcripts but its ability to map low abundance exons-exons reads drops significantly, if I am not mistaken. This should not be a problem for blat however.
Is it possible to use the output of blat to generate exon-exon model using tophat or else? The purpose here is also to be able to use cufflink for the computation of the rpkm.
Last questions more related to tophat itself: tophat cover 80% of Erange modeled exon. Is it solely due to the coverage limit with low rpkm transcript of tophat algorithm? Is it possible to increase the coverage of known exons with tophat by providing the gff file to tophat? Will this increase the coverage of exon-exon mapped compared to Erange?
Thank if you read this message entirely!
Have a nice day, all of you

Cheers
Olivier
This is my first thread here, so hello everyone!

I am new in the field, so I hope my questions are not too trivial and could interest some of you.
We have RNAseq data 36bp single end reads. I am interested to get quantitative expression data comparable between different exeprimental conditions and also on the discovery of new transcripts/exons. I have so far tried the bowtie/tophat/cufflinks pipeline for the analysis, using the default settings of those software with the option -g 1 -G /path/to/gff/file as a starting point.
I am however wondering if I can improve the mapping by using blat instead/or in complement of bowtie. I have access to a computing grid and will not have to run the pipeline everyday (as long as we believe the mapping is as comprehensive as possible). So the processing time of blat is theoriticaly not too problematic for me.
Does anyone knows how blat perform compared to bowtie? I know for instance that Erange use blat for the unmappable reads to increase the mapping coverage. Tophat is efficient for abundant transcripts but its ability to map low abundance exons-exons reads drops significantly, if I am not mistaken. This should not be a problem for blat however.
Is it possible to use the output of blat to generate exon-exon model using tophat or else? The purpose here is also to be able to use cufflink for the computation of the rpkm.
Last questions more related to tophat itself: tophat cover 80% of Erange modeled exon. Is it solely due to the coverage limit with low rpkm transcript of tophat algorithm? Is it possible to increase the coverage of known exons with tophat by providing the gff file to tophat? Will this increase the coverage of exon-exon mapped compared to Erange?
Thank if you read this message entirely!
Have a nice day, all of you

Cheers
Olivier
Comment