I have RNA-seq data which is strand specific that is the query strand equals the strand of transcription. In the sam output file from Tophat the XS tag is included for spliced transcripts, as I know the strand of the read from the FLAG query strand I went ahead and added the XS tag for unspliced reads. The Cufflinks readme indicates this strand info is utilized in the transcript build therefore i was expecting that the resulting assemblies would all have the strand set in the gtf file. This appears not to be the case and the strand was only set for spliced reads. The XS tags I added appears correct so it seems to be a feature that cufflinks does not output the strand for unspliced assemblies, is this correct?
Also i notice the -F parameter appears to limit two functions --min-isoform-fraction and --pre-mrna-fraction. Ideally I would not want to limit genuine splice variants (i.e. novel splice sites) based on abundance but would like to filter out isoforms with retained introns. Given the F parameter controls both options this doesn't seem possible.
Also i notice the -F parameter appears to limit two functions --min-isoform-fraction and --pre-mrna-fraction. Ideally I would not want to limit genuine splice variants (i.e. novel splice sites) based on abundance but would like to filter out isoforms with retained introns. Given the F parameter controls both options this doesn't seem possible.
Comment