Hi there,
I have been using cufflink to predict transcript, but cufflink does not provide strand info on it. Could someone please suggest me how to solve that.
This is how i run the program
# Mapping of reads by tophat
tophat --num-threads 7 --solexa-quals --output-dir Transcript_Mapped/Eggs --butterfly-search /export/data/alignment_references/bowtie/danRer7 Raw_Reads/Eggs_fastq
# Construction of trasncripts
cufflinks --output-dir Eggs/ --num-threads 4 --frag-bias-correct /export/data/alignment_references/bowtie/danRer7.fa --multi-read-correct --upper-quartile-norm /home/chirag/Projects/danRer7_RNASeq/Project_FM009/Transcript_Mapped/Eggs/accepted_hits.bam
# Merge transcripts
cuffmerge --num-threads 6 -s /export/data/alignment_references/bowtie/danRer7.fa -o Version-1 assemblies.txt
Transcript.gtf and Merge.gtf is now covered into .bed
chr1 1906 4188 CUFF.1.1 1000 . 0 0 0 1 2282, 0,
chr1 39115 39565 CUFF.2.1 1000 . 0 0 0 1 450, 0,
chr1 39666 39981 CUFF.3.1 1000 . 0 0 0 1 315, 0,
chr1 1 745 CUFF.4.1 1000 . 0 0 0 1 744, 0,
chr1 36295 36812 CUFF.5.1 1000 . 0 0 0 1 517, 0,
chr1 4343 35664 CUFF.6.1 1000 - 0 0 0 4 829,86,65,858, 0,6063,6921,30463,
chr1 4343 27499 CUFF.6.2 1000 - 0 0 0 4 829,86,65,837, 0,6063,6921,22319,
We can see when the transcript is unspliced, it is unable to predict the strand.
Even if i look at the GTf file, there is no strand information.
The RNASeq data i used is 76 bp long, from Illumina, and has not strand information.
As far as i am aware, Cufflink should be able to predict the strand irrespective of the strand specific RNASeq data. I guess, it is not because of low coverage, since we have around 250 million reads per sample.
Could you please suggest me on this, how can this be solved ?
Is there any parameters which i have missed while making the transcripts ?
Thanks for your help in advance !
regards
Chirag
I have been using cufflink to predict transcript, but cufflink does not provide strand info on it. Could someone please suggest me how to solve that.
This is how i run the program
# Mapping of reads by tophat
tophat --num-threads 7 --solexa-quals --output-dir Transcript_Mapped/Eggs --butterfly-search /export/data/alignment_references/bowtie/danRer7 Raw_Reads/Eggs_fastq
# Construction of trasncripts
cufflinks --output-dir Eggs/ --num-threads 4 --frag-bias-correct /export/data/alignment_references/bowtie/danRer7.fa --multi-read-correct --upper-quartile-norm /home/chirag/Projects/danRer7_RNASeq/Project_FM009/Transcript_Mapped/Eggs/accepted_hits.bam
# Merge transcripts
cuffmerge --num-threads 6 -s /export/data/alignment_references/bowtie/danRer7.fa -o Version-1 assemblies.txt
Transcript.gtf and Merge.gtf is now covered into .bed
chr1 1906 4188 CUFF.1.1 1000 . 0 0 0 1 2282, 0,
chr1 39115 39565 CUFF.2.1 1000 . 0 0 0 1 450, 0,
chr1 39666 39981 CUFF.3.1 1000 . 0 0 0 1 315, 0,
chr1 1 745 CUFF.4.1 1000 . 0 0 0 1 744, 0,
chr1 36295 36812 CUFF.5.1 1000 . 0 0 0 1 517, 0,
chr1 4343 35664 CUFF.6.1 1000 - 0 0 0 4 829,86,65,858, 0,6063,6921,30463,
chr1 4343 27499 CUFF.6.2 1000 - 0 0 0 4 829,86,65,837, 0,6063,6921,22319,
We can see when the transcript is unspliced, it is unable to predict the strand.
Even if i look at the GTf file, there is no strand information.
The RNASeq data i used is 76 bp long, from Illumina, and has not strand information.
As far as i am aware, Cufflink should be able to predict the strand irrespective of the strand specific RNASeq data. I guess, it is not because of low coverage, since we have around 250 million reads per sample.
Could you please suggest me on this, how can this be solved ?
Is there any parameters which i have missed while making the transcripts ?
Thanks for your help in advance !
regards
Chirag
Comment