I am working with RNA-Seq 36 bp single-end reads (unstranded) from several experimental conditions, which have been aligned to my reference genome using TopHat. I have a couple of questions regarding Cufflinks: -
1. What is the difference between the library types 'fr-unstranded' and 'ff-unstranded' ? I left Cufflinks to use the default 'fr-unstranded', just as I did for TopHat. Was this correct? I'm asking because I experimented using both 'fr-unstranded' and 'ff-unstranded' library types and got different FPKM values.
2. My 36 bp reads are recognized as 40 bp single-end reads by Cufflinks. Any ideas why this could be? Should I be concerned?
3. How does the fragment length distribution affect FPKM values for single-end reads? I didn't specify parameters for mean fragment length (-m) or standard deviation (-s) because my reads are single-end and the libraries were all size-selected at ~200 bp on a gel. Cufflinks therefore defaulted to '-m 200' and '-s 80'. If I run Cufflinks using the correct fragment length of 81 bp (200 bp minus 119 bp primer sequences), now my FPKM values are different. Any ideas why? Unless perhaps Cufflinks needs this info to correct for under-represented 5' and 3' ends of transcripts?
Thanks for your help.
1. What is the difference between the library types 'fr-unstranded' and 'ff-unstranded' ? I left Cufflinks to use the default 'fr-unstranded', just as I did for TopHat. Was this correct? I'm asking because I experimented using both 'fr-unstranded' and 'ff-unstranded' library types and got different FPKM values.
2. My 36 bp reads are recognized as 40 bp single-end reads by Cufflinks. Any ideas why this could be? Should I be concerned?
3. How does the fragment length distribution affect FPKM values for single-end reads? I didn't specify parameters for mean fragment length (-m) or standard deviation (-s) because my reads are single-end and the libraries were all size-selected at ~200 bp on a gel. Cufflinks therefore defaulted to '-m 200' and '-s 80'. If I run Cufflinks using the correct fragment length of 81 bp (200 bp minus 119 bp primer sequences), now my FPKM values are different. Any ideas why? Unless perhaps Cufflinks needs this info to correct for under-represented 5' and 3' ends of transcripts?
Thanks for your help.
Comment