Seqanswers Leaderboard Ad

**fanli** · 03-04-2016, 02:41 PM

You should use fr-firststrand, based on the Tophat manual.

Also see:

rnaseq_tutorial/manuscript/supplementary_tables/supplementary_table_5.md at master · griffithlab/rnaseq_tutorial

https://github.com/griffithlab/rnaseq_tutorial/blob/master/manuscript/supplementary_tables/supplementary_table_5.md

Informatics for RNA-seq: A web resource for analysis on the cloud. Educational tutorials and working pipelines for RNA-seq analysis including an introduction to: cloud computing, critical file form...

**adrian** · 12-29-2016, 02:48 PM

Dear Group,

I never came across errors in Tophat2/Cufflinks analysis and this error is troubling me and I cannot find a proper solution. I am posting in anticipation that I could get some help.

I have a fastq files from RNA-Seq experiment (1x50bp;strand specific dUTP) single-end 50bp reads.

I aligned the files using tophat2.

tophat -p 16 --library-type fr-firststrand -G /HumanGenome/Grch37/Homo_sapiens/Ensembl/GRCh37/Annotation/Archives/archive-2015-07-17-14-31-42/Genes/genes.gtf -o myDir /HumanGenome/Grch37/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome fastq1.fastq

I get accepted_hits.bam file along with other files and I do not see any error.

2. Next I ran, cufflinks and I did not get any problems here. I could successfully generate transcripts.gtf file.

cufflinks -p 16 --library-type fr-firststrand -o myDirCL accepted_hits.bam

3. I next did cuff merge using all gtf files in a list - gtfList

cuffmerge -p 16 -g /HumanGenome/Grch37/Homo_sapiens/Ensembl/GRCh37/Annotation/Archives/archive-2015-07-17-14-31-42/Genes/genes.gtf -s /HumanGenome/Grch37/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome.fa gtfList

4. Cuffdiff fails:
cuffdiff -p 16 --library-type fr-firststrand -o mycfDiff /HumanGenome/Grch37/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome.fa -u merged.gtf samp1a.bam,samp1b.bam samp2a.bam,samp2b.bam

Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[16:06:53] Loading reference annotation.

right after saying loading reference annotation, pipeline fails.

I compared this with previous analyses. I never had EOF marker absent with successful cuffdiff runs.

I validated using ValidateSamFile of Picard and I do not see any difference between bam files of successful cuffdiff analysis and failed analysis.

Also Bam file is gzip file and I do see the end of bam file of failed run :

tail accepted_hits.bam | hexdump -C
00000980 54 84 0f 1b e3 be fa f6 50 67 5d b5 92 9c 16 24 |T.......Pg]....$|
00000990 b6 9e 54 68 40 ff 07 3c e5 ef 1d 2f 8e 00 00 1f |..Th@..<.../....|
000009a0 8b 08 04 00 00 00 00 00 ff 06 00 42 43 02 00 1b |...........BC...|
000009b0 00 03 00 00 00 00 00 00 00 00 00 |...........|
000009bb

Can any one suggest what could be wrong here.

I suspect, if I am correct in executing tophat for single-end 50bp, strand-specific reads.
There are 4 fastq files for each sample.

Header of fastq file:

file 1 : @SN930:564:H3Y5YBCXY:1:1101:1228:2226 1:N:0:AGTCAA
file 2: @SN930:565:H3Y5YBCXY:1:1101:1263:2151 1:N:0:AGTCAAC
file 3: @SN930:564:H3Y5YBCXY:2:1101:1433:2070 1:N:0:AGTCAA
file 4: @SN930:565:H3Y5YBCXY:2:1101:1282:2078 1:N:0:AGTCAAC

thanks a lot.
Adrian

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

strand specific mayhem for Tophat and Cufflinks

Comment

Comment

Latest Articles

ad_right_rmr

News