Seqanswers Leaderboard Ad

**mknut** · 05-22-2013, 02:55 AM

From the manual:

Usage: tophat [options]* <index_base> <reads1_1[,...,readsN_1]> [reads1_2,...readsN_2]

<reads1_1[,...,readsN_1]> A comma-separated list of files containing reads in FASTQ or FASTA format. When running TopHat with paired-end reads, this should be the *_1 ("left") set of files.
<[reads1_2,...readsN_2]> A comma-separated list of files containing reads in FASTA or FASTA format. Only used when running TopHat with paired end reads, and contains the *_2 ("right") set of files. The *_2 files MUST appear in the same order as the *_1 files.

So for example:

Code:

tophat -o outputDirectoryName -r 50 --library-type fr-unstranded /path/to/genome/reference /path/to/first/read/read_1.fastq /path/to/second/read/read_2.fastq

Having a quick look at the tophat sourcecode (prep_reads.cpp), it looks like tophat processes reads sequentially, ie. the first read in read_1.fastq has to be the mate of the first read in read_2.fastq. Tophat does not seem to be doing any automatic read mate matching, at least nothing I can see. I think that if you merge the reads into one file and submit it twice as read1 and read2, you will lose synchronisation between the reads, if it works at all. Anyway, the method above is the method suggested in the manual, so I would stick to it.
To reiterate:

Should I enter the separate file lists for R1 and R2 fastq files?

Yes

**annavilborg** · 05-22-2013, 05:06 PM

Thank you! And thanks for explaining how tophat does this, great to get the background. I'll go ahead and run my samples as separate R1 and R2 files.

Topics	Statistics	Last Post
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, Yesterday, 12:17 PM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 18 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 22 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM

Seqanswers Leaderboard Ad

Announcement

Input Paired-ended data to Tophat

Comment

Comment

Latest Articles

ad_right_rmr

News