Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Must paired-end reads be in the same order in the two files for Tophat?

    Hello,

    Will TopHat work if R1.fastq and R2.fastq have their reads in different order?

    What if R1.fastq has some reads whose R2 mate did not make it past QC, and viceversa (R2.fastq has some reads whose R1 mate did not make it past QC)?

    I could only find this information about how it uses paired-end information after mapping independently each mate, but I can't find info on how it relates mate pairs to each other on the two files.

    TopHat maps left and right reads separately using Bowtie, that is, it doesn't use Bowtie's pair searching like --fr, --rf, --ff. Using the mapped reads, TopHat finds pairs if the two reads of a pair are on different strand (it ignores if they are on the same strand) and the inner distance is within user specified range.
    From: http://seqanswers.com/forums/showpos...49&postcount=2

    Thanks for your input.
    Last edited by friducha; 02-12-2015, 05:22 PM.

  • #2
    Reads always need to be in the same order. It is best to use tools like BBDuk that keep pairs together for quality control; doing QC on the two files independently will only cause problems. I did write a tool for fixing that situation, though, repair.sh.

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      Just for reference, there are quite a few tools that respect paired data, such as Prinseq and Trimmomatic. Often, custom pipelines don't take this into account. Also, Pairfq is a lighter approach to pairing reads. This makes it really easy to incorporate into a pipeline. In most cases, the follow is all you need to install:

      Code:
      curl -L git.io/pairfq_lite > pairfq_lite
      chmod +x pairfq_lite
      ./pairfq_lite -h
      The last command just prints the usage, which is explained on the the wiki or from the inline documentation available at the command line.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Advanced Methods for the Detection of Infectious Disease
        by seqadmin




        The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
        ...
        11-27-2023, 01:15 PM
      • seqadmin
        Strategies for Investigating the Microbiome
        by seqadmin




        Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
        11-09-2023, 07:02 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 02:24 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Today, 07:37 AM
      0 responses
      15 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 08:23 AM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-01-2023, 09:55 AM
      0 responses
      23 views
      0 likes
      Last Post seqadmin  
      Working...
      X