Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Must paired-end reads be in the same order in the two files for Tophat?

    Hello,

    Will TopHat work if R1.fastq and R2.fastq have their reads in different order?

    What if R1.fastq has some reads whose R2 mate did not make it past QC, and viceversa (R2.fastq has some reads whose R1 mate did not make it past QC)?

    I could only find this information about how it uses paired-end information after mapping independently each mate, but I can't find info on how it relates mate pairs to each other on the two files.

    TopHat maps left and right reads separately using Bowtie, that is, it doesn't use Bowtie's pair searching like --fr, --rf, --ff. Using the mapped reads, TopHat finds pairs if the two reads of a pair are on different strand (it ignores if they are on the same strand) and the inner distance is within user specified range.
    From: http://seqanswers.com/forums/showpos...49&postcount=2

    Thanks for your input.
    Last edited by friducha; 02-12-2015, 05:22 PM.

  • #2
    Reads always need to be in the same order. It is best to use tools like BBDuk that keep pairs together for quality control; doing QC on the two files independently will only cause problems. I did write a tool for fixing that situation, though, repair.sh.

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      Just for reference, there are quite a few tools that respect paired data, such as Prinseq and Trimmomatic. Often, custom pipelines don't take this into account. Also, Pairfq is a lighter approach to pairing reads. This makes it really easy to incorporate into a pipeline. In most cases, the follow is all you need to install:

      Code:
      curl -L git.io/pairfq_lite > pairfq_lite
      chmod +x pairfq_lite
      ./pairfq_lite -h
      The last command just prints the usage, which is explained on the the wiki or from the inline documentation available at the command line.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Choosing Between NGS and qPCR
        by seqadmin



        Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
        10-18-2024, 07:11 AM
      • seqadmin
        Non-Coding RNA Research and Technologies
        by seqadmin




        Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

        Nobel Prize for MicroRNA Discovery
        This week,...
        10-07-2024, 08:07 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 11-01-2024, 06:09 AM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-30-2024, 05:31 AM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-24-2024, 06:58 AM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-23-2024, 08:43 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Working...
      X