Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat2 high discordant alignments

    Hi,

    I am mapping paired end RNAseq data using tophat2, but the alignment summary generated is showing I am getting a very high discordant alignment rate. The only tophat options I am specifying is -p 16 and -o "DIR". Below is the output from tophat2:


    PHP Code:
    Left reads:
              
    Input     :  88556961
               Mapped   
    :  76938162 (86.9of input)
                
    of these:  20429665 (26.6%) have multiple alignments (622137 have >20)
    Right reads:
              
    Input     :  88556961
               Mapped   
    :  75252663 (85.0of input)
                
    of these:  20114304 (26.7%) have multiple alignments (621700 have >20)
    Unpaired reads:
              
    Input     :     68008
               Mapped   
    :     56927 (83.7of input)
                
    of these:      8045 (14.1%) have multiple alignments (9 have >20)
    85.9overall read mapping rate.

    Aligned pairs:  65389463
         of these
    :  18622479 (28.5%) have multiple alignments
                    61775607 
    (94.5%) are discordant alignments
     4.1
    concordant pair alignment rate
    The flagstat output I get is also below:

    PHP Code:
    341377625 0 in total (QC-passed reads QC-failed reads)
    189129873 0 secondary
    0 supplimentary
    0 duplicates
    341377625 
    0 mapped (100.00%:-nan%)
    152190825 0 paired in sequencing
    76938162 
    0 read1
    75252663 
    0 read2
    263998 
    0 properly paired (0.17%:-nan%)
    130778926 0 with itself and mate mapped
    21411899 
    0 singletons (14.07%:-nan%)
    116104620 0 with mate mapped to a different chr
    80799382 
    0 with mate mapped to a different chr (mapQ>=5

    I am using cutadapt to remove adapters and remove low quality reads and that is running fine. But then when I pass the paired files onto tophat, the results don't seem good. From what I have read, it is to do with the mate pairs no longer being in sync in the two fastq files. Is there a way around this and to get the number of discordant alignments down?

    I have tried aligning the fastq files with tophat2 without passing the files through cutadapt first and the alignment is fine and there is a very low discordant alignment rate, so I'm guessing the fastq files are good, but something is happening after the cutadapt step.
    Just as a note, I am not using Galaxy for the analysis.

    Thanks

  • #2
    Originally posted by ea11 View Post
    From what I have read, it is to do with the mate pairs no longer being in sync in the two fastq files. Is there a way around this and to get the number of discordant alignments down?
    Thanks
    Use a paired-end aware trimmer like trimmomatic/BBDuk (from BBMap) which keep the paired end files in sync post trimming.

    That said, if you are happy with the cutadapt results and just want to fix the PE read order you can do so by using repair.sh from BBMap (paired end reads in two files example): http://seqanswers.com/forums/showpos...0&postcount=45

    Comment


    • #3
      Thanks for the reply. I though cutadapt did that with the -p option to specify paired end data.
      I shall give BBDuk a try and see the results. I was not happy with the results of trimmomatic on my data, so staying away from that trimmer for now.

      Thanks

      Comment


      • #4
        Just checking. You are not switching the R1/R2 files when you use them as input for tophat by mistake? That will produce discordant results for obvious reasons.

        Comment


        • #5
          Nope I am not. R1 files are before the R2 files in the script.

          Comment


          • #6
            BBMap will do spliced alignments so after you use BBDuk you may want to give BBMap a try on the side while you do your TopHat2 runs.

            Comment


            • #7
              Thanks, I shall have a read and see what the results look like with BBDuk/BBMap while my tophat jobs are running

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Quality Control Essentials for Next-Generation Sequencing Workflows
                by seqadmin




                Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

                Nucleic Acid Quality Control
                Preparing for NGS starts with isolating the...
                02-10-2025, 01:58 PM
              • seqadmin
                An Introduction to the Technologies Transforming Precision Medicine
                by seqadmin


                In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
                01-27-2025, 07:46 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 02-07-2025, 09:30 AM
              0 responses
              64 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-05-2025, 10:34 AM
              0 responses
              99 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-03-2025, 09:07 AM
              0 responses
              78 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 01-31-2025, 08:31 AM
              0 responses
              44 views
              0 likes
              Last Post seqadmin  
              Working...
              X