Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat2 high discordant alignments

    Hi,

    I am mapping paired end RNAseq data using tophat2, but the alignment summary generated is showing I am getting a very high discordant alignment rate. The only tophat options I am specifying is -p 16 and -o "DIR". Below is the output from tophat2:


    PHP Code:
    Left reads:
              
    Input     :  88556961
               Mapped   
    :  76938162 (86.9of input)
                
    of these:  20429665 (26.6%) have multiple alignments (622137 have >20)
    Right reads:
              
    Input     :  88556961
               Mapped   
    :  75252663 (85.0of input)
                
    of these:  20114304 (26.7%) have multiple alignments (621700 have >20)
    Unpaired reads:
              
    Input     :     68008
               Mapped   
    :     56927 (83.7of input)
                
    of these:      8045 (14.1%) have multiple alignments (9 have >20)
    85.9overall read mapping rate.

    Aligned pairs:  65389463
         of these
    :  18622479 (28.5%) have multiple alignments
                    61775607 
    (94.5%) are discordant alignments
     4.1
    concordant pair alignment rate
    The flagstat output I get is also below:

    PHP Code:
    341377625 0 in total (QC-passed reads QC-failed reads)
    189129873 0 secondary
    0 supplimentary
    0 duplicates
    341377625 
    0 mapped (100.00%:-nan%)
    152190825 0 paired in sequencing
    76938162 
    0 read1
    75252663 
    0 read2
    263998 
    0 properly paired (0.17%:-nan%)
    130778926 0 with itself and mate mapped
    21411899 
    0 singletons (14.07%:-nan%)
    116104620 0 with mate mapped to a different chr
    80799382 
    0 with mate mapped to a different chr (mapQ>=5

    I am using cutadapt to remove adapters and remove low quality reads and that is running fine. But then when I pass the paired files onto tophat, the results don't seem good. From what I have read, it is to do with the mate pairs no longer being in sync in the two fastq files. Is there a way around this and to get the number of discordant alignments down?

    I have tried aligning the fastq files with tophat2 without passing the files through cutadapt first and the alignment is fine and there is a very low discordant alignment rate, so I'm guessing the fastq files are good, but something is happening after the cutadapt step.
    Just as a note, I am not using Galaxy for the analysis.

    Thanks

  • #2
    Originally posted by ea11 View Post
    From what I have read, it is to do with the mate pairs no longer being in sync in the two fastq files. Is there a way around this and to get the number of discordant alignments down?
    Thanks
    Use a paired-end aware trimmer like trimmomatic/BBDuk (from BBMap) which keep the paired end files in sync post trimming.

    That said, if you are happy with the cutadapt results and just want to fix the PE read order you can do so by using repair.sh from BBMap (paired end reads in two files example): http://seqanswers.com/forums/showpos...0&postcount=45

    Comment


    • #3
      Thanks for the reply. I though cutadapt did that with the -p option to specify paired end data.
      I shall give BBDuk a try and see the results. I was not happy with the results of trimmomatic on my data, so staying away from that trimmer for now.

      Thanks

      Comment


      • #4
        Just checking. You are not switching the R1/R2 files when you use them as input for tophat by mistake? That will produce discordant results for obvious reasons.

        Comment


        • #5
          Nope I am not. R1 files are before the R2 files in the script.

          Comment


          • #6
            BBMap will do spliced alignments so after you use BBDuk you may want to give BBMap a try on the side while you do your TopHat2 runs.

            Comment


            • #7
              Thanks, I shall have a read and see what the results look like with BBDuk/BBMap while my tophat jobs are running

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:57 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-06-2024, 07:17 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-02-2024, 08:06 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-30-2024, 12:17 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Working...
              X