Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alignment PE reads with different length by TopHat

    Hello!
    I want to know if TopHat 1.2.0 can work with paired-end reads of different lenght without problems.
    Can I modify the -r/--mate-inner-dist for correct output?

    Example: I have sample with insert size of 200bp, the first read of 75bp and the second read of 50bp (for quality decrease).
    Can I modify -r option as 200-(75-50)? is correct?

    Are there other TopHat options that I must to consider?

    Thanks in advance

    Valeria

  • #2
    The TopHat manual explicitly warns against this. The recommended procedure is to merge BAM files downstream with the different sizes.

    Are you really truncating reads based on quality? If so, you will be stuck breaking them into separate files by length.

    An option I would consider, but have not tried, is to set the quality values to 0 instead of actually trimming the data. I'm not sure whether TopHat would really effectively ignore the data if that is done.

    Comment


    • #3
      So I am also interested in this. To be exact, the manual specifically warns against using different "types" of reads:

      NOTE: TopHat can align reads that are up to 1024 bp, and it handles paired end reads, but we do not recommend mixing several "types" of reads in the same TopHat run. For example, mixing 100bp single end reads and 2x27bp paired ends into the same TopHat run will give bad results.
      Their example only illustrates that its bad to mix paired and un-paired. It doesn't mention using a PE lib that has had non-uniform quality trimming. Does anyone have knowledge/experience with this specific case?

      Gus
      In science, "fact" can only mean "confirmed to such a degree that it would be perverse to withhold provisional assent." I suppose that apples might start to rise tomorrow, but the possibility does not merit equal time in physics classrooms.
      --Stephen Jay Gould

      Comment


      • #4
        Anybody tried to use different length reads for R1 and R2 for aligning their PE data? Any concerns tophat may not be able to handle?

        Due to quality issues I had to trim the last 50bp of a R2 of a 100bp PE run, but R1 are still 100bp.

        Thanks

        Comment


        • #5
          Originally posted by selen View Post
          Anybody tried to use different length reads for R1 and R2 for aligning their PE data? Any concerns tophat may not be able to handle?
          That'll work fine. After trimming, this scenario isn't infrequent.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 08:47 AM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Working...
          X