Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat chromosome position greater than chromosome size

    I use tophat to map the rna-seq reads (paired-end). It seems that some of the reads in the .sam do not make sense. Please see some examples below:

    Example line 1:
    NB500923:20:H7WCJBGXX:1:12109:13961:14748 385 chr1 11646 3 151M chr22 114359002 0 CTTTTGGATTTTTGCCAGTCTAACAGGTGAAGCCCTGGAGATTCTTATTAGTGATTTGGGCTGGGGCCTGGCCATGTGTATTTTTTTAAATTTCCACTGATGATTTTGCTGCATGGCCGGTGTTGAGAATGACTGCGCAAATTTGCCGGAT AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE<EEEEEEEEEEEEEEEEEEEEE/AEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEAEA/EAA<EEEEEEAEA/<AEE XA:i:0 MD:Z:151 NM:i:0 NH:i:2 CC:Z:chr15 CP:i:102519374 HI:i:0

    In the above line, it says the start position of the mate on chr22 is 114359002, however, the size of the chr22 is only 51304566

    Example line 2:
    NB500923:20:H7WCJBGXX:1:23107:14442:20318 323 chr1 11696 0 151M chrUn_GL000249 155257832 0 GTGATTTGGGCTGGGGCCTGGCCATGTGTATTTTTTTAAATTTCCACTGATGATTTTGCTGCATGGCCGGTGTTGAGAATGACTGTGCAAATTTGCCGGATTTCCTTCGCTGTTCCTGCATGTAGTTTAAACGAGATTGCCAGCACCGGGT AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEAEEEEEAEEEEEEE<EEEEEEEEEEEEEEEAAEEA<EEEEEE XA:i:2 MD:Z:85C21T43 NM:i:2 NH:i:8 CC:Z:= CP:i:11696 HI:i:4

    It says the start position of the mate on chrUn_GL000249 is 155257832, however, the total size of chrUn_GL000249 is only 38502.

    Can anyone tell how this could happen? and how to fix the file?

    Thanks,

  • #2
    Can anyone help?

    Comment


    • #3
      Looks like this is a bug in the version of tophat2 that you're using. If you're using the most recent version, then please report this to the authors so they can fix it.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Best Practices for Single-Cell Sequencing Analysis
        by seqadmin



        While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
        Today, 07:15 AM
      • seqadmin
        Latest Developments in Precision Medicine
        by seqadmin



        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

        Somatic Genomics
        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
        05-24-2024, 01:16 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 08:18 AM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Today, 08:04 AM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-03-2024, 06:55 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-30-2024, 03:16 PM
      0 responses
      27 views
      0 likes
      Last Post seqadmin  
      Working...
      X