Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug in GAP5: Finding spanning read pairs

    Hi all,

    I've discovered that when using Find Read Pairs in GAP5 to find read pairs in an assembly which span contigs, all of the read pairs reported in the information given about the spanning read pairs by the contig comparator are ALWAYS reported as forwards and forwards direction e.g.

    Read pair:
    From contig DENOVO_c11(#2157181) at 64 reading GAPC_0042_FC:6:104:8598:9213#0/1(#2156633)
    With contig DENOVO_c224(#36154940) at 462 reading GAPC_0042_FC:6:104:8598:9213#0/2(#36162837)
    Direction of first read is forwards
    Direction of second read is forwards
    Length 35

    Read pair:
    From contig DENOVO_c190(#34488635) at 1191 reading GAPC_0042_FC:6:2:17711:1932#0/1(#34493286)
    With contig DENOVO_c354(#38868888) at 2904 reading GAPC_0042_FC:6:2:17711:1932#0/2(#38887671)
    Direction of first read is forwards
    Direction of second read is forwards

    I have three libraries. Two paired end >--< and One mate paired <---->.
    No matter which library I choose or which type of comparison I choose (end vs end, all vs all end vs all) the read pairs reported in the "Contig Comparator" are ALWAYS in the forwards forwards direction.

    When manually looking at the two examples above:
    In the first example first both directions are correct but the position reported for the read for the second contig is not at 462 but at 1847. Is it a coincidence that 462 is rounded up 1/4 of 1847?

    In the second example the reads are positioned correctly but the direction of the second read in the contig is reverse not forwards.

    So in my library of mate pairs which I want to use for scaffolding "find read pairs" finds 85000 spanning read pairs (of 650000 pairs in the library). EVERY SPANNING pair is reported in the forwards forwards direction.
    At first I thought that only forwards forwards were detected but after manually looking at 10 of the reported spanning pairs above, it seems the problem is with the reporting of direction of the second read. The direction is wrong for the second read in half the cases. Also in one case the position of the second reported read is incorrect being 1/4 of the position.
    I manually found some spanning reverse reverse reads using the template status and none of these get reported. So no reads were found by "find read pairs" that are reverse reverse even though these are represented in the assembly when manually looking for spanning reads using the template status colours.

    I've tried to report the problem on the Staden Sourceforge site but have not had a response.

    Regards
    Robert

  • #2
    Could you add a link to your SourceForge report?

    Which input file format are you giving GAP5, and how was it created (in case the error can be traced to the input data)?

    Comment


    • #3
      The input was from a caf file generated by MIRA. I used tg_index to create the database from the caf file. The caf file was from an assembly of 454 Titanium reads and Illumina mate paired and paired end reads by MIRA version 4.04 but this afternoon I check with assemblies from MIRA using versions 3.18 and 4.03 assemblies with the same result.

      The report was essentially the same email to the Staden discussion on Sourceforge via Email as this appeared to be the only way I could find submit a report about the Staden package.

      I don't think this is related to the input data. It seems to be related to reporting and "Find Read Pairs" as the reads are displayed correctly and read pairs are correct in the template status and in the editors in GAP5.

      Comment


      • #4
        Originally posted by rwillows View Post
        The report was essentially the same email to the Staden discussion on Sourceforge via Email as this appeared to be the only way I could find submit a report about the Staden package.
        Their issue tracker is here:


        Also they have a public discussion forum:


        Originally posted by rwillows View Post
        I don't think this is related to the input data. It seems to be related to reporting and "Find Read Pairs" as the reads are displayed correctly and read pairs are correct in the template status and in the editors in GAP5.
        OK - if the read pairs are displayed correctly then it probably isn't the input data.

        Comment


        • #5
          Thanks.
          I've posted a report.

          Comment


          • #6
            Originally posted by rwillows View Post
            Thanks.
            I've posted a report.
            Staden Bug 102, http://sourceforge.net/p/staden/bugs/102/

            Comment

            Latest Articles

            Collapse

            • seqadmin
              The Impact of AI in Genomic Medicine
              by seqadmin



              Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
              02-26-2024, 02:07 PM
            • seqadmin
              Multiomics Techniques Advancing Disease Research
              by seqadmin


              New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

              A major leap in the field has
              ...
              02-08-2024, 06:33 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 02-28-2024, 06:12 AM
            0 responses
            27 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-23-2024, 04:11 PM
            0 responses
            74 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-21-2024, 08:52 AM
            0 responses
            81 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-20-2024, 08:57 AM
            0 responses
            69 views
            0 likes
            Last Post seqadmin  
            Working...
            X