Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bug in GAP5: Finding spanning read pairs

    Hi all,

    I've discovered that when using Find Read Pairs in GAP5 to find read pairs in an assembly which span contigs, all of the read pairs reported in the information given about the spanning read pairs by the contig comparator are ALWAYS reported as forwards and forwards direction e.g.

    Read pair:
    From contig DENOVO_c11(#2157181) at 64 reading GAPC_0042_FC:6:104:8598:9213#0/1(#2156633)
    With contig DENOVO_c224(#36154940) at 462 reading GAPC_0042_FC:6:104:8598:9213#0/2(#36162837)
    Direction of first read is forwards
    Direction of second read is forwards
    Length 35

    Read pair:
    From contig DENOVO_c190(#34488635) at 1191 reading GAPC_0042_FC:6:2:17711:1932#0/1(#34493286)
    With contig DENOVO_c354(#38868888) at 2904 reading GAPC_0042_FC:6:2:17711:1932#0/2(#38887671)
    Direction of first read is forwards
    Direction of second read is forwards

    I have three libraries. Two paired end >--< and One mate paired <---->.
    No matter which library I choose or which type of comparison I choose (end vs end, all vs all end vs all) the read pairs reported in the "Contig Comparator" are ALWAYS in the forwards forwards direction.

    When manually looking at the two examples above:
    In the first example first both directions are correct but the position reported for the read for the second contig is not at 462 but at 1847. Is it a coincidence that 462 is rounded up 1/4 of 1847?

    In the second example the reads are positioned correctly but the direction of the second read in the contig is reverse not forwards.

    So in my library of mate pairs which I want to use for scaffolding "find read pairs" finds 85000 spanning read pairs (of 650000 pairs in the library). EVERY SPANNING pair is reported in the forwards forwards direction.
    At first I thought that only forwards forwards were detected but after manually looking at 10 of the reported spanning pairs above, it seems the problem is with the reporting of direction of the second read. The direction is wrong for the second read in half the cases. Also in one case the position of the second reported read is incorrect being 1/4 of the position.
    I manually found some spanning reverse reverse reads using the template status and none of these get reported. So no reads were found by "find read pairs" that are reverse reverse even though these are represented in the assembly when manually looking for spanning reads using the template status colours.

    I've tried to report the problem on the Staden Sourceforge site but have not had a response.

    Regards
    Robert

  • #2
    Could you add a link to your SourceForge report?

    Which input file format are you giving GAP5, and how was it created (in case the error can be traced to the input data)?

    Comment


    • #3
      The input was from a caf file generated by MIRA. I used tg_index to create the database from the caf file. The caf file was from an assembly of 454 Titanium reads and Illumina mate paired and paired end reads by MIRA version 4.04 but this afternoon I check with assemblies from MIRA using versions 3.18 and 4.03 assemblies with the same result.

      The report was essentially the same email to the Staden discussion on Sourceforge via Email as this appeared to be the only way I could find submit a report about the Staden package.

      I don't think this is related to the input data. It seems to be related to reporting and "Find Read Pairs" as the reads are displayed correctly and read pairs are correct in the template status and in the editors in GAP5.

      Comment


      • #4
        Originally posted by rwillows View Post
        The report was essentially the same email to the Staden discussion on Sourceforge via Email as this appeared to be the only way I could find submit a report about the Staden package.
        Their issue tracker is here:


        Also they have a public discussion forum:


        Originally posted by rwillows View Post
        I don't think this is related to the input data. It seems to be related to reporting and "Find Read Pairs" as the reads are displayed correctly and read pairs are correct in the template status and in the editors in GAP5.
        OK - if the read pairs are displayed correctly then it probably isn't the input data.

        Comment


        • #5
          Thanks.
          I've posted a report.

          Comment


          • #6
            Originally posted by rwillows View Post
            Thanks.
            I've posted a report.
            Staden Bug 102, http://sourceforge.net/p/staden/bugs/102/

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Exploring the Dynamics of the Tumor Microenvironment
              by seqadmin




              The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
              07-08-2024, 03:19 PM
            • seqadmin
              Exploring Human Diversity Through Large-Scale Omics
              by seqadmin


              In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
              06-25-2024, 06:43 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 11:09 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-19-2024, 07:20 AM
            0 responses
            148 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-16-2024, 05:49 AM
            0 responses
            124 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 07-15-2024, 06:53 AM
            0 responses
            111 views
            0 likes
            Last Post seqadmin  
            Working...
            X