Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Where is the read?

    Hello,

    I have "lost" one read in a 454 assembly. That read is in the .sff used to run the assembly.
    It's also at 454ReadStatus.txt file telling that it became a Singleton

    nameoftheread Singleton

    It also has been trimmed:
    Accno Trimpoints Used Used Trimmed Length Orig Trimpoints Orig Trimmed Length Raw Length
    nameoftheread 5-270 266 5-270 266 270

    I have the "allContigThresh" parameter to 100.

    So anybody knows why isn't this read at 454AllContigs.fna output??
    Where has it gone?

    Thanks in advance.






    PD: I write down the qualities extracted from the sff file, in case they can help:

    Bases: tcagTCGCAAGTGCCACGACCAGAAAGAATTGATGGTGGCGGTTGTTTGCCACACGACCTCTCCCGAAGACTTTGGAGGTAATGCCGACATTGGATCTGGCAAGAACAAAGCTTAACCCTAATTTATTTTATAGCAATAGCAAGGGTTAAGCTTTGTTCTTTCCATATTTGATGTCAGCATTCCCTCCTAAGACTCAGGAATGGATGTGGAAATTGTGCTTCGCAAGAACCTACAGTCGTCGCTTCGGTATGGACAAAGCTTGAAGGTTG
    Quality Scores: 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 33 20 20 23 23 22 22 31 31 33 33 35 35 35 35 35 35 36 35 35 35 35 35 35 35 35 35 31 31 31 33 35 35 35 35 33 33 33 35 35 31 31 31 33 33 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 33 33 33 35 35 35 35 35 35 35 35 35 35 35 35 33 31 31 31 33 35 35 35 35 31 31 31 35 35 31 31 31 31 17 17 17 25 24 24 25 25 25 31 20 20 20 20 35 35 35 35 33 31 31 32 35 35 35 30 30 21 21 21 21 21 33 33 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 33 33 33 35 35 35 35 35 35 35 35 35 35 35 35 33 33 33 35 35 35 35 35 35 35 35 35 35 35 35 35 35 33 33 33 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 35 33 31 31 33 34 34 34 34 34 34 27 27 27 33 34 34 34 34 34 34 34 34 34 34 31 31 31 26 26 26 28 28 28 28 29 28 28 28 28 28 28 28 28 28 22 22 22 22 22 22 22

  • #2
    The answer is there in the 454ReadStatus.txt file. This read is a "Singleton" meaning it was not assembled into any contig, that's why it's not in the 454AllContigs.fna. That file does not include singletons.

    If your assembly is typical you should have many more than just one Singleton remaining. Look at the readStatus section of your 454NewblerMetrics.txt file to see the total of Assembled, Singleton, et al. in your assembly.

    If you want to recover a file of the unassembled (Singleton) reads you could create a text file listing their IDs (taken from the ReadStatus file) and use this list in conjunction with the sfffile and sffinfo tools to extract the singleton reads from the sff file.
    Last edited by kmcarr; 02-14-2011, 09:25 AM. Reason: Added sfffile to the programs needed to extract a subset of reads.

    Comment


    • #3
      numreads=1?

      Thankyou for your answer
      But inside 454AllContigs.fna are a lot of "contigs" with numreads=1
      They are singletons to me. Aren't they??

      Comment


      • #4
        Originally posted by drgoettel View Post
        Thankyou for your answer
        But inside 454AllContigs.fna are a lot of "contigs" with numreads=1
        They are singletons to me. Aren't they??
        No, those are not classified as singletons. The read which makes up a single read contig will be listed as Assembled in the 454ReadStatus file.

        Now you may ask, how come some reads are singletons and other assembled as single read contigs? Not knowing the internals of the gsAssembler I can't answer definitively but my guess is that these single read contigs may have been been part of larger contigs originally but these reads were ripped out during latter stages of the assembly for whatever reason. They then become classified as single read contigs vs singletons which are reads which were never included in any contig. Again, this is only a guess.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Advances in Sequencing Technologies
          by seqadmin







          Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

          Long-Read Sequencing
          Long-read sequencing has...
          12-02-2024, 01:49 PM
        • seqadmin
          Genetic Variation in Immunogenetics and Antibody Diversity
          by seqadmin



          The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
          11-06-2024, 07:24 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 12-02-2024, 09:29 AM
        0 responses
        144 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-02-2024, 09:06 AM
        0 responses
        51 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-02-2024, 08:03 AM
        0 responses
        42 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-22-2024, 07:36 AM
        0 responses
        72 views
        0 likes
        Last Post seqadmin  
        Working...
        X