Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • newbler problem

    Hi, I found the 454Isotigs.fna file contains many sequences that are 100% identical but with different lengths (i.e. one sequence contains another shorter one). Isn't this supposed not to happen. I mean they should be assembled as one? Thanks ...
    Last edited by bioben; 09-30-2010, 07:49 PM.

  • #2
    Forgot to say that I am trying to assemble ~10 million 454 ESTs and ~1 million sanger ESTs. I also tried CAP3 and TGICL. They all output identical sequences more or less in the contigs and singlets files.

    Comment


    • #3
      Originally posted by bioben View Post
      Hi, I found the 454Isotigs.fna file contains many sequences that are 100% identical but with different lengths (i.e. one sequence contains another shorter one). Isn't this supposed not to happen. I mean they should be assembled as one? Thanks ...
      This is the gsAssembler (Newbler) saying that it believes there are two isoforms of the gene, one being shorter than the other. Is it correct?? That's where your biological expertise comes in. Personally I would bet a large number of donuts that it's not correct. gsAssembler seems to be overzealous in finding isoforms.

      Comment


      • #4
        Thanks, kmcarr. I think you are right. Probably they are splicing variants.

        Then how about singlets? I tried to find them back by parsing the 454ReadStatus.txt file. The resulting singlets file also contains many identical reads. To me, they are supposed to be assembled as one and show up in the isotigs file. Do people usually care about singlets or not? Thanks ...

        Comment


        • #5
          Originally posted by bioben View Post
          Then how about singlets? I tried to find them back by parsing the 454ReadStatus.txt file. The resulting singlets file also contains many identical reads. To me, they are supposed to be assembled as one and show up in the isotigs file. Do people usually care about singlets or not? Thanks ...
          I suspect that the singletons are not assembled together simply because they are identical and thus considered to be technical duplicates. It is hard to have a contig made up of exactly one identical read. If the reads overlap then they could be assembled. Unfortunately do not know of a 454 file that describes which reads are true singletons and which are duplicate singletons.

          Comment


          • #6
            Hi bioben
            I think you should read this thread: Detection of alternative splicing events from 454 output
            it should answer a lot of questions

            Comment


            • #7
              Originally posted by westerman View Post
              I suspect that the singletons are not assembled together simply because they are identical and thus considered to be technical duplicates. It is hard to have a contig made up of exactly one identical read. If the reads overlap then they could be assembled. Unfortunately do not know of a 454 file that describes which reads are true singletons and which are duplicate singletons.
              I don't think so. Singletons are read from region poorly covered by emPCR. also, if there were reads having an overlap but when they were trimmed or there were some sequencing errors, newbler did not find the overlap. Set these before you start assembly in 454AssemblyProject.xml:

              <minimumReadLength>45</minimumReadLength>
              <overlapSeedStep>1</overlapSeedStep>
              <overlapMinMatchLength>60</overlapMinMatchLength>
              <overlapMinMatchIdentity>96</overlapMinMatchIdentity>
              <ripMode>true</ripMode>

              Make a new cDNA assembly, do not re-run it from the current assembly directory because in my opinion newbler does not re-compute the overlaps and hence not all changes will kick in. With these settings I got 50% more assembled contigs than with loose defaults!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X