Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • newbler problem

    Hi, I found the 454Isotigs.fna file contains many sequences that are 100% identical but with different lengths (i.e. one sequence contains another shorter one). Isn't this supposed not to happen. I mean they should be assembled as one? Thanks ...
    Last edited by bioben; 09-30-2010, 07:49 PM.

  • #2
    Forgot to say that I am trying to assemble ~10 million 454 ESTs and ~1 million sanger ESTs. I also tried CAP3 and TGICL. They all output identical sequences more or less in the contigs and singlets files.

    Comment


    • #3
      Originally posted by bioben View Post
      Hi, I found the 454Isotigs.fna file contains many sequences that are 100% identical but with different lengths (i.e. one sequence contains another shorter one). Isn't this supposed not to happen. I mean they should be assembled as one? Thanks ...
      This is the gsAssembler (Newbler) saying that it believes there are two isoforms of the gene, one being shorter than the other. Is it correct?? That's where your biological expertise comes in. Personally I would bet a large number of donuts that it's not correct. gsAssembler seems to be overzealous in finding isoforms.

      Comment


      • #4
        Thanks, kmcarr. I think you are right. Probably they are splicing variants.

        Then how about singlets? I tried to find them back by parsing the 454ReadStatus.txt file. The resulting singlets file also contains many identical reads. To me, they are supposed to be assembled as one and show up in the isotigs file. Do people usually care about singlets or not? Thanks ...

        Comment


        • #5
          Originally posted by bioben View Post
          Then how about singlets? I tried to find them back by parsing the 454ReadStatus.txt file. The resulting singlets file also contains many identical reads. To me, they are supposed to be assembled as one and show up in the isotigs file. Do people usually care about singlets or not? Thanks ...
          I suspect that the singletons are not assembled together simply because they are identical and thus considered to be technical duplicates. It is hard to have a contig made up of exactly one identical read. If the reads overlap then they could be assembled. Unfortunately do not know of a 454 file that describes which reads are true singletons and which are duplicate singletons.

          Comment


          • #6
            Hi bioben
            I think you should read this thread: Detection of alternative splicing events from 454 output
            it should answer a lot of questions

            Comment


            • #7
              Originally posted by westerman View Post
              I suspect that the singletons are not assembled together simply because they are identical and thus considered to be technical duplicates. It is hard to have a contig made up of exactly one identical read. If the reads overlap then they could be assembled. Unfortunately do not know of a 454 file that describes which reads are true singletons and which are duplicate singletons.
              I don't think so. Singletons are read from region poorly covered by emPCR. also, if there were reads having an overlap but when they were trimmed or there were some sequencing errors, newbler did not find the overlap. Set these before you start assembly in 454AssemblyProject.xml:

              <minimumReadLength>45</minimumReadLength>
              <overlapSeedStep>1</overlapSeedStep>
              <overlapMinMatchLength>60</overlapMinMatchLength>
              <overlapMinMatchIdentity>96</overlapMinMatchIdentity>
              <ripMode>true</ripMode>

              Make a new cDNA assembly, do not re-run it from the current assembly directory because in my opinion newbler does not re-compute the overlaps and hence not all changes will kick in. With these settings I got 50% more assembled contigs than with loose defaults!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advanced Methods for the Detection of Infectious Disease
                by seqadmin




                The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
                ...
                11-27-2023, 01:15 PM
              • seqadmin
                Strategies for Investigating the Microbiome
                by seqadmin




                Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
                11-09-2023, 07:02 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 09:55 AM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 10:48 AM
              0 responses
              17 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 11-29-2023, 08:26 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 11-29-2023, 08:12 AM
              0 responses
              14 views
              0 likes
              Last Post seqadmin  
              Working...
              X