Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dgg32
    Junior Member
    • Oct 2011
    • 4

    mothur then assembly?

    Hi community.

    I have a question. Did or does anyone of you ever use mothur to process their 454 sequence reads and then assemble them? The way I see it, mothur is good and quite powerful now, includes the common sequence check and pyronoise, which is helpful for correcting some homopolymer in 454. Which in my opionion is a huge problem in my work. Then naturally I came to the idea that why not first mothur then assembly?

    The problem is, assembler like mira or newbler needs more than a fasta file to guarantee reasonable assembly quality. Mothur on the other hand, processes the raw sff data and when it is done, only a fasta file is generated and good to use.

    So I would like to know, anyone of you has previously tried to couple mothur and say, mira? Or anyone of you has first denoised the raw sff and then assemble?

    Thanks.
  • v_kisand
    Member
    • Jan 2009
    • 38

    #2
    why you want to skip quality data? That is not a good idea, never :-) Homopolymers are the problem in 454 data anyway ...

    Isn't mothur good for deep sequencing data of similar, kind a marker genes? Do you have metagenomic sequences from the total DNA and you plan to assemble to get a bit longer contigs?
    I would clean the 454 data from bad quality reads, perhaps very short ones, and check for artificial duplicates and the assemble with mira.

    -veljo

    Comment

    • dgg32
      Junior Member
      • Oct 2011
      • 4

      #3
      Hello, veljo.

      Thanks for your reply.

      Yes, we have brute force metagenomic dna and the assembly is necessary both for longer contigs and for reducing the work load, contigs have fewer bases in total than raw reads.

      In fact, your last suggestion is our current workflow. However, I am not quite sastisfied with it because a sheer look at the contig fasta gives me a unsettling feeling, the "T" are rarely alone, and it just doesn't look like "random". OK, someone asked me, "how do you know the original sequence is more random than this". But it just doesn't look right to me.

      Therefore I would like to confirm my double and if it is a fact, improve it. Mothur seems handy, everything is together and the development is ongoing. So it would be nice that we can use it before the assembly.

      Comment

      • themerlin
        Member
        • Feb 2010
        • 51

        #4
        I may be wrong, but I think that mothur would de-replicate your data by binning similar sequences. I don't think this is what you would want for assembly as it would mess up the contig coverage. You're probably better off sorting out sequencing errors on the back end after trimming on the front.

        Comment

        • v_kisand
          Member
          • Jan 2009
          • 38

          #5
          assembly of metagenomic reads should be useful for some other purposes, not for reducing workload...:-) Getting full genes, maybe even some operons, Easier do to recruitment maps. However I would not underestimate the power of data in raw reads, frequencies could be semi-quantitative measure of abundances.

          Concerning mothur or any other tool for finding errors in homopolymers. There is no way to find them when quality value is high - how you find artefacts from myriad of various possible genes? That would be possible only when you have sound and conserved model sequence to compare. Half of your metagenomic reads will remain unknown because there is no similar sequences from any so far sequenced organisms... 454 and some other NGS have trouble in finding out exact number of nucleotides in long homopolymers, Doublets, triplets etc should be not a problem, or am I wrong? And perhaps run some simple script over your reads to find the frequencies of Ts, TTs TTTs and so on

          And in the end only option is to compare the same samples with some other NGS technology know to have less problems in homopolymer detection...

          Comment

          Latest Articles

          Collapse

          • SEQadmin2
            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
            by SEQadmin2


            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


            Here are nine questions we think about, in roughly the order they matter, before...
            06-18-2026, 07:11 AM
          • SEQadmin2
            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
            by SEQadmin2


            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
            ...
            06-02-2026, 10:05 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-17-2026, 06:09 AM
          0 responses
          24 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          42 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          48 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          49 views
          0 reactions
          Last Post SEQadmin2  
          Working...