Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • shaohua.fan
    Member
    • Feb 2009
    • 10

    #61
    Originally posted by usad View Post
    Did you do random trimming or did you trim them down to the region with the highest information gain (which is what we do).

    I think it had large genomes in mind. It is really good in RAM consumption and quite ok in thread usage and thus speed. Maybe CLC4 brings some scaffold capabilities?

    Cheers,
    björn
    hi, björn
    I totally agree with you that CLC is good at CPU and RAM control. But, i just wondering that why it doesn't support scaffolding which is important for genome assembly.

    BTW, could you please tell me what is the region with highest information gain? Now i am just trim the reads randomly. But, i am thinking to use a sliding window to scan the region with less homopolymer(for example, the length of the homopolymer is < 4).

    Comment

    • usad
      Member
      • Sep 2009
      • 53

      #62
      Hi

      just the most unambigous region. So if your whole 454 read aligns to one region and one region only to search for a short region in the read which also would allow placing it at this position only and not on another contig and use this as a pseudo-Illumina read.
      But it probably doesn't help too much it will jut give you a few more links. (Maybe simulate first what you can expect, based on your N50 /aveage length or length distribution, linker length quality and number)

      Cheers,
      Björn

      Comment

      • Nomijill
        Member
        • Sep 2009
        • 25

        #63
        Originally posted by shaohua.fan View Post
        hi, björn
        I totally agree with you that CLC is good at CPU and RAM control. But, i just wondering that why it doesn't support scaffolding which is important for genome assembly.

        BTW, could you please tell me what is the region with highest information gain? Now i am just trim the reads randomly. But, i am thinking to use a sliding window to scan the region with less homopolymer(for example, the length of the homopolymer is < 4).
        Just for everyone who is using the Genomics Workbench to know - There have been numerous updates to the functionality. Much of which is available as a plug-in download. For the de novo assembler there are two significant changes. 1 - The assembler now supports scaffolding of paired-end reads. 2 - You now have the ability to change the k-mer value as one of your parameters.

        Happy Sequencing :-)
        Naomi

        Comment

        • lmilne
          Junior Member
          • Apr 2009
          • 8

          #64
          I am currently assembling about 215 gigabases of sequence data with clc_novo_assemble. Should I expect clc_novo_assemble to print its progress while it is running? Its last output of progress (80%) was two days ago.

          Comment

          • sklages
            Senior Member
            • May 2008
            • 628

            #65
            Originally posted by lmilne View Post
            I am currently assembling about 215 gigabases of sequence data with clc_novo_assemble. Should I expect clc_novo_assemble to print its progress while it is running? Its last output of progress (80%) was two days ago.
            what does CLC support tell you? They are usually very helpful and responsive.

            Comment

            • Nomijill
              Member
              • Sep 2009
              • 25

              #66
              CLC bio Support

              Please, whenever you have questions about the behavior of the CLC bio assemblers, contact [email protected]. You may also be interested in trying the new version of the assembler, if you have not already. The new algorithm scaffolds the paired end information.

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                07-01-2026, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 07-02-2026, 11:08 AM
              0 responses
              16 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-30-2026, 05:37 AM
              0 responses
              17 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              54 views
              0 reactions
              Last Post SEQadmin2  
              Working...