Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nomijill
    replied
    CLC bio Support

    Please, whenever you have questions about the behavior of the CLC bio assemblers, contact [email protected]. You may also be interested in trying the new version of the assembler, if you have not already. The new algorithm scaffolds the paired end information.

    Leave a comment:


  • sklages
    replied
    Originally posted by lmilne View Post
    I am currently assembling about 215 gigabases of sequence data with clc_novo_assemble. Should I expect clc_novo_assemble to print its progress while it is running? Its last output of progress (80%) was two days ago.
    what does CLC support tell you? They are usually very helpful and responsive.

    Leave a comment:


  • lmilne
    replied
    I am currently assembling about 215 gigabases of sequence data with clc_novo_assemble. Should I expect clc_novo_assemble to print its progress while it is running? Its last output of progress (80%) was two days ago.

    Leave a comment:


  • Nomijill
    replied
    Originally posted by shaohua.fan View Post
    hi, björn
    I totally agree with you that CLC is good at CPU and RAM control. But, i just wondering that why it doesn't support scaffolding which is important for genome assembly.

    BTW, could you please tell me what is the region with highest information gain? Now i am just trim the reads randomly. But, i am thinking to use a sliding window to scan the region with less homopolymer(for example, the length of the homopolymer is < 4).
    Just for everyone who is using the Genomics Workbench to know - There have been numerous updates to the functionality. Much of which is available as a plug-in download. For the de novo assembler there are two significant changes. 1 - The assembler now supports scaffolding of paired-end reads. 2 - You now have the ability to change the k-mer value as one of your parameters.

    Happy Sequencing :-)
    Naomi

    Leave a comment:


  • usad
    replied
    Hi

    just the most unambigous region. So if your whole 454 read aligns to one region and one region only to search for a short region in the read which also would allow placing it at this position only and not on another contig and use this as a pseudo-Illumina read.
    But it probably doesn't help too much it will jut give you a few more links. (Maybe simulate first what you can expect, based on your N50 /aveage length or length distribution, linker length quality and number)

    Cheers,
    Björn

    Leave a comment:


  • shaohua.fan
    replied
    Originally posted by usad View Post
    Did you do random trimming or did you trim them down to the region with the highest information gain (which is what we do).

    I think it had large genomes in mind. It is really good in RAM consumption and quite ok in thread usage and thus speed. Maybe CLC4 brings some scaffold capabilities?

    Cheers,
    björn
    hi, björn
    I totally agree with you that CLC is good at CPU and RAM control. But, i just wondering that why it doesn't support scaffolding which is important for genome assembly.

    BTW, could you please tell me what is the region with highest information gain? Now i am just trim the reads randomly. But, i am thinking to use a sliding window to scan the region with less homopolymer(for example, the length of the homopolymer is < 4).

    Leave a comment:


  • usad
    replied
    Did you do random trimming or did you trim them down to the region with the highest information gain (which is what we do).

    I think it had large genomes in mind. It is really good in RAM consumption and quite ok in thread usage and thus speed. Maybe CLC4 brings some scaffold capabilities?

    Cheers,
    björn

    Leave a comment:


  • shaohua.fan
    replied
    Originally posted by usad View Post
    I didn't know you have 454 data. So what kind of data do you have?

    if it is 99% illumina and a bit for 454 scaffolding:
    I reckon also SSPACE can be beaten into submission by giving it fake reads. It works with SOAP at least you could plainmail Tbolger if you wanted to give that a shot. Or better yet switch to an assembler/scaffolfer that takes all data into account. (I guess that was why you asked the question in the first place :-))


    Cheers,
    björn
    i have tried to trim the long 454 reads (20K PE) to 36 and 72 bp and fed them as Illumina reads to SSPACE. But the scaffolding quality didn't improve much.

    The reason I asked the question to CLC people is that we bought the CLC since it appears an all in one package (de novo genome assembly with hybrid 454 and illumina data). But, the scaffolding function, which is essential for a complicated genome assembly, is not included. I guess CLC is expecting all their customs buy the CLC then de novo assembly the virus or simple bacterial genome?

    Leave a comment:


  • usad
    replied
    I didn't know you have 454 data. So what kind of data do you have?

    if it is 99% illumina and a bit for 454 scaffolding:
    I reckon also SSPACE can be beaten into submission by giving it fake reads. It works with SOAP at least you could plainmail Tbolger if you wanted to give that a shot. Or better yet switch to an assembler/scaffolfer that takes all data into account. (I guess that was why you asked the question in the first place :-))


    Cheers,
    björn

    Leave a comment:


  • shaohua.fan
    replied
    Originally posted by usad View Post
    No idea,
    I guess the easiest way to help yourself is using SSPACE, after you got your contigs with CLC.

    Cheers,
    björn
    but SSPACE does not support scaffolding using the 454 reads.

    Leave a comment:


  • usad
    replied
    No idea,
    I guess the easiest way to help yourself is using SSPACE, after you got your contigs with CLC.

    Cheers,
    björn

    Leave a comment:


  • shaohua.fan
    replied
    hi, CLC people,

    I have a question about CLC genomic workbench that when will CLC add the scaffolding option in the genome assembly. Until the latest version (version 4.7.2), CLC genomic workbench still does not support this. But, this is of important for the genome assembly.

    Thanx

    Leave a comment:


  • Abishai3911
    replied
    Hi,

    I am basically a molecular biologist/biochemist and not a Bioinformatician. However, I have been trying to use CLC Genomics Workbench to analyze my 454 data resulting from PCR amplicons. I was able to import the .fna and .qual file into CLC. Now when I do use the "Map reads to reference" under "Highthroughput sequencing" for my sequencing reads (containing 121000 sequences of 310bases) with a 32bp reference sequence, the matched sequences that it shows is incorrect. For eg I am getting only 97 matches instead of atleast 10000 matches that are expected. Also, sometimes when the reference sequence is shorter for example 15 bp, then it says the match count is zero and that there are zero matches.

    Can somebody help me with this? Am I doing the mapping correctly?

    Thanks in advance.

    JAG

    Leave a comment:


  • NextGenSeq
    replied
    We don't have the assembly cell but on a computer with 16GB of RAM and 24 GB of data it would take about 6 hours. I've assembled 250 million reads from a HiSeq in ~16 hours. This if for reference assembly. However, de novo assembly takes about the same time.

    Leave a comment:


  • sklages
    replied
    Originally posted by Irsan_Kooi View Post
    Does anyone have an idea how long it takes to perform a single end assembly with CLC assembly cell 3.2.2. on 24 Gbases of data using quadcore with 16 GB or RAM.

    P.S. I know what they claim on the company website, I just like to hear about experiences of an unbiased user...
    There is probably no correct answer. It may depend on organism, type of library, type of sequence data, quality of sequence data, size of target (genome,transcriptome), type of processors, speed of IO etc. And, .. 16GB of RAM is not too much ... :-)

    Let us know when your assembly has finished and how the quality is ..

    Sven

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM
  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    06-25-2024, 06:43 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
38 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
49 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-15-2024, 06:53 AM
0 responses
61 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-10-2024, 07:30 AM
0 responses
43 views
0 likes
Last Post seqadmin  
Working...
X