Announcement

Collapse
No announcement yet.

De Novo Assembly of a transcriptome

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • jordi
    replied
    oh, sorry. I found repetitive elements which are reverses transcriptases, located at 3' UTR of different genes. How can I differenciate the origin of my blast results?

    Leave a comment:


  • jordi
    replied
    Because if you haven't a large coverage and the same repetitive elements could appears in different genes, how do I know which protein has been translated? So, I would mask these elements.
    The low coverage has been my problem with Standard GS de novo assembler. Length contigs aprox 200 bp and a coverage from 4X to 6X.
    Thanks!

    Leave a comment:


  • Melissa
    replied
    Originally posted by jordi View Post
    Hi all!
    I'm doing the annotation of a transcriptome of a non reference organism, something similar like you. My assembly was made with GS de novo assembler, but I had short contigs...
    I'm trying the assembly with Mosaik but prior I have another problem: what about transposable elements? Have you tried to use windowmasker? Or RepeatMasker? For an organism without a database for these repetitives elements, which program do you think is better?
    Thanks!
    Why would you worry about transposable/repetitive elements in the transcriptome? The common repeats found in transcriptome are SSR and low complexity region. I'm not refering to the repeats that are several kb long (like in the genome). But if these repeats are transcribed, then yes, you will find them in the transcriptome.

    Leave a comment:


  • jordi
    replied
    Hi all!
    I'm doing the annotation of a transcriptome of a non reference organism, something similar like you. My assembly was made with GS de novo assembler, but I had short contigs...
    I'm trying the assembly with Mosaik but prior I have another problem: what about transposable elements? Have you tried to use windowmasker? Or RepeatMasker? For an organism without a database for these repetitives elements, which program do you think is better?
    Thanks!

    Leave a comment:


  • Melissa
    replied
    A year ago, de novo transcriptome sequencing solely based on Illumina GAII is a bad idea. With 72bp PE reads and higher coverage, nothing is impossible now.

    Like what Rao suggested, EST data will be helpful for the assembly. But, the fact is most organisms of interest don’t have comprehensive EST information. No available reference genome/ transcriptome (not even from a related species). You don’t know the exact size of the transcriptome, repeats, paralogous genes and isoforms problem. It’s tricky to tell even if your assembly went wrong. Like I said, it depends on the purpose of sequencing. Things is a lot easier if the goal is to discover SNPs. If the results are not satisfying, try other alternatives like sequencing using longer reads.

    Leave a comment:


  • jnfass
    replied
    Though I haven't finished the project (reads aren't all in yet), I'm doing something similar right now: no reference transcriptome, but looking for SNPs in cDNA reads of two subspecies. The first was sequenced with single-ended reads, and resulted in pretty short contigs, and only roughly 1/10 of the trancriptome total was assembled. I'm recommending paired-ends for the second sample, so I may have a quantitative answer for you in a couple of weeks.

    The transcriptome may have more unique, assemblable sequence than the genome .. but homologous domains will be a problem, and paired-ends would definitely help there. That's why I'd guess that a small insert library should help quite a bit.

    I'd recommend velvet - seems to still be the best option out there for Illumina reads. Not sure on simulation ...

    Leave a comment:


  • Rao
    replied
    Check for ESTs may help you in assembly
    de novo assembly of transcriptome.... what about misassemblies...

    Leave a comment:


  • Neil
    started a topic De Novo Assembly of a transcriptome

    De Novo Assembly of a transcriptome

    Hi all,
    We are planning to perform an mRNA-seq run using the Illumina GAII platform. We are worried about assembling the transcriptome when we get our data back. Most of the RNA-seq papers I read are assembling to a reference genome/transcriptome, we don't have either of these! Is there anyone out there that has assembled cDNA short reads de novo? If so, are paired reads as important as they are with genome assembly?
    Is there an example database of mRNA-seq short-pair reads that i can download to simulate assembly?
    also, what software would you recommend for this?
    hope someone can help
    best regards
    neil
Working...
X