Header Leaderboard Ad

Collapse

454 statistics

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 454 statistics

    Hi all!
    3 years ago we did a 454 run over a transcriptome data and using the Newbler release 1.1.03.24 we got some statistics in order to know the mean number of assembled reads per contig, the number of contigs with only 2 reads and so on...as we thought that a read could be assigned only to a single contig.
    Today, we've used the 2.6 release software and the number of reads we've got from 454Allcontigs.fna ("numreads=" column) is larger than the total number of assembled reads. That's because a read could be assigned to a multiple contigs, isn't it? (as the contigs are "exons") If true, how kind of statistics do you advise in order to compare both sets of data??
    From newblerMetrics I got that 83.48% of reads were assembled but I want to get such a value from the assembled contigs file as I've seen something like:
    >contig00030 length=1 numreads=48 gene=isogroup00001 status=ig_thresh
    t
    >contig00031 length=6 numreads=4495 gene=isogroup00001 status=ig_thresh
    CACTTC
    >contig00032 length=3 numreads=61 gene=isogroup00001 status=ig_thresh
    GgA
    >contig00033 length=3 numreads=345 gene=isogroup00001 status=ig_thresh
    gtA
    >contig00034 length=2 numreads=2030 gene=isogroup00001 status=ig_thresh
    TA
    >contig00035 length=1 numreads=1914 gene=isogroup00001 status=ig_thresh
    A


    I hope I was clear enough!
    Thanks in advance.

  • #2
    Parsing the 454ReadStatus.txt file may the best solution.

    Comment


    • #3
      Thank you westerman. You are right, that seems to be the right place to find out the solution, but what kind of reads it's going to be assembled?? I mean: in 454NewblerMetrics file you have assembled reads, partial reads, singletons, repeat reads, outliers and tooshort (they appear in the last software release I think). I've read the flxlex blog (http://contig.wordpress.com/2010/03/...rics-txt-file/) and it seems that Assembled plus Partial and Repeat should be the number of aligned repeats...
      However, what does it mean the "numreads=" in the 454Allcontigs.fna file??

      Comment


      • #4
        Originally posted by jordi View Post
        However, what does it mean the "numreads=" in the 454Allcontigs.fna file??
        I am going to guess here because I am deep into looking at other problems at the moment (although it should be easy to find out with a bit of digging) that the 'numreads=' is the count of all reads that contribute to the contig, no matter if that read maps uniquely to that contig or not.

        Comment


        • #5
          Originally posted by westerman View Post
          I am going to guess here because I am deep into looking at other problems at the moment (although it should be easy to find out with a bit of digging) that the 'numreads=' is the count of all reads that contribute to the contig, no matter if that read maps uniquely to that contig or not.
          As far as I know, this is correct.

          Comment

          Working...
          X