Hi all!
3 years ago we did a 454 run over a transcriptome data and using the Newbler release 1.1.03.24 we got some statistics in order to know the mean number of assembled reads per contig, the number of contigs with only 2 reads and so on...as we thought that a read could be assigned only to a single contig.
Today, we've used the 2.6 release software and the number of reads we've got from 454Allcontigs.fna ("numreads=" column) is larger than the total number of assembled reads. That's because a read could be assigned to a multiple contigs, isn't it? (as the contigs are "exons") If true, how kind of statistics do you advise in order to compare both sets of data??
From newblerMetrics I got that 83.48% of reads were assembled but I want to get such a value from the assembled contigs file as I've seen something like:
>contig00030 length=1 numreads=48 gene=isogroup00001 status=ig_thresh
t
>contig00031 length=6 numreads=4495 gene=isogroup00001 status=ig_thresh
CACTTC
>contig00032 length=3 numreads=61 gene=isogroup00001 status=ig_thresh
GgA
>contig00033 length=3 numreads=345 gene=isogroup00001 status=ig_thresh
gtA
>contig00034 length=2 numreads=2030 gene=isogroup00001 status=ig_thresh
TA
>contig00035 length=1 numreads=1914 gene=isogroup00001 status=ig_thresh
A
I hope I was clear enough!
Thanks in advance.
3 years ago we did a 454 run over a transcriptome data and using the Newbler release 1.1.03.24 we got some statistics in order to know the mean number of assembled reads per contig, the number of contigs with only 2 reads and so on...as we thought that a read could be assigned only to a single contig.
Today, we've used the 2.6 release software and the number of reads we've got from 454Allcontigs.fna ("numreads=" column) is larger than the total number of assembled reads. That's because a read could be assigned to a multiple contigs, isn't it? (as the contigs are "exons") If true, how kind of statistics do you advise in order to compare both sets of data??
From newblerMetrics I got that 83.48% of reads were assembled but I want to get such a value from the assembled contigs file as I've seen something like:
>contig00030 length=1 numreads=48 gene=isogroup00001 status=ig_thresh
t
>contig00031 length=6 numreads=4495 gene=isogroup00001 status=ig_thresh
CACTTC
>contig00032 length=3 numreads=61 gene=isogroup00001 status=ig_thresh
GgA
>contig00033 length=3 numreads=345 gene=isogroup00001 status=ig_thresh
gtA
>contig00034 length=2 numreads=2030 gene=isogroup00001 status=ig_thresh
TA
>contig00035 length=1 numreads=1914 gene=isogroup00001 status=ig_thresh
A
I hope I was clear enough!
Thanks in advance.
Comment