Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assessing a 454 run?

    Hi all,

    We recently got data from a 1/8th of a 454 run. The read length shows
    the typical distribution that I have seen at various meetings (see
    attached image). However, how should I go about assessing the 'overall
    quality' of the run (if such a clear cut concept exists)...

    So far I have plotted the distribution of quality per base and the
    distribution of mean quality per read. Of course the qualities will
    never be 'perfect', but without any experience or any other reference,
    I don't know what kind of distributions I should be looking for. i.e.
    we see about 20% of all bases with a quality score below 20... is that
    a) as good as we are likely to get, b) not bad, c) woah! ask for 20%
    of your money back ;-)

    It would be great to get any feedback from the experience on the forum.


    Note that we do not have a reference genome to align the reads to, but
    we do have a reasonable coverage of the chloroplast DNA, and a
    reference for that (estimated 2-4 % chloroplast contamination by read,
    giving approximately 10x coverage). What is a good tool to identify
    SNPs between our read data and that reference? (If I can first
    identify the SNPs, I can then estimate the per base error rate using
    the reference).

    (Actually I found I can do this with MAQ, but I'll leave the question in in case there are alternative suggestions).


    Thanks very much for any information,
    Dan.

    Homepage: Dan Bolser
    MetaBase the database of biological databases.

  • #2
    you might try Gabor Marth's lab's tools (http://bioinformatics.bc.edu/marthlab/Main_Page) ... use Mosaik to align the reads to the reference, and GigaBayes (evolution of their polyBayes tool) to call SNPs from that alignment. In my recollection, it gives you some better control over whether you're looking for SNPs between homozygous or heterozygous individuals, many individuals, etc, and has sound statistical underpinnings to its algorithms.

    ~Joe

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Recent Advances in Sequencing Analysis Tools
      by seqadmin


      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
      05-06-2024, 07:48 AM
    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 05-14-2024, 07:03 AM
    0 responses
    19 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-10-2024, 06:35 AM
    0 responses
    42 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-09-2024, 02:46 PM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 05-07-2024, 06:57 AM
    0 responses
    42 views
    0 likes
    Last Post seqadmin  
    Working...
    X