Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assessing a 454 run?

    Hi all,

    We recently got data from a 1/8th of a 454 run. The read length shows
    the typical distribution that I have seen at various meetings (see
    attached image). However, how should I go about assessing the 'overall
    quality' of the run (if such a clear cut concept exists)...

    So far I have plotted the distribution of quality per base and the
    distribution of mean quality per read. Of course the qualities will
    never be 'perfect', but without any experience or any other reference,
    I don't know what kind of distributions I should be looking for. i.e.
    we see about 20% of all bases with a quality score below 20... is that
    a) as good as we are likely to get, b) not bad, c) woah! ask for 20%
    of your money back ;-)

    It would be great to get any feedback from the experience on the forum.


    Note that we do not have a reference genome to align the reads to, but
    we do have a reasonable coverage of the chloroplast DNA, and a
    reference for that (estimated 2-4 % chloroplast contamination by read,
    giving approximately 10x coverage). What is a good tool to identify
    SNPs between our read data and that reference? (If I can first
    identify the SNPs, I can then estimate the per base error rate using
    the reference).

    (Actually I found I can do this with MAQ, but I'll leave the question in in case there are alternative suggestions).


    Thanks very much for any information,
    Dan.

    Homepage: Dan Bolser
    MetaBase the database of biological databases.

  • #2
    you might try Gabor Marth's lab's tools (http://bioinformatics.bc.edu/marthlab/Main_Page) ... use Mosaik to align the reads to the reference, and GigaBayes (evolution of their polyBayes tool) to call SNPs from that alignment. In my recollection, it gives you some better control over whether you're looking for SNPs between homozygous or heterozygous individuals, many individuals, etc, and has sound statistical underpinnings to its algorithms.

    ~Joe

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Exploring the Dynamics of the Tumor Microenvironment
      by seqadmin




      The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
      07-08-2024, 03:19 PM
    • seqadmin
      Exploring Human Diversity Through Large-Scale Omics
      by seqadmin


      In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
      06-25-2024, 06:43 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 07-19-2024, 07:20 AM
    0 responses
    141 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 07-16-2024, 05:49 AM
    0 responses
    116 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 07-15-2024, 06:53 AM
    0 responses
    109 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 07-10-2024, 07:30 AM
    0 responses
    43 views
    0 likes
    Last Post seqadmin  
    Working...
    X