Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find SNP in 454HCDiffs.txt

    Hey there,

    I mapped my reads against a reference consisting of the isotigs of the de novo assembly of the same reads. I'm wondering now if the follwoing approach is really sufficient to detect SNPs in the 454HCDiff.txt:
    - get the summary line of each diff: grep '>' 454HCDiffs.txt
    - check if the start and end position are identical (SNPs need to be at the same position in the reference)
    - check if neither the ref nucleotide nor the var nucleotide is only a gap

    - check if the var nucleotide length is 1

    Regards,
    Thomas
    Last edited by dschika; 01-18-2011, 03:46 AM.

  • #2
    Yes, that approach will give you a list of putative SNP/SNVs.

    But you will want to do further filtering (e.g. on read depth, quality) to get a more trusted set of SNPs.

    Comment


    • #3
      Thanks for your quick reply!

      I thought it would be sufficient to take the 454HCDiffs.txt file, because of the High Confidence. That means that (please see the manual for full details):
      - there must be at least 3 non-duplicate reads with the difference
      - there must be forward and reverse reads with the difference, unless there are at least 5 reads with quality score over 20

      Do you think that those filtering options are still too smooth? Can you perhaps suggest some other values?

      Btw: I added another step in my first post.

      Comment


      • #4
        It will depend on a number of factors. For example if you have greater coverage then you might want to set the read depth cut-off higher. It will depend also on the quality of your reference genome - that might have errors in it. You need to take a view depending on what you are trying to do.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 11:49 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        61 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Working...
        X