Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • adrian
    Member
    • Oct 2009
    • 90

    insert size

    Dear group:
    I need some help in understanding insert size concept.

    I have a targeted exome sequencing data using paired-end approach with 76 bp. I have lots of duplicates in the sam file. I should use rmdup with insert size correctly mentioned.

    i was told by technician that insert size is between 150-300 bp.

    When I see 9th tag which is inferred insert size in sam file, I have lots of numbers that range from 0 to 100,000.

    Since the experiment is done with an insert size 150-300 bp, and BWA inferred insert size has lots of ranges, what number should I use in using rmdup. Heng Li recommends that we should use correct insert size always. If I have range from 150-300 (technician) and SAM file inferred sizes are spanning across wide ranges, Which insert size should I select to remove duplicates and call SNPs.

    OR should I make sets of reads that fall into certain ranges and call SNPs in each bin.

    Also what is inferred insert size '0' mean and what is 345,039 mean.

    thanks
    Adrian
  • krobison
    Senior Member
    • Nov 2007
    • 734

    #2
    Many if not all of the very large insert sizes are probably due to falsely aligned reads to repeats (LINEs, SINEs, Satellite, etc); take a few of them and look to see whether both reads actually align to non-repeat DNA.

    Also, take a sample from your SAM file (ideally random) & look at the size distribution -- you may see a long tail of weird sizes, but you'll probably see most of the counts in a distribution around what the technician said. It's a worthwhile check on the library in any case & easy to generate the data table with a little bit of perl (or even UNIX shell commands).

    Comment

    Latest Articles

    Collapse

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    14 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-04-2026, 08:59 AM
    0 responses
    24 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-02-2026, 12:03 PM
    0 responses
    29 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-02-2026, 11:40 AM
    0 responses
    23 views
    0 reactions
    Last Post SEQadmin2  
    Working...