Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • maria.b
    Member
    • Sep 2009
    • 14

    mapping parameters for SV/CNV discovery

    Hi!

    I'm begining to work on the SV/CNV dicovery field. I found many different methods to do this. The majority of them take as input mapped reads, but there is few documentation of the parameters that we must be used.

    For methods that analyse the paired mapping abnormalities, it seems that we must return all possible hits, and we must perform single end alignment even if we have paired end reads.
    For methods that analyse split read alignment, it seems that non gapped alignment is required and all hits must be returned.
    Very few programs speak about duplicated reads (must we removed them?), or masking reference genome (should we mask the reference before or after the mapping?).

    I'm not shure that there is one unique mapping process that may be used to all the SV CNV mehods, but have your point of view will may be inspired me!. So what software do you use for mapping and SV/CNV discovery and what parameters do you use for the mapping?

    I will begin my test with BWA ungapped and all possible hits (-n 600 -N 600) and I will see if there is different with the default parameters.

    Best regards

    Maria
  • stefanoberri
    Member
    • Jan 2010
    • 35

    #2
    Hi.

    I think the answer to your question depends on the coverage you have.

    We have some experience with low coverage CNV detection in tumour samples. For us paired end is not useful (the only benefit is that we can map a few more reads, but it is not worth the extra cost/time) and we only use uniquely mapped reads.

    If you take the ratio of reads in a test and a control, that smooth out a lot of the biases (mailny mappability problem). Also GC correction is very important for some samples. I suspect the paramenters used for the alignant are not so crucial, as long as they are the same for test and control.

    Comment

    • ryanmcg
      Junior Member
      • Dec 2012
      • 6

      #3
      I am also very much interested in these questions. I have high coverage (~150x) paired end sequence of the yeast genome. Using default parameters in BWA, I seem to be missing most SV data. So far I have tried Retroseq to map insertions of specific elements. I can find some retrotransposons at their reference location, but not all of them, and nothing novel. I have certain gene constructs that I have inserted in the lab, and I cannot find these insertions in the data, but again I find only the endogenous loci.

      Could someone explain a little bit more about the following BWA parameters, or suggest other things to change?

      bwa aln -e INT Maximum number of gap extensions, -1 for k-difference mode (disallowing long gaps) [-1]

      bwa aln -R INT Proceed with suboptimal alignments if there are no more than INT equally best hits. This option only affects paired-end mapping. Increasing this threshold helps to improve the pairing accuracy at the cost of speed, especially for short reads (~32bp).

      bwa sampe -o INT Maximum occurrences of a read for pairing. A read with more occurrneces will be treated as a single-end read. Reducing this parameter helps faster pairing. [100000]

      bwa sampe -n INT Maximum number of alignments to output in the XA tag for reads paired properly. If a read has more than INT hits, the XA tag will not be written. [3]

      bwa sampe -N INT Maximum number of alignments to output in the XA tag for disconcordant read pairs (excluding singletons). If a read has more than INT hits, the XA tag will not be written. [10]

      Comment

      Latest Articles

      Collapse

      • GATTACAT
        Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by GATTACAT
        Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
        07-01-2026, 11:43 AM
      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

        Here are nine questions we think about, in roughly the order they matter, before...
        06-18-2026, 07:11 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 07-02-2026, 11:08 AM
      0 responses
      7 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-30-2026, 05:37 AM
      0 responses
      12 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-26-2026, 11:10 AM
      0 responses
      20 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      54 views
      0 reactions
      Last Post SEQadmin2  
      Working...