Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • vinay052003
    Member
    • Jan 2010
    • 59

    SVDetect v/s BreakDancer

    Hi,
    Has anyone had any experience with SVDetect and BreakDancer? I need some thoughts on their comparison in terms of performance and detection accuracy. I am mainly interested in detecting large INDELs, inversions and translocations using paired-end whole genome data.

    Thanks.
  • HLA
    Junior Member
    • Aug 2010
    • 4

    #2
    I have just tried both of these on a whole genome sequenced sample with a balanced translocation identified through cytogenetics. We did single lane HiSeq2000, paired-end sequencing of ~600bp long fragments, read lenght 100bp, average coverage 8x. Aligned with BWA, and removed duplicates with Picard.

    We knew roughly where to look for the break points, and with IGV's coloured display of reads where pairs map to different chromosomes, we were able to quickly identify the breakpoint, with 7 reads providing evidence (mapping either side of the breakpoint), and further 3+1 reads mapping across the breakpoints, so we got to the actual base from WGS. But could we identify this translocation if we didn't know where to look?

    I first tried BreakDancer and got strange results (posted a question on seqanswers but no replies), lots of interchromosomal calls with the actual translocation looking no more real than ~100 false positives, and the output saying only 3 reads providing support.

    Then tried SVDetect, and I'm much happier with the results and their format. Still got lots of FP calls but the actual translocation got called correctly, output is much simpler than BreakDancer and listed that 7 read pairs map across it.

    Suggestions to filter out FPs for inter-chromosomal translocations (after selecting only those with the highest score), based on my observations after manually checking some of them in IGV:
    - those where multiple types of rearrangements map in close proximity
    - those mapping to centromeres
    - those with too large number of reads providing evidence (in my case say >14, or twice the average coverage)
    - those where start-end distance is considerably different between chr1 and chr2, which suggests mismapping of reads. This tells me I should probably remove any reads mapping to multiple locations from the bam file first, before running SVDetect.

    Perhaps not so many should have the perfect score?

    These are just some of my initial observations for a particular type of structural aberration, and I'd be also happy to hear form anyone else using these programs.

    Comment

    • ralonso
      Member
      • Feb 2012
      • 10

      #3
      Hello,

      I have tried BreakDancer and SVDetect, and now I am trying to filter out false positives from my results. I am doing these steps that you suggested HLA but I have some questions, you may help me. For the moment, I am just interested in deletions, my doubts:
      1. " (after selecting only those with the highest score)", what is a high score? Because I have from 0.8 to 1, since I have a lot of 1 score I think I will keep just those ones, is it ok?
      2. "those where multiple types of rearrangements map in close proximity", you mean, for instance, if there is a deletion and an insertion in close proximity?
      3. "those with too large number of reads providing evidence (in my case say >14, or twice the average coverage)", my coverage is 55, so, do you mean that I should filter out the SV with nb_pairs(int sv.text file) > 110, because my range of values are [2:95]
      4. " those where start-end distance is considerably different between chr1 and chr2, which suggests mismapping of reads. This tells me I should probably remove any reads mapping to multiple locations from the bam file first, before running SVDetect.", sorry for being a bit picky, but what is considerably difference?

      I was thinking to merge results from breakdancer and svdetect, it also exists this tool http://svmerge.sourceforge.net/ that is useful also for copy number.
      By the way, have you experience with any other tool such as Gasv or Hidra?

      Thanks in advance!

      Comment

      • HLA
        Junior Member
        • Aug 2010
        • 4

        #4
        Hi ralonso,

        my suggested filters were for balanced inter-chromosomal translocations only, and I couldn't be very specific as I had a single sample only and just looked at the distributions of various attributes given in the output.

        You seem to be looking for deletions, and have much higher coverage than me. In which case you need to look for cases with insert sizes larger than expected, probably starting with ones supported by the highest number of reads, and possibly largest deletions. If you are working on a human sample and looking for something disease-causing then obviously look for the ones not in DGV. You should also have quite a few reads spanning across the deletion itself (rather than read pairs either side of it), so you may be able to use one of coverage-based methods, if these two programs already don't take it into account?

        As it happens next week we'll do whole genome sequencing of a patient to confirm and accurately map a ~100kb deletion identified by CNV analysis of exome sequence data. So I can run SVDetect on this sample and post my observations again regarding filtering deletion calls. Though my coverage will again be ~8x, not sure if this means I'll get more or fewer calls than if I had 50x coverage.

        Comment

        • ralonso
          Member
          • Feb 2012
          • 10

          #5
          Hi HLA,

          thanks for your reply. I am not doing human, but a plant. I think we will be very glad if you post your results . On the other hand, have you tried gasv (https://code.google.com/p/gasv/)? They are doing quite well, focusing also in false positives, that in my case is very important since I have a lot of deletions .

          thanks!!

          Comment

          • HLA
            Junior Member
            • Aug 2010
            • 4

            #6
            I haven't tried gasv yet, I'd be happy to give it a go. I have to say we mainly work on exome and targeted sequencing experiments, and have so far only used low-ish coverage whole genome sequencing (whatever we get from single HiSeq2000 lane) on a couple of samples to map the structural variant breakpoints when we already know roughly where to look. So my current testing of various software is only out of curiosity for potential future experiments when we might go straight to WGS for structural variant detection.

            Comment

            • tonio100680
              Member
              • Apr 2010
              • 25

              #7
              Hi all,

              I am currently looking for a software to determine CNV. Knowing that I am using data re-targeted sequencing (approximately twenty genes). I tested CONTRA but I would know your opinion? What do you think is the best software to detect all types of CNV (deletion, duplication, inversion and translocation)?

              Thank you in advance for your help
              Last edited by tonio100680; 10-30-2012, 12:36 AM.

              Comment

              • tonio100680
                Member
                • Apr 2010
                • 25

                #8
                Hi all,

                I am currently looking for a software to determine CNV. Knowing that I am using data re-targeted sequencing (approximately twenty genes). I tested CONTRA but I would know your opinion? What do you think is the best software to detect all types of CNV (deletion, duplication, inversion and translocation)?

                Thank you in advance for your help

                Comment

                • binlangman
                  Member
                  • Dec 2013
                  • 11

                  #9
                  Problem about using SVDetect

                  When I used a perl scripts BAM_preprocessingPairs.pl to get anomalously mapped mate-pair/paired-end reads, I got to two files: **.ab.bam and **.norm.bam, which were empty. Why did not the files contain anything?
                  Thanks!

                  Comment

                  • YazBraimah
                    Junior Member
                    • Jul 2012
                    • 7

                    #10
                    Originally posted by binlangman View Post
                    When I used a perl scripts BAM_preprocessingPairs.pl to get anomalously mapped mate-pair/paired-end reads, I got to two files: **.ab.bam and **.norm.bam, which were empty. Why did not the files contain anything?
                    Thanks!
                    You should check the stdout output for parameters such as number of correctly mapped reads, anamolously mapped reads, etc.

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM
                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Yesterday, 05:37 AM
                    0 responses
                    6 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-26-2026, 11:10 AM
                    0 responses
                    16 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    51 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-09-2026, 11:58 AM
                    0 responses
                    110 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...