Seqanswers Leaderboard Ad

**tez** · 09-05-2011, 11:37 PM

I have now also seen that there is a "-r" option for setting the minimum number of read-pairs required to call an SV.

There isn't much mention of this in the manual, but looking through the source code I see it is set to 2, which would explain the huge number of results, poor run time and memory usage.

Does anyone have any experience with this parameter? Our data is supposed to be at ~30x depth. I am now giving it a try at min_read_pair=10, and I'll let you know how it goes.

Cheers

**aquinom85** · 12-12-2011, 07:10 AM

How did things turn out by tweaking the results? I'm looking into BreakDancer but also there is no FAQ and it's rather hard to get a clear picture of the limitations of the software. Do you know if BreakDancer jointly calls samples or if you have to run it on each of your samples then cross-validate the results?

**tez** · 12-12-2011, 01:24 PM

Hello,

The results did not look good at all. Basically it called about 10,000 structural variations in the "normal" sample, and about 1,300 in the "tumour" sample.

The only way I could get these results was to run break dancer with the -r 10 option, and then to break each whole genome down into chromosomes and run each chromosome separately. Even then it was still a 3-4 day process, running them all in parallel on fairly powerful cluster.

Looks like the biggest issue is data quality. The alignment / mapping was not done by us, and it looks like it may contain quite a lot of noise. So we are now experimenting with different ways to "clean" up the data.

Cheers

**P-Richmond** · 01-12-2012, 02:01 PM

Any luck in "cleaning up the data"? I have a similar problem, but I'm working in S. cerevisiae and keep running across artifacts of the alignements I'm using (read pairs that map to familial genes (genes with very high sequence identity on different chromosomes).

One possible methodology would be to generate reads from a perfect genome, then run through breakdancer and call that the noise model. I have a system in place for this read generation if you are interested in trying that. Then by simply creating an intersect with the calls from your data, you could produce a set that is more likely to be structural variations that aren't simply artifacts of the alignment or the underlying sequence.

-Phil

**aquinom85** · 01-20-2012, 07:46 AM

I just ran breakdancer on 1 human genome sample and got 29,500 SVs called, in my naive opinion this seems outrageously high. I think I'll try raising the -r value higher. Does anyone know what a normal range of SVs are in the human for comparison? Also, how should the confidence score be considered in general?

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 18 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Structural variation detection using BreakDancer on Whole Genome SOLiD data

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News