Seqanswers Leaderboard Ad

**MattB** · 03-30-2010, 07:33 AM

There is a setting in SOAPdenovo that I thought had some influence on this, used when you run 'SOAPdenovo contig' separately.

-M mergeLevel(default 1,min 0, max 3): the strength of merging similar sequences during contiging

However, when I experimented with different values it made no difference on the contig assembly results....not sure if it did anything with the 'consensus' base, probably not.

If you search for 'bubbles' in the Abyss, Velvet and CLC documentation you will find a lot more detail on how they deal with SNPs.

**Boonie** · 04-01-2010, 12:35 PM

A 454 - SSAHA approach

Just to throw in on the conversation, I pooled genomic DNA from 18 individuals, cut with a 4 base cutter, and sequenced a 15bp size fraction with two full runs of 454 reads (250bp). I assembled them gsAssembler which produced an average 20 reads per contig. Then I mapped the individual reads back to the contig consensus sequences using SSAHA2 and used the SSAHA_pipeline to call SNPs. It worked pretty well - wound up with about 8000 SNPs I could believe in, and the validation rate was about 95%. The predicted allele frequency was strongly correlated (>0.8) with the real allele frequency in the donors. My goal was just basic SNP discovery in a novel species and it fit the bill.

Caveats - Beware of minor allele freqs near 0.5 which could arise from alignment of reads from duplicated loci; Screen out short tandem repeats because STR allelic differences in the alignment can cause false positive SNPs; Loci with only 4 mapped reads (minimum 2 reads per allele) may be useful but don't count on them.

**pierre350d** · 09-28-2010, 11:38 PM

A piece of information,

We developed a tool, called kisSnp that takes two sets of non assembled raw short reads and compare them for finding SNPs between these two sets.
It outputs the SNPs with small flanking regions.
It uses light memory and run in short time.

All info and download can be found on the dedicated website: http://alcovna.genouest.org/

Enjoy ! (remarks and comments are welcome)

**lletourn** · 09-29-2010, 03:42 AM

I checked your site quickly, it's very interesting.

I do have a question though, without a reference won't you be missing all the homozygous variations?

Also you need long enough reads to generate flanks no, anything smaller dans 50 even 75 wouldn't ne long enough.

Or am I missing something.

**pierre350d** · 09-29-2010, 07:07 AM

With the current version we detect only SNPs between individuals. One compares two set of reads, focusing on small substitutions that may be those SNPs.

We are currently working on a version intra-individual, that will enable to detect heterozygous SNP of one individual.

This may be done avoiding the use of a reference genome, if the coverage is sufficient.
Reads of length 50 to 75 are indeed long enougth.

Pierre

**ybfu** · 12-06-2010, 12:40 PM

DIAL by Dr. Ratan for SNP without reference genome

Hi, Everyone:

I am trying to use DIAL without success for unknown reason, even following exact instructions. So I am wondering if anyone in our community is using the DIAL to get SNP and sharing some experience. I contacted Dr. Ratan at Penn State, but got no response. Any comments on DIAL?

I have a 454 sequencing run of 8 samples with barcodes each and got individual .sff file. When I perform DIAL by adding each .sff file, it worked sometime, and some time not working. I tested it with the supplied data and it worked for Adding but not working with Update (it returns with $ without error, but I check ps showing no such task).

**natstreet** · 12-06-2010, 01:21 PM

What version of newbler are you using? I tried DIAL and it would very specifically only work with v2.0 and nothing later.

**ybfu** · 12-06-2010, 01:30 PM

I did give it a trial at 2.0 version by changing the newbler path in my .profile. What I got when I performed DIAL add is: Errors: unable to open sff file. SRR000375.sff (which is one of the test sff file).

**arthurmelo** · 01-12-2016, 06:15 AM

Hi everybody, I wondering to introduce and share the GBS-SNP-CROP:a reference-optional pipeline for SNP discovery and plant germplasm characterization using variable length, paired-end genotyping-by- sequencing data.
Recently published on BMC Bioinformatics, this methodology could be useful for population genomic studies in model and non model organism when or not a reference genome is available.

Please see the GBS-SNP-CROP GitHub page for more details and UserManual:

GitHub - halelab/GBS-SNP-CROP: GBS SNP Calling Reference Optional Pipeline

https://github.com/halelab/GBS-SNP-CROP.git

GBS SNP Calling Reference Optional Pipeline. Contribute to halelab/GBS-SNP-CROP development by creating an account on GitHub.

Best regards,
Arthur Melo

Topics	Statistics	Last Post
Study Highlights Challenges in Cellular Reprogramming for Regenerative Medicine by seqadmin Started by seqadmin, Today, 06:25 AM	0 responses 13 views 0 likes	Last Post by seqadmin Today, 06:25 AM
New DNA Modification Discovered as Key to Gene Activation in Early Development by seqadmin Started by seqadmin, Yesterday, 01:02 PM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 01:02 PM
Wastewater Analysis Unlocks New Method for Identifying Public Health Threats by seqadmin Started by seqadmin, 09-18-2024, 06:39 AM	0 responses 14 views 0 likes	Last Post by seqadmin 09-18-2024, 06:39 AM
Molecular Markers Shared Across Dementias by seqadmin Started by seqadmin, 09-11-2024, 02:44 PM	0 responses 14 views 0 likes	Last Post by seqadmin 09-11-2024, 02:44 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News