Header Leaderboard Ad
Collapse
SNP calling on 454 data
Collapse
Announcement
Collapse
No announcement yet.
X
-
We also use the hcdiffs in combination with our own downstream analysis where we annotate the data with known SNPs and other useful info. Seems to work fine as long as you have sufficient coverage and there's not too many variants close to each other. With lower coverage you start getting more false positives but you also start missing variants. Actually we once did a comparison with a SNP array and the HCDiffs of version 1.0 of the mapper software and that didn't look that good, as we were missing quite a few variants.
-
denovo contig assembly from capture array
Originally posted by Layla View PostThis is quite a tricky process, especially without the support of bioinformaticians. The downstream analysis is much more complex than carrying out the capture array itself. The HCDiffs file does seem very promising for extracting useful information for SNPs.
Tim, could please say what you mean when you say that you parse the HCDifs file for ""differences with >85% agreement"". and also the kind of validation you do? As I am also certain that alot of our indels will be false positives.
(Thankyou)
Has anybody attempted denovo contig assembly from their capture array data?
Layla
Have you found anyone that has done the contig assembly? I'm curious...
Leave a comment:
-
Thank you Tim, I realized what you meant 2 seconds after I had posted the question! Yes, I have been focusing on that file and using diffs > 75% agreement. Cheers, Layla
Leave a comment:
-
Originally posted by Layla View PostTim, could please say what you mean when you say that you parse the HCDifs file for ""differences with >85% agreement"". and also the kind of validation you do? As I am also certain that alot of our indels will be false positives.
Layla
Leave a comment:
-
Capture and beyond
This is quite a tricky process, especially without the support of bioinformaticians. The downstream analysis is much more complex than carrying out the capture array itself. The HCDiffs file does seem very promising for extracting useful information for SNPs.
Tim, could please say what you mean when you say that you parse the HCDifs file for ""differences with >85% agreement"". and also the kind of validation you do? As I am also certain that alot of our indels will be false positives.
(Thankyou)
Has anybody attempted denovo contig assembly from their capture array data?
Layla
Leave a comment:
-
SNP calling for 454 data
You may use NextGENe software to call SNPs using 454 data. The software links the calling to dbSNP database if GenBank format is provided. SoftGenetics may provide a demo to use NextGENe to your own data.
josliu
Leave a comment:
-
No correlation I can see in the differences called by newbler runmapper that we validated (which are generally high quality calls). I dont think we have a large enough sample size though. We have noted trends in the raw output from runmapper for calls that fall underneath our cutoof filter. Like a large number of 1 bp insertions and deletions are <25-fold read coverage and <50% concordance.
tim
Leave a comment:
-
timread,
Could you give some parameters on read depth for the false vs true positives? Or do you find no correlation.
Thanks
Tom
Leave a comment:
-
We are primarily looking for SNPs in bacterial genomes (ie no heterozygotes). For a first look we parse the HCDifs file for differences with >85% agreement. We then proceed to validation. Most of the single base insertions and deletions turn out to be false positives.
Leave a comment:
-
Thanks Tom, that was helpful.
Any others looking for SNPs from 454 data? I heard brute blast approach with no gaps also works! lots of try-it-out-yourself
Leave a comment:
-
We are working with this, we use mostly the HCDiffs file with alot of post processing. Key things we look at are read depth (hcdiffs is a depth of 3, 2 one way 1 the other)I would say 5 is a better minimum, 15 if you are looking for hets. We also filter for known snps using the dbsnp track from ucsc database and if it is in an exon (also from ucsc) since most people I am working with are looking at nimblegen capture experiments, primarily focused on exons. If you are looking outside exons conservation score appears somewhat useful.
don't know if that helps at all
Leave a comment:
-
I'd hoped we were working on similar things, but it seems not. Your problem seem to be more about recognizing novel snps, which is substantially different from my need to recognized named snps.
Specifically I need to turn the PGP10 exome fasta into a series of dbSNP rs#s and report observed genotypes. Results will be tab delimited and look something like
Since this is about recognizing named entities, I'd like to extend it to also recognize non-SNP features such as Huntington's, and possibly CNVs.
Sorry I can't be more helpful, but if anyone has code or advice on either topic I'm interested in both.
Leave a comment:
-
SNP calling on 454 data
Anyone has ideas on how to make variation calls on 454 re-sequencing data?
perhaps using the Alldiffs or HCDiffs files from gsmapper software? or some other tools. I believe there needs to be some downstream analysis after Marth lab's mosaik tool, in order to get variation positions and % calls for A C G TsTags: None
Latest Articles
Collapse
-
Differential Expression and Data Visualization: Recommended Tools for Next-Level Sequencing Analysisby seqadmin
After covering QC and alignment tools in the first segment and variant analysis and genome assembly in the second segment, we’re wrapping up with a discussion about tools for differential gene expression analysis and data visualization. In this article, we include recommendations from the following experts: Dr. Mark Ziemann, Senior Lecturer in Biotechnology and Bioinformatics, Deakin University; Dr. Medhat Mahmoud Postdoctoral Research Fellow at Baylor College of Medicine;...-
Channel: Articles
05-23-2023, 12:26 PM -
-
by seqadmin
Continuing from our previous article, we share variant analysis and genome assembly tools recommended by our experts Dr. Medhat Mahmoud, Postdoctoral Research Fellow at Baylor College of Medicine, and Dr. Ming "Tommy" Tang, Director of Computational Biology at Immunitas and author of From Cell Line to Command Line.
Variant detection and analysis tools
Mahmoud classifies variant detection work into two main groups: short variants (<50...-
Channel: Articles
05-19-2023, 10:03 AM -
-
by seqadmin
With new tools and computational resources being released regularly, it can be hard to determine which are best suited for the analysis process and which older tools continue to be maintained. In an effort to assist the sequencing community, we interviewed three highly skilled bioinformaticians about their recommended tools for several important analysis applications.
Quality control and preprocessing tools
“Garbage in, garbage out” is a popular...-
Channel: Articles
05-16-2023, 10:11 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Exploring French-Canadian Ancestry: Insights into Migration, Settlement Patterns, and Genetic Structure
by seqadmin
Started by seqadmin, Yesterday, 09:22 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
Yesterday, 09:22 AM
|
||
Started by seqadmin, 05-24-2023, 09:49 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
05-24-2023, 09:49 AM
|
||
Introducing ProtVar: A Web Tool for Contextualizing and Interpreting Human Missense Variation in Proteins
by seqadmin
Started by seqadmin, 05-23-2023, 07:14 AM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
05-23-2023, 07:14 AM
|
||
Started by seqadmin, 05-18-2023, 11:36 AM
|
0 responses
113 views
0 likes
|
Last Post
by seqadmin
05-18-2023, 11:36 AM
|
Leave a comment: