Header Leaderboard Ad
Collapse
Diginorm Algorithm
Collapse
Announcement
Collapse
SEQanswers June Challenge Has Begun!
The competition has begun! We're giving away a $50 Amazon gift card to the member who answers the most questions on our site during the month. We want to encourage our community members to share their knowledge and help each other out by answering questions related to sequencing technologies, genomics, and bioinformatics. The competition is open to all members of the site, and the winner will be announced at the beginning of July. Best of luck!
For a list of the official rules, visit (https://www.seqanswers.com/forum/sit...wledge-and-win)
For a list of the official rules, visit (https://www.seqanswers.com/forum/sit...wledge-and-win)
See more
See less
X
-
No short-term plans to make use of phred scores; no short-term plans on releasing the new approaches. The end-trimming problems are fairly easily solved by using a high C, like C=20 or C=50, so it's not a blocker for anyone; and for now we're trying to focus on getting the next version out. Plus pubs.
-
Thank you Adina and Titus. That does make a lot of sense now. Can I ask if the new implementation makes use of the phred scores? And do you have an estimate of when it will be released?
Leave a comment:
-
Let's see if I can give some intuition too...
Suppose you have an undersampled region (like the terminal end of a contig, or a low-abundance splice variant) next to a bunch of highly sampled regions. Then if you had a completely correct read that crossed both the highly sampled and the low sampled region, but contained more of the highly sampled region, the median would be high, and the read would be discarded. So it really has to do with high sampling right next to low sampling -- basically what adina said about repeats.
We know how to deal with this properly and have a prototype implementation, but it isn't really ready for use yet.
Leave a comment:
-
Hi,
You are correct in that diginorm will retain low abundance reads where abundance is estimated as the median abundance of all k-mers in the read. If you were to rank order all the k-mers in a read by its observed abundance in the dataset, the abundance would be the median value. Thus, the read would be discarded based on median abundance of the kmer abundance distribution of the read (not necessarily the terminal kmers). The k-length and read length affects how sensitive the median estimation is (as described in the paper) to i.e., sequencing errors typically found at the end of Illumina reads.
Diginorm would discard reads pertaining to terminal kmers if its was, for example, a repetitive region in a read that was observed in high abundance in the dataset. In this case, the distribution of k-mer abundances of the entire read is likely even (due to repeats) and the abundance of the terminal k-mer abundance is more likely to be the median abundance of the read.
Hope this helps!
Leave a comment:
-
Diginorm Algorithm
Hello,
I am having trouble understanding a point made in the Diginorm paper:
They say that Diginorm discards some terminal kmer and low-abundance isoform information but I am wondering why this is?
According to the description of the algorithm, Diginorm estimates read coverage by using the median abundance of kmers for each read and discards the read if the median abundance is above some cutoff level. This should mean that any low abundance reads would be retained. If this is true, under what situations would it discard reads pertaining to terminal kmers and low-abundance isoforms?
I suspect I am missing something here and it would be very helpful to get some outside views to get me out of this mind trap.
Thank you!Tags: None
Latest Articles
Collapse
-
by seqadmin
Developments in sequencing technologies and methodologies have transformed the field of epigenetics, giving researchers a better way to understand the complex world of gene regulation and heritable modifications. This article explores some of the diverse sequencing methods employed in the study of epigenetics, ranging from classic techniques to cutting-edge innovations while providing a brief overview of their processes, applications, and advances.
Methylation Detect...-
Channel: Articles
05-31-2023, 10:46 AM -
-
Differential Expression and Data Visualization: Recommended Tools for Next-Level Sequencing Analysisby seqadmin
After covering QC and alignment tools in the first segment and variant analysis and genome assembly in the second segment, we’re wrapping up with a discussion about tools for differential gene expression analysis and data visualization. In this article, we include recommendations from the following experts: Dr. Mark Ziemann, Senior Lecturer in Biotechnology and Bioinformatics, Deakin University; Dr. Medhat Mahmoud Postdoctoral Research Fellow at Baylor College of Medicine;...-
Channel: Articles
05-23-2023, 12:26 PM -
-
by seqadmin
Continuing from our previous article, we share variant analysis and genome assembly tools recommended by our experts Dr. Medhat Mahmoud, Postdoctoral Research Fellow at Baylor College of Medicine, and Dr. Ming "Tommy" Tang, Director of Computational Biology at Immunitas and author of From Cell Line to Command Line.
Variant detection and analysis tools
Mahmoud classifies variant detection work into two main groups: short variants (<50...-
Channel: Articles
05-19-2023, 10:03 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 07:14 AM
|
0 responses
4 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:14 AM
|
||
Started by seqadmin, 06-06-2023, 01:08 PM
|
0 responses
6 views
0 likes
|
Last Post
by seqadmin
06-06-2023, 01:08 PM
|
||
Started by seqadmin, 06-01-2023, 08:56 PM
|
0 responses
132 views
0 likes
|
Last Post
by seqadmin
06-01-2023, 08:56 PM
|
||
Deep Sequencing Unearths Novel Genetic Variants: Enhancing Precision Medicine for Vascular Anomalies
by seqadmin
Started by seqadmin, 06-01-2023, 07:33 AM
|
0 responses
269 views
0 likes
|
Last Post
by seqadmin
06-01-2023, 07:33 AM
|
Leave a comment: