What's considered to be the best tool for this -- removing human sequences from large sets of metagenomic next-gen reads? We tried BMTagger at default values , on a set of ~200 million 100nt Illumina reads, and it left in a lot of reads that hit human seqs with high confidence in subsequent blastn search vs. the NCBI nt database.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Yes, it's far from good. But that's how many were left in our metagenomic set after filtering out short reads, duplicate reads, and (via BMTagger) human reads. So we''d like to try a better human read remover, to help insure that the final read set for downstream analysis (e.g. blastn) is all nonhuman. And smaller.
Comment
-
Originally posted by ssully View PostYes, it's far from good. But that's how many were left in our metagenomic set after filtering out short reads, duplicate reads, and (via BMTagger) human reads. So we''d like to try a better human read remover, to help insure that the final read set for downstream analysis (e.g. blastn) is all nonhuman. And smaller.
p.s. If you have human reads, you probably have other contaminants too, like bacteria from human skin among other stuff. Keep that in mind especially if your contamination rate is high..Last edited by rhinoceros; 08-29-2013, 12:51 PM.savetherhino.org
Comment
-
We don't want to do assembly, because our main goal is to interrogate the diversity of taxa in our samples. We've done quality score filtering, length filtering, adapter trimming, duplicate removal - more vigorous quality trimming may be detrimental to uncovering diversity according to this study
We are studying a surface microbiome that humans interact with, so we don't mind skin bacteria; we want to catalog those, as well as any eukaryotic seqs. We don't even 'mind' the human sequences, it's just that their numbers make the seq files very large, so we want to split them out and treat human/nonhuman sets separately.Last edited by ssully; 08-29-2013, 01:49 PM.
Comment
-
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...-
Channel: Articles
Yesterday, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
49 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
50 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
43 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
55 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment