Originally posted by JenBarb
View Post
Announcement
Collapse
No announcement yet.
X
-
-
If you are looking for a web tool, I can't really offer any suggestions (hopefully someone else can). You can run BBMap locally, though, which will return alignments along with their percent identity. And rather than aligning to bacterial genomes, you can just align to 16S using one of the datasets mentioned here. However, sometimes 16S in public databases are not full-length, or are too long, so the coordinates will be misleading. You may wish to first filter out the ones that seem anomalous, for example, like this:
reformat.sh in=16S.fasta out=filtered.fasta minlen=1440 maxlen=1640
...which is what I did previously when trying to get rid of bad sequences. The exact length limits I derived empirically from looking at length distributions (using readlength.sh); possibly a tighter band would be better since you are interested in finding specific coordinates.
Leave a comment:
-
Also, do you know if there is a publication or information somewhere that gives the rough coordinates of the variable regions within the gene?
Leave a comment:
-
Hi Brian,
Yes basically, I am involved with a metagenomics study where I have 200-250bp sequence reads from Next Gen Sequencing derived from different regions of the 16s gene, i.e. 6 different primers (primer sequences are unknown as they are from a commercial kit and the company informed us that they are proprietary). For example, one fastq file that I have contains ~170K reads all from different regions of the 16s gene. I would like to be able to blast my reads against a database of bacteria so that in return I get each read aligned to the 16s gene somewhere and it's genomic coordinate, there I will know where a read in my file is derived form within the gene.
Does this make sense? The HOMD database (www.homd.org) allows one to blast a total of only 3000 reads. I am looking for a different tool that will allow me to blast all of my reads and will return the alignments, and percent identity along with where the read aligned along the gene.
Jen
Leave a comment:
-
This is slightly tangential to your question, but I have a neat tool that will locate a region in a 16S if you have the full-length 16S and primer sequences for the region. It's actually designed for cutting out the sub-regions, but you can just look at the sam file to get the coordinates.
msa.sh in=16S.fasta query=ACTGACTG out=1.sam
msa.sh in=16S.fasta query=ACTGACTG out=2.sam
Those sam files will indicate, for each 16s sequence in the input file, the best alignment of the query sequence (which should be your left or right primer sequence for that subregion). Then you can cut out the regions like this:
cutprimers.sh in=16S.fasta out=V4.fasta sam1=1.sam sam2=2.sam
These are in BBTools.
If you want to align sequences to bacteria, I suggest RefSeq Bacteria (just download and concatenate all of the *.fna.gz files). As far as tools go for BLASTing, you can use BLAST, of course. But I'm not entirely sure what you want. Can you clarify the question?
Leave a comment:
-
16S gene genomic coordinates?
Hello,
I have a couple of questions.
Does anyone know where I can get the true genomic coordinates of the 9 different variable regions in the 16S gene?
I have these based on my own inference from some published figures of the gene but would like to know if they are correct:
V1 ~ 80-120
V2 ~ 170-200
V3 ~ 420-500
V4 ~ 610-700
V5 ~ 820-950
V6 ~ 960-1100
V7 ~ 1150-1200
V8 ~ 1220-1300
V9 ~ 1450-1500
Also, does anyone know if there is a tool that I could use where I could blast roughly 100K reads against a database of bacteria and get back the region where my sequences aligned?
thank you,
JenTags: None
Latest Articles
Collapse
-
by seqadmin
The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
...-
Channel: Articles
11-27-2023, 01:15 PM -
-
by seqadmin
Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...-
Channel: Articles
11-09-2023, 07:02 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 09:55 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
Yesterday, 09:55 AM
|
||
Started by seqadmin, 11-30-2023, 10:48 AM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
11-30-2023, 10:48 AM
|
||
Started by seqadmin, 11-29-2023, 08:26 AM
|
0 responses
14 views
0 likes
|
Last Post
by seqadmin
11-29-2023, 08:26 AM
|
||
Started by seqadmin, 11-29-2023, 08:12 AM
|
0 responses
14 views
0 likes
|
Last Post
by seqadmin
11-29-2023, 08:12 AM
|
Leave a comment: