Hello Members,
Thanks for your extensive support.
I'm completely fresh to 16s RNA, amplicon sequencing. I come from this background.
I've been working on Illumina QIIME tutorial, and have had few success, and couple of issues. Said that, I am left with gap of a broad understanding what and why has been said/done.
My doubt here is with the output(s). This page of QIIME shows few reads like :
I worked on Illumina overview tutorial provided by webpage. It has multiple samples. Lets say we're working on only one sample.
I grep L2S155 in seqs.fna (output after processing data provided), and get count as 5066.
Also few lines when grepped:
Question:
1- What does the sequences for L2S155 mean?
The 16s gene is ~1600 bases.
If I add sequences from L2S155 they would be longer than 16s gene, and __mostly__ V3, or, and V4 are sequenced to understand microbial diversity of a sample.
2- How the above contigs summarize V3/V4 region which has been amplified?
3- With assembling a genome, DNA is shredded into smaller pieces, and then processed to remove errors, poor quality, and then genome is made by tools for an organism.
What's going on here with respect to these individual contigs?
Sorry for naive questions.
Kindly provide some insights.
Thank you.
Thanks for your extensive support.
I'm completely fresh to 16s RNA, amplicon sequencing. I come from this background.
I've been working on Illumina QIIME tutorial, and have had few success, and couple of issues. Said that, I am left with gap of a broad understanding what and why has been said/done.
My doubt here is with the output(s). This page of QIIME shows few reads like :
>PC.634_1 FLP3FBN01ELBSX orig_bc=ACAGAGTCGGCT new_bc=ACAGAGTCGGCT bc_diffs=0
CTGGGCCGTGTCTCAGTCCCAATGTGGCCGTTTACCCTCTCAGGCCGGCTACGCATCATCGCC....
>PC.634_2 FLP3FBN01EG8AX orig_bc=ACAGAGTCGGCT new_bc=ACAGAGTCGGCT bc_diffs=0
TTGGACCGTGTCTCAGTTCCAATGTGGGGGCCTTCCTCTCAGAACCCCTATCCATCGAAGGCTT....
>PC.354_3 FLP3FBN01EEWKD orig_bc=AGCACGAGCCTA new_bc=AGCACGAGCCTA bc_diffs=0
TTGGGCCGTGTCTCAGTCCCAATGTGGCCGATCAGTCTCTTAACTCGGCTATGCATCATTGCCTT....
CTGGGCCGTGTCTCAGTCCCAATGTGGCCGTTTACCCTCTCAGGCCGGCTACGCATCATCGCC....
>PC.634_2 FLP3FBN01EG8AX orig_bc=ACAGAGTCGGCT new_bc=ACAGAGTCGGCT bc_diffs=0
TTGGACCGTGTCTCAGTTCCAATGTGGGGGCCTTCCTCTCAGAACCCCTATCCATCGAAGGCTT....
>PC.354_3 FLP3FBN01EEWKD orig_bc=AGCACGAGCCTA new_bc=AGCACGAGCCTA bc_diffs=0
TTGGGCCGTGTCTCAGTCCCAATGTGGCCGATCAGTCTCTTAACTCGGCTATGCATCATTGCCTT....
I grep L2S155 in seqs.fna (output after processing data provided), and get count as 5066.
Also few lines when grepped:
>L2S155_2984 HWI-EAS440_0386:1:30:8089:15545#0/1 orig_bc=ACGATGCGACCA new_bc=ACGATGCGACCA bc_diffs=0
>L2S155_5013 HWI-EAS440_0386:1:32:10532:9113#0/1 orig_bc=ACGATGCGACCA new_bc=ACGATGCGACCA bc_diffs=0
>L2S155_5013 HWI-EAS440_0386:1:32:10532:9113#0/1 orig_bc=ACGATGCGACCA new_bc=ACGATGCGACCA bc_diffs=0
1- What does the sequences for L2S155 mean?
The 16s gene is ~1600 bases.
If I add sequences from L2S155 they would be longer than 16s gene, and __mostly__ V3, or, and V4 are sequenced to understand microbial diversity of a sample.
2- How the above contigs summarize V3/V4 region which has been amplified?
3- With assembling a genome, DNA is shredded into smaller pieces, and then processed to remove errors, poor quality, and then genome is made by tools for an organism.
What's going on here with respect to these individual contigs?
Sorry for naive questions.
Kindly provide some insights.
Thank you.