I am analysing a environmental barcoding 454 dataset. As this is the first run I have completed I individually identified all species in all samples prior to bulk PCR and 454 sequencing, so I know what should be in each sample. While it has worked well I have one quirky thing with the dataset where I get 1-4 sequence reads appearing in samples that represent species that should not be in the sample. I though it may represent gut content of my species (as guts where included in the 454 extractions), but these reads are sporadically appear and are not alway associated with all replicates of a sample. I used a universal tailed design (with 2 PCR step to attached the MID) so I not sure how it could happen but is it possible for a small number of reads to be associated with the wrong MID? Has anyone else noticed this? Any ideas??
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I saw a paper in press that describes this source of error, but I forget the title and journal. it describes something in the way of MID swapping or errors in the MIDs that create either new MID combination not employed in the study or mimic existing MID combos. I have seen the same thing in our dual MID data i.e. we get a bunch of duel mid combination that were not employed, at very low sequencing depths.
Just ignore them in my opinion
-
One other thing that could explain your observations is aerosol contamination of PCR products, or during DNA extraction.
DNA contamination may occur if you are using a robot for extractions where the same pipets are only rinsed. For all next-gen library preps you should follow strict pcr-PCR conditions. i.e. separate rooms, barrier filter tips etc
Comment
-
We consistently see something similar (8+ 454 Jr. runs).
Multiplex sequencing of samples that each have unique MIDs on 5' and 3' resulting in reads that have crossing of MIDs.
(samples prepared on separate days, using pre- and post-PCR rooms)
Occurring at about 1% of total reads.
Most likely this occurs during emulsion PCR by cross-priming of unincorporated primers (no matter how good your sample prep is).
Hope this is informative
Comment
-
Just out of curiosity I used sfftools to split one of earlier sff files from multiplexed run. In addition to expected barcodes 1 to 5 I found reads with unused barcodes 6-12 at the rates of few percent each. Interestingly, tubes with barcodes 6-12 were never opened yet at the time the libraries were prepared and I am the only one in the entire institute who uses 454 sequencing. So the possibility of any sort of contamination is out of question. Either reagents were cross-contaminated already or there got to be another reason for emergence of reads with unexpected barcodes.
Comment
-
Originally posted by yaximik View PostJust out of curiosity I used sfftools to split one of earlier sff files from multiplexed run. In addition to expected barcodes 1 to 5 I found reads with unused barcodes 6-12 at the rates of few percent each. Interestingly, tubes with barcodes 6-12 were never opened yet at the time the libraries were prepared and I am the only one in the entire institute who uses 454 sequencing. So the possibility of any sort of contamination is out of question. Either reagents were cross-contaminated already or there got to be another reason for emergence of reads with unexpected barcodes.
The paper I linked too earlier discusses.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 11:49 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
Today, 11:49 AM
|
||
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment