Announcement
Collapse
No announcement yet.
X
-
Originally posted by biodiverse View PostThank you so much for your reply. I've attached both my sample sheet and my illumina adapter list with the barcodes. I didn't prepare the libraries with a kit but it was a standard prep.
Could you please explain how the reads and barcodes are oriented? I have searched the literature for this but am clearly misunderstanding something.
Code:Index1 p7 end primer showing index read (i7) primer annealed Index (i7) Read Sequencing Primer <-CACTGACCTCAAGTCTGCACACGAGAAGGCTAG-5’ 5'-CAAGCAGAAGACGGCATACGAGATatcacgGTGACTGGAGTTCAGACGTGT-3' (Obviously your full library fragment will extend beyond the 3' end of Index1 primer. Truncated for clarity.) Index1 will be read as CGTGAT
Leave a comment:
-
Leave a comment:
-
Originally posted by biodiverse View PostThank you so much for your reply. I've attached both my sample sheet and my illumina adapter list with the barcodes. I didn't prepare the libraries with a kit but it was a standard prep.
Could you please explain how the reads and barcodes are oriented? I have searched the literature for this but am clearly misunderstanding something.
Code:Index 1 p7 end primer with i7 index read primer annealed Index (i7) Read Sequencing Primer <-CACTGACCTCAAGTCTGCACACGAGAAGGCTAG-5’ 5'-CAAGCAGAAGACGGCATACGAGATatcacgGTGACTGGAGTTCAGACGTGT-3' Index 1 will be read as CGTGAT (Obviously your actual library molecules would extend out from the 3' end of the Index 1 primer, further complementary to the i7 index read sequencing primer.)
Leave a comment:
-
Thank you so much for your reply. I've attached both my sample sheet and my illumina adapter list with the barcodes. I didn't prepare the libraries with a kit but it was a standard prep.
Could you please explain how the reads and barcodes are oriented? I have searched the literature for this but am clearly misunderstanding something.
Leave a comment:
-
Originally posted by biodiverse View PostI have Illumina PE data from a MiSeq run and am finding large discrepancies in the number of reads recovered while demultiplexing.
Originally, the fastq files provided with the run had very few reads per individual and a large (>4Gb) Undetermined file. I investigated the DemultiplexSummary provided with the run and found that the top 30 indexes found were in fact my indexes but for some reason (??) were not de-multiplexed correctly.
Using the Fastx-barcode splitter (HammonLab), allowing a 1nucleotide mismatch, I was able to recover a small number of additional reads from the Undetermined file but not nearly the number stated in the summary.
I then inputted the raw data into Geneious and after trimming the adapters with the bbduk input I de-multiplexed and found significantly more reads but still only approximately half of the number that should be present.
Some Numbers:
Barcode CAAAAG/CTTTTG
DemultiplexSummary: 1,119,171 reads
Fastq file unaltered from run: 1, 428 reads
Fastx Barcode Splitter (on undetermined file): 118,031 reads
Geneious: 574,498 reads
My questions are:
1 - why are the different programs giving such disparate results?
2- Am I misunderstanding the orientation of the barcodes in the reads and thus perhaps searching for them incorrectly? It is my understanding that the "adapter - barcode - read" order should have the barcode as the first 6 bases in the R1 read after adapter trimming (in my example CAAAAG). R2 should not have(?) the barcode - or I should not have to search for a barcode in R2 anyways as I have paired the data? I recovered the reads in Geneious using CTTTTG as that was the index listed in the demultiplexsummary but my barcode as listed in my primer order was CAAAAG so I am concerned that I am misunderstanding a fundamental piece of the puzzle here.
3 - Most importantly - how do I recover the 1million+ reads?!!?
First things first; assuming that your libraries are standard Illumina design your barcodes are NOT part of R1. In the standard Illumina library design and run configuration the index read is completely separate from R1 or R2. Unless your sequence provider also gave you the index read file (would have something like "I1" in place of R1 or R2 in the file name) then you don't have the information needed to demultiplex your reads.
Can you provide a copy of the SampleSheet.csv file that was used for the MiSeq run? That will show the indexes actually used by the MiSeq to demultiplex your data. It sounds very likely that the SampleSheet.csv had the indexes entered in the wrong orientation.
To determine the proper orientation we would need to see an example of your completed adapter sequences, or know what kit was used to construct your libraries.
Leave a comment:
-
Discrepancies in Demultiplexing programs - help request
I have Illumina PE data from a MiSeq run and am finding large discrepancies in the number of reads recovered while demultiplexing.
Originally, the fastq files provided with the run had very few reads per individual and a large (>4Gb) Undetermined file. I investigated the DemultiplexSummary provided with the run and found that the top 30 indexes found were in fact my indexes but for some reason (??) were not de-multiplexed correctly.
Using the Fastx-barcode splitter (HammonLab), allowing a 1nucleotide mismatch, I was able to recover a small number of additional reads from the Undetermined file but not nearly the number stated in the summary.
I then inputted the raw data into Geneious and after trimming the adapters with the bbduk input I de-multiplexed and found significantly more reads but still only approximately half of the number that should be present.
Some Numbers:
Barcode CAAAAG/CTTTTG
DemultiplexSummary: 1,119,171 reads
Fastq file unaltered from run: 1, 428 reads
Fastx Barcode Splitter (on undetermined file): 118,031 reads
Geneious: 574,498 reads
My questions are:
1 - why are the different programs giving such disparate results?
2- Am I misunderstanding the orientation of the barcodes in the reads and thus perhaps searching for them incorrectly? It is my understanding that the "adapter - barcode - read" order should have the barcode as the first 6 bases in the R1 read after adapter trimming (in my example CAAAAG). R2 should not have(?) the barcode - or I should not have to search for a barcode in R2 anyways as I have paired the data? I recovered the reads in Geneious using CTTTTG as that was the index listed in the demultiplexsummary but my barcode as listed in my primer order was CAAAAG so I am concerned that I am misunderstanding a fundamental piece of the puzzle here.
3 - Most importantly - how do I recover the 1million+ reads?!!?
Latest Articles
Collapse
-
by seqadmin
At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...-
Channel: Articles
09-26-2023, 06:26 AM -
-
by seqadmin
Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...-
Channel: Articles
09-07-2023, 11:15 PM -
-
by seqadmin
Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.
Whole Transcriptome RNA-seq
Whole transcriptome sequencing...-
Channel: Articles
08-31-2023, 11:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:57 AM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:57 AM
|
||
Started by seqadmin, 09-26-2023, 07:53 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
09-26-2023, 07:53 AM
|
||
Multiplexed Biomarker Detection with Nanopore Technology: A Leap in Precision Diagnostics
by seqadmin
Started by seqadmin, 09-25-2023, 07:42 AM
|
0 responses
14 views
0 likes
|
Last Post
by seqadmin
09-25-2023, 07:42 AM
|
||
Started by seqadmin, 09-22-2023, 09:05 AM
|
0 responses
44 views
0 likes
|
Last Post
by seqadmin
09-22-2023, 09:05 AM
|
Leave a comment: