Originally posted by GenoMax
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Jon: This appears to be a single sample even though the barcode read is included as a separate file in the SRA archive. See the corresponding ENA record (http://www.ebi.ac.uk/ena/data/view/SRR343051).
In short, demultiplexing is not needed for this sample. You can use the _1 and _3 files as the R1/R2 read pair.
Leave a comment:
-
Originally posted by GenoMax View PostHopefully you have the information about barcode <--> sample.
Try this script for demultiplexing: http://qiime.org/scripts/split_libraries_fastq.html
Thanks
Leave a comment:
-
@Jon B: You have not used the
-F | --origfmt Defline contains only original sequence name.
Leave a comment:
-
Hopefully you have the information about barcode <--> sample.
Try this script for demultiplexing: http://qiime.org/scripts/split_libraries_fastq.html
Leave a comment:
-
Illumina paired-end sra data in three separate files - what next?
Hi,
I have used fastq-dump to split paired-end illumina data. I get three files, one for each different pair and one file with barcodes. This is transcriptome data and I want to do de novo assembly. I have two questions:
First, on the SRA website where I got the data it is only mentioned one barcode while there are several different in the barcodes file. Should I only use the sequences with the barcode given on the web?
Second, how can I split the files according to the different barcodes while keeping the pairs? I looked at the fastx toolkit and the qiime split_libraries, but I don't think my illumina barcodes are inlcuded in the sequences themselves?
Examples of the files:
Code:-bash-4.1$ head SRR343051_1.fastq @SRR343051.1.1 B0A05ABXX110604:3:1101:18610:1087 length=101 NTCTTCTTGCGTACGCATTTGGACTTAATCCTAATCTTGGATTTGTTTCTTCTAAATATGTACCAATCACAATGCTTGAATCTCTTATTATAATATATTTA +SRR343051.1.1 B0A05ABXX110604:3:1101:18610:1087 length=101 ##################################################################################################### @SRR343051.2.1 B0A05ABXX110604:3:1101:14471:1088 length=101 NCGAAGGGCAATGTAATAAAGTTTATTATTATGTGTGTACAATGCAAAAAAAAGGGACTCGACTCTAATCCTGGTCGAAGCACAGGGCAAGACCACCAATG +SRR343051.2.1 B0A05ABXX110604:3:1101:14471:1088 length=101 ##################################################################################################### @SRR343051.3.1 B0A05ABXX110604:3:1101:20187:1088 length=101 NATCATAATCTTCAATTTTCAAATTACTCTTGTTGCCTTTGGAAAGATCGTTAGTTTTCGGGTCTTTTATATTTTACTATTGCTTTATACTTGTTTTCACT -bash-4.1$ head SRR343051_2.fastq @SRR343051.1.2 B0A05ABXX110604:3:1101:18610:1087 length=8 TTGAGCCT +SRR343051.1.2 B0A05ABXX110604:3:1101:18610:1087 length=8 CCCFFFFF @SRR343051.2.2 B0A05ABXX110604:3:1101:14471:1088 length=8 TTGAGCCT +SRR343051.2.2 B0A05ABXX110604:3:1101:14471:1088 length=8 CCCFFFFF @SRR343051.3.2 B0A05ABXX110604:3:1101:20187:1088 length=8 TTGAGCCT -bash-4.1$ head SRR343051_3.fastq @SRR343051.1.3 B0A05ABXX110604:3:1101:18610:1087 length=101 GAGAAAATAAAATATGAGAAAATAGTAAAGAAGAAATTAACTGATATAATTACAGAAGAGAATGAATAATTGAAACAATTAAAAAATCATTAAATGAAGAT +SRR343051.1.3 B0A05ABXX110604:3:1101:18610:1087 length=101 CCCFFFFFGHHHHJJJIJIJJIJJJHJIJJJJJJJJJJJJJJJJJJJJHIGIIIIGHHIJIJJJJJJIJJJJJEGIIJJJJGFHHFFCEEEECCDDDCCCC @SRR343051.2.3 B0A05ABXX110604:3:1101:14471:1088 length=101 CTGATGGTGTACGTTGAACTTGGTCTGGTGGTGCTGATTCTGAGCAACAGTCTGCGTCGCGCCGCCTCCTTCTTCCTGATTCTCTCGCTGGCCGTGTCGCT +SRR343051.2.3 B0A05ABXX110604:3:1101:14471:1088 length=101 BCCFFFFDHHHHHJJIIGIJJJJHIJJIIJJFHIJJIJJJJIIJJJJJJJJIIJJIGIJJHFFDDDBDDDDDDDDDDDCDDDDCDD<BD39??&09B?9A< @SRR343051.3.3 B0A05ABXX110604:3:1101:20187:1088 length=101 AGGTGATTCATCATCTTCAAAATATTAATAAAAAGTATATTAATATAAAGACAATTATATATCGAAAGTGAATAGTACTGTGAAGGAAAGTAGGAAATATT
Latest Articles
Collapse
-
by seqadmin
While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...-
Channel: Articles
Today, 07:15 AM -
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 08:18 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Today, 08:18 AM
|
||
Started by seqadmin, Today, 08:04 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Today, 08:04 AM
|
||
Started by seqadmin, 06-03-2024, 06:55 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
06-03-2024, 06:55 AM
|
||
Started by seqadmin, 05-30-2024, 03:16 PM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
05-30-2024, 03:16 PM
|
Leave a comment: