Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • JonB
    replied
    Originally posted by GenoMax View Post
    Jon: This appears to be a single sample even though the barcode read is included as a separate file in the SRA archive. See the corresponding ENA record (http://www.ebi.ac.uk/ena/data/view/SRR343051).

    In short, demultiplexing is not needed for this sample. You can use the _1 and _3 files as the R1/R2 read pair.
    Thank you!

    Leave a comment:


  • GenoMax
    replied
    Jon: This appears to be a single sample even though the barcode read is included as a separate file in the SRA archive. See the corresponding ENA record (http://www.ebi.ac.uk/ena/data/view/SRR343051).

    In short, demultiplexing is not needed for this sample. You can use the _1 and _3 files as the R1/R2 read pair.

    Leave a comment:


  • JonB
    replied
    Originally posted by GenoMax View Post
    Hopefully you have the information about barcode <--> sample.

    Try this script for demultiplexing: http://qiime.org/scripts/split_libraries_fastq.html
    GenoMax, do you mind telling me how I could use this script? I was looking at it before, but I don't understand how it assigns my reads into files based on the barcodes, and how does it deal with the two read pairs? Can I still use it on my data with the pairs in separate files?

    Thanks

    Leave a comment:


  • JonB
    replied
    Originally posted by GenoMax View Post
    @Jon B: You have not used the



    option with fastq-dump so you have the SRR* in the names. Just keep that in mind.
    Thanks! I didn't see that option.

    Leave a comment:


  • GenoMax
    replied
    @Jon B: You have not used the

    -F | --origfmt Defline contains only original sequence name.
    option with fastq-dump so you have the SRR* in the names. Just keep that in mind.

    Leave a comment:


  • GenoMax
    replied
    Hopefully you have the information about barcode <--> sample.

    Try this script for demultiplexing: http://qiime.org/scripts/split_libraries_fastq.html

    Leave a comment:


  • Illumina paired-end sra data in three separate files - what next?

    Hi,

    I have used fastq-dump to split paired-end illumina data. I get three files, one for each different pair and one file with barcodes. This is transcriptome data and I want to do de novo assembly. I have two questions:

    First, on the SRA website where I got the data it is only mentioned one barcode while there are several different in the barcodes file. Should I only use the sequences with the barcode given on the web?

    Second, how can I split the files according to the different barcodes while keeping the pairs? I looked at the fastx toolkit and the qiime split_libraries, but I don't think my illumina barcodes are inlcuded in the sequences themselves?

    Examples of the files:

    Code:
    -bash-4.1$ head SRR343051_1.fastq 
    @SRR343051.1.1 B0A05ABXX110604:3:1101:18610:1087 length=101
    NTCTTCTTGCGTACGCATTTGGACTTAATCCTAATCTTGGATTTGTTTCTTCTAAATATGTACCAATCACAATGCTTGAATCTCTTATTATAATATATTTA
    +SRR343051.1.1 B0A05ABXX110604:3:1101:18610:1087 length=101
    #####################################################################################################
    @SRR343051.2.1 B0A05ABXX110604:3:1101:14471:1088 length=101
    NCGAAGGGCAATGTAATAAAGTTTATTATTATGTGTGTACAATGCAAAAAAAAGGGACTCGACTCTAATCCTGGTCGAAGCACAGGGCAAGACCACCAATG
    +SRR343051.2.1 B0A05ABXX110604:3:1101:14471:1088 length=101
    #####################################################################################################
    @SRR343051.3.1 B0A05ABXX110604:3:1101:20187:1088 length=101
    NATCATAATCTTCAATTTTCAAATTACTCTTGTTGCCTTTGGAAAGATCGTTAGTTTTCGGGTCTTTTATATTTTACTATTGCTTTATACTTGTTTTCACT
    
    -bash-4.1$ head SRR343051_2.fastq 
    @SRR343051.1.2 B0A05ABXX110604:3:1101:18610:1087 length=8
    TTGAGCCT
    +SRR343051.1.2 B0A05ABXX110604:3:1101:18610:1087 length=8
    CCCFFFFF
    @SRR343051.2.2 B0A05ABXX110604:3:1101:14471:1088 length=8
    TTGAGCCT
    +SRR343051.2.2 B0A05ABXX110604:3:1101:14471:1088 length=8
    CCCFFFFF
    @SRR343051.3.2 B0A05ABXX110604:3:1101:20187:1088 length=8
    TTGAGCCT
    
    -bash-4.1$ head SRR343051_3.fastq 
    @SRR343051.1.3 B0A05ABXX110604:3:1101:18610:1087 length=101
    GAGAAAATAAAATATGAGAAAATAGTAAAGAAGAAATTAACTGATATAATTACAGAAGAGAATGAATAATTGAAACAATTAAAAAATCATTAAATGAAGAT
    +SRR343051.1.3 B0A05ABXX110604:3:1101:18610:1087 length=101
    CCCFFFFFGHHHHJJJIJIJJIJJJHJIJJJJJJJJJJJJJJJJJJJJHIGIIIIGHHIJIJJJJJJIJJJJJEGIIJJJJGFHHFFCEEEECCDDDCCCC
    @SRR343051.2.3 B0A05ABXX110604:3:1101:14471:1088 length=101
    CTGATGGTGTACGTTGAACTTGGTCTGGTGGTGCTGATTCTGAGCAACAGTCTGCGTCGCGCCGCCTCCTTCTTCCTGATTCTCTCGCTGGCCGTGTCGCT
    +SRR343051.2.3 B0A05ABXX110604:3:1101:14471:1088 length=101
    BCCFFFFDHHHHHJJIIGIJJJJHIJJIIJJFHIJJIJJJJIIJJJJJJJJIIJJIGIJJHFFDDDBDDDDDDDDDDDCDDDDCDD<BD39??&09B?9A<
    @SRR343051.3.3 B0A05ABXX110604:3:1101:20187:1088 length=101
    AGGTGATTCATCATCTTCAAAATATTAATAAAAAGTATATTAATATAAAGACAATTATATATCGAAAGTGAATAGTACTGTGAAGGAAAGTAGGAAATATT

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    Today, 07:15 AM
  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 08:18 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, Today, 08:04 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-03-2024, 06:55 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-30-2024, 03:16 PM
0 responses
27 views
0 likes
Last Post seqadmin  
Working...
X