Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CLC genomics wokbench and illumina demultiplexing

    Hi there!
    After a miseq Nextera XT run we got a lot of undetermined data (undeterminedbarcode sequences with one mismatch or more). We wouldn't like to throw away so much data , and looking for a possibility to demultiplex sequences with one or more mismatch in the barcode.
    Does CLC genomics workbench have this function? there is an option to process tagged sequences, but can the mismatched barcodes be processed?

    Thank you for any answers!

  • #2
    I am not sure CLC can help since MiSeq reporter apparently will not add the tags to the "undetermined" reads file it produces. I am going by the info provided by dsobral in a recent thread that is in the list below.

    In cases such as this you will need to de-multiplex the MiSeq data using the "Bcl2fastq" software that is available here: http://support.illumina.com/download...tware_184.ilmn. If you are not comfortable using command line tools then you will need to find someone who is reasonably proficient with linux and has access to a linux server.

    You will need:

    1. Full data folder from your MiSeq run
    2. Working install of bcl2fastq (in addition to the illumina link above look at this thread http://seqanswers.com/forums/showthread.php?t=34844) You can allow up to 2 mismatches per tag read.
    3. Example of the SampleSheet.csv you will need to create to run Bcl2fastq is in post #14 in this thread.
    Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)


    NOTE: If this run was over-clustered (density > 1300-1400 clusters/mm^2 for v.3 reagents) then chances of recovering useful data are slim.

    Comment


    • #3
      Are we talking index or barcode?
      For indecies use CASAVA by Illumina
      For inline barcodes use jMHC

      Comment


      • #4
        Originally posted by GenoMax View Post
        2. Working install of bcl2fastq (in addition to the illumina link above look at this thread http://seqanswers.com/forums/showthread.php?t=34844) You can allow up to 2 mismatches per tag read.
        bcl2fastq allows, just as CASAVA before, exactly one or zero mismatches in index recognition.

        Comment


        • #5
          Originally posted by Etherella View Post
          Hi there!
          After a miseq Nextera XT run we got a lot of undetermined data (undeterminedbarcode sequences with one mismatch or more). We wouldn't like to throw away so much data , and looking for a possibility to demultiplex sequences with one or more mismatch in the barcode.
          Does CLC genomics workbench have this function? there is an option to process tagged sequences, but can the mismatched barcodes be processed?

          Thank you for any answers!
          As GenoMax has already pointed out, it is possible to get the "undetermined indices" when demultiplexing with CASAVA/bcl2fastq (no idea why Illumina does not write the index sequences in the header for the miseq undet files).

          But maybe it is enough if you just ask your sequence provider to run demultiplexing with one mismatch?

          Comment


          • #6
            To my knowledge,
            CLC does demultiplexing only for in-line barcodes, not for barcodes in separate barcode reads. CLC assumes that such de-multiplexing is being done by the Illumina system software. It is relatively easy to do demultiplexing with some scripts tolerating one (examples are already mentioned) or more mismatches (there certainly are better options, but we have some quick and dirty script if desired).
            Last edited by luc; 02-25-2014, 01:55 PM.

            Comment


            • #7
              Hi,

              Does anyone have perl or pythogn script that can pull out Reads (Forward) from R1 file and corresponding pair (Reverse) from R2 file. CLC workbench does give paired sequence, but as mentioned by luc it looks for inline barcodes.
              I want some script that works alike and tolerate some mismatch. i would also expect it looks for barcode in seperate barcode reads.

              Many thanks

              Comment


              • #8
                Hi Bioinform,

                The allPrep-8.py script out of barcode-tools set, will do what you want and more.
                When using the "-D" it will only demultiplex ( and not do adapter or quality trimming).

                Comment


                • #9
                  More options:
                  jmhc
                  fastx_barcode_splitter

                  But I have a question, is there any software that detects also insertions/deletions in the barcodes? I want to use something to repair Ion Torrent barcodes but the software above only detects mismatches

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Essential Discoveries and Tools in Epitranscriptomics
                    by seqadmin




                    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                    04-22-2024, 07:01 AM
                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-25-2024, 11:49 AM
                  0 responses
                  19 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-24-2024, 08:47 AM
                  0 responses
                  19 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  62 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  60 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X