Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Demultiplexing and CASAVA 1.7

    Hello,
    I am looking for young bioinformatics. I need tools to perform demultiplexing
    tool (script) to sequences from the GAIIx before CASAVA (1.7). We do not use the index provided by Illumina. Thank you in advance for your help

  • #2
    Originally posted by tonio100680 View Post
    Hello,
    I am looking for young bioinformatics. I need tools to perform demultiplexing
    tool (script) to sequences from the GAIIx before CASAVA (1.7). We do not use the index provided by Illumina. Thank you in advance for your help
    If you don't use an index read, CASAVA is of no use for demultiplexing.
    Have a look at 'sabre' (https://github.com/ucdavis-bioinformatics/sabre) or FastX-Toolkit (http://hannonlab.cshl.edu/fastx_tool...splitter_usage) amongst others.

    hth, Sven

    Comment


    • #3
      You could try Novobarcode. It's included in download of Novoalign at www.novocraft.com. Free to use, no license is required.

      Colin

      Comment


      • #4
        Thank you for the help !

        After converting my files are formatted QSEQ. To use Novobarcode must convert them FASTQ. If so have you a simple solution?

        Comment


        • #5
          You might want to read the following thread as a starting point,http://seqanswers.com/forums/showthread.php?t=1801

          hth, Sven

          Comment


          • #6
            I really am a bioinformatics novice! I want a simple converter... It's panic. I'm harassed by biologists

            Comment


            • #7
              'bfast', as mentioned in the thread, has its own converter,
              BFAST facilitates the fast and accurate mapping of short reads to reference sequences, where mapping billions of short reads with variants is of…


              Download the archive, untar it, and look in the 'scripts' directory, there you'll find a perl script called 'ill2fastq.pl'. I never used it, but it should do the job.
              There are probably a lot more tools... maybe you could have a look at GALAXY, but I am not sure if they provide qseq-to-fastq conversion.

              When I read your original post again, it seems you want to use CASAVA 1.7 for mapping? Be aware that 1.7 needs qseq format for input files, however the fresh 1.8 takes fastq as input ...

              hth, Sven

              Comment


              • #8
                Originally posted by tonio100680 View Post
                Thank you for the help !

                After converting my files are formatted QSEQ. To use Novobarcode must convert them FASTQ. If so have you a simple solution?
                Recent versions of Novoalign will process qseq files. The latest version will accept 3 read files with index tag in it's own read file. Output is then qseq.

                Earlier version could only accept 2 qseq files and would write demuxed files in fastq format. In this case you run novobarcode twice, once for read1 and index read then again for read2 and the index read. You can still do this with latest version if you wnat qseq to fastq conversion.

                Colin

                Comment


                • #9
                  That's right, I want to use CASAVA 1.7 because the 1.8 is not available in France...
                  I can not use novobarcode because the output format is not compatible with the input format CASAVA so I'm looking for an alternative demultiplexer "simply" ...

                  Comment


                  • #10
                    CASAVA 1.8 is already available at iCom; if you have your tags directly attached to your (first) read, you'll probably need to write your own demultiplexer to write qseq again (shouldn't be too hard if know someone familiar with e.g. perl).

                    Or a simple one-liner, assuming the barcode sequence 'ACGTACGT' (not removing it),
                    Code:
                    perl -lane 'print if($F[8]=~/^ACGTACGT/)' SampleABC_qseq.txt > SampleA_NewQseq.txt
                    You could then use the 'qseq-mask' (USE_BASES) option of GERALD to skip these bases.

                    Or just a starting point (not tested thoroughly), a simple script looking for all seqs starting with $barcode and removing it from seq and quals:
                    Code:
                    #!/usr/local/bin/perl
                    
                    use warnings;
                    use strict;
                    
                    my $barcode = shift;
                    my $length  = length($barcode);
                    my @line;
                    my ($s,$q);
                    
                    while (<>) {
                    
                        chomp;
                        @line=split;
                        next unless ($line[8]=~/^$barcode/o);
                        
                        $s=substr($line[8],0+$length);
                        $q=substr($line[9],0+$length);
                        
                        print join("\t", @line[0..7]), "\t$s\t$q\t$line[10]\n";     
                    }
                    start with the barcode sequence as the first argument and a bunch of qseq files as the following arguments. Redirect output to a new file if happy; the above script just dumps to the terminal.

                    E.g.
                    Code:
                    scriptName.pl ACGTACG *qseq.txt > newQseqFile_ACGTACG.txt
                    hth, Sven
                    Last edited by sklages; 06-08-2011, 01:09 AM.

                    Comment


                    • #11
                      btw, as of CASAVA version 1.8 there is a script called "configureQseqToFastq.pl" to convert a whole folder of qseqs to fastq.

                      Comment


                      • #12
                        Originally posted by tonio100680 View Post
                        That's right, I want to use CASAVA 1.7 because the 1.8 is not available in France...
                        I can not use novobarcode because the output format is not compatible with the input format CASAVA so I'm looking for an alternative demultiplexer "simply" ...
                        Let me look to see if I can get novobarcode to do qseq in & out when index tag is embedded in the read.

                        Novobarcode does allow some mismatches in the index tag so it may classify more reads than a perl script.
                        Last edited by sparks; 06-09-2011, 12:33 AM.

                        Comment


                        • #13
                          I've modded novobarcode so that it can write out QSEQ when input is in QSEQ. If you'd like to try it send an email to support (at) novocraft ....
                          Last edited by sparks; 06-09-2011, 12:32 AM.

                          Comment


                          • #14
                            Originally posted by sklages View Post
                            If you don't use an index read, CASAVA is of no use for demultiplexing.
                            Have a look at 'sabre' (https://github.com/ucdavis-bioinformatics/sabre) or FastX-Toolkit (http://hannonlab.cshl.edu/fastx_tool...splitter_usage) amongst others.

                            hth, Sven

                            We actually use Casava v.17 to demultiplex our in-house designed barcodes routinely. We don't use the Illumina index read kit, we just set out detection cycles out past the barcode and use Casava to demultiplex the barcode. It's just a matter of properly formatting the sample sheet. Casava will make you a folder for each barcode and demultiplex the _qseq.txt files to each folder, then you can run Gerald to create the .fastq

                            Take a look a the demultiplex.pl script in Casava 1.7 and there is an example of the sample sheet format to use on pg 14 in the manual.
                            Christine Brennan
                            UM DNA Sequencing Core
                            Ann Arbor, MI 48109

                            [email protected]

                            Comment


                            • #15
                              Originally posted by cbrennan View Post
                              We actually use Casava v.17 to demultiplex our in-house designed barcodes routinely. We don't use the Illumina index read kit, we just set out detection cycles out past the barcode and use Casava to demultiplex the barcode. It's just a matter of properly formatting the sample sheet. Casava will make you a folder for each barcode and demultiplex the _qseq.txt files to each folder, then you can run Gerald to create the .fastq

                              Take a look a the demultiplex.pl script in Casava 1.7 and there is an example of the sample sheet format to use on pg 14 in the manual.
                              ah, ok. But actually you are using an index .. I interpreted that the OP had indices as part of the construct to be sequenced, just like nimblegen adaptors or so.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advanced Tools Transforming the Field of Cytogenomics
                                by seqadmin


                                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                                09-26-2023, 06:26 AM
                              • seqadmin
                                How RNA-Seq is Transforming Cancer Studies
                                by seqadmin



                                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                                09-07-2023, 11:15 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 09:36 AM
                              0 responses
                              6 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 07:14 AM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-29-2023, 09:38 AM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-27-2023, 06:57 AM
                              0 responses
                              14 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X