Seqanswers Leaderboard Ad

**sklages** · 05-30-2011, 05:53 AM

Originally posted by tonio100680 View Post

Hello,
I am looking for young bioinformatics. I need tools to perform demultiplexing
tool (script) to sequences from the GAIIx before CASAVA (1.7). We do not use the index provided by Illumina. Thank you in advance for your help

If you don't use an index read, CASAVA is of no use for demultiplexing.
Have a look at 'sabre' (https://github.com/ucdavis-bioinformatics/sabre) or FastX-Toolkit (http://hannonlab.cshl.edu/fastx_tool...splitter_usage) amongst others.

hth, Sven

**sparks** · 06-07-2011, 05:03 PM

You could try Novobarcode. It's included in download of Novoalign at www.novocraft.com. Free to use, no license is required.

Colin

**tonio100680** · 06-07-2011, 10:16 PM

Thank you for the help !

After converting my files are formatted QSEQ. To use Novobarcode must convert them FASTQ. If so have you a simple solution?

**sklages** · 06-07-2011, 10:37 PM

You might want to read the following thread as a starting point,http://seqanswers.com/forums/showthread.php?t=1801

hth, Sven

**tonio100680** · 06-07-2011, 11:21 PM

I really am a bioinformatics novice! I want a simple converter... It's panic. I'm harassed by biologists

**sklages** · 06-07-2011, 11:39 PM

'bfast', as mentioned in the thread, has its own converter,

Blat-like Fast Accurate Search Tool - Browse /bfast at SourceForge.net

https://sourceforge.net/projects/bfast/files/bfast/

BFAST facilitates the fast and accurate mapping of short reads to reference sequences, where mapping billions of short reads with variants is of…

Download the archive, untar it, and look in the 'scripts' directory, there you'll find a perl script called 'ill2fastq.pl'. I never used it, but it should do the job.
There are probably a lot more tools... maybe you could have a look at GALAXY, but I am not sure if they provide qseq-to-fastq conversion.

When I read your original post again, it seems you want to use CASAVA 1.7 for mapping? Be aware that 1.7 needs qseq format for input files, however the fresh 1.8 takes fastq as input ...

hth, Sven

**sparks** · 06-07-2011, 11:48 PM

Originally posted by tonio100680 View Post

Thank you for the help !

After converting my files are formatted QSEQ. To use Novobarcode must convert them FASTQ. If so have you a simple solution?

Recent versions of Novoalign will process qseq files. The latest version will accept 3 read files with index tag in it's own read file. Output is then qseq.

Earlier version could only accept 2 qseq files and would write demuxed files in fastq format. In this case you run novobarcode twice, once for read1 and index read then again for read2 and the index read. You can still do this with latest version if you wnat qseq to fastq conversion.

Colin

**tonio100680** · 06-07-2011, 11:48 PM

That's right, I want to use CASAVA 1.7 because the 1.8 is not available in France...
I can not use novobarcode because the output format is not compatible with the input format CASAVA so I'm looking for an alternative demultiplexer "simply" ...

**sklages** · 06-08-2011, 12:29 AM

CASAVA 1.8 is already available at iCom; if you have your tags directly attached to your (first) read, you'll probably need to write your own demultiplexer to write qseq again (shouldn't be too hard if know someone familiar with e.g. perl).

Or a simple one-liner, assuming the barcode sequence 'ACGTACGT' (not removing it),

Code:

perl -lane 'print if($F[8]=~/^ACGTACGT/)' SampleABC_qseq.txt > SampleA_NewQseq.txt

You could then use the 'qseq-mask' (USE_BASES) option of GERALD to skip these bases.

Or just a starting point (not tested thoroughly), a simple script looking for all seqs starting with $barcode and removing it from seq and quals:

Code:

#!/usr/local/bin/perl

use warnings;
use strict;

my $barcode = shift;
my $length  = length($barcode);
my @line;
my ($s,$q);

while (<>) {

    chomp;
    @line=split;
    next unless ($line[8]=~/^$barcode/o);
    
    $s=substr($line[8],0+$length);
    $q=substr($line[9],0+$length);
    
    print join("\t", @line[0..7]), "\t$s\t$q\t$line[10]\n";     
}

start with the barcode sequence as the first argument and a bunch of qseq files as the following arguments. Redirect output to a new file if happy; the above script just dumps to the terminal.

E.g.

Code:

scriptName.pl ACGTACG *qseq.txt > newQseqFile_ACGTACG.txt

hth, Sven

**sklages** · 06-08-2011, 02:04 AM

btw, as of CASAVA version 1.8 there is a script called "configureQseqToFastq.pl" to convert a whole folder of qseqs to fastq.

**sparks** · 06-08-2011, 11:39 PM

Originally posted by tonio100680 View Post

That's right, I want to use CASAVA 1.7 because the 1.8 is not available in France...
I can not use novobarcode because the output format is not compatible with the input format CASAVA so I'm looking for an alternative demultiplexer "simply" ...

Let me look to see if I can get novobarcode to do qseq in & out when index tag is embedded in the read.

Novobarcode does allow some mismatches in the index tag so it may classify more reads than a perl script.

**sparks** · 06-09-2011, 12:30 AM

I've modded novobarcode so that it can write out QSEQ when input is in QSEQ. If you'd like to try it send an email to support (at) novocraft ....

**cbrennan** · 06-16-2011, 02:11 PM

Originally posted by sklages View Post

If you don't use an index read, CASAVA is of no use for demultiplexing.
Have a look at 'sabre' (https://github.com/ucdavis-bioinformatics/sabre) or FastX-Toolkit (http://hannonlab.cshl.edu/fastx_tool...splitter_usage) amongst others.

hth, Sven

We actually use Casava v.17 to demultiplex our in-house designed barcodes routinely. We don't use the Illumina index read kit, we just set out detection cycles out past the barcode and use Casava to demultiplex the barcode. It's just a matter of properly formatting the sample sheet. Casava will make you a folder for each barcode and demultiplex the _qseq.txt files to each folder, then you can run Gerald to create the .fastq

Take a look a the demultiplex.pl script in Casava 1.7 and there is an example of the sample sheet format to use on pg 14 in the manual.

**sklages** · 06-16-2011, 10:48 PM

Originally posted by cbrennan View Post

We actually use Casava v.17 to demultiplex our in-house designed barcodes routinely. We don't use the Illumina index read kit, we just set out detection cycles out past the barcode and use Casava to demultiplex the barcode. It's just a matter of properly formatting the sample sheet. Casava will make you a folder for each barcode and demultiplex the _qseq.txt files to each folder, then you can run Gerald to create the .fastq

Take a look a the demultiplex.pl script in Casava 1.7 and there is an example of the sample sheet format to use on pg 14 in the manual.

ah, ok. But actually you are using an index .. I interpreted that the OP had indices as part of the construct to be sequenced, just like nimblegen adaptors or so.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Today, 11:49 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Demultiplexing and CASAVA 1.7

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News