Seqanswers Leaderboard Ad

**MrRight** · 03-27-2011, 09:10 AM

Would it be possible during the bcl demultiplexing specify the error/mismatch value for the index ?

**sklages** · 03-28-2011, 03:27 AM

as this already worked in 1.7, why should Illumina remove this "feature"?

**visivas** · 04-21-2011, 07:52 AM

It seems that there are lots of things that has changed with v1.8. I wish Illumina releases at least a user guide/early version of the software. We can discuss all year long and still will not get the complete picture of the new version from the release notes alone. Many centers like ours have wrappers around these software for automation.

**sparks** · 05-11-2011, 12:24 AM

Hi,
V1.8 has some extra fields:
<is filtered> is Y if the read is filtered, N otherwise.
<control number> is 0 when none of the control bits are on, otherwise it is an even number.
Does anyone know what these are for?
Is is_filtered reminiscent of QSEQ quality flag and if so does 'Y' mean high or low quality?

Colin

**caddymob** · 05-11-2011, 11:48 AM

@sparks -- this is just like the 0/1 (fail/pass) in the last field old qseq files. The problem, to my knowledge is that all reads are output to the fastq.qz. Y means failed QC. Seems backwards I know...

Illumina should have a flag in the configureBclToFastq.pl script to either a) exclude non-passing filter reads or b) write them into a different fastq.gz. Otherwise you have to unzip and do filtering this via your own scripting, and this is just a waste of time...

One other thing I'll say Illumina about the format is the pass/fail and barcode string in the read header are delimited by a space. Spaces are bad! Shame! Lots of aligners will discard everything after the space.

**maubp** · 05-11-2011, 02:03 PM

Originally posted by caddymob View Post

One other thing I'll say Illumina about the format is the pass/fail and barcode string in the read header are delimited by a space. Spaces are bad! Shame! Lots of aligners will discard everything after the space.

On a similar point, I'd already posted earlier on this thread that I thought removing the forward/reverse suffix (i.e. /1 or /2 at the end of the read name) and sticking this in the read description (after the space) was a bad idea.

**caddymob** · 05-11-2011, 02:04 PM

Originally posted by maubp View Post

On a similar point, I'd already posted earlier on this thread that I thought removing the forward/reverse suffix (i.e. /1 or /2 at the end of the read name) and sticking this in the read description (after the space) was a bad idea.

I missed that, but yes, very good point!

**sparks** · 05-11-2011, 04:42 PM

Originally posted by caddymob View Post

@sparks -- this is just like the 0/1 (fail/pass) in the last field old qseq files. The problem, to my knowledge is that all reads are output to the fastq.qz. Y means failed QC. Seems backwards I know...

Illumina should have a flag in the configureBclToFastq.pl script to either a) exclude non-passing filter reads or b) write them into a different fastq.gz. Otherwise you have to unzip and do filtering this via your own scripting, and this is just a waste of time...

One other thing I'll say Illumina about the format is the pass/fail and barcode string in the read header are delimited by a space. Spaces are bad! Shame! Lots of aligners will discard everything after the space.

Thanks for update, I'll ad a function in novoalign to filter the failed reads.

With regard the barcode sequence it appears Illumina will have already demux'd the reads so all reads should have the same barcode. Is this correct or could we get a file with mixed index tags?

Colin

**sparks** · 05-11-2011, 07:31 PM

Does anyone have a few V1.8 fastq records they could share for testing? I'd like to identify file as V1.8 from header and parse the is_filtered field. I can fake some records for testing but real records would be better.

Thanks, Colin

**caddymob** · 05-12-2011, 10:30 AM

Couple test CASVA 1.8 fastqs with 400 reads for read 1 and read 2 attached, no QC filtering applied. Hope this helps!

Attached Files

**sparks** · 05-12-2011, 11:32 PM

Originally posted by caddymob View Post

Couple test CASVA 1.8 fastqs with 400 reads for read 1 and read 2 attached, no QC filtering applied. Hope this helps!

Hi Caddymob,

Thanks for that. The reads went perfectly though not many aligned against hg36, I guess they are not human.

Novoalign now recognises the 1.8 format and has options to skip, use or QC the is_filtered='Y' reads.

Cheers, Colin

**caddymob** · 05-13-2011, 06:50 PM

Originally posted by sparks View Post

Thanks for that. The reads went perfectly though not many aligned against hg36, I guess they are not human.

Correct, they're rat RNA-seq. Glad they worked anyway

**SeqAnswerSeeker** · 06-28-2011, 07:48 AM

FASTQ quality score above 40

With the new CASAVA version, base quality scores now include 41 (=J in ASCII)?

@HWI-ST750:72:B0812ABXX:5:1101:5504:2021 1:N:0:
TTGCAGGGTAGGTATAAGAGTTCTTAAAGAAAAGGAAATAGGACAACAATAAGAAGATAAGAAAAATCATTTGGACTTAAATTAGTTACATTGCTAAAGTTTCTC
+
BCCFFFFFCFHHCGHJJJIJHHIJJGJJJIJJJJJJDCGIIJJJJJJJJJJJJGHIJJJJJIJJJJIIJJIHHHHHHFFFFFFFEEEEEEEEDDDDDDDDDEEDD

Just wondering, since so far in our raw read data Phred scores ranged from 0 to 40 only.
Or is there an additional meaning behind the "J" base qual, like it was used for the stretch of "B"s at end of reads?

Thanks,
Natalie

**GenoMax** · 06-28-2011, 07:51 AM

Originally posted by SeqAnswerSeeker View Post

With the new CASAVA version, base quality scores now include 41 (=J in ASCII)?

@HWI-ST750:72:B0812ABXX:5:1101:5504:2021 1:N:0:
TTGCAGGGTAGGTATAAGAGTTCTTAAAGAAAAGGAAATAGGACAACAATAAGAAGATAAGAAAAATCATTTGGACTTAAATTAGTTACATTGCTAAAGTTTCTC
+
BCCFFFFFCFHHCGHJJJIJHHIJJGJJJIJJJJJJDCGIIJJJJJJJJJJJJGHIJJJJJIJJJJIIJJIHHHHHHFFFFFFFEEEEEEEEDDDDDDDDDEEDD

Just wondering, since so far in our raw read data Phred scores ranged from 0 to 40 only.
Or is there an additional meaning behind the "J" base qual, like it was used for the stretch of "B"s at end of reads?

Thanks,
Natalie

See this: http://seqanswers.com/forums/showthread.php?t=12339

**skruglyak** · 06-28-2011, 08:45 AM

Originally posted by SeqAnswerSeeker View Post

With the new CASAVA version, base quality scores now include 41 (=J in ASCII)?

@HWI-ST750:72:B0812ABXX:5:1101:5504:2021 1:N:0:
TTGCAGGGTAGGTATAAGAGTTCTTAAAGAAAAGGAAATAGGACAACAATAAGAAGATAAGAAAAATCATTTGGACTTAAATTAGTTACATTGCTAAAGTTTCTC
+
BCCFFFFFCFHHCGHJJJIJHHIJJGJJJIJJJJJJDCGIIJJJJJJJJJJJJGHIJJJJJIJJJJIIJJIHHHHHHFFFFFFFEEEEEEEEDDDDDDDDDEEDD

Just wondering, since so far in our raw read data Phred scores ranged from 0 to 40 only.
Or is there an additional meaning behind the "J" base qual, like it was used for the stretch of "B"s at end of reads?

Thanks,
Natalie

Hi Natalie,

there have been some improvements to the chemistry and a refinement of the quality model. As a result, we are now starting to see Q41. There is no additional meaning behind the "J".

Thanks,

Semyon

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, Today, 08:06 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 13 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News