Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • MrRight
    Junior Member
    • Mar 2011
    • 2

    #46
    Would it be possible during the bcl demultiplexing specify the error/mismatch value for the index ?

    Comment

    • sklages
      Senior Member
      • May 2008
      • 628

      #47
      as this already worked in 1.7, why should Illumina remove this "feature"?

      Comment

      • visivas
        Junior Member
        • May 2010
        • 7

        #48
        It seems that there are lots of things that has changed with v1.8. I wish Illumina releases at least a user guide/early version of the software. We can discuss all year long and still will not get the complete picture of the new version from the release notes alone. Many centers like ours have wrappers around these software for automation.

        Comment

        • sparks
          Senior Member
          • Mar 2008
          • 126

          #49
          Hi,
          V1.8 has some extra fields:
          <is filtered> is Y if the read is filtered, N otherwise.
          <control number> is 0 when none of the control bits are on, otherwise it is an even number.
          Does anyone know what these are for?
          Is is_filtered reminiscent of QSEQ quality flag and if so does 'Y' mean high or low quality?

          Colin

          Comment

          • caddymob
            Member
            • Apr 2009
            • 36

            #50
            @sparks -- this is just like the 0/1 (fail/pass) in the last field old qseq files. The problem, to my knowledge is that all reads are output to the fastq.qz. Y means failed QC. Seems backwards I know...

            Illumina should have a flag in the configureBclToFastq.pl script to either a) exclude non-passing filter reads or b) write them into a different fastq.gz. Otherwise you have to unzip and do filtering this via your own scripting, and this is just a waste of time...

            One other thing I'll say Illumina about the format is the pass/fail and barcode string in the read header are delimited by a space. Spaces are bad! Shame! Lots of aligners will discard everything after the space.

            Comment

            • maubp
              Peter (Biopython etc)
              • Jul 2009
              • 1544

              #51
              Originally posted by caddymob View Post
              One other thing I'll say Illumina about the format is the pass/fail and barcode string in the read header are delimited by a space. Spaces are bad! Shame! Lots of aligners will discard everything after the space.
              On a similar point, I'd already posted earlier on this thread that I thought removing the forward/reverse suffix (i.e. /1 or /2 at the end of the read name) and sticking this in the read description (after the space) was a bad idea.

              Comment

              • caddymob
                Member
                • Apr 2009
                • 36

                #52
                Originally posted by maubp View Post
                On a similar point, I'd already posted earlier on this thread that I thought removing the forward/reverse suffix (i.e. /1 or /2 at the end of the read name) and sticking this in the read description (after the space) was a bad idea.
                I missed that, but yes, very good point!

                Comment

                • sparks
                  Senior Member
                  • Mar 2008
                  • 126

                  #53
                  Originally posted by caddymob View Post
                  @sparks -- this is just like the 0/1 (fail/pass) in the last field old qseq files. The problem, to my knowledge is that all reads are output to the fastq.qz. Y means failed QC. Seems backwards I know...

                  Illumina should have a flag in the configureBclToFastq.pl script to either a) exclude non-passing filter reads or b) write them into a different fastq.gz. Otherwise you have to unzip and do filtering this via your own scripting, and this is just a waste of time...

                  One other thing I'll say Illumina about the format is the pass/fail and barcode string in the read header are delimited by a space. Spaces are bad! Shame! Lots of aligners will discard everything after the space.
                  Thanks for update, I'll ad a function in novoalign to filter the failed reads.

                  With regard the barcode sequence it appears Illumina will have already demux'd the reads so all reads should have the same barcode. Is this correct or could we get a file with mixed index tags?

                  Colin

                  Comment

                  • sparks
                    Senior Member
                    • Mar 2008
                    • 126

                    #54
                    Does anyone have a few V1.8 fastq records they could share for testing? I'd like to identify file as V1.8 from header and parse the is_filtered field. I can fake some records for testing but real records would be better.

                    Thanks, Colin

                    Comment

                    • caddymob
                      Member
                      • Apr 2009
                      • 36

                      #55
                      Couple test CASVA 1.8 fastqs with 400 reads for read 1 and read 2 attached, no QC filtering applied. Hope this helps!
                      Attached Files

                      Comment

                      • sparks
                        Senior Member
                        • Mar 2008
                        • 126

                        #56
                        Originally posted by caddymob View Post
                        Couple test CASVA 1.8 fastqs with 400 reads for read 1 and read 2 attached, no QC filtering applied. Hope this helps!
                        Hi Caddymob,

                        Thanks for that. The reads went perfectly though not many aligned against hg36, I guess they are not human.

                        Novoalign now recognises the 1.8 format and has options to skip, use or QC the is_filtered='Y' reads.

                        Cheers, Colin

                        Comment

                        • caddymob
                          Member
                          • Apr 2009
                          • 36

                          #57
                          Originally posted by sparks View Post

                          Thanks for that. The reads went perfectly though not many aligned against hg36, I guess they are not human.
                          Correct, they're rat RNA-seq. Glad they worked anyway

                          Comment

                          • SeqAnswerSeeker
                            Junior Member
                            • Apr 2010
                            • 3

                            #58
                            FASTQ quality score above 40

                            With the new CASAVA version, base quality scores now include 41 (=J in ASCII)?

                            @HWI-ST750:72:B0812ABXX:5:1101:5504:2021 1:N:0:
                            TTGCAGGGTAGGTATAAGAGTTCTTAAAGAAAAGGAAATAGGACAACAATAAGAAGATAAGAAAAATCATTTGGACTTAAATTAGTTACATTGCTAAAGTTTCTC
                            +
                            BCCFFFFFCFHHCGHJJJIJHHIJJGJJJIJJJJJJDCGIIJJJJJJJJJJJJGHIJJJJJIJJJJIIJJIHHHHHHFFFFFFFEEEEEEEEDDDDDDDDDEEDD

                            Just wondering, since so far in our raw read data Phred scores ranged from 0 to 40 only.
                            Or is there an additional meaning behind the "J" base qual, like it was used for the stretch of "B"s at end of reads?

                            Thanks,
                            Natalie

                            Comment

                            • GenoMax
                              Senior Member
                              • Feb 2008
                              • 7142

                              #59
                              Originally posted by SeqAnswerSeeker View Post
                              With the new CASAVA version, base quality scores now include 41 (=J in ASCII)?

                              @HWI-ST750:72:B0812ABXX:5:1101:5504:2021 1:N:0:
                              TTGCAGGGTAGGTATAAGAGTTCTTAAAGAAAAGGAAATAGGACAACAATAAGAAGATAAGAAAAATCATTTGGACTTAAATTAGTTACATTGCTAAAGTTTCTC
                              +
                              BCCFFFFFCFHHCGHJJJIJHHIJJGJJJIJJJJJJDCGIIJJJJJJJJJJJJGHIJJJJJIJJJJIIJJIHHHHHHFFFFFFFEEEEEEEEDDDDDDDDDEEDD

                              Just wondering, since so far in our raw read data Phred scores ranged from 0 to 40 only.
                              Or is there an additional meaning behind the "J" base qual, like it was used for the stretch of "B"s at end of reads?

                              Thanks,
                              Natalie
                              See this: http://seqanswers.com/forums/showthread.php?t=12339

                              Comment

                              • skruglyak
                                Member
                                • Sep 2010
                                • 44

                                #60
                                Originally posted by SeqAnswerSeeker View Post
                                With the new CASAVA version, base quality scores now include 41 (=J in ASCII)?

                                @HWI-ST750:72:B0812ABXX:5:1101:5504:2021 1:N:0:
                                TTGCAGGGTAGGTATAAGAGTTCTTAAAGAAAAGGAAATAGGACAACAATAAGAAGATAAGAAAAATCATTTGGACTTAAATTAGTTACATTGCTAAAGTTTCTC
                                +
                                BCCFFFFFCFHHCGHJJJIJHHIJJGJJJIJJJJJJDCGIIJJJJJJJJJJJJGHIJJJJJIJJJJIIJJIHHHHHHFFFFFFFEEEEEEEEDDDDDDDDDEEDD

                                Just wondering, since so far in our raw read data Phred scores ranged from 0 to 40 only.
                                Or is there an additional meaning behind the "J" base qual, like it was used for the stretch of "B"s at end of reads?

                                Thanks,
                                Natalie
                                Hi Natalie,

                                there have been some improvements to the chemistry and a refinement of the quality model. As a result, we are now starting to see Q41. There is no additional meaning behind the "J".

                                Thanks,

                                Semyon

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Today, 08:59 AM
                                0 responses
                                7 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...