Unconfigured Ad

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ScottC
    Senior Member
    • Jan 2008
    • 244

    #46
    I'd be interested in that, and I'd be prepared to submit some data.

    NGSfan: I'm mainly interested in benchmarking our system against others. Because I'm running the machine, I'm always interested in how well it's performing in comparison to other sequencing service providers!

    Comment

    • lparsons
      Member
      • Nov 2008
      • 28

      #47
      Excellent utility Simon. Thank you.

      I'm running into what looks like an old bug, however. I'm using FASTQC version 0.3.1 on a SunOS 5.10 server and I'm getting a HeadlessException. Any tips on solving this?

      Code:
      Exception in thread "main" java.awt.HeadlessException: 
      No X11 DISPLAY variable was set, but this program performed an operation which requires it.
              at sun.java2d.HeadlessGraphicsEnvironment.getDefaultScreenDevice(HeadlessGraphicsEnvironment.java:65)
              at javax.swing.RepaintManager.getVolatileOffscreenBuffer(RepaintManager.java:583)
              at javax.swing.JComponent.paintDoubleBuffered(JComponent.java:4911)
              at javax.swing.JComponent.paint(JComponent.java:996)
              at uk.ac.bbsrc.babraham.FastQC.Graphs.QualityBoxPlot.paint(QualityBoxPlot.java:81)
              at uk.ac.bbsrc.babraham.FastQC.Graphs.QualityBoxPlot.paint(QualityBoxPlot.java:75)
              at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.makeReport(PerBaseQualityScores.java:184)
              at uk.ac.bbsrc.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:63)
              at uk.ac.bbsrc.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:82)
              at uk.ac.bbsrc.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:28)
              at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:71)
      Last edited by lparsons; 06-03-2010, 08:30 AM.

      Comment

      • simonandrews
        Simon Andrews
        • May 2009
        • 870

        #48
        Originally posted by lparsons View Post
        I'm running into what looks like an old bug, however. I'm using FASTQC version 0.3.1 on a SunOS 5.10 server and I'm getting a HeadlessException. Any tips on solving this?

        Code:
        Exception in thread "main" java.awt.HeadlessException: 
        No X11 DISPLAY variable was set, but this program performed an operation which requires it.
                at sun.java2d.HeadlessGraphicsEnvironment.getDefaultScreenDevice(HeadlessGraphicsEnvironment.java:65)
        That's really strange. It's throwing a Headless exception from within the HeadlessGraphicsEnvironment! That means that the headless environment is being correctly set. (which was the original bug which was fixed in an earlier revision).

        At first glance this looks like it has to be a bug in the core java class - especially as it seems to be SunOS specific.

        As a test can you try setting a DISPLAY environment variable and see if it then works. It may be a redundant check for something which isn't actually required.

        Comment

        • colindaven
          Senior Member
          • Oct 2008
          • 417

          #49
          Nice work Simon, this is a simple and easy to use package.

          Comment

          • antoniou
            Junior Member
            • Oct 2008
            • 7

            #50
            Originally posted by Thomas Doktor View Post
            The qualities look fine so it's not an issue of bad base calling. I think you could be right that the cluster calling and/or sequencing chemistry could explain some of it. Could perhaps explain why certain sequences in the genome are less likely to be sequenced, we often see peaks and valleys in exons in our RNA-seq runs which are most likely explained by sequencing artefacts.
            I have seen the same phenomena but only with our mRNA-Seq libraries. Our genomic libraries do not show any biases. Has anyone else experienced this? Could it be an artifact of the Illumina library preparation protocol, may be at the fragmentation step?

            Comment

            • lletourn
              Member
              • Oct 2009
              • 63

              #51
              The illumina RNA protocol uses random hexamers to amplify the RNA. The thing is they are not 100% random so the beginning looks skewed for base composition, but that's because of the amplification.

              For mapping it's no problem. For assembly it might confuse some assemblers. (When assembling I would trim the 5' of RNA, not for mapping)

              Comment

              • antoniou
                Junior Member
                • Oct 2008
                • 7

                #52
                I just came across a reference to the following article in a different thread.

                It also attributes the biases to random priming.


                Eric

                Comment

                • simonandrews
                  Simon Andrews
                  • May 2009
                  • 870

                  #53
                  FastQC v0.4 released

                  I've just put FastQC v0.4 up on our website.

                  FastQC v0.4 introduces a new analysis module, an easier way to launch the program from the command line and a new output file, as well as fixing a few minor bugs.

                  The new analysis module is the sequence duplication level module. This is a complement to the existing overrepresented sequences module in that it looks at sequences which occur more than once in your data. The new module takes a more global view and says what proportion of all of your sequences occur once, twice, three times etc. In a diverse library most sequences should occur only once. A highly enriched library may have some duplication, but higher levels of duplication may indicate a problem, such as a PCR overamplification.

                  In response to several requests we've also now introduced a new output file into the report. This is a text based, tab delimited file which includes all of the data show in the graphs in the graphical report. This would allow people
                  running pipelines to store the data generated by fastQC and analyse it systematically rather than just taking the pass/fail/warn summary, or reviewing the reports manually.

                  Finally, if you're running fastqc from the command line we've now included a 'fastqc' wrapper script which you can launch directly rather than having to construct a java launch command. You can still pass -Dxxx options through to the program, but for simple analyses you can now simply run:

                  fastqc [some files]

                  ..once you have included the FastQC install directory into your path. More details are in the install document.

                  You can get the new version from:



                  [If you don't see the new version of any page hit control+refresh to force our cache to update]

                  Comment

                  • NGSfan
                    Senior Member
                    • Apr 2009
                    • 181

                    #54
                    Fantastic! I really like the command line ability - really good for pipelines.

                    Also nice that you display the Quality score type (Illumina v#/Sanger) in your output - helps to sort out confusion quickly when going through older data, especially after all of Illumina's schizophrenic quality score changes .

                    Comment

                    • agc
                      Member
                      • May 2010
                      • 26

                      #55
                      I'd like to run FastQC on SOLiD reads. I saw that someone did this using solid2fastq. Is it possible to do it without running solid2fastq? IE, would it work with only the SOLiD 'quals' file?

                      EDIT: After running FastQC on SOLiD files converted to fastq files via solid2fastq, the results file says (under basic statistics):
                      File type Conventional base calls

                      Should it have recognized it as colorspace?

                      Thanks!
                      Last edited by agc; 06-20-2010, 03:59 AM.

                      Comment

                      • mard
                        Member
                        • Jan 2010
                        • 21

                        #56
                        Hi Simon,

                        Thanks for the new features in FastQC v0.4.
                        I just installed v0.4 but got the error below when running it on a fastq file (I had previously run v0.3 on this file with no issues.)

                        Processing sequence.fastq
                        Approx 5% complete for sequence.fastq
                        Exception in thread "AWT-AppKit" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space

                        Comment

                        • simonandrews
                          Simon Andrews
                          • May 2009
                          • 870

                          #57
                          Originally posted by mard View Post
                          Processing sequence.fastq
                          Approx 5% complete for sequence.fastq
                          Exception in thread "AWT-AppKit" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space
                          The error is because the program ran out of memory. The new version will use a bit more memory than the previous version since it looks at more sequences for the overrepresented sequence module. I've tested it with up to four 20million+ files open at the same time though and it was OK.

                          Can you let me know the exact command you are using to launch the program. If you're using the full java command you need to ensure that you add the -Xmx250m option to allocate a larger than default memory block to the program. If you use the fastqc wrapper then this should be added automatically.

                          Comment

                          • simonandrews
                            Simon Andrews
                            • May 2009
                            • 870

                            #58
                            Originally posted by agc View Post
                            I'd like to run FastQC on SOLiD reads. I saw that someone did this using solid2fastq. Is it possible to do it without running solid2fastq? IE, would it work with only the SOLiD 'quals' file?
                            It will work with colorspace fastq files - you don't need to convert to base calls. I don't work with SOLID data directly so I'm not sure whether this is produced directly by the pipeline or not. I'm happy to look at other alternatives for SOLID data, but the program is fairly tied to fastq format (ie needs to work with a sequence and an encoded quality string).

                            Originally posted by agc View Post
                            EDIT: After running FastQC on SOLiD files converted to fastq files via solid2fastq, the results file says (under basic statistics):
                            File type Conventional base calls

                            Should it have recognized it as colorspace?
                            It depends on the conversion. If you look in the file you'll either see conventional base calls (something like GATCTCTAGATCTCT) or colorspace calls (something like G1324132431432434312). If you see colorspace calls and the report says conventional calls then can you send me the top few lines of the file and I can see why it's going wrong. It may be that your conversion program converted to base calls already though.

                            It FastQC gets the file type wrong it's normally pretty obvious since most of the graphs will show very weird results.

                            Comment

                            • mard
                              Member
                              • Jan 2010
                              • 21

                              #59
                              Originally posted by simonandrews View Post
                              The error is because the program ran out of memory. The new version will use a bit more memory than the previous version since it looks at more sequences for the overrepresented sequence module. I've tested it with up to four 20million+ files open at the same time though and it was OK.

                              Can you let me know the exact command you are using to launch the program. If you're using the full java command you need to ensure that you add the -Xmx250m option to allocate a larger than default memory block to the program. If you use the fastqc wrapper then this should be added automatically.

                              Thanks for the quick reply Simon.

                              The command I'm using is:

                              Code:
                              java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq
                              and the sequence.fastq file I'm running it on is 2.9Gb (~17million 75bp reads)

                              Comment

                              • simonandrews
                                Simon Andrews
                                • May 2009
                                • 870

                                #60
                                Originally posted by mard View Post
                                Thanks for the quick reply Simon.

                                The command I'm using is:

                                Code:
                                java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq
                                and the sequence.fastq file I'm running it on is 2.9Gb (~17million 75bp reads)
                                Maybe it's the longer sequence length which is causing the problem. Can you try changing the -Xmx250m to -Xmx500m and see if that works.

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...