Seqanswers Leaderboard Ad

**ScottC** · 06-02-2010, 03:35 PM

I'd be interested in that, and I'd be prepared to submit some data.

NGSfan: I'm mainly interested in benchmarking our system against others. Because I'm running the machine, I'm always interested in how well it's performing in comparison to other sequencing service providers!

**lparsons** · 06-03-2010, 08:23 AM

Excellent utility Simon. Thank you.

I'm running into what looks like an old bug, however. I'm using FASTQC version 0.3.1 on a SunOS 5.10 server and I'm getting a HeadlessException. Any tips on solving this?

Code:

Exception in thread "main" java.awt.HeadlessException: 
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
        at sun.java2d.HeadlessGraphicsEnvironment.getDefaultScreenDevice(HeadlessGraphicsEnvironment.java:65)
        at javax.swing.RepaintManager.getVolatileOffscreenBuffer(RepaintManager.java:583)
        at javax.swing.JComponent.paintDoubleBuffered(JComponent.java:4911)
        at javax.swing.JComponent.paint(JComponent.java:996)
        at uk.ac.bbsrc.babraham.FastQC.Graphs.QualityBoxPlot.paint(QualityBoxPlot.java:81)
        at uk.ac.bbsrc.babraham.FastQC.Graphs.QualityBoxPlot.paint(QualityBoxPlot.java:75)
        at uk.ac.bbsrc.babraham.FastQC.Modules.PerBaseQualityScores.makeReport(PerBaseQualityScores.java:184)
        at uk.ac.bbsrc.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:63)
        at uk.ac.bbsrc.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:82)
        at uk.ac.bbsrc.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:28)
        at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:71)

**simonandrews** · 06-03-2010, 11:16 AM

Originally posted by lparsons View Post

I'm running into what looks like an old bug, however. I'm using FASTQC version 0.3.1 on a SunOS 5.10 server and I'm getting a HeadlessException. Any tips on solving this?

Code:

Exception in thread "main" java.awt.HeadlessException: 
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
        at sun.java2d.HeadlessGraphicsEnvironment.getDefaultScreenDevice(HeadlessGraphicsEnvironment.java:65)

That's really strange. It's throwing a Headless exception from within the HeadlessGraphicsEnvironment! That means that the headless environment is being correctly set. (which was the original bug which was fixed in an earlier revision).

At first glance this looks like it has to be a bug in the core java class - especially as it seems to be SunOS specific.

As a test can you try setting a DISPLAY environment variable and see if it then works. It may be a redundant check for something which isn't actually required.

**colindaven** · 06-04-2010, 05:33 AM

Nice work Simon, this is a simple and easy to use package.

**antoniou** · 06-07-2010, 07:07 AM

Originally posted by Thomas Doktor View Post

The qualities look fine so it's not an issue of bad base calling. I think you could be right that the cluster calling and/or sequencing chemistry could explain some of it. Could perhaps explain why certain sequences in the genome are less likely to be sequenced, we often see peaks and valleys in exons in our RNA-seq runs which are most likely explained by sequencing artefacts.

I have seen the same phenomena but only with our mRNA-Seq libraries. Our genomic libraries do not show any biases. Has anyone else experienced this? Could it be an artifact of the Illumina library preparation protocol, may be at the fragmentation step?

**lletourn** · 06-07-2010, 07:26 AM

The illumina RNA protocol uses random hexamers to amplify the RNA. The thing is they are not 100% random so the beginning looks skewed for base composition, but that's because of the amplification.

For mapping it's no problem. For assembly it might confuse some assemblers. (When assembling I would trim the 5' of RNA, not for mapping)

**antoniou** · 06-07-2010, 08:01 AM

I just came across a reference to the following article in a different thread.

Biases in Illumina transcriptome sequencing caused by random hexamer priming - PubMed

http://www.ncbi.nlm.nih.gov/pubmed/20395217?dopt=Citation

Generation of cDNA using random hexamer priming induces biases in the nucleotide composition at the beginning of transcriptome sequencing reads from the Illumina Genome Analyzer. The bias is independent of organism and laboratory and impacts the uniformity of the reads along the transcriptome. We pr …

It also attributes the biases to random priming.

Eric

**simonandrews** · 06-18-2010, 02:46 AM

FastQC v0.4 released

I've just put FastQC v0.4 up on our website.

FastQC v0.4 introduces a new analysis module, an easier way to launch the program from the command line and a new output file, as well as fixing a few minor bugs.

The new analysis module is the sequence duplication level module. This is a complement to the existing overrepresented sequences module in that it looks at sequences which occur more than once in your data. The new module takes a more global view and says what proportion of all of your sequences occur once, twice, three times etc. In a diverse library most sequences should occur only once. A highly enriched library may have some duplication, but higher levels of duplication may indicate a problem, such as a PCR overamplification.

In response to several requests we've also now introduced a new output file into the report. This is a text based, tab delimited file which includes all of the data show in the graphs in the graphical report. This would allow people
running pipelines to store the data generated by fastQC and analyse it systematically rather than just taking the pass/fail/warn summary, or reviewing the reports manually.

Finally, if you're running fastqc from the command line we've now included a 'fastqc' wrapper script which you can launch directly rather than having to construct a java launch command. You can still pass -Dxxx options through to the program, but for simple analyses you can now simply run:

fastqc [some files]

..once you have included the FastQC install directory into your path. More details are in the install document.

You can get the new version from:

http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

[If you don't see the new version of any page hit control+refresh to force our cache to update]

**NGSfan** · 06-18-2010, 10:16 AM

Fantastic! I really like the command line ability - really good for pipelines.

Also nice that you display the Quality score type (Illumina v#/Sanger) in your output - helps to sort out confusion quickly when going through older data, especially after all of Illumina's schizophrenic quality score changes

.

**agc** · 06-20-2010, 02:09 AM

I'd like to run FastQC on SOLiD reads. I saw that someone did this using solid2fastq. Is it possible to do it without running solid2fastq? IE, would it work with only the SOLiD 'quals' file?

EDIT: After running FastQC on SOLiD files converted to fastq files via solid2fastq, the results file says (under basic statistics):
File type Conventional base calls

Should it have recognized it as colorspace?

Thanks!

**mard** · 06-20-2010, 05:47 PM

Hi Simon,

Thanks for the new features in FastQC v0.4.
I just installed v0.4 but got the error below when running it on a fastq file (I had previously run v0.3 on this file with no issues.)

Processing sequence.fastq
Approx 5% complete for sequence.fastq
Exception in thread "AWT-AppKit" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space

**simonandrews** · 06-20-2010, 11:07 PM

Originally posted by mard View Post

Processing sequence.fastq
Approx 5% complete for sequence.fastq
Exception in thread "AWT-AppKit" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space

The error is because the program ran out of memory. The new version will use a bit more memory than the previous version since it looks at more sequences for the overrepresented sequence module. I've tested it with up to four 20million+ files open at the same time though and it was OK.

Can you let me know the exact command you are using to launch the program. If you're using the full java command you need to ensure that you add the -Xmx250m option to allocate a larger than default memory block to the program. If you use the fastqc wrapper then this should be added automatically.

**simonandrews** · 06-20-2010, 11:16 PM

Originally posted by agc View Post

I'd like to run FastQC on SOLiD reads. I saw that someone did this using solid2fastq. Is it possible to do it without running solid2fastq? IE, would it work with only the SOLiD 'quals' file?

It will work with colorspace fastq files - you don't need to convert to base calls. I don't work with SOLID data directly so I'm not sure whether this is produced directly by the pipeline or not. I'm happy to look at other alternatives for SOLID data, but the program is fairly tied to fastq format (ie needs to work with a sequence and an encoded quality string).

Originally posted by agc View Post

EDIT: After running FastQC on SOLiD files converted to fastq files via solid2fastq, the results file says (under basic statistics):
File type Conventional base calls

Should it have recognized it as colorspace?

It depends on the conversion. If you look in the file you'll either see conventional base calls (something like GATCTCTAGATCTCT) or colorspace calls (something like G1324132431432434312). If you see colorspace calls and the report says conventional calls then can you send me the top few lines of the file and I can see why it's going wrong. It may be that your conversion program converted to base calls already though.

It FastQC gets the file type wrong it's normally pretty obvious since most of the graphs will show very weird results.

**mard** · 06-20-2010, 11:20 PM

Originally posted by simonandrews View Post

The error is because the program ran out of memory. The new version will use a bit more memory than the previous version since it looks at more sequences for the overrepresented sequence module. I've tested it with up to four 20million+ files open at the same time though and it was OK.

Can you let me know the exact command you are using to launch the program. If you're using the full java command you need to ensure that you add the -Xmx250m option to allocate a larger than default memory block to the program. If you use the fastqc wrapper then this should be added automatically.

Thanks for the quick reply Simon.

The command I'm using is:

Code:

java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq

and the sequence.fastq file I'm running it on is 2.9Gb (~17million 75bp reads)

**simonandrews** · 06-20-2010, 11:31 PM

Originally posted by mard View Post

Thanks for the quick reply Simon.

The command I'm using is:

Code:

java -Xmx250m -classpath /Tools/FastQC/ uk.ac.bbsrc.babraham.FastQC.FastQCApplication sequence.fastq

and the sequence.fastq file I'm running it on is 2.9Gb (~17million 75bp reads)

Maybe it's the longer sequence length which is causing the problem. Can you try changing the -Xmx250m to -Xmx500m and see if that works.

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 48 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News