I've just put up v0.4.3 on our website which fixes the sequence count problem.
Unconfigured Ad
Collapse
This topic is closed.
X
X
-
find contaminant sequence
hello Simon,
I use FastQC to evauate my sequence data.
The last part is contaminant(overrepresented sequences)
Total Sequences 9265299
Sequence length 42
It like this:
{
>>Overrepresented sequences fail
#SequenceCountPercentagePossible Source
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTAGATCGGAAG 119288 1.2874705932317998 Illumina Single End Apapter 2 (96% over 32bp)
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA 112538 1.2146181143209733 Illumina Single End Apapter 2 (100% over 33bp)
AATTCGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAA 29127 0.3143665412200945 No Hit
GATCGGAAGAGCTGTATGCCGTCTTCTGCTTAGATCGGAAGA 24460 0.2639957976531572 No Hit
AATTCACAGGTGTTCTCCCGTATTGTTGACATGCCAGCGGGT 20305 0.21915104952360417 No Hit
AATTCCCCTTGATTGCAAGGGGAACGAAATAGACAGATCGCT 17190 0.18553097962623763 No Hit
}
How can I find these contaminant sequences from all data?
use fastQC or bioperl module? or other algorithms?
Is this data's quality too poor that we can not use it to analysis ?
Thank you very much
Comment
-
-
So this is saying that you have some adapter contamination in your sample. You've probably lost 5-10% of your sequences to this contamination, but there's no reason to think that the rest of it won't be usable.Originally posted by flower6991 View Posthello Simon,
I use FastQC to evauate my sequence data.
The last part is contaminant(overrepresented sequences)
Total Sequences 9265299
Sequence length 42
It like this:
{
>>Overrepresented sequences fail
#SequenceCountPercentagePossible Source
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTAGATCGGAAG 119288 1.2874705932317998 Illumina Single End Apapter 2 (96% over 32bp)
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA 112538 1.2146181143209733 Illumina Single End Apapter 2 (100% over 33bp)
AATTCGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAA 29127 0.3143665412200945 No Hit
GATCGGAAGAGCTGTATGCCGTCTTCTGCTTAGATCGGAAGA 24460 0.2639957976531572 No Hit
AATTCACAGGTGTTCTCCCGTATTGTTGACATGCCAGCGGGT 20305 0.21915104952360417 No Hit
AATTCCCCTTGATTGCAAGGGGAACGAAATAGACAGATCGCT 17190 0.18553097962623763 No Hit
}
FastQC is not intended to be a filter - merely just to report on the state of your data. There are plenty of other tools out there which you can use to remove these contaminants if you need to do that before running the rest of your analyses.Originally posted by flower6991 View PostHow can I find these contaminant sequences from all data?
use fastQC or bioperl module? or other algorithms?
There's nothing in this result to suggest that - it simply shows that the data is contaminated. You need to look at the rest of the results as well to assess the overall quality of your data.Originally posted by flower6991 View PostIs this data's quality too poor that we can not use it to analysis ?
FastQC output shouldn't be taken too literally. Just because you get a red cross against one or more tests doesn't necessarily mean that you should throw your data away. I can think of legitimate reasons why some data sets would fail every single one of the tests - and that's OK. What the program aims to do is to point things out to you ("Did you know that 3 sequences make up 50% of your data?" etc). Beyond that it's really up to you to decide if this means that the data is too poor to use, if you go ahead - but bear the FastQC results in mind in your interpretation, or if you decide the warning is spurious for the type of data you're analysing.
For example - every one of our PhiX control lanes now fails QC as assessed by FastQC because the degree of sequence duplication is ridiculously high. This is both a correct and irrelevant result. In a supposedly diverse library this would indicate a real problem, but in a PhiX lane we expect that. You have to judge the results based on your knowledge of the experiment.
Comment
-
-
Fastqc: Version 0.5.0
When I run fastqc in the home directory ~/bin/FastQC, I got this error.
java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
java version "1.5.0_17"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_17-b04, mixed mode)
Comment
-
-
This will be because you have an existing classpath defined and you need to add the new directory to it, rather than replacing it.Originally posted by fabrice View PostFastqc: Version 0.5.0
When I run fastqc in the home directory ~/bin/FastQC, I got this error.
java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
If you're running fastqc on a unix system from the command line it's much easier to use the wrapper script which is included in the distribution.
In your case you'd initially need to do:
chmod 755 ~/bin/FastQC/fastqc
..then in future you can do:
~/bin/FastQC/fastqc [your list of files]
Comment
-
-
The script fastqc does not work for command line.
On mac:
java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02-279-10M3065)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01-279, mixed mode)
./fastqc aa.txt
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
On debian:
java -version
java version "1.5.0"
gij (GNU libgcj) version 4.3.2
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
./fastqc a.txt
Exception in thread "main" java.lang.NoClassDefFoundError: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at gnu.java.lang.MainThread.run(libgcj.so.90)
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:./,file:~/bin/FastQC/,file:~/bin/FastQC/], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
at java.net.URLClassLoader.findClass(libgcj.so.90)
at java.lang.ClassLoader.loadClass(libgcj.so.90)
at java.lang.ClassLoader.loadClass(libgcj.so.90)
at gnu.java.lang.MainThread.run(libgcj.so.90)
Originally posted by simonandrews View PostThis will be because you have an existing classpath defined and you need to add the new directory to it, rather than replacing it.
If you're running fastqc on a unix system from the command line it's much easier to use the wrapper script which is included in the distribution.
In your case you'd initially need to do:
chmod 755 ~/bin/FastQC/fastqc
..then in future you can do:
~/bin/FastQC/fastqc [your list of files]
Comment
-
-
On unbantu:
java -version
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: uk.ac.bbsrc.babraham.FastQC.FastQCApplication. Program will exit.
Comment
-
-
Could you by any chance have downloaded the source distribution instead of the compiled version? The errors are all saying that java can't find the initial class file, which it should be able to if the classpath is set correctly.Originally posted by fabrice View PostException in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Can you look in uk/ac/bbsrc/babraham/FastQC/ and see if you see a file called FastQCApplication.class. If you see a file called FastQCApplication.java then you've got the source files rather than the binaries.
Comment
-
-
The files are:
Analysis FastQCApplication.java Graphs Modules Resources Sequence
Dialogs FastQCMenuBar.java Help Report Results Statistics
Originally posted by simonandrews View PostCould you by any chance have downloaded the source distribution instead of the compiled version? The errors are all saying that java can't find the initial class file, which it should be able to if the classpath is set correctly.
Can you look in uk/ac/bbsrc/babraham/FastQC/ and see if you see a file called FastQCApplication.class. If you see a file called FastQCApplication.java then you've got the source files rather than the binaries.
Comment
-
-
FastQC v0.5.1 has been released. This fixes a formatting bug in the text output and a bug in the %GC profile for runs containing reads >100bp.
We've also improved the fitting of the modelled curve to the %GC profile and have added a load more oligos to the contaminants file (thanks to Aaron Statham for providing these).
You can get the new version from:
http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
[If you don't see the new version of any page hit shift+refresh to force our cache to update]
Comment
-
-
Instructions for installing and running the program on a variety of platforms are in the INSTALL.txt document which comes in the distribution. On linux there is a wrapper script which you can use to run the program which is probably the easiest way to launch it.Originally posted by seq_GA View PostHi Simon,
May I know how to use the tool in linux environment? Thanks.
Comment
-
-
Hi Simon,
Thanks for your response. I am trying to use this as part of the pipeline and hence didn't try it through win32 to access the linux server.
I tried as below and please let me know the details.
Thanks.Code:FastQC]$ chmod 755 fastqc FastQC]$ ./fastqc Exception in thread "main" java.awt.HeadlessException: No X11 DISPLAY variable was set, but this program performed an operation which requires it. at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173) at java.awt.Window.<init>(Window.java:437) at java.awt.Frame.<init>(Frame.java:419) at java.awt.Frame.<init>(Frame.java:384) at javax.swing.JFrame.<init>(JFrame.java:180) at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.<init>(FastQCApplication.java:256) at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:91)
Comment
-
Latest Articles
Collapse
-
by SEQadmin2
Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.
The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...-
Channel: Articles
06-02-2026, 10:05 AM -
-
by SEQadmin2
With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.
Introduction
Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...-
Channel: Articles
05-22-2026, 06:42 AM -
ad_right_rmr
Collapse
News
Collapse
| Topics | Statistics | Last Post | ||
|---|---|---|---|---|
|
Started by SEQadmin2, Yesterday, 10:09 AM
|
0 responses
10 views
0 reactions
|
Last Post
by SEQadmin2
Yesterday, 10:09 AM
|
||
|
Started by SEQadmin2, 06-04-2026, 08:59 AM
|
0 responses
18 views
0 reactions
|
Last Post
by SEQadmin2
06-04-2026, 08:59 AM
|
||
|
Started by SEQadmin2, 06-02-2026, 12:03 PM
|
0 responses
26 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 12:03 PM
|
||
|
Started by SEQadmin2, 06-02-2026, 11:40 AM
|
0 responses
21 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 11:40 AM
|
Comment