I've just put up v0.4.3 on our website which fixes the sequence count problem.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
This topic is closed.
X
X
-
find contaminant sequence
hello Simon,
I use FastQC to evauate my sequence data.
The last part is contaminant(overrepresented sequences)
Total Sequences 9265299
Sequence length 42
It like this:
{
>>Overrepresented sequences fail
#SequenceCountPercentagePossible Source
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTAGATCGGAAG 119288 1.2874705932317998 Illumina Single End Apapter 2 (96% over 32bp)
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA 112538 1.2146181143209733 Illumina Single End Apapter 2 (100% over 33bp)
AATTCGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAA 29127 0.3143665412200945 No Hit
GATCGGAAGAGCTGTATGCCGTCTTCTGCTTAGATCGGAAGA 24460 0.2639957976531572 No Hit
AATTCACAGGTGTTCTCCCGTATTGTTGACATGCCAGCGGGT 20305 0.21915104952360417 No Hit
AATTCCCCTTGATTGCAAGGGGAACGAAATAGACAGATCGCT 17190 0.18553097962623763 No Hit
}
How can I find these contaminant sequences from all data?
use fastQC or bioperl module? or other algorithms?
Is this data's quality too poor that we can not use it to analysis ?
Thank you very much
Comment
-
Originally posted by flower6991 View Posthello Simon,
I use FastQC to evauate my sequence data.
The last part is contaminant(overrepresented sequences)
Total Sequences 9265299
Sequence length 42
It like this:
{
>>Overrepresented sequences fail
#SequenceCountPercentagePossible Source
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTAGATCGGAAG 119288 1.2874705932317998 Illumina Single End Apapter 2 (96% over 32bp)
GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA 112538 1.2146181143209733 Illumina Single End Apapter 2 (100% over 33bp)
AATTCGAATATCTGCCGAATGCCGTGTGGACGTAAGCGTGAA 29127 0.3143665412200945 No Hit
GATCGGAAGAGCTGTATGCCGTCTTCTGCTTAGATCGGAAGA 24460 0.2639957976531572 No Hit
AATTCACAGGTGTTCTCCCGTATTGTTGACATGCCAGCGGGT 20305 0.21915104952360417 No Hit
AATTCCCCTTGATTGCAAGGGGAACGAAATAGACAGATCGCT 17190 0.18553097962623763 No Hit
}
Originally posted by flower6991 View PostHow can I find these contaminant sequences from all data?
use fastQC or bioperl module? or other algorithms?
Originally posted by flower6991 View PostIs this data's quality too poor that we can not use it to analysis ?
FastQC output shouldn't be taken too literally. Just because you get a red cross against one or more tests doesn't necessarily mean that you should throw your data away. I can think of legitimate reasons why some data sets would fail every single one of the tests - and that's OK. What the program aims to do is to point things out to you ("Did you know that 3 sequences make up 50% of your data?" etc). Beyond that it's really up to you to decide if this means that the data is too poor to use, if you go ahead - but bear the FastQC results in mind in your interpretation, or if you decide the warning is spurious for the type of data you're analysing.
For example - every one of our PhiX control lanes now fails QC as assessed by FastQC because the degree of sequence duplication is ridiculously high. This is both a correct and irrelevant result. In a supposedly diverse library this would indicate a real problem, but in a PhiX lane we expect that. You have to judge the results based on your knowledge of the experiment.
Comment
-
Fastqc: Version 0.5.0
When I run fastqc in the home directory ~/bin/FastQC, I got this error.
java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
java version "1.5.0_17"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_17-b04, mixed mode)
Comment
-
Originally posted by fabrice View PostFastqc: Version 0.5.0
When I run fastqc in the home directory ~/bin/FastQC, I got this error.
java -Xmx250m -classpath . uk.ac.bbsrc.babraham.FastQC.FastQCApplication
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
If you're running fastqc on a unix system from the command line it's much easier to use the wrapper script which is included in the distribution.
In your case you'd initially need to do:
chmod 755 ~/bin/FastQC/fastqc
..then in future you can do:
~/bin/FastQC/fastqc [your list of files]
Comment
-
The script fastqc does not work for command line.
On mac:
java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02-279-10M3065)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01-279, mixed mode)
./fastqc aa.txt
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
On debian:
java -version
java version "1.5.0"
gij (GNU libgcj) version 4.3.2
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
./fastqc a.txt
Exception in thread "main" java.lang.NoClassDefFoundError: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at gnu.java.lang.MainThread.run(libgcj.so.90)
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:./,file:~/bin/FastQC/,file:~/bin/FastQC/], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
at java.net.URLClassLoader.findClass(libgcj.so.90)
at java.lang.ClassLoader.loadClass(libgcj.so.90)
at java.lang.ClassLoader.loadClass(libgcj.so.90)
at gnu.java.lang.MainThread.run(libgcj.so.90)
Originally posted by simonandrews View PostThis will be because you have an existing classpath defined and you need to add the new directory to it, rather than replacing it.
If you're running fastqc on a unix system from the command line it's much easier to use the wrapper script which is included in the distribution.
In your case you'd initially need to do:
chmod 755 ~/bin/FastQC/fastqc
..then in future you can do:
~/bin/FastQC/fastqc [your list of files]
Comment
-
On unbantu:
java -version
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
Exception in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Caused by: java.lang.ClassNotFoundException: uk.ac.bbsrc.babraham.FastQC.FastQCApplication
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: uk.ac.bbsrc.babraham.FastQC.FastQCApplication. Program will exit.
Comment
-
Originally posted by fabrice View PostException in thread "main" java.lang.NoClassDefFoundError: uk/ac/bbsrc/babraham/FastQC/FastQCApplication
Can you look in uk/ac/bbsrc/babraham/FastQC/ and see if you see a file called FastQCApplication.class. If you see a file called FastQCApplication.java then you've got the source files rather than the binaries.
Comment
-
The files are:
Analysis FastQCApplication.java Graphs Modules Resources Sequence
Dialogs FastQCMenuBar.java Help Report Results Statistics
Originally posted by simonandrews View PostCould you by any chance have downloaded the source distribution instead of the compiled version? The errors are all saying that java can't find the initial class file, which it should be able to if the classpath is set correctly.
Can you look in uk/ac/bbsrc/babraham/FastQC/ and see if you see a file called FastQCApplication.class. If you see a file called FastQCApplication.java then you've got the source files rather than the binaries.
Comment
-
FastQC v0.5.1 has been released. This fixes a formatting bug in the text output and a bug in the %GC profile for runs containing reads >100bp.
We've also improved the fitting of the modelled curve to the %GC profile and have added a load more oligos to the contaminants file (thanks to Aaron Statham for providing these).
You can get the new version from:
http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
[If you don't see the new version of any page hit shift+refresh to force our cache to update]
Comment
-
Originally posted by seq_GA View PostHi Simon,
May I know how to use the tool in linux environment? Thanks.
Comment
-
Hi Simon,
Thanks for your response. I am trying to use this as part of the pipeline and hence didn't try it through win32 to access the linux server.
I tried as below and please let me know the details.
Code:FastQC]$ chmod 755 fastqc FastQC]$ ./fastqc Exception in thread "main" java.awt.HeadlessException: No X11 DISPLAY variable was set, but this program performed an operation which requires it. at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:173) at java.awt.Window.<init>(Window.java:437) at java.awt.Frame.<init>(Frame.java:419) at java.awt.Frame.<init>(Frame.java:384) at javax.swing.JFrame.<init>(JFrame.java:180) at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.<init>(FastQCApplication.java:256) at uk.ac.bbsrc.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:91)
Comment
Latest Articles
Collapse
-
by seqadmin
Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...-
Channel: Articles
10-18-2024, 07:11 AM -
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks
by seqadmin
Started by seqadmin, Yesterday, 05:31 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 05:31 AM
|
||
Started by seqadmin, 10-24-2024, 06:58 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
10-24-2024, 06:58 AM
|
||
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types
by seqadmin
Started by seqadmin, 10-23-2024, 08:43 AM
|
0 responses
48 views
0 likes
|
Last Post
by seqadmin
10-23-2024, 08:43 AM
|
||
Started by seqadmin, 10-17-2024, 07:29 AM
|
0 responses
58 views
0 likes
|
Last Post
by seqadmin
10-17-2024, 07:29 AM
|
Comment