Originally posted by debarryj
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Hello everybody,
I am trying Trimmomatic for the first time for a few days on my paired-ends 100pb reads data. I am facing a problem: the 4 output files created are empty.
I want to use Trimmomatic with these parameters:
java -jar /Trimmomatic-0.30/trimmomatic-0.30.jar PE GLE7.R1.fastq GLE7.R2.fastq GLE7_paired.R1.fastq GLE7_unpaired.R1.fastq GLE7_paired.R2.fastq GLE7_unpaired.R2.fastq SLIDINGWINDOW:6:20 LEADING:20 TRAILING:20 AVGQUAL:20 MINLEN:36
TrimmomaticPE: Started with arguments: GLE7.R1.fastq GLE7.R2.fastq GLE7_paired.R1.fastq GLE7_unpaired.R1.fastq GLE7_paired.R2.fastq GLE7_unpaired.R2.fastq SLIDINGWINDOW:6:20 LEADING:20 TRAILING:20 AVGQUAL:20 MINLEN:36
Input Read Pairs: 61314232 Both Surviving: 0 (0.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 61314232 (100.00%)
TrimmomaticPE: Completed successfully
I don't think that my thresholds are too drastic for my data. Attached is the FastQC output.
Do you have any idea of what is happening?
Thank you in advance,
JaneAttached FilesLast edited by Jane M; 10-31-2013, 04:25 AM.
Comment
-
Originally posted by Jane M View PostHello everybody,
I am trying Trimmomatic for the first time for a few days on my paired-ends 100pb reads data. I am facing a problem: the 4 output files created are empty.
I want to use Trimmomatic with these parameters:
If no quality score is specified, phred-64 is the default for historical reasons but is correct only for the older Illumina machines / pipeline versions.If you are using the Illumina HiSeq or MiSeq, you will need to add –phred33. This will be changed to an 'autodetected' quality score in a future version!
Code:java -jar /Trimmomatic-0.30/trimmomatic-0.30.jar PE -phred33 GLE7.R1.fastq GLE7.R2.fastq GLE7_paired.R1.fastq GLE7_unpaired.R1.fastq GLE7_paired.R2.fastq GLE7_unpaired.R2.fastq SLIDINGWINDOW:6:20 LEADING:20 TRAILING:20 AVGQUAL:20 MINLEN:36
Comment
-
Originally posted by kmcarr View PostTrimmomatic, by default, assumes that FASTQ reads still use the very old ASCII phred+64 encoding for their Q-scores. Here is the quote from the Trimmomatic manual:
Using that default Trimmomatic believes all your base calls are crap (<Q20). You have to add '-phred33' to your command line to change this default behvior. E.g.
Now that the main problem is solved, there are 3 details I would like to discuss:
- I read in this thread that Trimmomatic should be multi-threated. I have not found an option to do that. Is it possible?
- I don't clearly understand what is the keepBothReads. Could you please explain me in other words?
keepBothReads: After read-though has been detected by palindrome mode, and the adapter sequence removed, the reverse read contains the same sequence information as the forward read, albeit in reverse complement. For this reason, the default behaviour is to entirely drop the reverse read.
For gene fusion detection, I will use tophat2 --fusion-search. Do you know if it's a good idea to use unpaired reads for fusion detection? Should I set keepBothReads=true?
Thank you,
Jane
Comment
-
Originally posted by Jane M View Post- I read in this thread that Trimmomatic should be multi-threated. I have not found an option to do that. Is it possible?
Code:java -jar <path to trimmomatic.jar> PE [COLOR="Red"]-threads <int>[/COLOR] -phred33 <inputFiles> <outputFiles> <trimmerParameters>...
- I don't clearly understand what is the keepBothReads. Could you please explain me in other words?
- Last point: I intend to keep both paired and unpaired reads. I will use tophat2 for alignment, which seems to deal with unpaired reads.
For gene fusion detection, I will use tophat2 --fusion-search. Do you know if it's a good idea to use unpaired reads for fusion detection? Should I set keepBothReads=true?
Comment
-
Originally posted by kmcarr View PostUse the '-threads <int>' option described in the manual or in command line usage message shown when your run trimmomatic with just the '-h' (help) parameter
Replace <int> with number of threads you wish to use. In my experience trimmomatic does scale much beyond 3-4 threads.
Look at the figure on the top of page 5 of the Trimmomatic Manual. In part D of the figure it shows the case where the insert (green) is shorter than the read length such that you get read through of the insert into Illumina adapter at the 3' end (red). In such a case, with PE reads, read #2 will completely overlap read #1, as its reverse complement. No additional sequence information is provided by read #2. Trimmomatic's default behavior is to keep read #1 (after trimming the adapter (red) portion) as a singleton and discard read #2 since it is simply redundant information. The '-keepBothReads' option changes the default, read 1 and read 2 will be kept as paired reads.
I'm not familiar with Tophat fusion or how it would deal with the case of completely overlapping reads so I can't comment.
Comment
-
Hi kmcarr,
I hope you could also help me out
I am having a similar problem here, no idea what is wrong with these reads
Exception processing reads: HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/1 and HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/2
java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 58
......
Cheers
Comment
-
-
Originally posted by luiscunhamx View PostHi kmcarr,
I hope you could also help me out
I am having a similar problem here, no idea what is wrong with these reads
Exception processing reads: HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/1 and HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/2
java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 58
......
Cheers
Could you tell me the command line used, including the version of trimmomatic, and also the remainder of the stack trace?
Tony.
Comment
-
Sorry Tony,
not sure why I did not post it in the first place
here goes the command:
java -classpath /media/scratch/sbilnc/appz/Trimmomatic-0.32/trimmomatic-0.32.jar org.usadellab.trimmomatic.TrimmomaticPE -phred33 w2_1_sufx.fastq w2_2_sufx.fastq trimmomatic_w1_1_sufx.fastq trimmomatic_w1_1_sufx_unpaired.fastq trimmomatic_w1_2_sufx.fastq trimmomatic_w1_2_sufx_unpaired.fastq ILLUMINACLIP:/media/scratch/sbilnc/appz/Trimmomatic-0.32/adapters/TruSeq2-PE.fa:2:30:10 LEADING:3 SLIDINGWINDOW:4:15 MINLEN:100
and the rest you can find it in text file attached,
Thanking in advanceAttached Files
Comment
-
Originally posted by luiscunhamx View PostSorry Tony,
not sure why I did not post it in the first place.
Can you also post the reads:
HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/1 and HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/2
from the input files? It seems they trigger something, and even though it looks like a bug in the trimmomatic (or a least lack of graceful failure), i would like to know the trigger so i can handle it better or survive it.
Thanks,
Tony.
Comment
-
Hi Tony
Strangely the reads are these
@HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/1
CA
+
CC
and
@HISEQ2000:406:H0JYCADXX:2:1101:13467:8659_/2
G
+
@
I had no idea that I had this type of error as the data were produced by hiseq2500 rapid run and I would expect all reads with 150bp as this is the raw data (and added suffixes), nevertheless do you think would be a good idea to filter by length before feeding it to Trimmomatic?
Thanking in advance for the attention
Luis
Comment
-
Originally posted by luiscunhamx View PostI had no idea that I had this type of error as the data were produced by hiseq2500 rapid run and I would expect all reads with 150bp as this is the raw data (and added suffixes), nevertheless do you think would be a good idea to filter by length before feeding it to Trimmomatic?
In any case, trimmomatic appears to break on such short read pairs - i guess i never tested such a scenario. You could try adding a MINLENGTH filter as the first step, to prevent the short reads causing problems.
Thanks,
Tony.
Comment
Latest Articles
Collapse
-
by seqadmin
Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...-
Channel: Articles
10-18-2024, 07:11 AM -
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks
by seqadmin
Started by seqadmin, Yesterday, 05:31 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 05:31 AM
|
||
Started by seqadmin, 10-24-2024, 06:58 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
10-24-2024, 06:58 AM
|
||
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types
by seqadmin
Started by seqadmin, 10-23-2024, 08:43 AM
|
0 responses
48 views
0 likes
|
Last Post
by seqadmin
10-23-2024, 08:43 AM
|
||
Started by seqadmin, 10-17-2024, 07:29 AM
|
0 responses
58 views
0 likes
|
Last Post
by seqadmin
10-17-2024, 07:29 AM
|
Comment