Oh - that's an intentional protection from overwriting files. Just delete the output file first or add the "overwrite" flag.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
high contaninants
Thanks.
Input is being processed as unpaired
Input: 385043 reads 10781204 bases.
Contaminants: 341911 reads (88.80%) 9573508 bases (88.80%)
Result: 43132 reads (11.20%) 1207696 bases (11.20%)
What is diffinition of contaminants? It looks very high.
Comment
-
k=16 shows high contaminants than k=26
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_26.txt k=26 fbm
Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_26.txt, k=26, fbm]
No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
Initial:
Memory: free=237m, used=14m
Added 13 kmers; time: 0.023 seconds.
Memory: free=228m, used=23m
Input is being processed as unpaired
Input: 159642 reads 4469976 bases.
Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
Result: 28918 reads (18.11%) 809704 bases (18.11%)
Time: 0.197 seconds.
Reads Processed: 159k 811.47k reads/sec
Bases Processed: 4469k 22.72m bases/sec
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ ^C
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
bduk.sh: command not found
zheng@zheng-XPS-8500:~/Desktop/bbmap/20140916ngs$ bbduk.sh -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
java -ea -Xmx1g -cp /home/zheng/Desktop/bbmap/current/ jgi.BBDukF -Xmx1g in=probe48mix25fg_S7_L001_R2_001.fastq ref=ngs13template.fasta stats=probe48mix25fg_S7_L001_R2_001_16.txt k=16 fbm
Executing jgi.BBDukF [-Xmx1g, in=probe48mix25fg_S7_L001_R2_001.fastq, ref=ngs13template.fasta, stats=probe48mix25fg_S7_L001_R2_001_16.txt, k=16, fbm]
No output stream specified. To write to stdout, please specify 'out=stdout.fq' or similar.
Initial:
Memory: free=237m, used=14m
Added 143 kmers; time: 0.028 seconds.
Memory: free=228m, used=23m
Input is being processed as unpaired
Input: 159642 reads 4469976 bases.
Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
Result: 7915 reads (4.96%) 221620 bases (4.96%)
Comment
-
So... that's telling you that you are getting matches between the stuff in your input file (probe48mix25fg_S7_L001_R2_001.fastq) and your reference file (ngs13template.fasta). And a shorter kmer will always find more matches in the presence of error.
probe48mix25fg_S7_L001_R2_001_26.txt will contain a list of which reference sequences were seen, and how many times they were seen.
Comment
-
And a shorter kmer will always find more matches in the presence of error.
Here k=16 shows less match sequences than k=26
for k=16
Input: 159642 reads 4469976 bases.
Contaminants: 151727 reads (95.04%) 4248356 bases (95.04%)
Result: 7915 reads (4.96%) 221620 bases (4.96%)
for k=26
Input: 159642 reads 4469976 bases.
Contaminants: 130724 reads (81.89%) 3660272 bases (81.89%)
Result: 28918 reads (18.11%) 809704 bases (18.11%)
Comment
-
In this case, the output is misleading... BBDuk assumes that the ref file is a file of contaminants because that's what I originally designed it for. So "Contaminants" actually means "Things that match the reference". I may change the wording eventually.
In other words, 95.04% of the reads matched the reference for K=16 and 81.89% did for K=26.
Comment
Latest Articles
Collapse
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
[Article Coming Soon!]...-
Channel: Articles
Today, 08:07 AM -
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
-
by seqadmin
During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.
Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...-
Channel: Articles
09-09-2024, 10:59 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
23 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
1 response
29 views
0 likes
|
Last Post
by EmiTom
Today, 06:46 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Comment