Is there a command to output the kmers of each sequence in a multifasta file?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Trouble parsing header
Dear BBMap team:
I tried to use filterbytile.sh to remove the reads with low quality, but I encountered an error message saying that there was a trouble parsing the header. I've read the description of the script and Brian Bushnell said that was possible when the reads were renamed (such as in SRA) and to contact him if such error happened.
I downloaded the sequencing data (SRA) from ncbi and used fastq-dump to get the fastq files. I wonder if there is a solution to this?
Thank you very much!
Rose
Comment
-
BBsketch alltoall is incomplete
Can I ask a question about bbsketch?
I want to compare the ANI between many genomes (1000+) to each other.
I did
Code:bbsketch.sh perfile genome_folder/*.fasta out=sketch.gz k=31,24 threads=16 comparesketch.sh alltoall sketch.gz k=31,24 prealloc=0.75 format=3 threads=16 out=table.tsv
Code:Set threads to 16 Loading sketches. Loaded 1157 sketches in 59.541 seconds. Total Time: 59.784 seconds.
Code:Set threads to 16 Loading sketches. Executing kmer.KmerTableSet [ways=31, tabletype=10, prealloc=0.75] Initial size set to 45218398 Initial: Ways=31, initialSize=45218398, prefilter=f, prealloc=0.75 Memory: max=91268m, total=91268m, free=90848m, used=420m 3.713 seconds. Indexed 2880884 unique and 10513099 total hashcodes. Loaded 1157 sketches in 8.457 seconds. Ran 1225005 comparisons in 9.344 seconds. Total Time: 17.801 seconds.
- Genomes are highly similar.
#Query Ref ANI QSize RefSize QBases RBases QTaxID RTaxID KID WKID SSU
genome1.fasta genome2.fasta 94.223 1984118 1796930 1987598 1797650 -1 -1 24.952 27.523 .
- It is not simply due to the naming: I neither find "genome1 vs genome2" nor "genome 2 vs genome1"
Any idea?
Comment
-
I'm trying to use BBmap to find all perfect hits or hits with an indel length 1.
Code:bbmapskinner.sh in=kmer.fasta out=result.sam ambiguous=all strictmaxindel=1
Is there something that I am doing wrong?
Comment
Latest Articles
Collapse
-
by seqadmin
The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...-
Channel: Articles
11-06-2024, 07:24 PM -
-
by seqadmin
Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...-
Channel: Articles
10-18-2024, 07:11 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 11-08-2024, 11:09 AM
|
0 responses
35 views
0 likes
|
Last Post
by seqadmin
11-08-2024, 11:09 AM
|
||
Started by seqadmin, 11-08-2024, 06:13 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
11-08-2024, 06:13 AM
|
||
Started by seqadmin, 11-01-2024, 06:09 AM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
11-01-2024, 06:09 AM
|
||
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks
by seqadmin
Started by seqadmin, 10-30-2024, 05:31 AM
|
0 responses
23 views
0 likes
|
Last Post
by seqadmin
10-30-2024, 05:31 AM
|
Comment