I have done a blast on virus contigs and obtained hits matching to virus (strains/isolates) in the database. I would like to calculate the percentage of reads that are aligning to the viruses in the database. Can someone guide me on how to do this?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
You can download the virus sequences in fasta format and use e.g. Bowtie2 to align the reads locally. Therefore, you need to build an index first. The log output of Bowtie2 tells you haw many reads mapped.
After aligning the reads, you can use samtools to get some statistics (e.g. samtools idxstats).
-
1. What format are your blast results in (html, xml, text)? You may be able to parse that result file if all you want to know is how many sequences hit a "virus".
2. If you did the blast locally do you have a sequence file with all "virus" sequences available? You will be able to use that file as an input for bowtie2 and follow the path @Michael.Ante suggested.
3. Are you comfortable using command line (e.g. linux) applications?
Comment
-
-
Originally posted by kaps View PostThanks Michael,
I have not used Bowtie/ samtools before. How do I start off?
you should have a look at the Bowtie2 homepage. There, it is explained in detail how the programs work. At the end of the manual is a "Lambda phage example", which has quite an overlap to your problem. It also has a SAMtools downstream section...
Cheers,
Michael
Comment
-
Originally posted by Michael.Ante View PostYou can download the virus sequences in fasta format and use e.g. Bowtie2 to align the reads locally. Therefore, you need to build an index first. The log output of Bowtie2 tells you haw many reads mapped.
After aligning the reads, you can use samtools to get some statistics (e.g. samtools idxstats).
I am getting a comment as below;
samtools idxstats lib4seq.sorted.bam
[bam_idxstats] fail to load the index.
what could be the problem?
Comment
-
Originally posted by Michael.Ante View PostYou can download the virus sequences in fasta format and use e.g. Bowtie2 to align the reads locally. Therefore, you need to build an index first. The log output of Bowtie2 tells you haw many reads mapped.
After aligning the reads, you can use samtools to get some statistics (e.g. samtools idxstats).
Comment
-
Originally posted by Michael.Ante View PostYou can download the virus sequences in fasta format and use e.g. Bowtie2 to align the reads locally. Therefore, you need to build an index first. The log output of Bowtie2 tells you haw many reads mapped.
After aligning the reads, you can use samtools to get some statistics (e.g. samtools idxstats).
After getting the samtools idxstats (on number of mapped vs unmapped reads), is it possible to extract/select reads that mapped from the raw read files/query? how is it done?
Comment
-
If you had used the "--un-conc and --al-conc" options (http://bowtie-bio.sourceforge.net/bo...output-options) the unmapped reads could have been written to separate files when you did the alignment.
1. You could repeat bowtie2 alignment with above parameters added to your original list (easier) OR
2. Identify read ID's of sequences that mapped and use a tool like seqtk to extract the mapped reads (e.g. seqtk subseq in.fq name.lst > out.fq)
Use @Michael.Ante's easy suggestion belowLast edited by GenoMax; 05-12-2015, 04:34 AM.
Comment
-
You can use samtools view to extract the mapped/unmapped reads by filtering the 'unmapped' flag:
Code:samtools view -F 4 -bh lib4seq.sorted.bam > lib4seq.sorted.mapped.bam samtools view -f 4 -bh lib4seq.sorted.bam > lib4seq.sorted.unmapped.bam
Comment
Latest Articles
Collapse
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 06-03-2024, 06:55 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
06-03-2024, 06:55 AM
|
||
Started by seqadmin, 05-30-2024, 03:16 PM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
05-30-2024, 03:16 PM
|
||
Comprehensive Sequencing of Great Ape Sex Chromosomes Yields Insights into Evolution and Genetic Variability
by seqadmin
Started by seqadmin, 05-29-2024, 01:32 PM
|
0 responses
29 views
0 likes
|
Last Post
by seqadmin
05-29-2024, 01:32 PM
|
||
Started by seqadmin, 05-24-2024, 07:15 AM
|
0 responses
216 views
0 likes
|
Last Post
by seqadmin
05-24-2024, 07:15 AM
|
Comment