It seams to be a problem only when positions are "covered" by gaps in the read. Since it's calling N in the consensus if the position is not covered by a single read.
Stefan
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
It seams they changed it a little bit since the pileup function, which also interprets "-" as "A", but didn't fully resolve the problem.
Cheers,
Stefan
Leave a comment:
-
Hi,
Yes, I checked and it calls the reference in case the position is not covered by any read.
I should probably report that as a bug.
Thanks!
Stefan
Leave a comment:
-
So if you replaced the "N" with "C" or "T" or "G" do you get those letters instead of "A"? That would pretty much confirm the nature of this bug / design limitation, and give a good clue about where in the code to start looking.
Reminds me of this tweet from Vince Buffalo,
@vsbuffalo: Any time you only handle A, T, C, or G, the members of IUPAC kill an innocent kitten. We've all done it. #bioinfo
Leave a comment:
-
Hi,
Replacing "N" by "-" seams to help. It looks like it's really reading N's as A's.
Thanks!
Stefan
Leave a comment:
-
I wonder if this is an issue with treating N in the reference as A, or down to the fact that the reads have an A before the gap? Do you have some other examples with different bases used instead of the N's?
Leave a comment:
-
samtools mpileup consensus calls A instead of a gap
Hello,
I'm using samtools with the following command to call the consensus of my mapping:
samtools mpileup -uf Ref SEQ | bcftools view -cg - | vcfutils.pl vcf2fq > SEQ.fq
The problem is that I have sequences that have 1 and some have 2 copies of a short repeat - the reference has 2. If I use the reference with 2 copies I end up with 2 copies in the consensus since it sometimes maps the reads to the first and sometimes to the second copy of the repeat. So I exchanged one copy in the reference with N's., but end up with A's instead of the gap in the consensus.
Example:
Ref AANNNTATTATTA
Read1 AA---TATTATTA
Read2 AA---TATTATTA
it calls AAAAATATTATTA instead of AATATTATTA.
I also get the following error if I use the reference with N's instead of the second copy (that I don't get if I use the reference with 2 copies):
Unknown command "vcf2fq".
I've had the same problem with the previous pileup function!
Any idea how to change that?
Cheers,
StefanTags: None
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has...-
Channel: Articles
12-02-2024, 01:49 PM -
-
by seqadmin
The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben MartÃnez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...-
Channel: Articles
11-06-2024, 07:24 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 12-02-2024, 09:29 AM
|
0 responses
139 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:29 AM
|
||
Started by seqadmin, 12-02-2024, 09:06 AM
|
0 responses
49 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:06 AM
|
||
Started by seqadmin, 12-02-2024, 08:03 AM
|
0 responses
38 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 08:03 AM
|
||
Started by seqadmin, 11-22-2024, 07:36 AM
|
0 responses
69 views
0 likes
|
Last Post
by seqadmin
11-22-2024, 07:36 AM
|
Leave a comment: