A former lab member assembled a number of contigs from Illumina reads using SPAdes. I have been trying to assess the depth of coverage using Bowtie2 when I noticed something interesting. I find that there are no Bowtie alignments (concordant or discordant) for the largest contig. Can anyone explain this?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by sewellh View PostI am mapping the reads used to make the contigs back on to the contigs. I get alignments for all contigs except for the largest one. I have BLASTED the contig and it is what I expect.
Comment
-
1) Have you verified that all of the contigs have unique, correctly-formatted names?
2) Does the contig look normal to you - high complexity, mainly defined bases, rather than e.g. a homopolymer or mostly-N sequence?
3) Is it possible that this contig is a replicate of other contigs? Even though it's bigger, it could be fully covered by other contigs. So, do any other contigs map to it?
4) Is it highly repetitive such that reads aligning to it might exceed the maximum number of allowed alignments?
Comment
-
Yes, the contigs have uniqe and correctly formatted names. But even when I try to just map the reads to the single large contig, I get no matches.
It doesn't look like this contig is a replicate of others but it does have a 3-4 copies of a ~500 nt fragment within itself. Does that mean that this contig was made incorrectly or that there is something else I should do? I would would expect that if I tried to align the raw reads just to single contig that I would get some alignments.
Update: Using the resulting fastq files from the Hammer error correcting, I still get no Bowtie alignments to that contig
Comment
-
Thanks so much for your help. I'll try out BBMap. If you are curious at all to look at the contigs, they're on JGI. The largest is:
>gi|589096183|gb|JARN01000011.1| Dehalococcoidia bacterium DscP2 WGS:JARN01:comHGAPfinal_Contig11_1.11, whole genome shotgun sequence
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-02-2024, 08:06 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
05-02-2024, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment