Does anyone else know how to get samtools tview to display the reference sequence at the top? I only see a string of N where I think the reference sequence should be.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Originally posted by Mansequencer View PostHi Rodney and Nilshomer,
I have a similar problem. I do add my reference fasta file after the BAM file but all I see are the Ns. I dont even see the reads that have aligned.
Any suggestions?
Thanks guys.
Comment
-
Originally posted by nntao View PostI have several issues with "samtools tview":
g : goto position, allows me to type a number, but not going anywhere
b : won't toggle
anyone has similar problem, and solutions
Comment
-
samtools tview
Hello,
I am having similar problem as mentioned here earlier. I have a string of N's in the reference genome, when I use the samtools tview command.Only the first 80 bases are present and remaining are all NNNNNN's. I tried using g option and tried different chromosome numbers. Still I have only N's.
Any help will be highly appreciated.
Here is the command I gave:
samtools tview MH_0001alignedreadssorted.bam danRer6.fa
Comment
-
If your reference sequence is correct and indexed, as well as you BAM file, you should see the reference sequence where reads aligned. I noted that if there are no reads for a long stretch of bases, tview doesn't bother to show the reference, it shows only Ns. When I checked with another BAM file that had reads at that location, everything was OK. Before I thought the fasta file was currupted - which might very well be a reason for only seeing Ns.
Comment
-
Did you use 'danRer6.fa' in your alignment step? I think the issue here is that the reference used in the alignment step is not the same as 'danRer6.fa'
Originally posted by naluru View PostHello,
I am having similar problem as mentioned here earlier. I have a string of N's in the reference genome, when I use the samtools tview command.Only the first 80 bases are present and remaining are all NNNNNN's. I tried using g option and tried different chromosome numbers. Still I have only N's.
Any help will be highly appreciated.
Here is the command I gave:
samtools tview MH_0001alignedreadssorted.bam danRer6.fa
Comment
-
It's probably not a problem with tview, you can get the same issue in mpileup. Double-check that the names in your reference file are exactly the same as in your .sam file, or that there aren't any odd characters that might be confusing things.
Comment
-
Thank you VeBeKay!! That solved my problem perfectly :-) However I had to delete it in the reference AND rerun the alignment and subsequent steps, time-consuming but 100% effective.
I am surprised this was an issue, as the fasta reference was created using samtools faidx region command, which utilises a semi-colon to define the region. I tried faidx region extract with no semi-colon, and this does not give the correct output. Does anyone know of how to use faidx to extract regions to a multi-fasta which does not have semi-colons in the output files? It seems silly to require them for faidx yet malfunction over them in tview.
Comment
-
I am also having this problem, where the first 80 bases are present, then mostly Ns except in a few positions where aligned reads appear. I am just running a test set to try to learn the process, so I took a reference sequence and fragmented it into 500 bp segments with 50 bp overlaps, then aligned the fragments (short_reads.fas) back to the reference.fas. Thus, the full reference should have fragments aligned to it. Here are the commands I used:
$ bwa index reference.fas
$ bwa aln reference.fas short_reads.fas >short_reads.sai
$ bwa samse reference.fas short_reads.sai short_reads.fas >short_reads.sam
$ samtools faidx reference.fas
$ samtools import reference.fas.fai short_reads.sam short_reads.bam
$ samtools sort short_reads.bam short_reads.srt
$ samtools index short_reads.srt.bam
$ samtools tview short_reads.srt.bam reference.fas
I made sure that there were no colons in any sequence titles, but I am not sure I have used all the commands correctly. I would really appreciate any help!
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-02-2024, 08:06 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
05-02-2024, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment