I assigned RGID's to 4 different sets of files. I labeled them L1, L2, L3, L4 to indicate which lane they were from. I was having issues after samtools merge and I think the problem is that the first in.bam header overwrote the other headers (this is in the samtools manual). Basically I have orphan reads now because they have a RGID that isn't in the header. All the other RG tags are identical in the bam files. I was told that I needed to add the other ID's to my header in the merged file. So I guess the header needs to say something like RGID: L1, L2, L3, L4 for it to work properly but I don't think you can put comma's. I don't really know how to reformat text files and on a scale of 1-10 in programming knowledge, I would put myself at a 2. Can someone please give me an example code for how to edit this header? I will attach the header of my existing file below if that helps.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I think various RGs need to be on separate lines.
You can use text editor to get the header you want, no "programming" involved using gedit/vim/emacs.
It you did want to program, using the Unix tools sed,grep,cat and so in a script could solve the problem.
When you get the header you want, try the samtools "reheader" option.
-
Originally posted by Richard Finney View PostI think various RGs need to be on separate lines.
You can use text editor to get the header you want, no "programming" involved using gedit/vim/emacs.
It you did want to program, using the Unix tools sed,grep,cat and so in a script could solve the problem.
When you get the header you want, try the samtools "reheader" option.
Comment
-
Originally posted by shawpa View PostI think I wasn't clear when I said I couldn't program very well. I actually meant I couldn't program as well as do what you are describing.
Right now your header has only L1:
Code:@RG ID:L1 PL:ILLUMINA PU:D0DHVACXX LB:ryan SM:ryan
Code:@RG ID:L1 PL:ILLUMINA PU:D0DHVACXX LB:ryan SM:ryan1 @RG ID:L2 PL:ILLUMINA PU:D0DHVACXX LB:ryan SM:ryan2 @RG ID:L3 PL:ILLUMINA PU:D0DHVACXX LB:ryan SM:ryan3 @RG ID:L4 PL:ILLUMINA PU:D0DHVACXX LB:ryan SM:ryan4
The method Richard Finney suggests is to use samtools reheader:
Code:samtools view -H file.bam > header.txt ...edit header.txt using any text editor... samtools reheader header.txt file.bam > file.fixedheader.bam
Comment
Latest Articles
Collapse
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 02:44 PM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
Yesterday, 02:44 PM
|
||
Started by seqadmin, 10-11-2024, 06:55 AM
|
0 responses
14 views
0 likes
|
Last Post
by seqadmin
10-11-2024, 06:55 AM
|
||
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
110 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
117 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
Comment