Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BreakDancer empty cfg (no output from bam2cfg)

    Hi, everyone, I had spent days to do trouble-shooting on bam2cfg.pl,
    breakdancer-1.1_2011_02_21.zip. I had tried some ways to figure out the problem, but without any success. Owing to only one lib I have, I changed the header code of bam2cfg.pl,

    Edited code:

    open(BAM,"samtools view -h $fbam |") || die "unable to open $fbam\n";
    while(<BAM>){
    chomp;
    if(/^\@PG/){ #getting RG=>LIB mapping from the bam header
    my ($id)="bwa";
    my ($lib)="1.2k";
    my ($platform)="Illumina";
    my ($sample)="88LN";
    my ($insertsize)="1200";
    #if(defined $insertsize && $insertsize>0){
    #$lib=$sample . '_'. $lib;
    $libs{$lib}=1;
    $RGlib{$id}=$lib;
    $RGplatform{$id}=$platform;
    #}
    }

    this is the sorted bam file I have:

    @SQ SN:scaffold00001 LN:10500
    @SQ SN:scaffold00002 LN:2281
    @SQ SN:scaffold00003 LN:27085
    @SQ SN:scaffold00004 LN:12161
    @SQ SN:scaffold00005 LN:2206
    .
    .
    .
    @PG ID:bwa PN:bwa VN:0.5.9-r16
    .
    .
    .
    HWI-ST833:6:4:18265:54791#0 145 scaffold00001 207 0 50M scaffold00079 243097 0 TTTACTAAAACCGATTGGNCCCGGACAATATTTCGATGTGGGCCGGCCCT ggggggfggg\[]]][K]B[e`gggggfggWgggggegcggggggggggg XT:A:R NM:i:1 SM:i:0 AM:i:0 X0:i:2 X1:i:2 XM:i:1 XO:i:0 XG:i:0 MD:Z:18T31 XA:Z:scaffold00100,+12080,50M,1;scaffold00042,-1233,50M,2;scaffold00007,-133793,50M,2;
    .
    .
    .

    "perl bam2cfg.pl *bam > jun.cfg"

    At last: samtaols and perl run successfully, but no output from bam2cfg .

    Could anyone help me?

    Thanks in advance!
    Best
    Jun
    Last edited by zhongj; 12-14-2011, 01:50 AM.

  • #2
    Your code does not seem quite right. Are you sure the conditional tests on @PG (program)? The standard version of bam2cfg.pl tests on @RG (read group). Does your BAM header contain a @PG entry? If not, your changes are not getting executed because they are inside an if block that never evaluates to true. Can you just add an RG group to your BAM using samtools reheader? Otherwise, you should probably just set the $libs{$lib}, $RGlib{$id}, and $RGplatform{$id} before you open the BAM.
    David Dooling

    Comment


    • #3
      Hi, ddgenome, thanks for your suggestion, I will have a try.

      Comment


      • #4
        It still do not work. My Bam header do contain a @PG entry. Any one could help me? BreakDancer is too annoying...

        Originally posted by ddgenome View Post
        Your code does not seem quite right. Are you sure the conditional tests on @PG (program)? The standard version of bam2cfg.pl tests on @RG (read group). Does your BAM header contain a @PG entry? If not, your changes are not getting executed because they are inside an if block that never evaluates to true. Can you just add an RG group to your BAM using samtools reheader? Otherwise, you should probably just set the $libs{$lib}, $RGlib{$id}, and $RGplatform{$id} before you open the BAM.

        Comment


        • #5
          It really seems your BAM is not well-formed (at least for sequence analysis). Looking closer at the alignment record from the example BAM you provided, the read does not have an RG tag. So you are creating an RG record, but then no reads are associated with it.
          David Dooling

          Comment


          • #6
            Yes, you are right! But how can I fix my BAM file to adapt bam2cfg.pl? Manually add "@RG" ahead every alignment line? Like the following

            @R 1.2k GHWI-ST833:6:4:18265:54791#0 145 scaffold00001 207 0 50M scaffold00079 243097 0 TTTACTAAAACCGATTGGNCCCGGACAATATTTCGATGTGGGCCGGCCCT ggggggfggg\[]]][K]B[e`gggggfggWgggggegcggggggggggg XT:A:R NM:i:1 SM:i:0 AM:i:0 X0:i:2 X1:i:2 XM:i:1 XO:i:0 XG:i:0 MD:Z:18T31 XA:Z:scaffold00100,+12080,50M,1;scaffold00042,-1233,50M,2;scaffold00007,-133793,50M,2;


            or How to fix bam2cfg.pl to adapt my BAM file?

            Looking forward for your reply!
            Thanks.

            Originally posted by ddgenome View Post
            It really seems your BAM is not well-formed (at least for sequence analysis). Looking closer at the alignment record from the example BAM you provided, the read does not have an RG tag. So you are creating an RG record, but then no reads are associated with it.
            Last edited by zhongj; 12-16-2011, 12:52 AM.

            Comment


            • #7
              You need to add the @RG entry to the header and then add an RG field pointing to that @RG entry for each alignment record. See the SAM/BAM specification for more information.

              David Dooling

              Comment


              • #8
                Hello,

                did you solve the problem? If yes, could you share the solution, please? I have no outpufile as well and I don't know, what is wrong with my bam file.

                My header looks like the following lines:
                @HD VN:1.0 SO:coordinate
                @SQ SN:chr1 LN:249250621
                ....
                @SQ SN:chr22 LN:51304566
                @SQ SN:chrX LN:155270560
                @SQ SN:chrY LN:59373566
                @SQ SN:chrM LN:16571
                @RG ID:sample CN:xxx PL:ILLUMINA SM:sample LB:sample PI:50
                @PG ID:bwa PN:bwa VN:0.5.9-r16

                My reads look like
                readid1 1123 chrM 1 60 101M = 160 260 GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACGCTGG @@@FDFADFADFD>FH@FHEIIIIIIIFGGGGIIC@6BGHGCEEHIIIGIIH(B7@CHGGCCEE<CE/909)2:4>8<>B9>@>3@@4>@BB>@?9<9@ RG:Z:sample XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:101
                HWI-ST758_0058:1:1104:17925:175341#CTTGTA 1187
                chrM 1 60 101M = 237 337 GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACGCTGG @@??DF?D;=<DBG<?F9E:AFG>HF99E9CCF=@6)?0D>D9??BDGDFFB<F8=CGCFGA@HIH3=:/5;BB<A4@8&055933>@>+39&059@@055 RG:Z:sample XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:101
                HWI-ST758_0058:1:1105:15315:81776#CTTGTA 1123
                The output is of bam2cfg_2.pl is:
                Processing bam:test.bam
                Closing BAM file
                Send TERM signal for 5164
                samtools pid process 5164 is still there...
                invoking kill -9 on 5164 ...
                Closing samtools process : 5164
                Could anyone help me?
                Best regards
                Robby

                Comment


                • #9
                  sorry,I am still working on it

                  Comment


                  • #10
                    Dear zhongj,
                    thanks for your answer. My problem is already solved.

                    I used the following command
                    perl bam2cfg.pl -g -h sample.bam > sample.cfg
                    I received the message mentioned before, but that is not an error message. For the first computations it is enough to use a specified amount of reads. The output of that script should look like:

                    readgroup:xxx platform:ILLUMINA map:/path/to/bam/sample.bam readlen:xxx lib:xxx num:xxx lower:xxx upper:xxx mean:xxx std:xxx SWnormality:-xxx flag:0(xx.xx%)1(x.xx%)18(xx.xx%)2(x.xx%)32(x.xx%)4(x.xx%)8(x.xx%)30001 exe:samtools view
                    After that I proceeded with breakdancer-max1.2 and it worked. At least I received an output and try to analyze and understand that at present.

                    Best regards
                    Robby

                    Comment


                    • #11
                      Hi all,

                      I'm having a similar problem and I think it might be due to the way I'm originally aligning the file. I'm aligning with bowtie in the SAM format (-S), but the header is @PG and there are no @RG tags for the reads.

                      How are you aligning your files to get the Read Group Id?

                      Thanks,
                      Sara

                      Comment


                      • #12
                        Google is your friend.

                        David Dooling

                        Comment


                        • #13
                          Hi slink,

                          if your mappings are already finished you can add the read group with Picard as well. For breakdancer ist is also enough, if the RG is mentioned in the header. So you can simply modify the header manually.

                          Best regards
                          Robby

                          Comment


                          • #14
                            I'm going to bump this thread to mention to mention that BAMs merged with samtools merge will not produce output unless the -c flag is provided to combine @RG headers with colliding IDs - if appropriate.

                            Without the -c flag, samtools merge will create additional @RG IDs in the read mapping that may not be in the header, especially if a header is provided with the -h flag.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            34 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            37 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            31 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            53 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X