Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA sampe fails - invalid BAM binary header (

    Hi,
    I am trying to use BWA for analysing human whole exome PE data (Illumina HiSeq) for the first time.
    I downloaded the hg19 genome as a reference as indexed it with bwa index.
    Then aligned with bwa aln with the following format of command (for eacg fastq file):

    nohup bwa aln ~/work_area/hg19_genome/hg19_chromFa.tar.gz SG1177_2_sequence.txt > SG1177_2_aln_sa.sai &

    from the ~10Gb fastq files I got ~1.3 Gb sai files.

    But when I am trying to use sampe to get the sam files, it doesn't work, and the output file contains the following comments:
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [bwa_read_seq] the maximum barcode length is 63.
    then the list of chromosomes
    and end.

    Here is the sampe command I used:
    nohup bwa sampe ~/work_area/hg19_genome/hg19_chromFa.tar.gz SG1177_1_aln_sa.sai SG1177_2_aln_sa.sai SG1177_1_sequence.txt SG1177_2_sequence.txt &

    What did I do wrong?

    Many thanks in advance!

  • #2
    Hello,

    According to the bwa documentation, the reference should be a FASTA file.


    http://bio-bwa.sourceforge.net/bwa.shtml


    Your reference is hg19_chromFa.tar.gz -- this is a compressed archive likely containing many files.

    Sébastien Boisvert

    Originally posted by Lilach View Post
    Hi,
    I am trying to use BWA for analysing human whole exome PE data (Illumina HiSeq) for the first time.
    I downloaded the hg19 genome as a reference as indexed it with bwa index.
    Then aligned with bwa aln with the following format of command (for eacg fastq file):

    nohup bwa aln ~/work_area/hg19_genome/hg19_chromFa.tar.gz SG1177_2_sequence.txt > SG1177_2_aln_sa.sai &

    from the ~10Gb fastq files I got ~1.3 Gb sai files.

    But when I am trying to use sampe to get the sam files, it doesn't work, and the output file contains the following comments:
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [bwa_read_seq] the maximum barcode length is 63.
    then the list of chromosomes
    and end.

    Here is the sampe command I used:
    nohup bwa sampe ~/work_area/hg19_genome/hg19_chromFa.tar.gz SG1177_1_aln_sa.sai SG1177_2_aln_sa.sai SG1177_1_sequence.txt SG1177_2_sequence.txt &

    What did I do wrong?

    Many thanks in advance!

    Comment


    • #3
      solution

      After solving the prolem, I am writing here the solution, in case somebody else will search for a solution for this proble:

      The problem was not in giving hg19_chromFa.tar.gz as a reference file. BWA knows to work with it.

      The reason for the problem was that I tried to run the bwa commands in the background (in nohup...&) or via a Putty SSH terminal (that crashed in the middle).
      This command should NOT be executed in the background, so the only way to run it in a far computer in by VPN. Then, run it without nohup ... &.

      Comment


      • #4
        Yeah, incompatibility with nohup has been reported on other threads on Seqanswers.

        Comment


        • #5
          The program screen is useful for that kind of workflow.

          You can start your things in a screen, close your window. After that, you can re-ssh to
          your box and reconnect to your screen. In one screen, you can have several tabs too.

          Sébastien

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advanced Tools Transforming the Field of Cytogenomics
            by seqadmin


            At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
            09-26-2023, 06:26 AM
          • seqadmin
            How RNA-Seq is Transforming Cancer Studies
            by seqadmin



            Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
            09-07-2023, 11:15 PM
          • seqadmin
            Methods for Investigating the Transcriptome
            by seqadmin




            Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

            Whole Transcriptome RNA-seq
            Whole transcriptome sequencing...
            08-31-2023, 11:07 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 09:38 AM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-27-2023, 06:57 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-26-2023, 07:53 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-25-2023, 07:42 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Working...
          X