Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • AnushaC
    Member
    • Sep 2013
    • 78

    Sam to Bam using bowtie and using the shell script

    Hi ,
    I am writing a script to convert sam file to bam files
    I wrote script like this


    #!/bin/bash
    for files in *.sra
    do
    OUT=(${files%.sra}.bam)
    ref=($"/raid/references-and-indexes/hg19/bwa/hg19.fa")
    fastq-dump -Z $files|bwa aln -t 4 $ref -|bwa samse $ref -| samtools view -bS -o > $OUT
    done

    I got the following doubts
    bwa aln database.fasta short_read.fastq > aln_sa.sai

    bwa samse database.fasta aln_sa.sai short_read.fastq > aln.sam

    How would my script can understand that I am taking fastq dump output and giving to bwa samse . Can some body help me to improve my code and tell me where I am going wrong .
    And also I want output my file to specific directory instead of current directory like this
    I want all my bam files in /raid/development/anusha/python_test/fetal_brain_bam_bwa

    Thanks in advance ,
    Anusha.Ch
  • AnushaC
    Member
    • Sep 2013
    • 78

    #2
    Hi All ,
    Sorry I meant bwa alignment

    Thanks in advance,
    Anusha.Ch

    Comment

    • gringer
      David Eccles (gringer)
      • May 2011
      • 845

      #3
      Originally posted by AnushaC View Post
      fastq-dump -Z $files | bwa aln -t 4 $ref - | bwa samse $ref - | samtools view -bS -o > $OUT
      ...
      I want all my bam files in /raid/development/anusha/python_test/fetal_brain_bam_bwa
      I Don't really understand the specifics of your question, so will assume that the bit before the 'samtools' pipe produces a properly-formatted SAM file.

      I like sorting the BAM files at the same time (in preparation for indexing), because it can all be done in the one pipe, so you don't need to keep the unsorted BAM file around:

      Code:
      fastq-dump -Z $files | bwa aln -t 4 $ref -|bwa samse $ref  - | \
         samtools view -bS - | \
         samtools sort - /raid/development/anusha/python_test/fetal_brain_bam_bwa/$OUT
      Last edited by gringer; 10-31-2013, 03:32 PM.

      Comment

      • AnushaC
        Member
        • Sep 2013
        • 78

        #4
        Hi ,
        I wrote script similar this one like fastq-dump -Z SRR203400.copy.sra|bwa aln -t 7 /raid/references-and-indexes/hg19/bwa/hg19.fa -|bwa samse /raid/references-and-indfexes/hg19/bwa/hg19.fa - -|samtools view -bT /raid/references-and-indfexes/hg19/bwa/hg19.fa - > scrapSRR203400.copy.bam
        The sam file generating from bwa samse is not generating an indexed samfile so i think samtools view -bS option is not working . I tried use -T with reference file and -t option it did not work.

        Thanks,
        Anusha.Ch

        Comment

        • swbarnes2
          Senior Member
          • May 2008
          • 910

          #5
          .sam files are never indexed, only .bam files are. samtools view -bS will work fine with a sam file. I use it all the time.

          If you are going to pipe from .sam | .bam | sorted.bam, I would use -Shu instead of -bS. -b compresses the .bam file, -u leaves it uncompressed, which makes the sort faster. In some versions of samtools, samtools sort will complain that the EOF is missing, but it's nothing to worry about.

          Comment

          • AnushaC
            Member
            • Sep 2013
            • 78

            #6
            Hi ,
            I am getting multiple errors while running pipeline these are following errors
            Usage: bwa samse [-n max_occ] [-f out.sam] [-r RG_line] <prefix> <in.sai> <in.fq>
            [samopen] no @SQ lines in the header.
            [sam_read1] missing header? Abort!
            [bwa_aln] 17bp reads: max_diff = 2
            [bwa_aln] 38bp reads: max_diff = 3
            [bwa_aln] 64bp reads: max_diff = 4
            [bwa_aln] 93bp reads: max_diff = 5
            [bwa_aln] 124bp reads: max_diff = 6
            [bwa_aln] 157bp reads: max_diff = 7
            [bwa_aln] 190bp reads: max_diff = 8
            [bwa_aln] 225bp reads: max_diff = 9
            [bwa_aln_core] calculate SA coordinate... 71.48 sec
            [bwa_aln_core] write to the directory

            [samopen] no @SQ lines in the header.
            [sam_read1] missing header? Abort!
            [bwa_aln_core] calculate SA coordinate... 53.11 sec
            [bwa_aln_core] write to the disk... [bwa_aln] 17bp reads: max_diff = 2
            [bwa_aln] 38bp reads: max_diff = 3
            [bwa_aln] 64bp reads: max_diff = 4
            [bwa_aln] 93bp reads: max_diff = 5
            [bwa_aln] 124bp reads: max_diff = 6
            [bwa_aln] 157bp reads: max_diff = 7
            [bwa_aln] 190bp reads: max_diff = 8
            [bwa_aln] 225bp reads: max_diff = 9
            Usage: bwa samse [-n max_occ] [-f out.sam] [-r RG_line] <prefix> <in.sai> <in.fq>
            [samopen] no @SQ lines in the header.
            [sam_read1] missing header? Abort!
            [bwa_aln_core] calculate SA coordinate... 53.71 sec
            [bwa_aln_core] write to the disk... [bwa_aln] 17bp reads: max_diff = 2
            [bwa_aln] 38bp reads: max_diff = 3
            [bwa_aln] 64bp reads: max_diff = 4
            [bwa_aln] 93bp reads: max_diff = 5
            [bwa_aln] 124bp reads: max_diff = 6
            [bwa_aln] 157bp reads: max_diff = 7
            Usage: bwa samse [-n max_occ] [-f out.sam] [-r RG_line] <prefix> <in.sai> <in.fq>
            [bwa_aln] 190bp reads: max_diff = 8
            [bwa_aln] 225bp reads: max_diff = 9
            [samopen] no @SQ lines in the header.
            [sam_read1] missing header? Abort!

            Then i changed my code little bit like this

            fastq-dump -Z $files | bwa aln -t 8 $ref -|bwa samse $ref - | \samtools view -Shu - | \
            # samtools sort - >/raid/development/anusha/python_test/fetalbrain_bam_bwa/$OUT


            fastq-dump -Z $files | bwa aln -t 4 $ref -|bwa samse $ref - - | \
            samtools view -bS - >/raid/development/anusha/python_test/fetal_brainbam_bwa/$OUT
            done
            samtbam_bwa.copy.sh: line 3: 33602 Broken pipe fastq-dump -Z $files
            33603 | bwa aln -t 4 $ref -
            33604 Exit 1 | bwa samse $ref - -
            33605 Aborted | samtools view -bS - > /raid/development/anusha/python_test/fetal_brainbam_bwa/$OUT
            [bwa_aln] 17bp reads: max_diff = 2
            [bwa_aln] 38bp reads: max_diff = 3
            [bwa_aln] 64bp reads: max_diff = 4
            [bwa_aln] 93bp reads: max_diff = 5
            [bwa_aln] 124bp reads: max_diff = 6
            [bwa_aln] 157bp reads: max_diff = 7
            [bwa_aln] 190bp reads: max_diff = 8
            [bwa_aln] 225bp reads: max_diff = 9
            [bwa_aln_core] calculate SA coordinate... 49.96 sec
            [bwa_aln_core] write to the disk... [fclose] Bad file descriptor
            [samopen] SAM header is present: 93 sequences.
            [sam_read1] reference 'SN:chr18_gl000207_random LN:4262


            ' is recognized as '*'.
            [main_samview] truncated file.
            [bwa_aln] 17bp reads: max_diff = 2
            [bwa_aln] 38bp reads: max_diff = 3
            [bwa_aln] 64bp reads: max_diff = 4
            [bwa_aln] 93bp reads: max_diff = 5
            [bwa_aln] 124bp reads: max_diff = 6
            [bwa_aln] 157bp reads: max_diff = 7
            [bwa_aln] 190bp reads: max_diff = 8
            [bwa_aln] 225bp reads: max_diff = 9
            [bwa_aln_core] calculate SA coord

            In a weird way

            Comment

            • AnushaC
              Member
              • Sep 2013
              • 78

              #7
              bwa_aln_core] calculate SA coordinate... 47.61 sec
              [bwa_aln_core] write to the disk... [bwa_aln_core] convert to sequence coordinate... 4.23 sec
              [bwa_aln_core] refine gapped alignments... 0.55 sec
              [bwa_aln_core] print alignments... [samopen] SAM header is present: 93 sequences.
              [sam_read1] reference '0' is recognized as '*'.
              Parse warning at line 95: mapped sequence without CIGAR
              Parse error at line 95: sequence and quality are inconsistent
              Aborted

              Comment

              • AnushaC
                Member
                • Sep 2013
                • 78

                #8
                Hi ,
                Can anybody can explain me in detail samse requires three inputs one is fastq file ,fast file and fai file . While I am specifying the reference but I how it can take fastq dump out and take it as input since in the intermediate there is a sai file. So what do for piping part

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                30 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                85 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                96 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                94 views
                0 reactions
                Last Post SEQadmin2  
                Working...