Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • multiple bam files one command

    I have multiple .bam files in a directory that I would like to run the following command on:

    Code:
    samtools view -H Input.bam | sed '/^@PG/d' | samtools reheader - Input.bam > Input_newheader.bam
    The command works great for one file, but I am trying to use that command on all .bam file in a directory (/home/cmccabe/Desktop/NGS).

    to do multiple? or is there a better way? Thank you .

    Code:
    find *bam | parallel 'samtools view -H Input.bam | sed '/^@PG/d' | samtools reheader - Input.bam > Input_newheader.bam'

  • #2
    Code:
    for f in *.bam
    do
    prefix=${f%%.bam}
    samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > ${prefix}_newheader.bam
    done
    This is going to quickly become IO bound, so you're unlikely to see much benefit from parallel. BTW, to do this with parallel, the simplest method is to just write a shell script that takes a single file as input and use that with parallel.

    Comment


    • #3
      If all the bam files are stored on a separate drive (/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215), and the output gets re-directed to (/home/cmccabe/Desktop/NGS/pool_I_090215)will the below work?

      Code:
      cd "/home/cmccabe/Desktop/NGS" -- path to samtools
      
      for f in *.bam
      do
      prefix=${f%%.bam}
      samtools view -H /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/$f | sed '/^@PG/d' | samtools reheader - /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/$f > /home/cmccabe/Desktop/NGS/pool_I_090215/${prefix}_newheader.bam
      done
      Thank you

      Comment


      • #4
        "*.bam" is looking in the current working directory, so no, that won't work. If you said "/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam" then note that you would need to do something like:
        Code:
        bname=`basename $f`
        pref=${bname%%.bam}
        That would strip the path to the file appropriately. You would also then just use $f instead of /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/$f. For why that's the case, run:

        Code:
        for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam
        do
        echo $f
        done

        Comment


        • #5
          Makes sense, so I tried: the bold is the output so why is it looking for those files? Thank you for your help.

          Code:
          cmccabe@HPZ640:~/Desktop/NGS$ for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam; do prefix=${f%%.bam}; samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215${prefix}_newheader.bam; done
          [B]bash: /home/cmccabe/Desktop/NGS/pool_I_090215/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902_newheader.bam: No such file or directory
          bash: /home/cmccabe/Desktop/NGS/pool_I_090215/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_015_rawlib_newheader.bam: No such file or directory
          bash: /home/cmccabe/Desktop/NGS/pool_I_090215/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_016_150902_newheader.bam: No such file or directory[/B]

          Comment


          • #6
            You forgot the basename line.

            Comment


            • #7
              I thought I got it but I am a bit confused:

              Code:
              bname=`basename $f`
              pref=${bname%%.bam}
              for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam
              do
              prefix=${f%%.bam}
              samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215/${prefix}_newheader.bam
              done
              gives the below error:

              Code:
              cmccabe@HPZ640:~$ cd "/home/cmccabe/Desktop/NGS"
              cmccabe@HPZ640:~/Desktop/NGS$ bname=`basename $f`
              basename: missing operand
              Try 'basename --help' for more information.
              cmccabe@HPZ640:~/Desktop/NGS$ pref=${bname%%.bam}
              cmccabe@HPZ640:~/Desktop/NGS$ for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam
              > do
              > prefix=${f%%.bam}
              > samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215/${prefix}_newheader.bam
              > done
              bash: /home/cmccabe/Desktop/NGS/pool_I_090215//media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902_newheader.bam: No such file or directory
              bash: /home/cmccabe/Desktop/NGS/pool_I_090215//media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_015_rawlib_newheader.bam: No such file or directory
              bash: /home/cmccabe/Desktop/NGS/pool_I_090215//media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_016_150902_newheader.bam: No such file or directory
              Thank you .

              Comment


              • #8
                When you run basename, $f hasn't yet been defined...

                I'm actually not going to explicitly tell you the solution to this, you should be able to figure it out given a bit of playing around and noting the error message.

                Comment


                • #9
                  So here is the command:

                  Code:
                  /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam ; do     bname=`basename $f`;     pref=${bname%%.bam};     samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215/${pref}_newheader.bam; done
                  This seems to work great, thank you for your help .

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Latest Developments in Precision Medicine
                    by seqadmin



                    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                    Somatic Genomics
                    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                    05-24-2024, 01:16 PM
                  • seqadmin
                    Recent Advances in Sequencing Analysis Tools
                    by seqadmin


                    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                    05-06-2024, 07:48 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 05-30-2024, 03:16 PM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-29-2024, 01:32 PM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-24-2024, 07:15 AM
                  0 responses
                  208 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-23-2024, 10:28 AM
                  0 responses
                  225 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X