Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • multiple bam files one command

    I have multiple .bam files in a directory that I would like to run the following command on:

    Code:
    samtools view -H Input.bam | sed '/^@PG/d' | samtools reheader - Input.bam > Input_newheader.bam
    The command works great for one file, but I am trying to use that command on all .bam file in a directory (/home/cmccabe/Desktop/NGS).

    to do multiple? or is there a better way? Thank you .

    Code:
    find *bam | parallel 'samtools view -H Input.bam | sed '/^@PG/d' | samtools reheader - Input.bam > Input_newheader.bam'

  • #2
    Code:
    for f in *.bam
    do
    prefix=${f%%.bam}
    samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > ${prefix}_newheader.bam
    done
    This is going to quickly become IO bound, so you're unlikely to see much benefit from parallel. BTW, to do this with parallel, the simplest method is to just write a shell script that takes a single file as input and use that with parallel.

    Comment


    • #3
      If all the bam files are stored on a separate drive (/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215), and the output gets re-directed to (/home/cmccabe/Desktop/NGS/pool_I_090215)will the below work?

      Code:
      cd "/home/cmccabe/Desktop/NGS" -- path to samtools
      
      for f in *.bam
      do
      prefix=${f%%.bam}
      samtools view -H /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/$f | sed '/^@PG/d' | samtools reheader - /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/$f > /home/cmccabe/Desktop/NGS/pool_I_090215/${prefix}_newheader.bam
      done
      Thank you

      Comment


      • #4
        "*.bam" is looking in the current working directory, so no, that won't work. If you said "/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam" then note that you would need to do something like:
        Code:
        bname=`basename $f`
        pref=${bname%%.bam}
        That would strip the path to the file appropriately. You would also then just use $f instead of /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/$f. For why that's the case, run:

        Code:
        for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam
        do
        echo $f
        done

        Comment


        • #5
          Makes sense, so I tried: the bold is the output so why is it looking for those files? Thank you for your help.

          Code:
          cmccabe@HPZ640:~/Desktop/NGS$ for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam; do prefix=${f%%.bam}; samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215${prefix}_newheader.bam; done
          [B]bash: /home/cmccabe/Desktop/NGS/pool_I_090215/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902_newheader.bam: No such file or directory
          bash: /home/cmccabe/Desktop/NGS/pool_I_090215/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_015_rawlib_newheader.bam: No such file or directory
          bash: /home/cmccabe/Desktop/NGS/pool_I_090215/media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_016_150902_newheader.bam: No such file or directory[/B]

          Comment


          • #6
            You forgot the basename line.

            Comment


            • #7
              I thought I got it but I am a bit confused:

              Code:
              bname=`basename $f`
              pref=${bname%%.bam}
              for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam
              do
              prefix=${f%%.bam}
              samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215/${prefix}_newheader.bam
              done
              gives the below error:

              Code:
              cmccabe@HPZ640:~$ cd "/home/cmccabe/Desktop/NGS"
              cmccabe@HPZ640:~/Desktop/NGS$ bname=`basename $f`
              basename: missing operand
              Try 'basename --help' for more information.
              cmccabe@HPZ640:~/Desktop/NGS$ pref=${bname%%.bam}
              cmccabe@HPZ640:~/Desktop/NGS$ for f in /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam
              > do
              > prefix=${f%%.bam}
              > samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215/${prefix}_newheader.bam
              > done
              bash: /home/cmccabe/Desktop/NGS/pool_I_090215//media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902_newheader.bam: No such file or directory
              bash: /home/cmccabe/Desktop/NGS/pool_I_090215//media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_015_rawlib_newheader.bam: No such file or directory
              bash: /home/cmccabe/Desktop/NGS/pool_I_090215//media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_016_150902_newheader.bam: No such file or directory
              Thank you .

              Comment


              • #8
                When you run basename, $f hasn't yet been defined...

                I'm actually not going to explicitly tell you the solution to this, you should be able to figure it out given a bit of playing around and noting the error message.

                Comment


                • #9
                  So here is the command:

                  Code:
                  /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/*.bam ; do     bname=`basename $f`;     pref=${bname%%.bam};     samtools view -H $f | sed '/^@PG/d' | samtools reheader - $f > /home/cmccabe/Desktop/NGS/pool_I_090215/${pref}_newheader.bam; done
                  This seems to work great, thank you for your help .

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Recent Advances in Sequencing Analysis Tools
                    by seqadmin


                    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                    Today, 07:48 AM
                  • seqadmin
                    Essential Discoveries and Tools in Epitranscriptomics
                    by seqadmin




                    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                    04-22-2024, 07:01 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 07:17 AM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-02-2024, 08:06 AM
                  0 responses
                  19 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-30-2024, 12:17 PM
                  0 responses
                  20 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-29-2024, 10:49 AM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X