Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Remove a part of a filename in a Bash loop

    I have many files named like this:

    lib01.GFBAG_UHAU.fastq.sam.bam
    lib02.ABABAB_ZU.fastq.sam.bam
    lib03.ZGAZG_IAUDH.fastq.sam.bam

    Many parts of the filenames are thus variable in length, although they are connected through the same type of punctuation (. or _).
    What I want to achieve is to remove the part .fastq.sam.bam from a filename when I loop trough these files in BASH. How do I achieve this in Bash?

  • #2
    You want to split the string on a "." delimiter and then keep the first two parts. Or use ".fastq.sam.bam" as a delimiter, I suppose!

    To split string in Bash scripting with single character or set of single character delimiters, set IFS(Internal Field Separator) to the delimiter(s) and parse the string to array. To split string in Bash with multiple character delimiter use Parameter Expansions. Examples have been provided for Bash Split String operation.
    Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

    Comment


    • #3
      Refer to https://unix.stackexchange.com/quest...ck-of-variable

      An example for changing extension from fastq.sam.bam to txt.

      for file in *.fastq.sam.bam
      do
      mv ${file%.fastq.sam.bam} ${file%.fastq.sam.bam}.txt
      done

      Comment


      • #4
        Originally posted by ungsik View Post
        Refer to https://unix.stackexchange.com/quest...ck-of-variable

        An example for changing extension from fastq.sam.bam to txt.

        for file in *.fastq.sam.bam
        do
        mv ${file%.fastq.sam.bam} ${file%.fastq.sam.bam}.txt
        done
        Don't you mean:

        for file in *.fastq.sam.bam
        do
        mv $file ${file%.fastq.sam.bam}.txt
        done

        --
        Phillip

        Comment


        • #5
          Originally posted by Marius View Post
          I have many files named like this:

          lib01.GFBAG_UHAU.fastq.sam.bam
          lib02.ABABAB_ZU.fastq.sam.bam
          lib03.ZGAZG_IAUDH.fastq.sam.bam

          Many parts of the filenames are thus variable in length, although they are connected through the same type of punctuation (. or _).
          What I want to achieve is to remove the part .fastq.sam.bam from a filename when I loop trough these files in BASH. How do I achieve this in Bash?
          Using BASH parameter expansion:

          Code:
          for i in *.fastq.sam.bam; do mv $i ${i%.fastq.sam.bam}; done;
          Which is pretty fun, the "%" more-or-less meaning "clip what follows from the the very end of the value stored in variable $i." "#" does the analogous thing, but clips from the very front.

          But "%%" does a "greedy" removal of whatever follows it. So:

          Code:
          i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
          echo ${i%.fastq.sam.bam*}
          will produce:
          Code:
          lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam
          whereas:

          Code:
          i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
          echo ${i%%.fastq.sam.bam*}
          will produce:
          Code:
          lib01.GFBAG_UHAU
          If you can run Perl, then finding the "rename.pl" script might be less arcane than deploying you BASH powers.

          rename.pl 's/.fastq.sam.bam$//' *.fastq.sam.bam

          Find rename.pl here:


          --
          Phillip

          Comment


          • #6
            Originally posted by Marius View Post
            I have many files named like this:

            lib01.GFBAG_UHAU.fastq.sam.bam
            lib02.ABABAB_ZU.fastq.sam.bam
            lib03.ZGAZG_IAUDH.fastq.sam.bam

            Many parts of the filenames are thus variable in length, although they are connected through the same type of punctuation (. or _).
            What I want to achieve is to remove the part .fastq.sam.bam from a filename when I loop trough these files in BASH. How do I achieve this in Bash?
            Using BASH parameter expansion:

            Code:
            for i in *.fastq.sam.bam; do mv $i ${i%.fastq.sam.bam}; done;
            Which is pretty fun, the "%" more-or-less meaning "clip what follows from the the very end of the value stored in variable $i." "#" does the analogous thing, but clips from the very front.

            But "%%" does a "greedy" removal of whatever follows it. So:

            Code:
            i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
            echo ${i%.fastq.sam.bam*}
            will produce:
            Code:
            lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam
            whereas:

            Code:
            i=lib01.GFBAG_UHAU.fastq.sam.bam.fastq.sam.bam.fastq.sam.bam
            echo ${i%%.fastq.sam.bam*}
            will produce:
            Code:
            lib01.GFBAG_UHAU
            If you can run Perl, then finding the "rename.pl" script might be less arcane than deploying you BASH powers.

            rename.pl 's/.fastq.sam.bam$//' *.fastq.sam.bam

            Find rename.pl here:


            --
            Phillip

            Comment


            • #7
              A range of options exists for munging the pathnames

              The approach I would use might well depend on what else I going to do in the loop.

              FWIW:

              [basename](https://linux.die.net/man/1/basename) can be used to remove a suffix of a filename.

              [Shell Parameter Expansion](https://www.gnu.org/software/bash/ma...Expansion.html) can be used to strip or replace either suffixes or prefixes of pathnames stored in variables.

              [GNU parallel](https://www.gnu.org/software/parallel/) can be used in effect to replace your bash looping construct, and has simple syntax for to refer to the basename of a file or directory, including `{=perl expression=}` to munge the pathname any way you like. It has MANY great features and is well worth exploring and being in your toolbelt.

              [rename](https://www.computerhope.com/unix/rename.htm) is very useful for batch renaming of files using regular expressions (if that is all you need to do).

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              66 views
              0 likes
              Last Post seqadmin  
              Working...
              X