Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • musta1234
    Member
    • Jun 2013
    • 10

    Help with While []... done in bash

    Hello all!! I need help converting tab to fasta... I am a newbie and know only a little bash scripting. I need to convert a tab delimited SNP file into either a single fasta file or a multiple fasta files for each column using the first line as identifier.

    The closest I got was a script that generates the required .fasta files but enters a loop and can only be stopped by ctrl-C.


    #!/bin/bash


    echo ">" > carat.txt

    counter=1
    #My tab file has 64 columns

    while : [ $counter -lt 64]
    do

    less <SNP.txt |awk "{print$"$counter"}"| cat carat.txt - >$counter.fa
    counter=$(($counter +1))

    done

    exit



    SNP_001 SNP_002 SNP_003....
    T T T T
    C C C C
    C C C C
    C C C C
    A A A A
    A A A A
    T T T T
    T T T T
    C C C C
    G G G G
    G G G G
    C C C C
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    See if this thread helps: http://stackoverflow.com/questions/1...a-file-in-bash

    Comment

    • blakeoft
      Member
      • Oct 2013
      • 79

      #3
      Is your input the following:

      T T T T
      C C C C
      C C C C
      C C C C
      A A A A
      A A A A
      T T T T
      T T T T
      C C C C
      G G G G
      G G G G
      C C C C

      and do you want to get

      TCCCAATTCGGC
      TCCCAATTCGGC
      TCCCAATTCGGC
      TCCCAATTCGGC

      back as a result?
      Last edited by blakeoft; 03-31-2014, 05:00 AM. Reason: nitpicky spacing

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        I think @musta1234 wants the matrix transposed and then converted to a multi-fasta file.

        >SNP_001
        TCCCAATTCGGC
        >SNP_002
        TCCCAATTCGGC
        >SNP_003
        TCCCAATTCGGC

        Comment

        • musta1234
          Member
          • Jun 2013
          • 10

          #5
          Thats right

          Sorry for the sloppy explanation, but all the nucleotides are from a tab delimited file and Genomax stated the way I want it perfectly.


          >SNP_001
          TCCCAATTCGGC
          >SNP_002
          TCCCAATTCGGC
          >SNP_003
          TCCCAATTCGGC

          ......

          SNP_XXX
          ATGCATGCATGC

          Thanks

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            This is a bash shell script based on a solution in the stackoverflow thread I had posted above.

            Save the code in a file (script.sh in example below) and then run as follows:

            Code:
            $ sh script.sh your_data file
            Code:
            #!/bin/bash 
            declare -a array=( )                      # we build a 1-D-array
            
            read -a line < "$1"                       # read the headline
            
            COLS=${#line[@]}                          # save number of columns
            
            index=0
            while read -a line; do
                for (( COUNTER=0; COUNTER<${#line[@]}; COUNTER++ )); do
                    array[$index]=${line[$COUNTER]}
                    ((index++))
                done
            done < "$1"
            
            for (( ROW = 0; ROW < COLS; ROW++ )); do
                    printf ">"
              for (( COUNTER = ROW; COUNTER < ${#array[@]}; COUNTER += COLS )); do
                printf "%s" ${array[$COUNTER]}
                if [ $COUNTER == $ROW ]
                then
                    printf "\n"
                fi
              done
              printf "\n" 
            done

            Comment

            • musta1234
              Member
              • Jun 2013
              • 10

              #7
              Thanks

              I will definitely give it a try...

              Comment

              • musta1234
                Member
                • Jun 2013
                • 10

                #8
                Works GREAT!!!

                Hey Genomax and all!!

                The code works great... handles a file with 160 columns and 128,000 lines very well.

                Thanks

                Originally posted by GenoMax View Post
                This is a bash shell script based on a solution in the stackoverflow thread I had posted above.




                Save the code in a file (script.sh in example below) and then run as follows:

                Code:
                $ sh script.sh your_data file
                Code:
                #!/bin/bash 
                declare -a array=( )                      # we build a 1-D-array
                
                read -a line < "$1"                       # read the headline
                
                COLS=${#line[@]}                          # save number of columns
                
                index=0
                while read -a line; do
                    for (( COUNTER=0; COUNTER<${#line[@]}; COUNTER++ )); do
                        array[$index]=${line[$COUNTER]}
                        ((index++))
                    done
                done < "$1"
                
                for (( ROW = 0; ROW < COLS; ROW++ )); do
                        printf ">"
                  for (( COUNTER = ROW; COUNTER < ${#array[@]}; COUNTER += COLS )); do
                    printf "%s" ${array[$COUNTER]}
                    if [ $COUNTER == $ROW ]
                    then
                        printf "\n"
                    fi
                  done
                  printf "\n" 
                done

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                13 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                24 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                28 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                22 views
                0 reactions
                Last Post SEQadmin2  
                Working...