Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Awk command

    Hello,

    I have a file like the following

    chr1 1234
    chr1 2345
    chr2 94837
    chr2 73457

    how can I split this data into two files

    chr1.txt

    chr1 1234
    chr1 2345

    chr2.txt

    chr2 94837
    chr2 73457

    Thanks in advance.

  • #2
    Originally posted by rkk View Post
    Hello,

    I have a file like the following

    chr1 1234
    chr1 2345
    chr2 94837
    chr2 73457

    how can I split this data into two files

    chr1.txt

    chr1 1234
    chr1 2345

    chr2.txt

    chr2 94837
    chr2 73457

    Thanks in advance.
    $ awk '$1 =="chr1"' file > file1
    $ awk '$1 =="chr2"' file > file2

    This in therapy should work..

    Comment


    • #3
      Code:
      awk '{print > $1".txt"}' input

      Comment


      • #4
        A more "universal" way to do it:
        awk '{print > $1 ".txt"}' Input.file.txt

        Comment


        • #5
          Good to learn a easier way to do this.. can you explain a bit how did it work?

          Comment


          • #6
            awk: syntax error at source line 1
            context is
            {print > $1 >>> ".txt" <<<
            awk: illegal statement at source line 1


            I am getting the above error...

            Comment


            • #7
              Originally posted by rkk View Post
              awk: syntax error at source line 1
              context is
              {print > $1 >>> ".txt" <<<
              awk: illegal statement at source line 1


              I am getting the above error...
              make sure you pay attention to sigle quote, double quote, brackets etc. It worked for me.

              Comment


              • #8
                $head -5 test.txt

                1 9992
                1 9992
                1 9993
                1 9994
                1 9994


                $awk '{print > $1 ".txt"}' test.txt

                awk: syntax error at source line 1
                context is
                {print > $1 >>> ".txt" <<<
                awk: illegal statement at source line 1

                This is what I get for my test.txt file

                Comment


                • #9
                  Originally posted by rkk View Post
                  $head -5 test.txt

                  1 9992
                  1 9992
                  1 9993
                  1 9994
                  1 9994


                  $awk '{print > $1 ".txt"}' test.txt

                  awk: syntax error at source line 1
                  context is
                  {print > $1 >>> ".txt" <<<
                  awk: illegal statement at source line 1

                  This is what I get for my test.txt file

                  It worked for me.... not sure why it's not working for you.

                  Comment


                  • #10
                    Originally posted by rkk View Post
                    $head -5 test.txt

                    1 9992
                    1 9992
                    1 9993
                    1 9994
                    1 9994


                    $awk '{print > $1 ".txt"}' test.txt

                    awk: syntax error at source line 1
                    context is
                    {print > $1 >>> ".txt" <<<
                    awk: illegal statement at source line 1

                    This is what I get for my test.txt file
                    Where r u running it on?

                    Are you on linux server or running at your Mac's terminal?

                    Try using nawk or gawk instead of awk.

                    Comment


                    • #11
                      Originally posted by gene_x View Post
                      Good to learn a easier way to do this.. can you explain a bit how did it work?

                      Code:
                      awk '{print > $1".txt"}' input
                      $1 refers to the first column.

                      for each distinct column1,
                      Code:
                      print
                      to another file
                      Code:
                      >
                      with the same column name
                      Code:
                      $1

                      Comment


                      • #12
                        Originally posted by gokhulkrishnakilaru View Post
                        Code:
                        awk '{print > $1".txt"}' input
                        $1 refers to the first column.

                        for each distinct column1,
                        Code:
                        print
                        to another file
                        Code:
                        >
                        with the same column name
                        Code:
                        $1
                        I can understand print to another file with the same column name. What I don't get is where the separation based on first column contents happened..

                        Comment


                        • #13
                          I should use that command in LINUX...

                          Now, I have another issue

                          I have a file like following..I need to bin the first column in 100bp regions and count the second column value for that bin
                          10175 1
                          10179 1
                          10189 1
                          10191 1
                          10201 1
                          10243 1
                          10249 1
                          10262 1
                          10313 1
                          10414 1
                          10485 1
                          10499 1

                          The output should be something like this..

                          10101-10200 4
                          10201-10300 4
                          10301-10400 1
                          10401-10500 3

                          Can someone help with this..

                          Thanks in advance..

                          Comment


                          • #14
                            Originally posted by rkk View Post
                            I should use that command in LINUX...

                            Now, I have another issue

                            I have a file like following..I need to bin the first column in 100bp regions and count the second column value for that bin
                            10175 1
                            10179 1
                            10189 1
                            10191 1
                            10201 1
                            10243 1
                            10249 1
                            10262 1
                            10313 1
                            10414 1
                            10485 1
                            10499 1

                            The output should be something like this..

                            10101-10200 4
                            10201-10300 4
                            10301-10400 1
                            10401-10500 3

                            Can someone help with this..

                            Thanks in advance..
                            Do you already know your bins?

                            If not, what are your start values and end values to consider bins at 100bp?

                            Comment


                            • #15
                              command has to identify min and max value from col1 values.. and then bin that into 100bp regions...

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Exploring the Dynamics of the Tumor Microenvironment
                                by seqadmin




                                The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                                07-08-2024, 03:19 PM
                              • seqadmin
                                Exploring Human Diversity Through Large-Scale Omics
                                by seqadmin


                                In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                                06-25-2024, 06:43 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 07:20 AM
                              0 responses
                              24 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-16-2024, 05:49 AM
                              0 responses
                              38 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-15-2024, 06:53 AM
                              0 responses
                              44 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-10-2024, 07:30 AM
                              0 responses
                              41 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X