Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mbk0asis
    Member
    • Dec 2011
    • 41

    Fasta manipulation tools???

    Hello, everyone.
    I have a peak data that I can see on UCSC genome browser.
    I downloaded the sequences of all the peaks in fasta format and tried to align with ClustalX/W.
    but failed due to the duplicated headers (of seq ID) of the sequences.

    Can anyone tell me how to change or remove the headers for multiple alignment?
    Is there any software for that?
  • twaddlac
    Member
    • Feb 2011
    • 49

    #2
    I don't know of any programs that solve that problem specifically, but you could always write some script to append an arbitrary identifier to the end of each sequence.

    Comment

    • gringer
      David Eccles (gringer)
      • May 2011
      • 845

      #3
      Originally posted by mbk0asis View Post
      Can anyone tell me how to change or remove the headers for multiple alignment?
      Is there any software for that?
      For FASTA manipulation, my first port of call is FASTX-Toolkit, and then EMBOSS. A quick glance through the FASTX-Toolkit reveals a program called fastx_renamer. Here's something that should change sequence headers to a simple counter:

      Code:
      fastx_renamer -n COUNT -i input.fasta -o output.fasta

      Comment

      • mbk0asis
        Member
        • Dec 2011
        • 41

        #4
        Thank you very much!!!
        I will try now.

        Comment

        • JackieBadger
          Senior Member
          • Mar 2009
          • 385

          #5
          Galaxy is a community-driven web-based analysis platform for life science research.


          super easy fasta manipulation. video tutorials are also very good

          It employs the above programs via a GUI

          Comment

          • mbk0asis
            Member
            • Dec 2011
            • 41

            #6
            Hi, all... again.
            I failed to install FastX-Toolkit on my computer (LinuxMint11 64bit - Ubuntu based).
            I am trying to find what went wrong.
            Meanwhile, what is the name of EMBOSS tool that can manipulate fasta files?

            Comment

            • gringer
              David Eccles (gringer)
              • May 2011
              • 845

              #7
              Originally posted by mbk0asis View Post
              I failed to install FastX-Toolkit on my computer (LinuxMint11 64bit - Ubuntu based).
              fastx-toolkit is in Debian, so it should also be in Ubuntu and Mint:

              Code:
              aptitude install fastx-toolkit

              Comment

              • mbk0asis
                Member
                • Dec 2011
                • 41

                #8
                # sudo aptitude install fastx-toolkit

                No candidate version found for fastx-toolkit
                No candidate version found for fastx-toolkit
                No packages will be installed, upgraded, or removed.
                0 packages upgraded, 0 newly installed, 0 to remove and 18 not upgraded.
                Need to get 0 B of archives. After unpacking 0 B will be used.

                ???

                Comment

                • gringer
                  David Eccles (gringer)
                  • May 2011
                  • 845

                  #9
                  It's in the oneiric and precise universe repositories from Ubuntu. Add one of those to your package repostiories list:

                  Comment

                  • dvanic
                    Member
                    • Jan 2012
                    • 61

                    #10
                    To completely remove the headers just use awk:
                    cat fasta.filename | awk '0 == NR % 2' > sequences_only
                    But the FASTX-toolkit is in general quite a useful utility to have, so perhaps have a look at the instructions here:
                    http://hannonlab.cshl.edu/fastx_tool...all_ubuntu.txt and here:


                    Hope this helps!

                    Comment

                    • gringer
                      David Eccles (gringer)
                      • May 2011
                      • 845

                      #11
                      Originally posted by dvanic View Post
                      To completely remove the headers just use awk
                      Here's a perl one-liner which will replace fasta headers with the current input line number:

                      Code:
                      perl -pe 's/^>.*$/">$."/e' input.fasta > output.fasta
                      Same thing with awk:
                      Code:
                      awk '{if(/^>/){print ">"FNR} else{print $0}}' input.fasta > output.fasta

                      Comment

                      • mbk0asis
                        Member
                        • Dec 2011
                        • 41

                        #12
                        Finally, it worked, GRINGER.

                        I really really appreciate your help.

                        Comment

                        Latest Articles

                        Collapse

                        • GATTACAT
                          Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                          by GATTACAT
                          Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                          Yesterday, 11:43 AM
                        • SEQadmin2
                          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                          by SEQadmin2


                          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                          Here are nine questions we think about, in roughly the order they matter, before...
                          06-18-2026, 07:11 AM
                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          06-02-2026, 10:05 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, 06-30-2026, 05:37 AM
                        0 responses
                        11 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-26-2026, 11:10 AM
                        0 responses
                        18 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-17-2026, 06:09 AM
                        0 responses
                        52 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-09-2026, 11:58 AM
                        0 responses
                        111 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...