Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Align multiple sequences in tabular or fasta format

    Hi Folks,

    I have ~100,000 short sequences (~25bp long) in fasta format. They are oligo probes used in affymetrix mouse 430-2 chip. I want to align all the sequences with mm9 genomic database to get either GFF or BED output. Can anyone suggest a good web- or windows-based tool for this purpose?

    The following is an example of the first probe, thanks!

    >probe:Mouse430_2:1415670_at:269:753; Interrogation_Position=2436; Antisense;
    GGCTGATCACATCCAAAAAGTCATG

  • #2
    There a several short read aligner for this purpose :

    - Bowtie
    - Soap2
    - BWA
    - Novoalign
    - ...

    Comment


    • #3
      For online based, I have seen Galaxy which i think would be good option since your dataset it small.

      Comment


      • #4
        Thanks to NicoBxl and husamia.

        Still trying to understand how to install bowtie in windows....

        I did tried galaxy using my fasta files. It turned out in error "reads file does not look like a FASTQ file." Galaxy requires 2 more columns (strandness and quality score) to run the alignment. However, it is not working even I tried to add 2 dummy columns and change the file identity from FASTA to FASTQ.

        Does anybody know how to run alignment without going through FASTQ requirement on galaxy? Thanks a million!

        Comment


        • #5
          Write a simple PERL script to convert your FASTA format into a FASTQ format.
          Then run bowtie to do the alignment.

          Comment


          • #6
            Galaxy should auto detect your format, and it should be able to take up fasta formats. If it is spitting out a fastq related error, make sure you are uploading with the correct options.
            Otherwise, the headers to your fasta file may be causing problems? Not sure if you can use wordpad or some other program in windows to change the headers to something simpler if you aren't familiar with command line.
            There are windows large text file editor programs such as 'gVim', or google for one.

            Comment


            • #7
              Originally posted by Kennels View Post
              There are windows large text file editor programs such as 'gVim', or google for one.
              anybody has experience with opening large text files such as fasta in windows? I usually like to use search and replace function alot what are some good editors for large files ~12GB
              I know this is huge file but I wonder if there anybody know of editor that responsibly handles such files without hogging up memory or crashing.

              Comment


              • #8
                Turned out working by aligning using bowtie! Thank you everyone for your suggestions.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  05-06-2024, 07:48 AM
                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 05-14-2024, 07:03 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-10-2024, 06:35 AM
                0 responses
                42 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-09-2024, 02:46 PM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-07-2024, 06:57 AM
                0 responses
                42 views
                0 likes
                Last Post seqadmin  
                Working...
                X