Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Align multiple sequences in tabular or fasta format

    Hi Folks,

    I have ~100,000 short sequences (~25bp long) in fasta format. They are oligo probes used in affymetrix mouse 430-2 chip. I want to align all the sequences with mm9 genomic database to get either GFF or BED output. Can anyone suggest a good web- or windows-based tool for this purpose?

    The following is an example of the first probe, thanks!

    >probe:Mouse430_2:1415670_at:269:753; Interrogation_Position=2436; Antisense;
    GGCTGATCACATCCAAAAAGTCATG

  • #2
    There a several short read aligner for this purpose :

    - Bowtie
    - Soap2
    - BWA
    - Novoalign
    - ...

    Comment


    • #3
      For online based, I have seen Galaxy which i think would be good option since your dataset it small.

      Comment


      • #4
        Thanks to NicoBxl and husamia.

        Still trying to understand how to install bowtie in windows....

        I did tried galaxy using my fasta files. It turned out in error "reads file does not look like a FASTQ file." Galaxy requires 2 more columns (strandness and quality score) to run the alignment. However, it is not working even I tried to add 2 dummy columns and change the file identity from FASTA to FASTQ.

        Does anybody know how to run alignment without going through FASTQ requirement on galaxy? Thanks a million!

        Comment


        • #5
          Write a simple PERL script to convert your FASTA format into a FASTQ format.
          Then run bowtie to do the alignment.

          Comment


          • #6
            Galaxy should auto detect your format, and it should be able to take up fasta formats. If it is spitting out a fastq related error, make sure you are uploading with the correct options.
            Otherwise, the headers to your fasta file may be causing problems? Not sure if you can use wordpad or some other program in windows to change the headers to something simpler if you aren't familiar with command line.
            There are windows large text file editor programs such as 'gVim', or google for one.

            Comment


            • #7
              Originally posted by Kennels View Post
              There are windows large text file editor programs such as 'gVim', or google for one.
              anybody has experience with opening large text files such as fasta in windows? I usually like to use search and replace function alot what are some good editors for large files ~12GB
              I know this is huge file but I wonder if there anybody know of editor that responsibly handles such files without hogging up memory or crashing.

              Comment


              • #8
                Turned out working by aligning using bowtie! Thank you everyone for your suggestions.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Best Practices for Single-Cell Sequencing Analysis
                  by seqadmin



                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                  06-06-2024, 07:15 AM
                • seqadmin
                  Latest Developments in Precision Medicine
                  by seqadmin



                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                  Somatic Genomics
                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                  05-24-2024, 01:16 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 07:24 AM
                0 responses
                9 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 08:58 AM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-12-2024, 02:20 PM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-07-2024, 06:58 AM
                0 responses
                184 views
                0 likes
                Last Post seqadmin  
                Working...
                X