Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert SAM to an aligned multi-fasta

    I have aligned contigs to a reference using Bowtie2 which gives a SAM file output.
    I require a mutli-fasta alignment for downstream analysis-NOT a consensus as my dataset contains unknown paralogs that I have to separate manually. I repeat i do NOT want to extract a consensus sequence, i just want a set of aligned fastas.

    Is it possible to convert SAM files to an aligned multifasta, or am i going to have to do my whole alignment manually?

  • #2
    I wrote a tool named SAM4WebLogo https://github.com/lindenb/jvarkit/wiki/SAM4WebLogo

    ( fulfills your needs ?)

    Comment


    • #3
      What does "aligned multifasta" mean? Do you just want a fasta file containing the reads that mapped?

      Comment


      • #4
        It is a multi-fasta like format except that the sequences are extracted after a multiple sequence alignment: http://www.bioperl.org/wiki/FASTA_mu...ignment_format

        Comment


        • #5
          no lindenb it does not meet my needs as it generates a bitmap which is completely unusable for downstream analysis while pretty will not create the fasta file i require.

          Brian Bushnell- I am trying to speed up a rather lengthy manual alignment process by utalizing Bowtie 2 to create the alignment for me... however this outputs a SAM file. I require all my output to be in fasta format for downstream analysis therefore i require the alignment to be saved as a multi fasta with each contig in position as descirbed in the link posted by GenoMax

          Comment


          • #6
            fiona,

            Sorry, I have nothing currently that can do that. Maybe I'll add it.

            -Brian

            Comment


            • #7
              Hi all,

              Thank you for your help... I have discovered that Geneious has the ability to export SAM files as fasta while maintaining the alignment... although i usually prefer to use freeware it appears to be the only method available that can currently do this conversion, luckily it is not currently behind the paywall. Thank you for all your help.

              -Fiona

              Comment


              • #8
                Originally posted by fiona_l View Post
                no lindenb it does not meet my needs as it generates a bitmap which is completely unusable for downstream analysis while pretty will not create the fasta file i require.
                it doesn't generate a bitmap, but a set of aligned fasta sequences:

                Code:
                >B7_593:4:106:316:452/1
                TGTTG--------------------------
                >B7_593:4:106:316:452a/1
                TGTTG--------------------------
                >B7_593:4:106:316:452b/1
                TGTTG--------------------------
                >B7_589:8:113:968:19/2
                TGGGG--------------------------
                >B7_589:8:113:968:19a/2
                TGGGG--------------------------
                >B7_589:8:113:968:19b/2
                TGGGG--------------------------
                >EAS54_65:3:321:311:983/1
                TGTGGG-------------------------
                >EAS54_65:3:321:311:983a/1
                TGTGGG-------------------------
                >EAS54_65:3:321:311:983b/1
                TGTGGG-------------------------
                >B7_591:6:155:12:674/2
                TGTGGGGG-----------------------
                >B7_591:6:155:12:674a/2
                TGTGGGGG-----------------------
                >B7_591:6:155:12:674b/2
                TGTGGGGG-----------------------
                >EAS219_FC30151:7:51:1429:1043/2
                TGTGGGGGGCGCCG-----------------

                Comment


                • #9
                  I usually go to Picard tools or BedTools when faced with a BAM/SAM question. Perhaps the Picard program 'SamToFastQ' and then any of FastQ to FastA converters. But this probably will not conserve the position and thus a

                  While I see what you are trying to do -- make alignments via a non-MSA (multiple sequence alignment) program since you have a reference to work from -- it seems awkward. An MSA program would take care of indels and SNPs that bowtie2 can not handle. For anything moderately complex it does not seem that bowtie2 would be a good choice.

                  Comment


                  • #10
                    I am also trying to convert a sam/bam to aligned fasta file.
                    The tools I have tried will convert to a fasta file but not an aligned fasta.
                    I tried Geneious as above, but cannot import my data as I have a lot of clipping.

                    I am trying to use the SAM4WebLogo - but cannot get the tool to work. I have installed jvarkit as per the instructions and it appears to have built successfully, however when I try to run the tool
                    java -jar SAM4WebLogo.java
                    I get:
                    Error: Invalid or corrupt jarfile SAM4WebLogo.java

                    Any advice on getting the tool running or alternatives would be apprecaited.
                    Thanks

                    Comment


                    • #11
                      I don't know anything about that tool, but that's incorrect syntax for running a java program. You can't execute the .java files - those are the source code. You can only execute .class files or .jar files. For a .class file:

                      java SAM4WebLogo

                      for a .jar file:

                      java -jar SAM4WebLogo.jar

                      Comment


                      • #12
                        Thanks Brian,
                        I looked and it is a .java file - running as above I still get an error:
                        Error: Could not find or load main class

                        Comment


                        • #13
                          If it has no dependencies, and you have the full JDK installed, you may be able to compile it like this:

                          javac SAM4WebLogo.java

                          Which will give you a class file. But it's probably easier to explore the website where you found it to look for a compiled version.
                          Last edited by Brian Bushnell; 11-03-2014, 07:39 PM.

                          Comment


                          • #14
                            I went back to the website and have since got the tool running. Thanks again

                            Comment


                            • #15
                              Anyone got this working? the "-h" isn't helpful and I'm unable to get fasta output no matter what flags I try. Yes I've tried -r and -o

                              sam4weblogo SAMPLE_274_R1_val_1_bismark_bt2_pe.sam
                              There was an error in the input parameters.
                              The following option is required: -r, --region, --interval
                              [INFO][Launcher]sam4weblogo Exited with failure (-1)

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Exploring the Dynamics of the Tumor Microenvironment
                                by seqadmin




                                The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                                07-08-2024, 03:19 PM
                              • seqadmin
                                Exploring Human Diversity Through Large-Scale Omics
                                by seqadmin


                                In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                                06-25-2024, 06:43 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 06:53 AM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-10-2024, 07:30 AM
                              0 responses
                              32 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-03-2024, 09:45 AM
                              0 responses
                              203 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-03-2024, 08:54 AM
                              0 responses
                              213 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X