Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FASTA to BAM for IGV visualization

    Hi all,

    Does anyone know how to make a .fasta file into a .bam file so that we can incorporate two reference genomes into IGV (Integrative Genomics Viewer)?

    Thanks
    Joyce

  • #2
    Are you thinking of ["fasta" --> "align to a reference" (SAM) --> Convert to BAM --> View in IGV] workflow?

    What do you mean by incorporate two reference genomes into IGV?

    Comment


    • #3
      Hi GenoMax, thanks for your reply.

      I will explain the situation a bit more clearly - we have aligned our MiSeq data based on this published "scaffold" reference genome but there are a lot of "N"s in this file, making it hard to tell apart true SNPs/indels from the bad reference sequence.

      There is another well-annotated sequence of the same species but different lineage, and we'd like to use this as our IGV reference genome. At the same time, we're wondering if it's possible to convert the scaffold reference genome (.fasta) into a .bam file so we can open it along with our other real MiSeq samples?

      Thanks,
      Joyce

      Comment


      • #4
        You could align your sample (and the scaffold fasta) to the annotated genome and then open both BAM's in IGV using the annotated genome as your reference.

        Comment


        • #5
          Genomax, do you know if I could do the conversion using samtools? If not, how do you think I could do so?
          Thanks
          Joyce

          Comment


          • #6
            You could map the sample to the new reference; and in parallel map old reference to new reference, then use samtools merge (which I think requires conversion from sam to bam).

            Comment


            • #7
              Originally posted by Bubblepig View Post
              Genomax, do you know if I could do the conversion using samtools? If not, how do you think I could do so?
              Thanks
              Joyce
              You will have to do some alignments to view the three datasets together in IGV.

              Minimally you should:

              1. Create an index for annotated reference.
              2. Map your sequence and the scaffold fasta against the annotated reference.
              3. Finally view the aligned BAM in IGV.

              You will need access to a UNIX machine to do these steps.

              Depending on how distant your organism (and the scaffold fasta) is from the annotated reference you may or may not be able to get useful visual information from IGV.

              Comment


              • #8
                Thanks Genomax and ctseto. I am having trouble mapping the scaffold genome fasta to the index reference genome using bowtie2.
                Error: reads file does not look like a FASTQ file
                libc++abi.dylib: terminate called throwing an exception
                bowtie2-align died with signal 6 (ABRT)
                Anything else I could try?

                Thanks
                Joyce

                Comment


                • #9
                  Originally posted by Bubblepig View Post
                  Thanks Genomax and ctseto. I am having trouble mapping the scaffold genome fasta to the index reference genome using bowtie2.


                  Anything else I could try?

                  Thanks
                  Joyce
                  Are you remembering to include the "-f" option to indicate that your scaffold file is in "fasta" format?

                  Comment


                  • #10
                    Genomax, where should I indicate the -f option?

                    Here's the command:
                    wpa031012:bowtie2-2.1.0 Behr$ /Users/Behr/bowtie2-2.1.0/bowtie2-align -p 4 -x H37Rv -U /Users/Behr/bowtie2-2.1.0/HN878.fasta -S HN878aligned.sam
                    -x H37Rv is the indexed referenced genome in bt2 format
                    HN878.fasta is the scaffold genome
                    -S HN878aligned.sam is what I want the output to be.

                    Thanks!! and
                    Joyce

                    Comment


                    • #11
                      Try

                      Code:
                      $ /Users/Behr/bowtie2-2.1.0/bowtie2 -p 4 -f -x H37Rv -U /Users/Behr/bowtie2-2.1.0/HN878.fasta -S HN878aligned.sam
                      Why are you running bowtie2-align directly. Run the wrapper script "bowtie2" as indicated above.

                      Comment


                      • #12
                        Hi Genomax,

                        Thanks, I tried your script

                        /Users/Behr/bowtie2-2.1.0/bowtie2 -p 4 -f -x H37Rv -U /Users/Behr/bowtie2-2.1.0/HN878.fasta -S HN878aligned.sam
                        After several minutes the HN878aligned.sam file remained 0 kb, and the computer became abnormally slow. The HN878.fasta file itself is only 4.4 MB, so I thought something wasn't going right.
                        I hit Command+C to terminate the script and got this message.

                        ^Cbowtie2-align died with signal 2 (INT)
                        Any idea?

                        Thanks
                        Joyce

                        Comment


                        • #13
                          How much RAM do you have (are you using a virtual machine to run bowtie)? How big is your genome index file?

                          Comment


                          • #14
                            It's a Macbook so I used Terminal.
                            Here are the specs:
                            Processor 2.9 GHz Intel Core i7
                            Memory 8 GB 1600 MHz DDR3

                            There are 6 files for the indexed genome, all in bt2 format. They add up to less than 15 MB!

                            Comment


                            • #15
                              Have you used bowtie2 on this machine before? Did you create the index files on this machine?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              66 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X