Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alignment tool for long reads?

    Hi,

    I'm just starting to work with some long reads from a PacBio sequencer (>1Kbp) and I see that my usual alignment tools like MEGA, DNA STAR, bowtie, bwa are all restricted to smaller length bp seqs (<500 bp). Does anybody have good experience with alignment tools that can handle longer reads of say >1Kbp and upto 2.5 Kbp reads?

    TiA, Nash

  • #2
    long reads

    MAY be BWASW

    Comment


    • #3
      I used Blat to do reference based scaffolding by aligning contigs to scaffolds. So that should work with long reads I assume.

      Comment


      • #4
        Blat

        Agree Blat is another good option

        Comment


        • #5
          Hi All

          We have actually developed a fast and accurate aligner named BLASR (Basic Local Alignment with Successive Refinement) - http://www.smrtcommunity.com/SMRT-An...gorithms/BLASR to align our long reads. The source code for this as well as the full analysis software suite is freely available at the same PacBio DevNet site. A publication on this algorithm is also currently in review, so stay tuned.

          Comment


          • #6
            Originally posted by pacbio View Post
            Hi All

            We have actually developed a fast and accurate aligner named BLASR (Basic Local Alignment with Successive Refinement) - http://www.smrtcommunity.com/SMRT-An...gorithms/BLASR to align our long reads. The source code for this as well as the full analysis software suite is freely available at the same PacBio DevNet site. A publication on this algorithm is also currently in review, so stay tuned.
            Thanks for posting. I look forward to the publication as I am also trying to map PacBio reads.

            Comment


            • #7
              Re: Alignment tool for long reads

              Thank you all for suggesting blat and bwa sw for aligning long reads. I am looking at the docs for bwa sw and it looks like in the command:

              bwa bwasw database.fasta long_read.fastq >aln.sam

              the parameter "database.fasta" is the reference and the parameter "long_read.fastq" is the sequence being aligned. Right?

              So does it absolutely need the fastq file or can it just work w/o the quality data, i.e., just a *.fasta file? Also how about the ccs based output from PacBio? Anybody has tried the PacBio ccs outputs? I'm trying to get "blasr" tool from PacBio pipeline installed here, but I am not there yet...

              Thanks in advance,

              Nash

              Comment


              • #8
                Hi Nash,
                You can install blasr on your own using github (https://github.com/PacificBiosciences/blasr).

                If you have the hdf files, there are options (-useccsdenovo) to align the ccs sequences instead of the raw subreads.

                HTH,
                -mark


                Originally posted by naragam View Post
                Thank you all for suggesting blat and bwa sw for aligning long reads. I am looking at the docs for bwa sw and it looks like in the command:

                bwa bwasw database.fasta long_read.fastq >aln.sam

                the parameter "database.fasta" is the reference and the parameter "long_read.fastq" is the sequence being aligned. Right?

                So does it absolutely need the fastq file or can it just work w/o the quality data, i.e., just a *.fasta file? Also how about the ccs based output from PacBio? Anybody has tried the PacBio ccs outputs? I'm trying to get "blasr" tool from PacBio pipeline installed here, but I am not there yet...

                Thanks in advance,

                Nash

                Comment


                • #9
                  Thank you Mark...I don't have access to hd5 files yet....they are with the core sequencing facility and I am not sure they will give me those right now.... But I am working with them to gradually get some of the pipeline tools locally on my new Ubuntu machine that still needs memory upgrades before I can run your pipeline tools...

                  Yeah, I hope to run blasr soon but, in the meantime, I am trying to learn some of these long read tools that I haven't worked with before. Do you know if you have to have the fastq files for bwa sw?

                  Nash

                  Comment


                  • #10
                    Originally posted by naragam View Post
                    Yeah, I hope to run blasr soon but, in the meantime, I am trying to learn some of these long read tools that I haven't worked with before. Do you know if you have to have the fastq files for bwa sw?

                    Nash
                    bwa sw aligns fasta sequences.

                    You will want the bas.h5 files since they have additional information about subread coordinates.

                    Comment


                    • #11
                      blasr compilation

                      Mark,

                      Am trying to compile blasr on my machine and am missing some header files in the tar file distribution. Can you please point me to sources who can help me or provide the *.h files needed? Thanks much,

                      Nash

                      Comment


                      • #12
                        A quicker refined flavour of BLAT is BFAST :http://www.plosone.org/article/info:...l.pone.0007767

                        Comment


                        • #13
                          PacBio &quot;blasr&quot; questions....

                          Perhaps, I should really start a new thread...but, does anybody on this forum have good experience with blasr alignments to discuss the various options for the run and further the several output formats that are available. I have just started playing with some of the balsr runs and I have some pointed questions that I'd like to ask and/or seek detailed docs to refer to in terms of understanding all the options and outputs.

                          Any help available in this forum?

                          Thanks much in advance for any pointers,

                          Nash

                          Comment


                          • #14
                            Originally posted by naragam View Post
                            Perhaps, I should really start a new thread...but, does anybody on this forum have good experience with blasr alignments to discuss the various options for the run and further the several output formats that are available. I have just started playing with some of the balsr runs and I have some pointed questions that I'd like to ask and/or seek detailed docs to refer to in terms of understanding all the options and outputs.

                            Any help available in this forum?

                            Thanks much in advance for any pointers,

                            Nash
                            You could say I'm pretty familiar with blasr output (I'm the author).

                            Most of the help may be found by running blasr -h, or blasr -help for detailed help. There are many output formats including tabular ones for which you can get column labels with the -header option, human readable output (-m 0), and sam (specified by -sam).

                            -mark

                            Comment


                            • #15
                              blasr output

                              Mark,

                              That's great to know....I have printed out the help pages, but there are still unanswered questions for me...would you like to take this discussion offline or do you want me to post the questions right here? If there's a special PacBio support site for blasr, I can reach you through that...Please let me know your convenience. Thanks much,

                              Nash

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Non-Coding RNA Research and Technologies
                                by seqadmin


                                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                [Article Coming Soon!]...
                                Today, 08:07 AM
                              • seqadmin
                                Recent Developments in Metagenomics
                                by seqadmin





                                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                09-23-2024, 06:35 AM
                              • seqadmin
                                Understanding Genetic Influence on Infectious Disease
                                by seqadmin




                                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                                09-09-2024, 10:59 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 10-02-2024, 04:51 AM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-01-2024, 07:10 AM
                              0 responses
                              23 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-30-2024, 08:33 AM
                              1 response
                              29 views
                              0 likes
                              Last Post EmiTom
                              by EmiTom
                               
                              Started by seqadmin, 09-26-2024, 12:57 PM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X