Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bambus2 Input files

    Dear all,
    I have merged three-four velvet assemblies using minimus2 and now I have .contig and .bnk/ files, I want to do scaffolding of these generated contigs using goBambus2. As goBambus2(Amos package) requires .mate file, How can I generate this file? I did my assemblies from Illumina paired-end reads. Is this .mate file refer to that pair information? I read many forums but did not get any solution. Can anybody please suggest me how can I proceed further?
    Best regards,
    Rahul
    Rahul Sharma,
    Ph.D
    Frankfurt am Main, Germany

  • #2
    Does this info at the AMOS Web site help?

    Download AMOS for free. AMOS is a collection of tools for genome assembly. AMOS is a collection of tools and class interfaces for the assembly of DNA reads. The package includes a robust infrastructure, modular assembly pipelines, and tools for overlapping, consensus generation, contigging, and assembly manipulation.

    Comment


    • #3
      Yes I checked this link. I have .contig file and .bnk/ file. How to generate the .mate file? should I use the sed command on the Illumina paired-end reads files as discussed in some posts. But the Id's of my .mate file and .contig file are not showing any link. My .contig file has id: #NODE_1_length_1305_cov_18.627586(0) from velvet and the .mate is with Illumina id's @HWUSI-EAS100R:6:73:941:1973#0/1 @HWUSI-EAS100R:6:73:941:1973#0/2. How can I link this information? Anybody please help.
      Regards,
      Rahul
      Rahul Sharma,
      Ph.D
      Frankfurt am Main, Germany

      Comment


      • #4
        You merged Velvet assemblies where you have extracted the contigs? I would guess that if you merged Velvet assemblies where you ended up with scaffolds, and then merged the scaffolds using Minimus2, you will not gain anything by running goBambus2, your assembly is already scaffolded.

        The .mate file refer to your Illumina IDs, and goBambus2 need that ID inside the AMOS bank. goBambus2 needs to know where in each contig the different reads map to, and I'm not sure how you would go about and get that information into the bank. If you generated the .afg file when you ran Velvet, and used that file when you merged the assemblies, I think goBambus2 would have all the information it needs in the bank.

        Either you can try
        Code:
        goBambus2 <your AMOS bank> <output prefix>
        or I would guess that the output of Minimus2 is as good as you can get your assembly.

        Ole

        Comment


        • #5
          Dear Ole,
          Thanks for your message. Ya I ran velvet without scaffolding. I turned off its scaffolding and got 4-5 assemblies and then merged with minimus2. So the assemblies I have are only contigs without scaffolding and without N's in it.
          Ya I tried the command you mentioned above but it did'nt work .

          Rahul
          Rahul Sharma,
          Ph.D
          Frankfurt am Main, Germany

          Comment


          • #6
            Hi Rahul.

            I think I know what you can do. ABySS has a script called abyss-samtoafg, or you can find it in the AMOS repository too (direct link). What you need to do is to align your reads to the merged contigs using BWA or Bowtie or something else, resulting in a .sam file. Then you can use samtoafg.pl like this:
            Code:
            samtoafg.pl contigs.fa alignments.sam >assembly.afg
            You can optionally provide the mean fragment size, standard deviation of the fragment size etc., just look at samtoafg.pl --help.

            When you have done this, and have a .afg file, you can create a .bnk:
            Code:
            bank-transact -cb assembly.bnk -m assembly.afg
            I think, but haven't tried yet, that you can then just run Bambus2 on the resulting bank to scaffold your merged contigs.

            Hope this helps.

            Ole

            Comment


            • #7
              Dear Ole,

              Many thanks for your great help and good news is that its working!! I tried the above commands with the SSPACE demo data in the example/ directory. But unfortunately I got very small genome size in the BAMBUS2 Scaffolds. And SSPACE generated very good results.

              Summary:
              Bambus2 Scaffolding:-
              Scaffolds_eco.contigs.fasta:595
              Scaffolds_eco.scaffold.fasta:23
              Scaffolds_eco.scaffold.linear.fasta:488

              SSPACE scaffolds were: 111(with_extension) and 127(without_extension)
              But I have not tried the synteny from MUMmer yet. And I have another question regarding the commands you mentioned above. Will they include the mate-pair information also? If I run the Bambus2 from bank/ option. I am also trying to do this with .contig and .mate file with following commands:
              samtoafg.pl contigs.fa alignments.sam >assembly.afg
              bank-transact -cb assembly.bnk -m assembly.afg
              bank2contig assembly.bnk/ > assembly.contig
              cat SRR001665_1.fastq | grep "^@SRR" | sed s/@//g | awk '{print $1"/1""\t"$1"/2""\tsmall"}' > assembly.mate

              goBambus2 assembly.contigs myoutput --all --contigs
              Thanks,
              Rahul
              Rahul Sharma,
              Ph.D
              Frankfurt am Main, Germany

              Comment


              • #8
                After careful reading the Bambus2 paper and thinking, I think your approach didn't use benefit of Bambus2.

                1) In the paper, it says the software can resolve variation motifs.
                2) Variation motifs can be represented in de bruijin graph construction.
                3) BWA alignment or other short read alignment probably failed to reconstruct large variation motifs.

                But you use alignment_sam_afg file to do scaffolding. I am sure Bambus2 is not designed and optimised to handle this kind of situation.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                47 views
                0 likes
                Last Post seqadmin  
                Working...
                X