Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • crAss for comparative metagenomics using cross-assembly

    Cross-assembly of reads from different metagenomes allows us to assess the degree of similarity between the sampled communities. crAss is a tool to analyse the cross-assembly files. It creates a distance measure between metagenome pairs using several possible distance formulas. Please share your experiences with us!
    The tool is available online at http://edwards.sdsu.edu/crass/ and for download at https://sourceforge.net/projects/crass/

  • #2
    Greetings,
    I'm very interested into this tool. I've tried it, but when I load my datasets the program tells me that it doesn't recognize any read coming from my unassembled, single metagenomes.
    I assembled my samples by concatenating them together and using Cap3 as assembler. Each read in my samples has a unique identifier. Do I need to change it, in order to tell the program which read comes from which metagenome?
    Thanks in advance.

    Michael Tangherlini

    Comment


    • #3
      Originally posted by MikeT View Post
      Greetings,
      I'm very interested into this tool. I've tried it, but when I load my datasets the program tells me that it doesn't recognize any read coming from my unassembled, single metagenomes.
      I assembled my samples by concatenating them together and using Cap3 as assembler. Each read in my samples has a unique identifier. Do I need to change it, in order to tell the program which read comes from which metagenome?
      Thanks in advance.

      Michael Tangherlini
      Hi Michael, my guess is Cap3 does not list the read IDs in the ACE format in the same way as other assemblers. Could you send me your Fasta files and the resulting ACE file in an email? Maybe a small example of an assembly of just a couple of reads would be better than the whole thing at once . Then I'll take a look! Best, Bas

      Comment


      • #4
        Hello Bas,
        I managed to solve the issue by simply adding the sample name as a prefix to the read. But I experienced issues with the website: as I upload my sequences, it doesn't seem to be able to update itself. The wheel on the left, after starting the run, keeps spinning forever.
        Instead, the program works great locally. I've installed it and is able to analyze my data barely in seconds.
        I have a question regarding the plot, though. For 3 samples, crAss automatically generates a threedimensional box, but I'm not able to understand what the dots and points stand for. I mean, I see that dots on each surface represent the contigs plotted along each axis according to the number of reads used to assemble them, but what about the points? What do they represent?
        Best regards

        Michael Tangherlini

        Comment


        • #5
          Originally posted by MikeT View Post
          Hello Bas,
          I managed to solve the issue by simply adding the sample name as a prefix to the read. But I experienced issues with the website: as I upload my sequences, it doesn't seem to be able to update itself. The wheel on the left, after starting the run, keeps spinning forever.
          Instead, the program works great locally. I've installed it and is able to analyze my data barely in seconds.
          I have a question regarding the plot, though. For 3 samples, crAss automatically generates a threedimensional box, but I'm not able to understand what the dots and points stand for. I mean, I see that dots on each surface represent the contigs plotted along each axis according to the number of reads used to assemble them, but what about the points? What do they represent?
          Best regards

          Michael Tangherlini
          Great that you managed to solve the issue! I will ask Rob Schmieder to take a look at the website when he's back at his desk in August (he made it). Could it be that the files are really big and just take a very long time to upload? You could try to make them smaller using the commands shown on the help page.

          The dots in the 3D plot are the projections of the points onto the three plains (superimposed on the points for visibility).

          Comment


          • #6
            Hi Bas,
            Thanks for the reply! No, the files have already been treated to greatly reduce their size as specified on the website. On that PC I've been using FireFox from CentOS, now I'm repeating it with Chrome on Win7 and get the same results: the wheel keeps spinning.
            Is there a way to change the plot into a triangle plot as the one visible on the website? I fear that this kind of visualization is not so clear (at least for my samples).
            Kind regards

            Michael Tangherlini

            Comment


            • #7
              Originally posted by MikeT View Post
              Hi Bas,
              Thanks for the reply! No, the files have already been treated to greatly reduce their size as specified on the website. On that PC I've been using FireFox from CentOS, now I'm repeating it with Chrome on Win7 and get the same results: the wheel keeps spinning.
              Is there a way to change the plot into a triangle plot as the one visible on the website? I fear that this kind of visualization is not so clear (at least for my samples).
              Kind regards

              Michael Tangherlini
              The stand-alone script should also output both plots, we will fix that! Meanwhile, you can take the data from the "output.contigs2reads.txt" file, that lists the number of reads from each dataset assembled into each contig, and make your own plot. If you want, you can also filter out the unassembled reads (listed at the end of the file) because they will not really add much info to the plot: they will all overlap in the same three points: (0,0,1) ; (0,1,0) and (1,0,0).

              Comment


              • #8
                Hello Bas,
                I've used the output contig2reads provided by the program. I loaded it into a spreadsheet and associated to each contig (removing the single, unassembled reads) a specific color. I created the color code by assigning a single color in the RGB spectrum to each of my three samples (so one was Red, one was Green and one was Blue), normalizing each contribution to the 0-255 range and extrapolating the corresponding hex value from the mix of the three single values.
                I then exported the spreadsheet and used the hex value as a fourth column in GNUplot to create a 3d plot as you do with crAss, only that I have only the contigs as points colored according to the contribution of each metagenome to the assembly. Pretty nifty.

                Michael Tangherlini

                Comment


                • #9
                  Originally posted by MikeT View Post
                  Hello Bas,
                  I've used the output contig2reads provided by the program. I loaded it into a spreadsheet and associated to each contig (removing the single, unassembled reads) a specific color. I created the color code by assigning a single color in the RGB spectrum to each of my three samples (so one was Red, one was Green and one was Blue), normalizing each contribution to the 0-255 range and extrapolating the corresponding hex value from the mix of the three single values.
                  I then exported the spreadsheet and used the hex value as a fourth column in GNUplot to create a 3d plot as you do with crAss, only that I have only the contigs as points colored according to the contribution of each metagenome to the assembly. Pretty nifty.

                  Michael Tangherlini
                  Wow that sounds great! Make sure you share it when it's published
                  I'll see if we can include it in our plots too...
                  Last edited by dutilh; 07-04-2012, 06:53 AM. Reason: update

                  Comment


                  • #10
                    The end plot looks quite cluttered, though, and a little bit uneasy to understand. So I did something different: I plotted the cross-assembled contigs using four colors. 1, 2 and 3 are for the three metagenomes considered and represent the most important contributor to each contig. 4 is for contigs with ties in contribution.
                    I can show the plot, since it's just an attempt with very preliminary data.

                    -MikeT
                    Attached Files

                    Comment


                    • #11
                      The first paper that cites crAss is out: "Taxonomic and Functional Microbial Signatures of the Endemic Marine Sponge Arenosclera brasiliensis" by Trindade-Silva et al.:
                      The endemic marine sponge Arenosclera brasiliensis (Porifera, Demospongiae, Haplosclerida) is a known source of secondary metabolites such as arenosclerins A-C. In the present study, we established the composition of the A. brasiliensis microbiome and the metabolic pathways associated with this community. We used 454 shotgun pyrosequencing to generate approximately 640,000 high-quality sponge-derived sequences (∼150 Mb). Clustering analysis including sponge, seawater and twenty-three other metagenomes derived from marine animal microbiomes shows that A. brasiliensis contains a specific microbiome. Fourteen bacterial phyla (including Proteobacteria, Cyanobacteria, Actinobacteria, Bacteroidetes, Firmicutes and Cloroflexi) were consistently found in the A. brasiliensis metagenomes. The A. brasiliensis microbiome is enriched for Betaproteobacteria (e.g., Burkholderia) and Gammaproteobacteria (e.g., Pseudomonas and Alteromonas) compared with the surrounding planktonic microbial communities. Functional analysis based on Rapid Annotation using Subsystem Technology (RAST) indicated that the A. brasiliensis microbiome is enriched for sequences associated with membrane transport and one-carbon metabolism. In addition, there was an overrepresentation of sequences associated with aerobic and anaerobic metabolism as well as the synthesis and degradation of secondary metabolites. This study represents the first analysis of sponge-associated microbial communities via shotgun pyrosequencing, a strategy commonly applied in similar analyses in other marine invertebrate hosts, such as corals and algae. We demonstrate that A. brasiliensis has a unique microbiome that is distinct from that of the surrounding planktonic microbes and from other marine organisms, indicating a species-specific microbiome.

                      It cites the website, as the paper is still under review.

                      Comment


                      • #12
                        And another paper that cites crAss is out: "Going viral: next-generation sequencing applied to phage populations in the human gut" by Reyes et al.:

                        Still citing the website ...
                        Last edited by dutilh; 09-03-2012, 07:29 AM.

                        Comment


                        • #13
                          crAss has now been published in Bioinformatics:

                          Comment


                          • #14
                            crAss Plots

                            Hello Bas:
                            Congrats on the publication! It lends credibility to all of our efforts to use crAss. I've run two experiments and all went well with one exception and I hope that you can provide some advice. The second run has 3 sets of reads contributing to the cross-assembly. The distance matrices were as expected but there was no plot at the end. The toggle between 3D and Triangle options is there and enabled but no plot shows. The output (output.contigs2reads.txt) is there and is populated with appropriate data. For reference, a similar analysis of 2 other data sets resulted in a 2D plot so it doesn't appear to be a browser issue.
                            Thank you in advance for providing advice,
                            Bonnie Brown

                            Comment


                            • #15
                              Originally posted by ecogenetics View Post
                              Hello Bas:
                              Congrats on the publication! It lends credibility to all of our efforts to use crAss. I've run two experiments and all went well with one exception and I hope that you can provide some advice. The second run has 3 sets of reads contributing to the cross-assembly. The distance matrices were as expected but there was no plot at the end. The toggle between 3D and Triangle options is there and enabled but no plot shows. The output (output.contigs2reads.txt) is there and is populated with appropriate data. For reference, a similar analysis of 2 other data sets resulted in a 2D plot so it doesn't appear to be a browser issue.
                              Thank you in advance for providing advice,
                              Bonnie Brown

                              Hi Bonnie,
                              We have just installed a bigger server to run crAss, which might have caused some of the problems with the online version of crAss.
                              Second, the new version of DrawTree (3.69) creates an error in the plot for some cladograms (due to zero branch lengths), for example those in the toy example that we provide on the SourceForge site. This results in no output image being created (this should not be a problem for real datasets).
                              Finally, I've now uploaded a new version (v1.2) that allows the user to more easily change the executable commands for BioNJ, GNUplot, GhostScript and DrawTree.
                              I hope that everything is working for you now, let me know if you still have trouble! Best, Bas

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X