Announcement

Collapse
No announcement yet.

crAss for comparative metagenomics using cross-assembly

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • crAss for comparative metagenomics using cross-assembly

    Cross-assembly of reads from different metagenomes allows us to assess the degree of similarity between the sampled communities. crAss is a tool to analyse the cross-assembly files. It creates a distance measure between metagenome pairs using several possible distance formulas. Please share your experiences with us!
    The tool is available online at http://edwards.sdsu.edu/crass/ and for download at https://sourceforge.net/projects/crass/

  • #2
    Greetings,
    I'm very interested into this tool. I've tried it, but when I load my datasets the program tells me that it doesn't recognize any read coming from my unassembled, single metagenomes.
    I assembled my samples by concatenating them together and using Cap3 as assembler. Each read in my samples has a unique identifier. Do I need to change it, in order to tell the program which read comes from which metagenome?
    Thanks in advance.

    Michael Tangherlini

    Comment


    • #3
      Originally posted by MikeT View Post
      Greetings,
      I'm very interested into this tool. I've tried it, but when I load my datasets the program tells me that it doesn't recognize any read coming from my unassembled, single metagenomes.
      I assembled my samples by concatenating them together and using Cap3 as assembler. Each read in my samples has a unique identifier. Do I need to change it, in order to tell the program which read comes from which metagenome?
      Thanks in advance.

      Michael Tangherlini
      Hi Michael, my guess is Cap3 does not list the read IDs in the ACE format in the same way as other assemblers. Could you send me your Fasta files and the resulting ACE file in an email? Maybe a small example of an assembly of just a couple of reads would be better than the whole thing at once . Then I'll take a look! Best, Bas

      Comment


      • #4
        Hello Bas,
        I managed to solve the issue by simply adding the sample name as a prefix to the read. But I experienced issues with the website: as I upload my sequences, it doesn't seem to be able to update itself. The wheel on the left, after starting the run, keeps spinning forever.
        Instead, the program works great locally. I've installed it and is able to analyze my data barely in seconds.
        I have a question regarding the plot, though. For 3 samples, crAss automatically generates a threedimensional box, but I'm not able to understand what the dots and points stand for. I mean, I see that dots on each surface represent the contigs plotted along each axis according to the number of reads used to assemble them, but what about the points? What do they represent?
        Best regards

        Michael Tangherlini

        Comment


        • #5
          Originally posted by MikeT View Post
          Hello Bas,
          I managed to solve the issue by simply adding the sample name as a prefix to the read. But I experienced issues with the website: as I upload my sequences, it doesn't seem to be able to update itself. The wheel on the left, after starting the run, keeps spinning forever.
          Instead, the program works great locally. I've installed it and is able to analyze my data barely in seconds.
          I have a question regarding the plot, though. For 3 samples, crAss automatically generates a threedimensional box, but I'm not able to understand what the dots and points stand for. I mean, I see that dots on each surface represent the contigs plotted along each axis according to the number of reads used to assemble them, but what about the points? What do they represent?
          Best regards

          Michael Tangherlini
          Great that you managed to solve the issue! I will ask Rob Schmieder to take a look at the website when he's back at his desk in August (he made it). Could it be that the files are really big and just take a very long time to upload? You could try to make them smaller using the commands shown on the help page.

          The dots in the 3D plot are the projections of the points onto the three plains (superimposed on the points for visibility).

          Comment


          • #6
            Hi Bas,
            Thanks for the reply! No, the files have already been treated to greatly reduce their size as specified on the website. On that PC I've been using FireFox from CentOS, now I'm repeating it with Chrome on Win7 and get the same results: the wheel keeps spinning.
            Is there a way to change the plot into a triangle plot as the one visible on the website? I fear that this kind of visualization is not so clear (at least for my samples).
            Kind regards

            Michael Tangherlini

            Comment


            • #7
              Originally posted by MikeT View Post
              Hi Bas,
              Thanks for the reply! No, the files have already been treated to greatly reduce their size as specified on the website. On that PC I've been using FireFox from CentOS, now I'm repeating it with Chrome on Win7 and get the same results: the wheel keeps spinning.
              Is there a way to change the plot into a triangle plot as the one visible on the website? I fear that this kind of visualization is not so clear (at least for my samples).
              Kind regards

              Michael Tangherlini
              The stand-alone script should also output both plots, we will fix that! Meanwhile, you can take the data from the "output.contigs2reads.txt" file, that lists the number of reads from each dataset assembled into each contig, and make your own plot. If you want, you can also filter out the unassembled reads (listed at the end of the file) because they will not really add much info to the plot: they will all overlap in the same three points: (0,0,1) ; (0,1,0) and (1,0,0).

              Comment


              • #8
                Hello Bas,
                I've used the output contig2reads provided by the program. I loaded it into a spreadsheet and associated to each contig (removing the single, unassembled reads) a specific color. I created the color code by assigning a single color in the RGB spectrum to each of my three samples (so one was Red, one was Green and one was Blue), normalizing each contribution to the 0-255 range and extrapolating the corresponding hex value from the mix of the three single values.
                I then exported the spreadsheet and used the hex value as a fourth column in GNUplot to create a 3d plot as you do with crAss, only that I have only the contigs as points colored according to the contribution of each metagenome to the assembly. Pretty nifty.

                Michael Tangherlini

                Comment


                • #9
                  Originally posted by MikeT View Post
                  Hello Bas,
                  I've used the output contig2reads provided by the program. I loaded it into a spreadsheet and associated to each contig (removing the single, unassembled reads) a specific color. I created the color code by assigning a single color in the RGB spectrum to each of my three samples (so one was Red, one was Green and one was Blue), normalizing each contribution to the 0-255 range and extrapolating the corresponding hex value from the mix of the three single values.
                  I then exported the spreadsheet and used the hex value as a fourth column in GNUplot to create a 3d plot as you do with crAss, only that I have only the contigs as points colored according to the contribution of each metagenome to the assembly. Pretty nifty.

                  Michael Tangherlini
                  Wow that sounds great! Make sure you share it when it's published
                  I'll see if we can include it in our plots too...
                  Last edited by dutilh; 07-04-2012, 06:53 AM. Reason: update

                  Comment


                  • #10
                    The end plot looks quite cluttered, though, and a little bit uneasy to understand. So I did something different: I plotted the cross-assembled contigs using four colors. 1, 2 and 3 are for the three metagenomes considered and represent the most important contributor to each contig. 4 is for contigs with ties in contribution.
                    I can show the plot, since it's just an attempt with very preliminary data.

                    -MikeT
                    Attached Files

                    Comment


                    • #11
                      The first paper that cites crAss is out: "Taxonomic and Functional Microbial Signatures of the Endemic Marine Sponge Arenosclera brasiliensis" by Trindade-Silva et al.:
                      http://www.plosone.org/article/info%...l.pone.0039905
                      It cites the website, as the paper is still under review.

                      Comment


                      • #12
                        And another paper that cites crAss is out: "Going viral: next-generation sequencing applied to phage populations in the human gut" by Reyes et al.:
                        http://www.nature.com/nrmicro/journa...micro2853.html
                        Still citing the website ...
                        Last edited by dutilh; 09-03-2012, 07:29 AM.

                        Comment


                        • #13
                          crAss has now been published in Bioinformatics:
                          http://bioinformatics.oxfordjournals...vm&keytype=ref

                          Comment


                          • #14
                            crAss Plots

                            Hello Bas:
                            Congrats on the publication! It lends credibility to all of our efforts to use crAss. I've run two experiments and all went well with one exception and I hope that you can provide some advice. The second run has 3 sets of reads contributing to the cross-assembly. The distance matrices were as expected but there was no plot at the end. The toggle between 3D and Triangle options is there and enabled but no plot shows. The output (output.contigs2reads.txt) is there and is populated with appropriate data. For reference, a similar analysis of 2 other data sets resulted in a 2D plot so it doesn't appear to be a browser issue.
                            Thank you in advance for providing advice,
                            Bonnie Brown

                            Comment


                            • #15
                              Originally posted by ecogenetics View Post
                              Hello Bas:
                              Congrats on the publication! It lends credibility to all of our efforts to use crAss. I've run two experiments and all went well with one exception and I hope that you can provide some advice. The second run has 3 sets of reads contributing to the cross-assembly. The distance matrices were as expected but there was no plot at the end. The toggle between 3D and Triangle options is there and enabled but no plot shows. The output (output.contigs2reads.txt) is there and is populated with appropriate data. For reference, a similar analysis of 2 other data sets resulted in a 2D plot so it doesn't appear to be a browser issue.
                              Thank you in advance for providing advice,
                              Bonnie Brown

                              Hi Bonnie,
                              We have just installed a bigger server to run crAss, which might have caused some of the problems with the online version of crAss.
                              Second, the new version of DrawTree (3.69) creates an error in the plot for some cladograms (due to zero branch lengths), for example those in the toy example that we provide on the SourceForge site. This results in no output image being created (this should not be a problem for real datasets).
                              Finally, I've now uploaded a new version (v1.2) that allows the user to more easily change the executable commands for BioNJ, GNUplot, GhostScript and DrawTree.
                              I hope that everything is working for you now, let me know if you still have trouble! Best, Bas

                              Comment

                              Working...
                              X