Hi,
I'm a bench biologist with a load of RNA-seq data, who is trying to find a good integrated tool for the analysis. I'm neither a bioinformatician nor a mathematician or a statistics fan, so I’m looking for a program where I can put in my data and without getting into much details and complex algorithms, get some answers. With ‘I’, I am sure I represent many other colleagues. Have until today trialed Avadis NGS and CLC Genomics workbench. I have one question before offering my view on these programs.
When it comes to the aligning algorithms within these programs, which one is the better one? I know that CLC uses the Bowtie, Tophat, etc, but I do not know anything about Avadis.
Before offering my evaluation of the programs, should say that I have only used the programs for two weeks (trial periods) and so have not gone into the deep with each. Therefore, I have most probably missed other pros and cons of the programs. Then again, if I have not been able to find a specific feature in the software isn’t that something for the developers to think of?!
Both softwares
For groups that have no bioinformatician and no collaborations with such people (sometimes collaborations can take a long long time), it is good that there are companies or goups with an effort to make bioinformatics tools available.
In my opinion, the makers have more or less thrown the user-friendliness out of the window. These softwares are supposed to be made for people with no prior knowledge of bioinformatics, or at least so does the companies behind it claim, but nope. They talk about the algorithms and explain them in a way so that a biologist like me gets lost half the way (this is especially true about the Avadis). And when you finally have passed the aligning step, it needs a whole education to understand and interpret your results.
None of the softwares allows you to explore the program without the manual. Honestly, how many does read the manual for that new TV they have bought or the license agreements? And when you start reading the manuals, sometimes you wish you hadn’t, because they leave you with more confusion.
And what is this obsession with SNPs? Not everyone is interested in SNPs.
All I want to have is a simple list of which genes are significantly up- or downregulated in my cells with a specific treatment. None of the programs helps you analyze time series experiments to the fullest either.
The idea with these programs should be to allow the user analyze the whole data within the same program, ie upload, handle replicates, align, give data together with statistics for expression of each gene, allow for comparison between samples and over series of samples and do statistical analysis, and help you produce tabular and graphical data.
Avadis
Pros
+ The integrated genome browser
+ Compatibility with UCSC genes
+ Data statistics (e.g. PCA plot)
+ The software helps combining replicates when data in different files
+ Ability to analyze the expression of a single gene and its different isoforms, in particular the fact that they can be traced in different samples graphically and with tabular data
+ The graphics
+ Stability
+ Requires less memory compared to CLC.
+ Reference genomes available one click away
Cons
- The average expression for a specific gene in some cases may mask the difference in expression of its isoforms
- Cannot handle trends over samples
- Does not offer statistical data for finding difference in expression of a specific gene in samples
- The manual (although short, but very incomprehensible) – leaves you with the same feeling as Kafka’s ‘The process’
- No visualization of quality scores
- When trial period over, you are not even allowed to open the program
- A lot of money for a program that only handles NGS data
CLC Genomics
Pros
+ Uses the alignment algorithms widely accepted in the RNA-seq community (one of the two reasons why I still consider this software)
+ Visualization of quality scores for the reads
+ Can do other things than just analyzing NGS data (although not very comprehensively), which makes it more worth the money (the second reason)
Cons
- The program crashed 4-5 times (3 times while running reads alignment over night, so don’t think you can be efficient!)
- Requires a lot of memory
- Does not offer a simple way to combine replicates
- Reference sequence not available, have to download manually – while attempting to download, it disconnected several times (forget about doing it overnight or while doing something else in the lab)
- No graphic visualization of data
- Cannot handle time series analysis or treatment
- No statistical data (PCA plot) offered
- Incompatibility with UCSC genomes
- The manual – thicker than Tolstoy’s ‘War and Peace’ and in some parts you’ll have to guess whether it is related to your type of analysis or not. You have to go through most of the manual for finding the related pieces.
- Search not capable of finding same gene in different samples
I am on my way to trial the Partek program, so perhaps I could continue on this thread…
I'm a bench biologist with a load of RNA-seq data, who is trying to find a good integrated tool for the analysis. I'm neither a bioinformatician nor a mathematician or a statistics fan, so I’m looking for a program where I can put in my data and without getting into much details and complex algorithms, get some answers. With ‘I’, I am sure I represent many other colleagues. Have until today trialed Avadis NGS and CLC Genomics workbench. I have one question before offering my view on these programs.
When it comes to the aligning algorithms within these programs, which one is the better one? I know that CLC uses the Bowtie, Tophat, etc, but I do not know anything about Avadis.
Before offering my evaluation of the programs, should say that I have only used the programs for two weeks (trial periods) and so have not gone into the deep with each. Therefore, I have most probably missed other pros and cons of the programs. Then again, if I have not been able to find a specific feature in the software isn’t that something for the developers to think of?!
Both softwares
For groups that have no bioinformatician and no collaborations with such people (sometimes collaborations can take a long long time), it is good that there are companies or goups with an effort to make bioinformatics tools available.
In my opinion, the makers have more or less thrown the user-friendliness out of the window. These softwares are supposed to be made for people with no prior knowledge of bioinformatics, or at least so does the companies behind it claim, but nope. They talk about the algorithms and explain them in a way so that a biologist like me gets lost half the way (this is especially true about the Avadis). And when you finally have passed the aligning step, it needs a whole education to understand and interpret your results.
None of the softwares allows you to explore the program without the manual. Honestly, how many does read the manual for that new TV they have bought or the license agreements? And when you start reading the manuals, sometimes you wish you hadn’t, because they leave you with more confusion.
And what is this obsession with SNPs? Not everyone is interested in SNPs.
All I want to have is a simple list of which genes are significantly up- or downregulated in my cells with a specific treatment. None of the programs helps you analyze time series experiments to the fullest either.
The idea with these programs should be to allow the user analyze the whole data within the same program, ie upload, handle replicates, align, give data together with statistics for expression of each gene, allow for comparison between samples and over series of samples and do statistical analysis, and help you produce tabular and graphical data.
Avadis
Pros
+ The integrated genome browser
+ Compatibility with UCSC genes
+ Data statistics (e.g. PCA plot)
+ The software helps combining replicates when data in different files
+ Ability to analyze the expression of a single gene and its different isoforms, in particular the fact that they can be traced in different samples graphically and with tabular data
+ The graphics
+ Stability
+ Requires less memory compared to CLC.
+ Reference genomes available one click away
Cons
- The average expression for a specific gene in some cases may mask the difference in expression of its isoforms
- Cannot handle trends over samples
- Does not offer statistical data for finding difference in expression of a specific gene in samples
- The manual (although short, but very incomprehensible) – leaves you with the same feeling as Kafka’s ‘The process’
- No visualization of quality scores
- When trial period over, you are not even allowed to open the program
- A lot of money for a program that only handles NGS data
CLC Genomics
Pros
+ Uses the alignment algorithms widely accepted in the RNA-seq community (one of the two reasons why I still consider this software)
+ Visualization of quality scores for the reads
+ Can do other things than just analyzing NGS data (although not very comprehensively), which makes it more worth the money (the second reason)
Cons
- The program crashed 4-5 times (3 times while running reads alignment over night, so don’t think you can be efficient!)
- Requires a lot of memory
- Does not offer a simple way to combine replicates
- Reference sequence not available, have to download manually – while attempting to download, it disconnected several times (forget about doing it overnight or while doing something else in the lab)
- No graphic visualization of data
- Cannot handle time series analysis or treatment
- No statistical data (PCA plot) offered
- Incompatibility with UCSC genomes
- The manual – thicker than Tolstoy’s ‘War and Peace’ and in some parts you’ll have to guess whether it is related to your type of analysis or not. You have to go through most of the manual for finding the related pieces.
- Search not capable of finding same gene in different samples
I am on my way to trial the Partek program, so perhaps I could continue on this thread…
Comment