I read (online) and hear (conferences) a lot about scientific workflow systems, in the sense of http://en.wikipedia.org/wiki/Scientific_workflow_system.
There are many such systems under development, for example Kepler, Taverna, VisTtrails, LoniPipeline, Apache Airavata.
Scientific workflow systems *sound* very interesting when I read or hear about them; they promise reproducibility, grid and cluster support, use R without programming etc etc.
But when digging a bit deeper it looks like a tool where I can string together boxes to coordinate different programs. Nothing I can't do with make, bash and awk a dozen times faster. Also it's great that they have grid support but I don't have access to a grid. I have access to one of the Top 500 supercomputers but this system is so secured that not in a million years, any of these workflow systems can connect to it.
I work in a very big genomics research insitute in the bioinformatics department and *noone* is using scientific workflow systems, either.
So I am just wondering. Are scientific workflow systems (Kepler, Taverna, VisTtrails, LoniPipeline, Apache Airavata...) useful for anything? Are they just a waste of time or do they boost productivity? Do *you* use them for anything? And if yes, do you use them for ad-hoc type of data analysis, or for byuilding analysis services?
Looking forward to your opinions.
There are many such systems under development, for example Kepler, Taverna, VisTtrails, LoniPipeline, Apache Airavata.
Scientific workflow systems *sound* very interesting when I read or hear about them; they promise reproducibility, grid and cluster support, use R without programming etc etc.
But when digging a bit deeper it looks like a tool where I can string together boxes to coordinate different programs. Nothing I can't do with make, bash and awk a dozen times faster. Also it's great that they have grid support but I don't have access to a grid. I have access to one of the Top 500 supercomputers but this system is so secured that not in a million years, any of these workflow systems can connect to it.
I work in a very big genomics research insitute in the bioinformatics department and *noone* is using scientific workflow systems, either.
So I am just wondering. Are scientific workflow systems (Kepler, Taverna, VisTtrails, LoniPipeline, Apache Airavata...) useful for anything? Are they just a waste of time or do they boost productivity? Do *you* use them for anything? And if yes, do you use them for ad-hoc type of data analysis, or for byuilding analysis services?
Looking forward to your opinions.
Comment