Hi all,
I recently wrote a piece of software that performs a set of analyses and helps users visualize their time series RNA-seq data. I've been suggested to publish the software in a minor journal and make it available for anyone that wants to use it and I'm looking for advice on whether or not people here think its worth trying to publish.
This particular program, written in python with only common dependencies (numpy, matplotlib, seaborn), was designed to help labs analyze RNA-seq data that doesn't have replicates. We all know that underataking an RNA-seq project is meaningless, but the fact of the matter is that not only my lab has used this approach, but people are currently not only generating and analyzing no-replicate RNA-seq data for use in house (as a pilot study for example), but they are actually publishing this data.
For example:
(Dudakovic, A., Camilleri, E.T., Riester, S.M., Paradise, C.R., Gluscevic, M., O’Toole, T.M., Thaler, R., Evans, J.M., Yan, H., Subramaniam, M., et al. (2016). Enhancer of zeste homolog 2 inhibition stimulates bone formation and mitigates bone loss caused by ovariectomy in skeletally mature mice. J. Biol. Chem. 291, 24594–24606.)
My program, called SeqPyPlot (bc it plots RNA-seq data and its written in Python) reads in raw counts produced by ht-seq (or cuffnorm), organizes the data, filters the data based on user set paramters, organizes the data in a variety of useful ways, and then prints arbitrary numbers of nicely formatted plots that I designed. They look nice and also have a sort of scale bar that is calculated by solving for the user set log2fold range around the mean of two single samples. Furthermore, it produces a series of plots as part of analysis which helps users select the optimal filtering parameters when creating prioritized gene lists.
I've used this tool in my own research to discover sets of genes are enriched for Go-Terms associated with a developmental process that I study, as well as to analyze the data from the above paper to discover flagged gene sets enriched for relevant GO-terms.
Anyways - does the community think this something worth publishing as a tool for others to use? My lab and our neighboring labs have found the software very useful. In the publication, I'd describe the program, explain the parameter selection process, and how to interpret all of the plots used for choosing parameters, and also show some evidence of it working using my analyses of data from my lab as well as the paper above.
The program isavailable on my github at https://github.com/paulgradie/SeqPyPlot
(full documentation coming very soon)
Any feedback would be greatly appreciated!
I'll be providing some output examples on my github/blog in the next couple of days.
Cheers,
Paul
I recently wrote a piece of software that performs a set of analyses and helps users visualize their time series RNA-seq data. I've been suggested to publish the software in a minor journal and make it available for anyone that wants to use it and I'm looking for advice on whether or not people here think its worth trying to publish.
This particular program, written in python with only common dependencies (numpy, matplotlib, seaborn), was designed to help labs analyze RNA-seq data that doesn't have replicates. We all know that underataking an RNA-seq project is meaningless, but the fact of the matter is that not only my lab has used this approach, but people are currently not only generating and analyzing no-replicate RNA-seq data for use in house (as a pilot study for example), but they are actually publishing this data.
For example:
(Dudakovic, A., Camilleri, E.T., Riester, S.M., Paradise, C.R., Gluscevic, M., O’Toole, T.M., Thaler, R., Evans, J.M., Yan, H., Subramaniam, M., et al. (2016). Enhancer of zeste homolog 2 inhibition stimulates bone formation and mitigates bone loss caused by ovariectomy in skeletally mature mice. J. Biol. Chem. 291, 24594–24606.)
My program, called SeqPyPlot (bc it plots RNA-seq data and its written in Python) reads in raw counts produced by ht-seq (or cuffnorm), organizes the data, filters the data based on user set paramters, organizes the data in a variety of useful ways, and then prints arbitrary numbers of nicely formatted plots that I designed. They look nice and also have a sort of scale bar that is calculated by solving for the user set log2fold range around the mean of two single samples. Furthermore, it produces a series of plots as part of analysis which helps users select the optimal filtering parameters when creating prioritized gene lists.
I've used this tool in my own research to discover sets of genes are enriched for Go-Terms associated with a developmental process that I study, as well as to analyze the data from the above paper to discover flagged gene sets enriched for relevant GO-terms.
Anyways - does the community think this something worth publishing as a tool for others to use? My lab and our neighboring labs have found the software very useful. In the publication, I'd describe the program, explain the parameter selection process, and how to interpret all of the plots used for choosing parameters, and also show some evidence of it working using my analyses of data from my lab as well as the paper above.
The program isavailable on my github at https://github.com/paulgradie/SeqPyPlot
(full documentation coming very soon)
Any feedback would be greatly appreciated!
I'll be providing some output examples on my github/blog in the next couple of days.
Cheers,
Paul
Comment