Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reaching out to bioinformatics enthusiasts

    We are a group of bioinformatics/genomics enthusiasts currently working on developing a streamlined platform for genomic analysis to be used in the research and healthcare community (for purposes of application in transnational medicine and basic science research). Our team consists of a biologist/progammer, engineer/programmer, and two MDs. The progress so far has been to develop a web-based demo to integrate the most commonly used bioinformatics tools into one platform with easy usability for the end user.

    The efforts are opensource and use open source tools. Our goal however is to polish or even create new tools that are unique to the developing market's needs. Ultimately, our platform would help the end user run through a modifiable pipeline that is tailored to the end user's needs.

    Our approach/focus is on:
    • Data handling (compression, and optimization for use on cloud-based application)
    • Data analysis (modifiable but within scope of most commonly used tools or technologies to study gene expression and other future prospects such as genome evolution and studies across multiple species/phylogenetics, pharmacogenetics, and proteomics)
    • Data visualization (interactive data output tailored to the end user's needs in terms focusing on usable results)

    Our team is currently in need of people with experience in:
    • Programming (Java, Python) - also looking into the possibility of integrating machine learning into such a platform.
    • Bioinformatics
    • Mathematics or data science/statistics (MATLAB, R; for purposes of data compression algorithm development and analysis)

    Again, our efforts are open source and serve the purpose of developing the end user experience in the research and healthcare world. Everyone on board is striving to learn more about the field during this endeavor. Our team members have varying levels of experience but are enthusiastic and continuing to show dedication to pushing the scientific community forward.

    Best,
    M Samour
    Teleogene LLC

  • #2
    Platforms like this for bioinformatics are all over the place. Please consider putting your efforts into improving one of the ones that already exists, rather than creating a new one:



    If you're not improving something that already exists you may get some interest, but chances are that most people will find something a little bit different about your solution that doesn't precisely fit their needs, and move on without mentioning their issues.

    Comment


    • #3
      Thank you for taking the time to reply. I am aware of the multiple platforms available out there with similar approaches. However, I could not see any with the focus on making it available and usable for the masses. I understand that it can be difficult to believe such a need exists at the current time (as you mentioned, most would want pipelines that are tailored to their needs individually).

      I believe there is much to be improved on the current market. And yes I agree that the focus should be on improving current platforms. It is something we are looking into and most likely will steer that way after further exploration.

      As sequencing is becoming more affordable, the limiting factor of development will be of data handling, storage, and security. Again, this is just a general outlook with more details playing into the decision to develop such a platform.

      Best,

      Comment


      • #4
        I could not see any with the focus on making it available and usable for the masses
        Pretty much every such platform has this focus. The point of creating an integrated toolkit is to make individual tools easier for less-skilled people to use. I don't have to look hard to find similar statements:

        The Galaxy Project: Data intensive biology for everyone.

        MPI Bioinformatics Toolkit: "The primary aim in developing the MPI Bioinformatics Toolkit was to offer a web service that is as easy to use as possible..."

        RobiNA: "RobiNA tries to bridge this gap by providing a flexible user friendly graphical interface to unleash the power of R/BioConductor for the individual biologist"

        Unipro UGENE: "Unipro UGENE is a multiplatform open-source software with the main goal of assisting molecular biologists without much expertise in bioinformatics to manage, analyze and visualize their data."

        ... or see the Wikipedia page on RNA-Seq tools.

        I understand that it can be difficult to believe such a need exists at the current time
        I agree that the need exists, I just think that there are more effective ways to address that need than trying to create something entirely new.

        Comment


        • #5
          I completely agree with you. There is many available platforms which are redundant in some sense. I've looked through of those you've linked before. I somewhat agree with your analysis of the market but disagree on some points. What in your perspective is currently immensely lacking in the toolkits available to allow for a better usage of structured pipelines by commonplace biologist researchers for example?

          Either way, this is one of the reasons I wanted to reach out to people here with hands-on experience on the matter, to get a different perspective and also make sure non-redundancy is made in developing any platform (or at least minimal).

          Have you by any chance worked or are aware of EDGE bioinf? https://bioedge.lanl.gov/edge_ui/

          Comment


          • #6
            Originally posted by samour View Post
            What in your perspective is currently immensely lacking in the toolkits available to allow for a better usage of structured pipelines by commonplace biologist researchers for example?
            Publicity / marketing.

            Have you by any chance worked or are aware of EDGE bioinf?
            I'm not sure. It looks to me quite similar to DNANexus, or possibly Galaxy, both of which I've used previously to access a specific dataset. I'm comfortable enough with the command line that toolkits like these slow me down more than help me, because I spend a lot of time trying to get my shoehorns to work.

            Comment


            • #7
              Originally posted by samour View Post
              What in your perspective is currently immensely lacking in the toolkits available to allow for a better usage of structured pipelines by commonplace biologist researchers for example?
              Curious if anyone on here agrees with me on this, but IMO what is needed is something that works both with and without GUI. If you want bioinformaticans to care, you cannot force them to touch the mouse. If you want biologists and mds to be able to use it, you cannot force them onto the command line.

              Development of another Galaxy is not at all what we need. Whatever you make, if it only works in a browser, power users will never touch it. These are also the potential developers, so having them actually want to use the platform, should help grow the platform. The holy grail here is to expose the power and flexibility of R scripting with a GUI. Thankfully, this is NOT just around the corner, so I will continue to have a job for the near future.

              Comment


              • #8
                There's no need for a single unified platform on the CLI and GUI (N.B., you can do this with Galaxy for the most part, but I wouldn't recommend it unless you have a very very good reason). For wet-lab folks, Galaxy works quite well and our more advanced users are able to transition to directly using R and jupyter since both are available from within Galaxy's interactive environments.

                The real glue between CLI and GUI is the common workflow language, which is increasingly supported by Galaxy. When I can write a cwl workflow and have it work on Galaxy and with snakemake then life will be wonderful. In fact, writing the snakemake glue for this would be a more productive use of your time.

                As an aside, bioinformaticians should be forced to do at least a few wet-lab experiments at some point in their life. Similarly, wet-lab folks should have at least a vague clue about data analysis, preferably with R (yes, I also teach our wet-lab folks R).

                Comment


                • #9
                  Originally posted by gringer View Post
                  Publicity / marketing.
                  Agreed. I personally am focusing on this. I believe I have access to markets that are currently untapped and have potential for growth in certain aspects of healthcare (mostly international -meaning outside U.S.)


                  Originally posted by gringer View Post
                  I'm not sure. It looks to me quite similar to DNANexus, or possibly Galaxy, both of which I've used previously to access a specific dataset. I'm comfortable enough with the command line that toolkits like these slow me down more than help me, because I spend a lot of time trying to get my shoehorns to work.
                  That is completely fair and understandable. It is just about finding the right balance between all these variables. I am not proposing the ultimate fix-it-all solution with what I want to work on, but rather like you said, an improvement on what is available. Therefore, experience from people running the trends is valuable and should drive the development. As an MD, I am seeing this from the healthcare perspective and I do foresee the myriad applications for such technologies in the near future (my perspective might be flawed and missing out on a lot of important points, but is valid in its domain).

                  Again, there are many people cutting deep into developing these technologies without looking at the broader picture of applicability in healthcare or research. Any bioinformatician will see this idea as displeasing in its most rudimentary form and I can relate to that. The ultimate goal of the huge genomic efforts is to solve problems, and I see a huge lack in focus on bigger picture applications.

                  Comment


                  • #10
                    Originally posted by jiaco View Post
                    Curious if anyone on here agrees with me on this, but IMO what is needed is something that works both with and without GUI. If you want bioinformaticans to care, you cannot force them to touch the mouse. If you want biologists and mds to be able to use it, you cannot force them onto the command line.

                    Development of another Galaxy is not at all what we need. Whatever you make, if it only works in a browser, power users will never touch it. These are also the potential developers, so having them actually want to use the platform, should help grow the platform. The holy grail here is to expose the power and flexibility of R scripting with a GUI. Thankfully, this is NOT just around the corner, so I will continue to have a job for the near future.
                    Another Galaxy is not what is needed. There is no need for more fragmentation, and no need for something as non-universal. Thinking about this in mass-use perspective (whether in healthcare or research), there needs to be compatibility and reproducibility with handling and visualizing data. The future will not be limited by how many technologies or ways we create that can handle genomic data. It will be about having the people involved in the process to be able to recognize patterns on mass-scale basis and identify the opportunities for disease identification, treatment, and scientific discoveries in general. The huge amount of data that will spawn from advancing sequencing technologies will need to be more governed. To me, as an MD with a humble experience in a wet-lab environment (as well as seeing average biology researchers in lab environment), the genomics world is a mess.

                    In summary, I am calling for an effort to classify and organize the handling of genomic data. The beauty of data is that it is relatively static, can be studied over and over. There is no need for a revolution for genomics if the way the community deals with genomic data in a very arbitrary way, letting all of the money and time spent to generate data go to waste. I can go on for hours about what I think is lacking in the field (again, this is from my humble perspective. I know a lot of it is obvious, but it seems it is not well stressed in most approaches to genomics today), but I will refrain from doing so to allow for a more dynamic discussion.

                    Also, I completely agree with the need for a GUI-based web-application with a non-GUI backend for developers or bioinformaticians/enthusiasts

                    Thanks!

                    Comment


                    • #11
                      Originally posted by dpryan View Post
                      There's no need for a single unified platform on the CLI and GUI (N.B., you can do this with Galaxy for the most part, but I wouldn't recommend it unless you have a very very good reason). For wet-lab folks, Galaxy works quite well and our more advanced users are able to transition to directly using R and jupyter since both are available from within Galaxy's interactive environments.

                      The real glue between CLI and GUI is the common workflow language, which is increasingly supported by Galaxy. When I can write a cwl workflow and have it work on Galaxy and with snakemake then life will be wonderful. In fact, writing the snakemake glue for this would be a more productive use of your time.

                      As an aside, bioinformaticians should be forced to do at least a few wet-lab experiments at some point in their life. Similarly, wet-lab folks should have at least a vague clue about data analysis, preferably with R (yes, I also teach our wet-lab folks R).
                      This is inarguable. People need to know about the whole process and how it works even if in variable degrees. In more advanced labs you will have biologists who can deal with data using basic skills in bioinformatics. Some more adept than others. However, I think, just like when clinical research is done by MD's trying to do their own statistics, this could fail at any point.

                      Additionally, I have worked with biologists who collaborate with genomic cores for RNA-Seq analysis and other analyses, with zero understanding of what NGS is and how does it work. Again, this is not your high level genomics/biology lab studying the evolutionary tree of a specific species. It is your average biology wet-lab studying human disease.

                      I am also not sure how many wet-labs have access to their local or regional bioinformatics department.

                      I will take your advice, thanks!

                      Comment


                      • #12
                        Be very careful about how you market this platform.

                        Under the minimum of something like a GNU license covering most of it; we’re not going to publish this to the public.... We do not communicate under unsafe avenues and will only provide this software to those willing to join in on this endeavor.
                        I see that you're developing this under a framework that is at least consistent with how GNU believes the GPL should be used, as long as everyone in the team (including toolkit users) shares this ideal of non-public release:



                        However, it feels to me like it goes against statements used in the opening post like "The efforts are opensource and use open source tools."

                        If you've starting out by generating a toolkit without public collaboration (and certainly if you intend to continue with that), there will be some bioinformaticians who will be put off by the closed nature of this project. Research is moving towards being more open, because it tends to be easier and work better that way:



                        As I've already mentioned, I'm not likely to use your platform because I've been burned too much by "kitchen sink" platforms previously, nor am I interested in helping with development of this. You have exposed additional concerns about freedom and openness that I didn't previously have, but these concerns may not apply to your target users (or developers). It would be a good idea to determine if that is the case before trying to expand your user/developer base.

                        Comment


                        • #13
                          I am not reassured (or more correctly, not assured) by this clarification. The opposite, really. Patents shouldn't be anywhere near software, particularly not when it is used on a general-purpose computing device.

                          Comment


                          • #14
                            Originally posted by gringer View Post
                            Be very careful about how you market this platform.



                            I see that you're developing this under a framework that is at least consistent with how GNU believes the GPL should be used, as long as everyone in the team (including toolkit users) shares this ideal of non-public release:



                            However, it feels to me like it goes against statements used in the opening post like "The efforts are opensource and use open source tools."

                            If you've starting out by generating a toolkit without public collaboration (and certainly if you intend to continue with that), there will be some bioinformaticians who will be put off by the closed nature of this project. Research is moving towards being more open, because it tends to be easier and work better that way:



                            As I've already mentioned, I'm not likely to use your platform because I've been burned too much by "kitchen sink" platforms previously, nor am I interested in helping with development of this. You have exposed additional concerns about freedom and openness that I didn't previously have, but these concerns may not apply to your target users (or developers). It would be a good idea to determine if that is the case before trying to expand your user/developer base.
                            The effort is ultimately open source. The current work is however limited to a team (For reasons of better communication and getting ideas across within the framework that is believed to be most desirable to the users). We are in no position to limit ourselves yet. The market is being explored and our work will be tailored to the needs. There is no advancement without openness to allow for faults and ideas to get through to us.

                            Thanks again,

                            Comment


                            • #15
                              @SDos, given your concern about security, do you really expect any of us on this forum to click on blind links posted by an essentially anonymous individual?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X