Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    I use a handful of languages. Here goes.

    Mathematica is my absolute favorite because of its versatility, high performance, incorporation of high level functions, incredible documentation and useful interactive front end. I'm also the only biologist I know who uses it, though I know the Lawrence National Labs at least prototypes code in it.

    I rarely use perl. It is easy to be sloppy in it and I don't like that about it, but it is nice for some basic scripts (e.g. rearranging data). Python is about the same in terms of functionality, and I like its syntax better.

    Matlab is so clunky and poorly documented that I don't think it's worth the trouble.

    R, dislike syntax structure. Who came up with the reverse symbol assignments? Plenty of other languages do what R can do, no need for a dedicated statistics language.

    Java is good for applications requiring GUIs. It's OK for backend work too... despite common belief it's not inherently slow.

    C is the king if you need to do real software authoring.

    Finally, I use any of the .NET languages (Windows!) for hardware interfaces (if I'm writing control software for lab robotics).

    So, if I had to choose two, it would be Mathematica and C I suppose.

    Comment


    • #32
      I got my Ph.D in computer science as opposed to anything biological, so here's my two cents from that point of view...

      - If you're writing for computational speed, go with C or C++
      - If you're writing a GUI, go with Java
      - If you just need to get it done and never look at it again, go with Perl
      - If you're setting up a pipeline, especially one dealing with converting data formats, go with Python

      People tend to advertise jobs looking for C++ or Java when they want folks to write code that'll eventually be released for other people to use as a stand-alone tool.

      I do all of my analysis using Python to massage and analyze data for tools like Bowtie and genome browsers.

      Comment


      • #33
        First off, I do not have anything against all the shiny new languages and IDEs and such, but if a young programmer stumbles across this thread and takes to heart what people are saying, I feel obligated to input my opinion.

        Learn C and bash and the most basic stuff first. LEARN vi as your IDE and your word processor and your only way of knowing how to enter text. Understand how to log into a machine with the most basic of linux available and to actually do something functional to bring it back to life. There will be times when there is no python, no jvm, no eclipse. If you cannot function in such an environment then you are shooting yourself in the foot.

        The compiler is your friend. You can spend loads of time writing code. If you learn stellar white space habits, your code is readable and you can be fairly confident of what you have written. Then having the compiler pass over it before debugging is a great way to catch stupid and serious errors without wasting more time debugging.

        While I now use exclusively Qt (C++), I still force myself to get dirty in C just to keep it fresh. Plus bash and the history command should be your best friends. They basically record your actions, give you an easy way to script up a pipeline and serve as a form of documentation. Once you have your script, be sure to comment it with database versions, and any other relevant info that may change in the future.

        Best of luck

        :wq

        Comment


        • #34
          Originally posted by jiaco View Post
          First off, I do not have anything against all the shiny new languages and IDEs and such, but if a young programmer stumbles across this thread and takes to heart what people are saying, I feel obligated to input my opinion.
          Sorry, but for the benefit of these "young programmers" I have to disagree strongly.

          Basically, you should distinguish two cases. (a) A young student aspires to become a professional in (scientific or other) software development. (b) A scientist who already works in research (or is studying biology, not CS) want to broaden his skills in order to perform some bioinformatics analyses himself.

          Jiaco's advice is well suited for case (a). There are many professional developers with a CS degree but without an understanding of what is going on under the hood of a computer. These usually come from universities which kicked C out of the curriculum and I share Jiaco's frustration about this.

          However, most reader here will fall into case (b). They don't want to be able to replace of a fully qualified computer scientist. Rather, they already have a qualification, and that is biology or biotech engineering.

          Hence, I fully agree with the emphasis that was put in this thread on scripting languages, especially Python.

          They allow you to get a job done fast, and they are much easier to learn.

          What has not yet been mentioned here is the fundamental trade-off between compiled languages and scripting languages, namely runtime speed versus development speed: An developer experienced in both languages might need half a day to code something in Python and two days to get the same job done in C. However, the Python program may take, say five minutes to run, while the C program needs only half a minute. But only if you plan to run the program very often, you will get back your investment in development time from having to wait less for the program to run.

          That is not to say that there are not a lot of problems in bioinformatics that require strong C/C++ and computer science theory skills but these skills are not something acquired within a few weeks or months. (I would be unemployed if all biologists were experts in computer science, too.)

          Simon
          Last edited by Simon Anders; 05-18-2010, 02:44 PM. Reason: slight rewording

          Comment


          • #35
            Originally posted by Simon Anders View Post
            Sorry, but for the benefit of these "young programmers" I have to disagree strongly.

            Basically, you should distinguish two cases. (a) A young student aspires to become a professional in (scientific or other) software development. (b) A scientist who already works in research (or is studying biology, not CS) want to broaden his skills in order to perform some bioinformatics analyses himself.

            Jiaco's advice is well suited for case (a). There are many professional developers with a CS degree but without an understanding of what is going on under the hood of a computer. These usually come from universities which kicked C out of the curriculum and I share Jiaco's frustration about this.

            However, most reader here will fall into case (b). They don't want to be able to replace of a fully qualified computer scientist. Rather, they already have a qualification, and that is biology or biotech engineering.

            Hence, I fully agree with the emphasis that was put in this thread on scripting languages, especially Python.

            They allow you to get a job done fast, and they are much easier to learn.

            What has not yet been mentioned here is the fundamental trade-off between compiled languages and scripting languages, namely runtime speed versus development speed: An developer experienced in both languages might need half a day to code something in Python and two days to get the same job done in C. However, the Python program may take, say five minutes to run, while the C program needs only half a minute. But only if you plan to run the program very often, you will get back your investment in development time from having to wait less for the program to run.

            That is not to say that there are not a lot of problems in bioinformatics that require strong C/C++ and computer science theory skills but these skills are not something acquired within a few weeks or months. (I would be unemployed if all biologists were experts in computer science, too.)

            Simon
            Simon is dead-on here. Ferraris aren't wise choices for trips to the grocery just as my uncle's Prius won't win LeMans. I use C++ (and sometimes C) for the races and Python for getting groceries, daily work and medium-size applications/prototypes. The two play very nicely and the general programming concepts are transferable. In one day you can get over the whitespace issue in Python and in two more you can get past the extra bit of work that one must do to for REGEX and multi-level hashes/dictionaries relative to Perl. Then learn iterators, comprehensions and what is and isn't mutable and after that, it's smooth sailing.

            In closing, http://xkcd.com/353/

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            17 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            14 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            43 views
            0 likes
            Last Post seqadmin  
            Working...
            X