Originally posted by chadn737
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Originally posted by samanta View PostGeez !! What a warped view of the world.
Computer science has two components - (i) algorithm development and (ii) coding the algorithm into some programming language. A new algorithm is a mathematical discovery that sometimes takes decades to develop, but once it is in place, if can revolutionize all aspects of science and non-science, including your beloved sequence analysis. Here is the development history of one chain of algorithms -
You may notice that when Myers and Manbar were working on the concept of suffix arrays, they had no clue about how the future of sequencing technology would develop, yet two important lines of programs for short read analysis (Bowtie/BWA and String graph assemblers) rely on mathematical constructs developed by them.
Just like you have a reward structure regarding quick publication of your sequence-related paper, computer scientists have a different reward structure related to development of new algorithms. Historically it has been found that their reward structure contributes more to biology than another incremental biology paper. So biologists themselves (those more knowledgeable than you) encourage discoveries of new algorithms.
While the continued development of new algorithms and programs are of great use to Biologists, their development is the primary concern of the specialists, not the biologist. Even for the computer scientists, it makes no sense to develop new programs from scratch for everything. It also make no sense for the biologist to spend the vast amount of time necessary to learn C for simple and mundane applications if they already know Perl and can implement it in Perl in a shorter amount of time, even if it takes an hour longer to run. I can spend those extra few hours of runtime doing wet lab experiments. And since I don't know C, I can actually get more done using Perl then all the time it would take me learning a new language and going through the hassle of implementing it.
And frankly, why am I going to send my data to someone else to analyze if they are simply going to use the exact same tools that already exist and which I already know how to implement? Why should I wait 6 months or a year for my results while they create a completely new program when I can get the results in a week reusing tried and true programs?
If the computer scientist wants to create a new algorithm, then they are doing their job and that is sufficient for a paper in itself. Besides it is better for the computer scientist because then he gets the credit rather than having to be a co-author on a paper where the program takes second place to the data.
They have their own careers to look after, I have mine. I understand that bioinformatics takes time to develop and I applaud those who develop it. But I am not seeking a career in the development of bioinformatic tools, nor do most biologists. We just use them and then its on to the next step. If there is no preexisting tool, then I'll take the time to work with the computer scientist and wait for them to develop one and then use it to get to the question in hand. But otherwise, I see no reason for the biologist not to take advantage of pre-existing tools and it is absurd to be dismissive of them just because they use a pre-existing tool or a language they are more comfortable with.
I'm not dismissing anyone here or their relative contributions. I'm just being frank about the fact that there are more practical concerns.Last edited by chadn737; 06-11-2012, 02:58 PM.
Leave a comment:
-
Originally posted by chadn737 View PostI don't think the majority of people really care. I know I don't. I'm first and foremost a biologist. Sequencing is just a tool. Bioinformatics is just a tool. The real scientific question is the biology, not which is the best programming language. 10 years from now C++ and most of the bioinformatics will be outdated and lie unused, sequencing will be completely different, but the biology will remain. I think most of the hard core computer scientists here get that and certainly the biologists do. For most of us it is a waste of time writing new programs or rewriting old ones in a different language. It is far far smarter spending an extra hour of my time reusing a slightly slower program written by someone else in perl or python or java and getting my answer that week than spending a year trying to develop something completely new and then getting scooped by the guy who focused on the biology.
I've collaborated with enough computer scientists to know that it typically goes one of two ways:
1) They reuse tools already out there, which would be no different than what I could do on my own.
or
2) They want to develop something completely new and then I don't get my answer for 6 months, when I could have had it within the week and begun doing the follow up experiments.
So I have come to the conclusion that if I am going to collaborate to have that nice new program written in C++, I'd rather do my own work and get that published and let the Computer Scientist develop a program around already published data. Because if I get scooped waiting around that long, I'm the one whose screwed.
Geez !! What a warped view of the world.
Computer science has two components - (i) algorithm development and (ii) coding the algorithm into some programming language. A new algorithm is a mathematical discovery that sometimes takes decades to develop, but once it is in place, if can revolutionize all aspects of science and non-science, including your beloved sequence analysis. Here is the development history of one chain of algorithms -
You may notice that when Myers and Manbar were working on the concept of suffix arrays, they had no clue about how the future of sequencing technology would develop, yet two important lines of programs for short read analysis (Bowtie/BWA and String graph assemblers) rely on mathematical constructs developed by them.
Just like you have a reward structure regarding quick publication of your sequence-related paper, computer scientists have a different reward structure related to development of new algorithms. Historically it has been found that their reward structure contributes more to biology than another incremental biology paper. So biologists themselves (those more knowledgeable than you) encourage discoveries of new algorithms.
Leave a comment:
-
Originally posted by rskr View PostA) It is funny that people will spend tens of years programming languages that take five minutes to learn yet spend hours a day waiting for the programs to run.
B) Don't write thousands of lines of code in bash or perl they aren't designed for it. They are weakly typed and don't take advantage of compiler checking, not to mention the languages don't facilitate porting to many platforms.
I've collaborated with enough computer scientists to know that it typically goes one of two ways:
1) They reuse tools already out there, which would be no different than what I could do on my own.
or
2) They want to develop something completely new and then I don't get my answer for 6 months, when I could have had it within the week and begun doing the follow up experiments.
So I have come to the conclusion that if I am going to collaborate to have that nice new program written in C, I'd rather do my own work and get that published and let the Computer Scientist develop a program around already published data. Because if I get scooped waiting around that long, I'm the one whose screwed.Last edited by chadn737; 06-11-2012, 02:08 PM.
Leave a comment:
-
Originally posted by greenhilly View PostI have an extensive molecular biology background but am relatively new to bioinformatics. Would like to extend my computational/programming skills to maximize utility in analyzing sequencing and other high-throughput data, as well as to improve my own marketability.
Many job postings refer to some combination of Perl/Python/C++/Java experience. Any suggestions regarding where to focus effort, particularly in a forward-looking manner?
Thanks for any suggestions.
Searching at a website for folding of a set of miRNA sequences is bioinformatics. Writing server side code for the program that does that folding is also bioinformatics. Analyzing hundreds of expression numbers in excel or R is bioinformatics as well. Those three tasks take three different skills.Last edited by samanta; 06-11-2012, 12:54 PM.
Leave a comment:
-
Originally posted by krobison View PostIt's also worth contemplating the huge fraction of security holes in the world that are due to buffer overflow, an easy error to make in C/C++ and a challenging one to make in languages which supply memory management.
Originally posted by krobison View PostIt's also useful to think of all the poor user interfaces in the world, such as entry boxes for social security numbers or credit cards which do not accept human-friendly punctuation or spacing, that are there because it was hard to do in C or a similar language, and so trivial to do in Perl that almost nobody could be too lazy to do them.
B) Don't write thousands of lines of code in bash or perl they aren't designed for it. They are weakly typed and don't take advantage of compiler checking, not to mention the languages don't facilitate porting to many platforms.
Leave a comment:
-
Originally posted by Artem View PostWhere does Perl/Python fit into the mix?
A person can write multi-hundred line Bash routines but at some point the scripts become hard to maintain and expand at which point you should use Perl/Python unless you wish to go into the complexities of C/C++.
BTW: My longest bash script is 430 lines and is used to set up ABySS runs in various combinations of paired-end and single-end runs. My Perl scripts can run many times that length.
As I have said before, I consider 'R' to be a different path than bash/perl/python/C. Those languages are similar enough to have a common way of thinking. 'R' is all about statistical computing.
Leave a comment:
-
I'm actually in the same position as Greenhilly, I have started seriously programming about a month ago with background knowledge in Python and bash. My work has been in bash and R though. I use R for calculations and I use bash for data formatting and pipe lining. Eventually I do hope to learn some C for writing functions but I see that as a while away.
Where does Perl/Python fit into the mix?
Leave a comment:
-
Originally posted by rskr View PostIn a forward looking manner I wouldn't bother with Perl/Python/Java they are mostly just fads and any location you might want to work is just as likely to use the one you don't know, for no other reason than the CEO liked the monty python jokes or coffee. These scripting languages are easy enough to pick up if you know how to program in C, and most cool molecular dynamics simulators are in C for obvious performance reasons. Unix command line utilities are very handy for getting things done, and PERL and Python both draw heavily on the conventions so if you encounter a script done in either of these you should be able to figure out what it does(knowing linux that is).
I am a biologist first & program mostly in Perl, because it fits my brain well. So did C#, which I suspect you would also denigrate -- and I wrote some very sophisticated dynamic programming algorithms (if I do say so myself) in C#.
For most biologists, the extra bookkeeping required by C/C# isn't worth the execution speed advantage. Many other languages offer higher levels of abstraction that are a better fit to their line of thinking.
Ultimately, if you have the time it is worth exploring multiple languages, as many people find that there are a subset that fit their brain well. A rare few individuals are excellent at most. For me, Perl & C# have been the best fits, with Scala probably just missing out.
It's also worth contemplating the huge fraction of security holes in the world that are due to buffer overflow, an easy error to make in C/C++ and a challenging one to make in languages which supply memory management. It's also useful to think of all the poor user interfaces in the world, such as entry boxes for social security numbers or credit cards which do not accept human-friendly punctuation or spacing, that are there because it was hard to do in C or a similar language, and so trivial to do in Perl that almost nobody could be too lazy to do them.
Biologists & hard core computer scientists need to forge links, but I've always found it was the polylingual & inclusive computer whizzes who were a joy to work with; language snobs are likely to have other motes in their eyes which will interfere with collaborations.
Leave a comment:
-
I normally think you'd need a good command over unix shell, your choice of scripting language (Perl/Pythin/Ruby) and one object oriented programming language. A decent understanding of SQL queries might be pretty helpful as well depending on the kind of set up you work in.
However for a biologist, writing bioinformatics software in C will be a very steep learning curve, mainly due to understanding memory management, not many computer science majors have a good command over it, so java is a much friendlier programming language which is well used around bioinformatics software.
Leave a comment:
-
"...the difference between an 'int' and a 'char'."
An ent (sic int) is a tree-like giant of Middle Earth; a char is a tasty cold-water fish.
(Sorry, I'm a little punchy from lack of sleep)
Leave a comment:
-
... I said lack of merit, not without merit...
Given the number of people on this forum for whom American English is not their first language I think we should allow a bit of leeway for the subtle differences in stating their opinions.
BTW: I agree with "dpryan". Learn the shell. Learn Perl/Python (or maybe Ruby). Learn R. Learn C. As for the differences between the 4 -- R is the most different in syntax. The other 3 are similar enough to be easy to pick up once you know one of them (although all are hard to master.)
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
169 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
259 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
639 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
265 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Leave a comment: