|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
TOP THREE LINKS YOU MUST CLICK ON Industry Analysis Applicability of the .NET Platform to Bioinformatics Research
Use .NET to heal
Oct. 25, 2005 03:00 PM
A current look at the field of bioinformatics will reveal that it is a field that is largely dominated by the Linux operating system, as well as by programming languages such as Perl, Python, and Java. Windows and its associated native application development platforms are not in widespread use among present-day bioinformatics practitioners. In fact, the usage of Linux and other open source technologies will likely remain the dominant platforms upon which most novel and/or large-scale bioinformatics research is conducted. Scientific computing of all types has deep-seated roots in Unix and its derivatives, and as a result is very much dependent on code bases that are written with *nix platforms in mind. Many scientific applications are written for High Performance Computing (HPC) architectures or distributed computing environments, and such applications will often need to be run for lengthy periods of time, thus making OS stability an important factor. While Windows operating systems have made inroads in the server markets, the HPC market is still devoid of most Microsoft-based products. Practical issues such as these aside, however, most bioinformatics practitioners are highly in favor of open source ideologies and technologies, since the free exchange of ideas is valued as one of the fundamental building blocks upon which scientific progress is based.
The popularity of the Web interface of BLAST illustrates some important points about mainstream application acceptance among biologists. The BLAST Web interface allows users to interact with a graphical interface that provides them with labeled text boxes and drop-down menus, which simplifies the interaction between the user and the application. In contrast, there is also a stand-alone BLAST client with a command line interface that can be downloaded to run off local sequence databases, or that possesses the ability to interact with a Web API via a Perl script or equivalent. Professional bioinformaticists who are comfortable with CLI input or scripting often prefer these Web application alternates due to the greater degree of customization that is possible or the ability to automate large jobs. Still, these are skill sets that the average experimental biologist does not possess. Proficiency with the Windows OS and Windows-based applications, however, is commonplace among biologists, and like the Web interface, Windows applications provide a graphical means for user interaction. As time passes and as biologists continue to amass large data sets, the desire for biologists to conduct bioinformatics-type analyses on their data sets will also grow. Many algorithms and techniques will go the way of BLAST and leave the realm of specialized knowledge, thereby achieving mainstream usage. Unlike BLAST however, not all of these methods will be best suited for development as a Web application, but would be better if offered as a desktop application. This is where there will be a newfound need for bioinformatics application developers to shift development efforts to a Windows platform, so that biologists can use the applications in an environment and layout that is familiar to them. The need for simplified access to bioinformatics applications has also been recognized by the Apple Corporation, which advertises that many bioinformatics tools can be run due to its Unix core, but that the OSX desktop can make the experience more user-friendly. While perhaps not the best platform for the development of the most computationally intensive applications, the Windows environment has demonstrated itself to be suitable for many types of bioinformatics analyses. One of the most interesting examples of this is the recent demonstration by Microsoft Research that code found in the MS AntiSpyware application could be used to find genetic patterns in HIV. Getting more of these types of tools into the hands of biologists would greatly accelerate the pace at which many types of research findings could be made and would enhance the ability of biologists to tackle pressing biological problems such as disease, drug resistance, and bioterrorism, to name a few. The question is how to facilitate the development of such applications. I believe the answer lies in looking at how the present-day bioinformatics community operates, and in transferring some of that ideology to a Windows development platform. The widespread acceptance of open source methodologies within bioinformatics is often credited as a factor that contributes to the rate at which bioinformatics researchers are able to produce new tools and analysis techniques; a good of this is the BioPerl library of modules. Basically the BioPerl project allows bioinformaticists to contribute code in the form of a Perl module to the project, and other bioinformaticists can then download the code and freely use it within their own applications. This allows researchers to keep from reinventing the wheel and permits them to focus more on the novel scientific aspects of their project rather than on coding routine tasks, which is a concept that is not so different from the classes that make up the .NET framework. This suggests that the development of an increasing number of bioinformatics applications for Windows could be greatly facilitated if a .NET class library consisting of specialized bioinformatics classes were developed. For example, such a library may contain functionality that computes the properties of protein or DNA sequences, such as in the segment of example code provided in Listing 1, which calculates the molecular weight of a protein based on its amino acid sequence (see Figure 1). Standardized libraries are often especially important in science where reproducibility is of key importance, and having applications based on a common set of underlying functionality is one way of ensuring this. This goal may even be furthered by creating a class library that could be used interoperably with the Mono project, since this would provide the ability for the functionality to be reproduced in a more platform-independent manner. Moreover, it is important for the library to be developed in an open-source manner because modifications and contributions by the scientific community will be imperative. The needs of scientists are constantly changing and the field of bioinformatics is quite diverse. It would be difficult for a single development team to develop a library with functionality that is widespread enough to attract a cross section of all bioinformatics researchers. The community-based development approach would help to ensure that the library had the requisite diversity and that as the field advances, so too do the classes that compose the library.
Summary YOUR FEEDBACK
LATEST CLOUD DEVELOPER STORIES
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS BREAKING CLOUD COMPUTING NEWS |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||