BLOG-N-PLAY.COM
The Great Firewall of China? and more.
The Great Firewall of China... and more.

TOP THREE LINKS YOU MUST CLICK ON


Applicability of the .NET Platform to Bioinformatics Research
Use .NET to heal

Digg This!

A current look at the field of bioinformatics will reveal that it is a field that is largely dominated by the Linux operating system, as well as by programming languages such as Perl, Python, and Java. Windows and its associated native application development platforms are not in widespread use among present-day bioinformatics practitioners. In fact, the usage of Linux and other open source technologies will likely remain the dominant platforms upon which most novel and/or large-scale bioinformatics research is conducted. Scientific computing of all types has deep-seated roots in Unix and its derivatives, and as a result is very much dependent on code bases that are written with *nix platforms in mind. Many scientific applications are written for High Performance Computing (HPC) architectures or distributed computing environments, and such applications will often need to be run for lengthy periods of time, thus making OS stability an important factor. While Windows operating systems have made inroads in the server markets, the HPC market is still devoid of most Microsoft-based products. Practical issues such as these aside, however, most bioinformatics practitioners are highly in favor of open source ideologies and technologies, since the free exchange of ideas is valued as one of the fundamental building blocks upon which scientific progress is based.

That having been said, in the not-so-distant future there will be an increasing demand for Windows-based bioinformatics applications. Bioinformatics is a field that has experienced rapid growth over the last decade and has in many ways revolutionized the way certain aspects of the biological sciences are conducted. Bioinformatics methodologies were of critical importance to completing projects such as the genome assembly portion of the Human Genome Project. Advances in areas such as genome sequencing, proteomics, microarrays, as well as advances in other forms of biological data collection are generating voluminous amounts of data and thus rapidly changing the field of biology from a purely experimental science into an information science. As this change occurs, many bioinformatics tools become adopted by mainstream biologists and leave the realm of specialized knowledge, thereby requiring the skill set of a trained bioinformaticist. Perhaps the most prominent example of this is the widespread adoption of the BLAST algorithm, which performs alignments between DNA or protein sequences based on the similarities of their composition. When the algorithm first appeared it was mainly a tool used by individuals who were interested in the computational analysis of biological sequences, yet currently it's a staple technique employed in many research projects that would not be considered at all computational in nature. In fact, today BLAST is among the most widely used of all bioinformatics applications, and the major interface for utilizing BLAST is the Web application hosted by the National Center for Biotechnology Information at www.ncbi.nlm.nih.gov/blast/index.shtml.

The popularity of the Web interface of BLAST illustrates some important points about mainstream application acceptance among biologists. The BLAST Web interface allows users to interact with a graphical interface that provides them with labeled text boxes and drop-down menus, which simplifies the interaction between the user and the application. In contrast, there is also a stand-alone BLAST client with a command line interface that can be downloaded to run off local sequence databases, or that possesses the ability to interact with a Web API via a Perl script or equivalent. Professional bioinformaticists who are comfortable with CLI input or scripting often prefer these Web application alternates due to the greater degree of customization that is possible or the ability to automate large jobs. Still, these are skill sets that the average experimental biologist does not possess. Proficiency with the Windows OS and Windows-based applications, however, is commonplace among biologists, and like the Web interface, Windows applications provide a graphical means for user interaction.

As time passes and as biologists continue to amass large data sets, the desire for biologists to conduct bioinformatics-type analyses on their data sets will also grow. Many algorithms and techniques will go the way of BLAST and leave the realm of specialized knowledge, thereby achieving mainstream usage. Unlike BLAST however, not all of these methods will be best suited for development as a Web application, but would be better if offered as a desktop application. This is where there will be a newfound need for bioinformatics application developers to shift development efforts to a Windows platform, so that biologists can use the applications in an environment and layout that is familiar to them. The need for simplified access to bioinformatics applications has also been recognized by the Apple Corporation, which advertises that many bioinformatics tools can be run due to its Unix core, but that the OSX desktop can make the experience more user-friendly. While perhaps not the best platform for the development of the most computationally intensive applications, the Windows environment has demonstrated itself to be suitable for many types of bioinformatics analyses. One of the most interesting examples of this is the recent demonstration by Microsoft Research that code found in the MS AntiSpyware application could be used to find genetic patterns in HIV. Getting more of these types of tools into the hands of biologists would greatly accelerate the pace at which many types of research findings could be made and would enhance the ability of biologists to tackle pressing biological problems such as disease, drug resistance, and bioterrorism, to name a few.

The question is how to facilitate the development of such applications. I believe the answer lies in looking at how the present-day bioinformatics community operates, and in transferring some of that ideology to a Windows development platform. The widespread acceptance of open source methodologies within bioinformatics is often credited as a factor that contributes to the rate at which bioinformatics researchers are able to produce new tools and analysis techniques; a good of this is the BioPerl library of modules. Basically the BioPerl project allows bioinformaticists to contribute code in the form of a Perl module to the project, and other bioinformaticists can then download the code and freely use it within their own applications. This allows researchers to keep from reinventing the wheel and permits them to focus more on the novel scientific aspects of their project rather than on coding routine tasks, which is a concept that is not so different from the classes that make up the .NET framework.

This suggests that the development of an increasing number of bioinformatics applications for Windows could be greatly facilitated if a .NET class library consisting of specialized bioinformatics classes were developed. For example, such a library may contain functionality that computes the properties of protein or DNA sequences, such as in the segment of example code provided in Listing 1, which calculates the molecular weight of a protein based on its amino acid sequence (see Figure 1). Standardized libraries are often especially important in science where reproducibility is of key importance, and having applications based on a common set of underlying functionality is one way of ensuring this. This goal may even be furthered by creating a class library that could be used interoperably with the Mono project, since this would provide the ability for the functionality to be reproduced in a more platform-independent manner. Moreover, it is important for the library to be developed in an open-source manner because modifications and contributions by the scientific community will be imperative. The needs of scientists are constantly changing and the field of bioinformatics is quite diverse. It would be difficult for a single development team to develop a library with functionality that is widespread enough to attract a cross section of all bioinformatics researchers. The community-based development approach would help to ensure that the library had the requisite diversity and that as the field advances, so too do the classes that compose the library.

Summary
In all, bioinformatics is a field of research that has undergone a rapid explosion in terms of the numbers of tools and techniques that it has produced. While much of this progress can be attributed to the adaptability and flexibility of open source technologies such as Linux and Perl, it is important for bioinformatics professionals to consider that disseminating their bioinformatics tools to users can be as critical to scientific progress as developing new tools. A key way to facilitate the widespread dissemination of bioinformatics applications to biologists will likely be the development of an open sourced .NET class library to serve as framework for Windows-based bioinformatics applications.

About Christopher Frenz
Christopher Frenz is the author of "Visual Basic and Visual Basic .NET for Scientists and Engineers" (Apress) and "Pro Perl Parsing" (Apress). He is a faculty member in the Department of Computer Engineering at the New York City College of Technology (CUNY), where he performs computational biology and machine learning research.

SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

ADS BY GOOGLE
LATEST ARTICLES, NEWS & POSTS
HP Launches New Versions Of SOA Testing Products
HP has introduced enhanced quality and management software designed to meet new requirements for mainstream deployment of service-oriented architectures (SOA) by businesses. To make sure that services meet all functional and performance objectives and are ready for production dep
ScaleMP Announces Virtualization Channel Partner Program
ScaleMP announced details of its channel partner program. The program, designed to assist channel partners in providing server solutions for their customers' high-end computing needs, offers resellers and system integrators the necessary tools to bring the recently launched Scale
Dell Virtualization Servers Featuring Quad-Core AMD Opteron Processors Available
A new scalable virtualization server is now available from Dell and based on the Quad-Core AMD Opteron processor, AMD announced. Enterprise customers interested in consolidating server infrastructure may turn to two new virtualization servers from Dell utilizing Quad-Core AMD Opt
Virtualization - AMD Kills Montreal for Istanbul
AMD has rethought its roadmap and, given its limited resources and near-death experience with Barcelona, it's scrubbing Montreal, the eight-core chip that was supposed to follow Shanghai, the chip after Barcelona, and substituting a six-core part code named Istanbul to be followe
Application Security for Open Source - The New Frontier
Hybrid applications made up of proprietary, open source and third-party components are the result of today's fast-paced and complex software development landscape. Applications developed within the last five years - whether internal or external - are at least 50% open source soft
Virtualization Journal Attracts JavaOne Attendees to SYS-CON Media Booth
Virtualization Journal now reaches more than 60,000 online readers with monthly digital editions and weekly newsletters. The premier issue of the magazine's print edition, which debuts on May 6, 2008, at JavaOne in San Francisco, as a media sponsor of this event, will be availabl
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS

LIVE NEWS FROM THE WIRES
Director Actress Anna Wilding Attends Cannes 2008 With Carpe Diem Films LLC New York and the Film "BUDDHA WILD MONK IN A HUT"
Director Actress Anna Wilding will be attending the Producers Network Cannes 2008 and taking t
IP Attorney Steven P. Shurtz Participates in Salt Lake City Public Library's Inventors Fair
Intellectual Property attorney, Steven P. Shurtz, managing partner of the Salt Lake City off
Premium Australian Beverage Company Lion Nathan Selects Quickcomm for Telecom Expense Management Services
Lion Nathan, an Australian premium alcoholic beverages company has chosen Quickcomm, a leadi
Pangea Day to be Broadcast for TV in 150 Countries, Plus Global Live Webcast, and on Mobile Phones
PANGEA DAY, a global day of film, speakers, and music watched simultaneously around the worl
Extended Accrual Time and Decreased Sample Size for OVATURE, a Multinational Phase III Clinical Study for Women With Recurrent Ovarian Cancer
The accrual time for the OVArian TUmor REsponse (OVATURE) clinical study, a Phase III study of