www.Tutorialsforu.info

Free Tutorials Cave

  • Increase font size
  • Default font size
  • Decrease font size
Your Ad Here



Introduction to Clusters - page 4

E-mail Print
Article Index
Introduction to Clusters
page 2
page 3
page 4
page 5
All Pages

 

Problems Being Solved On Linux Clusters

Scientists and engineers have been the predominant users of Linux clusters. These users range from an aerospace engineer simulating a trajectory to a financial engineer performing economic scenario analysis. New applications are getting ported to Linux clusters every day. Clusters are now finding aggressive acceptance into data mining applications, where a lot of largely independent data, e.g. clicks on a website, needs to be crunched for finding interesting and useful patterns.

Clusters are sometimes used as development or intermediary tools to assist specialized supercomputers. In many environments parallel programs are designed and tested on clusters and then deployed on, e.g., an MPP machine. Linux clusters are very popular in universities and research labs for teaching and experimenting with parallel and distributed computing technologies.

In next sections we provide some examples of clusters usage in specific areas and point out reasons for their preference.

Earth and Space science

Many of the computing problems in the areas of geophysics and seismology lend themselves for efficient solution on Linux clusters. Many big corporations, including global oil giants, have deployed clusters for oil and mineral exploration. Government agencies, as well as insurance companies, are using clusters to model effects of an earthquake for varied objectives like city planning, evaluating emergency readiness, etc. Astrophysicists are using clusters to simulate large and complex cosmic phenomena, like colliding galaxies. Numerical weather prediction models have been ported and optimized for Linux clusters. In fact the purpose of the very first Beowulf system built at NASA Goddard was to operate on the large data sets typically associated with applications in earth and space sciences.

Space, earth's core & its surface can be divided into smaller chunks and for some of these problems each chunk can be processed independent of the other. For example, an oil exploration project can divide the sea-floor into various areas. Nodes in a throughput cluster can then process the seismic data from these areas in parallel.

Bioinformatics and Chemistry

Linux clusters are widely used in established and emerging areas in the fields of bioinformatics and computational chemistry. Scientists can use numerical simulation of electronic structures of molecules to predict various chemical phenomena, without actual laboratory experimentation. This simulation can be highly computationally intensive, and benefits significantly from scalability of a cluster. Another exciting usage of Linux clusters is in the areas of gene sequencing and protein modeling. Applications in these clusters are used for such crucial areas as understanding diseases, designing drugs and improving agricultural output.

Many popular computational chemistry applications have been ported and tuned for Linux clusters. Examples include Gaussian[], GAMESS[], CHARMM[] etc. A typical computational chemistry environment has a mix of both sequential jobs (requiring a throughput cluster) and parallel jobs (requiring a capability cluster). Therefore the deployed clusters are hybrid in nature, providing tools for both throughput and capability requirements.

Clusters deployed in bioinformatics applications require not only the scalability of their number crunching capabilities, but also the scalability of their data storage and retrieval. For example, a system meant for sequencing of genomes requires access to a large amount of pre-existing sequence knowledge along with the capability of adding numerous new entries. Special databases have been designed to serve the purposes of such clusters.

Rendering

Rendering refers to the process of transforming software scene descriptions into colored pixels. Rendering a sequence of frames to create an animation is one of the so called embarrassingly parallel applications. A cluster meant for rendering a set of frames is referred to as a Render farm.

A render farm is a throughput cluster, with each node capable of running a rendering algorithm. Even a small animation contains thousands of frames, and a single frame may take hours to get rendered. A render farm renders an animation in parallel, hence cutting down on production time. A Linux cluster was used to render the visual effects in the movie Titanic.

Artists submit the scenes of a film to a control server. The control server monitors the various nodes in the cluster to see availability of idle computing resources. It then submits the frame to an available CPU. The computing node renders the frame and sends it to a storage server (which may or may not be the same as the control server). The storage server makes sure that the frames are kept in the correct order as desired by the artists.

An advanced technique for exploiting the parallelism of rendering is to break a single frame into multiple segments, and spread these segments onto different nodes of the cluster. These nodes then send the rendered segments back to a master server. The master server usually has to do some post-processing on the frame after assembling all the segments. This technique is useful when fast rendering of an individual frame is desired.



 

Subscribe By Email

Enter your email address:

Delivered by FeedBurner

Translate

Donate

Development & maintainance needs time & money.
With your donation you can help us to keep this project alive
Donate:
  Monthly Monthly
Currency
Amount