www.Tutorialsforu.info

Free Tutorials Cave

  • Increase font size
  • Default font size
  • Decrease font size
Your Ad Here



Introduction to Clusters - page 3

E-mail Print
Article Index
Introduction to Clusters
page 2
page 3
page 4
page 5
All Pages

 

Why a Cluster

The two key reasons for using clusters instead of a large system are price/performance and scalability. As system size becomes larger, the size of its installed base decreases quite rapidly. Thus the cost of producing technology to scale the system to higher number of processors is amortized to a relatively fewer number of systems. Hence single systems reach a point of diminishing returns beyond which it is not cost-effective to scale them except for a limited set of special applications.

Some benefits of clusters include:

  • Lower cost: In general smaller sized systems benefit the most from commoditization of technology. Both hardware and software acquisition costs tend to be significantly lower for smaller systems. However you should consider the total cost of ownership of your computing environment while making a purchase decision. Next subsection points to some factors which may offset some of the advantages of initial cost of acquisition of a cluster.

     

  • Scalability: In many environments the problem workload is so large that it simply cannot be processed on a single system within the time constraints of the organization. Clusters also provide an easier path for increasing the computational resources as the workload increases over time. Most large systems scale to a certain number of processors and require a costly fork-lift upgrade.

     

  • Vendor independence: Although it is generally advisable to use similar components across various servers in a cluster, it is good to maintain a certain degree of vendor independence, especially if the cluster is being deployed for long term usage. A Linux cluster based on mostly commodity hardware allows for much greater vendor independence than a large multi-processor system running a proprietary operating system.

     

  • Adaptability: It is much more easier to adapt the topology, i.e. pattern of connecting the compute nodes together, of a cluster to best suit the application requirements of a computer center. Vendors typically support very restricted topologies of MPPs because of design, or sometimes testing, issues.

     

  • Reliability, Availability and Serviceability (RAS): A larger system is typically more susceptible to failure than a smaller system. A major hardware or software component failure brings the whole system down. Hence if a large single system is deployed as the computational resource, a component failure will bring down significant computing power. In case of a cluster, a single component failure only affects a small proportion of the overall computational resources.

    A system in the cluster can be serviced without bringing rest of the cluster down. Also, additional computational resources can be added to a cluster while it is running the user workload. Hence a cluster maintains continuity of user operations in both of these cases. In similar situations a SMP system will require a complete shutdown and a restart.

     

  • Faster technology innovation: Clusters benefit from thousands of researchers around the world, who typically work on smaller systems rather than expensive high end systems.

Downsides of Clusters

It is important to mention some disadvantages of using clusters as opposed to a single large system. These should be closely considered while deciding an optimal computational resource for an organization. System administrators and programmers of the organization should actively take part in evaluating the following trade-offs.

A cluster increases the number of individual components in a computer center. Every server in a cluster has its own independent power supplies, network ports etc. The increased number of components and cables going across servers in a cluster partially offsets some of the RAS advantages mentioned above. It is easier to manage a single system as opposed to multiple servers in a cluster. There are a lot more system utilities available to manage computing resources within a single system than those which can help manage a cluster. As clusters increasingly find their way into commercial organizations, more cluster savvy tools will become available over time, which will bridge some of this gap.

In order for a cluster to scale to make effective use of multiple CPUs, the workload needs to be properly balanced on the cluster. Workload imbalance is easier to handle in a shared memory environment, because switching tasks across processors doesn't require too much data movement. On the other hand, on a cluster it tends to be very hard to move an already running task from one node to another. If the environment is such that workload balance cannot be controlled, a cluster may not provide good parallel efficiency.

Programming paradigms used on a cluster are usually different from those used on shared-memory systems. It is relatively easier to use parallelism in a shared-memory system, since the shared data is readily available. On a cluster, as in an MPP system, either the programmer or the compiler has to explicitly transport data from one node to another. Before deploying a cluster as a key resource in your environment, you should make sure that your system administrators and programmers are comfortable in working in a cluster environment.

Cluster types

There are two broad categories of compute clusters based on the computational resource usage characteristics of problem(s) being solved them: Throughput clusters and Capability clusters.

 

Throughput Clusters


Figure 1.1: Throughput Cluster

A throughput cluster is deployed to solve a lot of relatively small problems. A single compute node is capable of providing sufficient computational resources to solve any of these problems. These could be independent applications (We will be referring to the program developed by a software developer to solve a problem as an application, and a particular instantiation of the application will be referred to as a job. A job may be composed of one or more processes/threads.) or multiple instantiations of the same application.

A cluster is deployed to optimally spread these problems on multiple compute nodes so that the overall workload can be executed on in parallel (see fig 1.1). A load balancing tool is used to optimally allocate compute nodes to address the needs of the users.

Capability Clusters


Figure 1.2: Capability Cluster

A capability cluster is deployed when a problem cannot be cost-effectively solved using a single server. As mentioned earlier, a set of small servers is significantly cheaper than a single system with same number of aggregate CPUs. So, a cluster is more cost-effective to deploy if it can execute the application at a comparable level of performance of a similar sized single system. In some extreme cases the resource requirements of a problem could be so large that a single system image cannot effectively scale to meet the demand.

Multiple nodes, in a capability cluster, coordinate with each other to solve the problem concurrently (see fig 1.2). A parallel programming technique is used to spread the load of a problem across compute nodes.



 

Subscribe By Email

Enter your email address:

Delivered by FeedBurner

Translate

Donate

Development & maintainance needs time & money.
With your donation you can help us to keep this project alive
Donate:
  Monthly Monthly
Currency
Amount