| Article Index |
|---|
| MySQL Cluster Evaluation Guide |
| Page 2 |
| Page 3 |
| Page 4 |
| Page 5 |
| All Pages |
MySQL Cluster Evaluation Guide
The purpose of this article is to present several key points to consider before beginning and during the
course of an evaluation of MySQL Cluster and MySQL Cluster Carrier Grade Edition. This will help you make the most of the time and resources you dedicate to an evaluation to determine its suitability for your application/database migration.
What is MySQL Cluster?
MySQL Cluster is a relational database technology which enables the clustering of in-memory and disk-based tables with shared-nothing storage. The shared-nothing architecture is a distributed computing architecture where each node is independent and self-sufficient, and there is no single point of contention across the system. This shared-nothing architecture allows the system to work with commodity hardware and software components, such as the standards based AdvancedTCA platform.
MySQL Cluster integrates the standard
MySQL server with a clustered storage engine called NDB. The data
within a MySQL Cluster can therefore be accessed via various MySQL
connectors like PHP, Java or .NET. Data can also be accessed and
manipulated directly using MySQL Cluster’s native NDB API.
This C++ interface provides fast, low-level connectivity to data
stored in a MySQL Cluster. A Java version of NDB API is also
available, called NDB/J.
Nodes which comprise MySQL Cluster Architecturally, MySQL Cluster
consists of three different types of nodes, each providing a
specialized role.
Data Nodes are the main nodes of a MySQL
Cluster. They provide the following functionality:
• Storage and management of both in-memory and disk-based data
• Automatic and user defined partitioning of data
• Synchronous replication of data between data nodes
• Transactions and data retrieval
• Fail over
• Resynchronization after failure
By storing and distributing data in a shared-nothing architecture,
i.e. without the use of a shared-disk, if a Data Node happens to
fail, there will always be at least one additional Data Node
storing the same information. This allows for requests and
transactions to continue to be satisfied without interruption.
Transactions which are aborted because of a Data node failure are
rolled back and must be restarted. As of version 5.1, it is
possible to choose how to store data; some data can be stored on
disk or completely in-memory. In-memory storage can be especially
useful for data that is frequently changing (the active working
set). Data stored in-memory is routinely check pointed to disk both
locally and global across Data Nodes so that the MySQL Cluster can
be recovered in case of a system failure. Disk-based data can be
used to store data with less strict performance requirements, where
the data set is bigger than the available RAM. As with most other
database servers, a page-cache is used to cache frequently used
disk-based data in order to increase the performance.
Application Nodes are the applications connecting to the database. This can take the form of an application leveraging the high performance APIs, such as the NDB API or NDB/J. It can also be one or many MySQL Servers performing the function of SQL interfaces into the data stored within a MySQL Cluster. A common approach is to access the data for the real time applications using the NDB API, and perform operations and maintenance tasks using the SQL interface, where real time performance is not critical.
Data Nodes do not require any specific Application Nodes to be available and running in order to service requests from other Application Nodes. This means there is no interdependence between Application
Nodes and Data Nodes. In this way, by minimizing the interdependency of nodes, the MySQL Cluster is able to minimize any single points of failure. Management Nodes manage and make available to other nodes cluster configuration information. The Management Nodes are used at startup, when a node wants to join the cluster, and when there is a system reconfiguration. Management Nodes can be stopped and restarted without affecting the ongoing execution of the Data and Application Nodes. By default, the Management Node also provides arbitration services, in the event there is a network failure which leads to a “split-brain” or a cluster exhibiting
“network-partitioning”.
In Figure 1 is a simplified architecture
diagram of a MySQL Cluster consisting of four Data Nodes.
The Benefits of MySQL Cluster
The shared-nothing architecture employed by MySQL Cluster offers
several key advantages:
Scalability
MySQL Cluster offers scalability on three different levels:
• If more storage or capacity is
needed, Data Nodes can be added incrementally
• Application Nodes can be dynamically added to increase
performance and parallelization
• Clients connecting to Application Nodes can also be dynamically added online
Performance
MySQL Cluster's architecture, which offers scalability on three tiers, can deliver unprecedented performance when used in conjunction with:
• NDB API or NDB/J
• Primary key lookups
• Distribution-aware application design
• User-defined partitioning
• Parallelization
• Transaction batching
High-Availability
Data Nodes can fail, and resynchronize
automatically, without affecting service or forcing the Application
Nodes to reconnect. Moreover, it is also possible to have redundant
Management Servers and Application Nodes to maximize service
availability. In version 5.1, it is also possible to replicate
asynchronously between MySQL Clusters to allow for geographic
redundancy.
Key features of MySQL Cluster 5.1
MySQL Cluster 5.1 introduces several new
features that lend themselves to building a high performance,
scalable and highly available system. These include:
• Disk-based data
• Row-based replication
• Online add/drop index
• More efficient variable sized record storage
• Optimized node recovery
For more information about these features,
see:
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster51.php




