In this article we discuss a use case
for starting with a single server (single shard setup) and scaling
gradually as the application and its data grows. We also describe
here the different aspects of creating shards using MySQL Fabric and
what each of it means.
Note: Unlike the previous example
the present example is not working with data that already exists.
Note: The example also covers how
to modify the python application code to work with Fabric.
Prototyping the Application on a Single MySQL Shard Setup
The figure below depicts the
difference in the initial setup between using and not using MySQL
Fabric.
Note: For the initial setup the
Global Group can contain just another MySQL Server running on the
same machine.
Note: The application would also
need changes in the way it uses the python connector, when it
switches to FABRIC.
The Application
In order to keep the focus on working
with Fabric, a very simple application is used as an example. The
application basically registers and queries employee details. The
schema itself is highly simplified and contains the bare minimum to
cover the all the use cases in Fabric.
Application Schema:
The employee, department and the salary tables are sharded based on the emp_no that is used as the shard key.
The fuel_reimb represents a table
that is often used (in joins queries) to find the fuel allowance for
various employees. All the data in this table is often required in
joins.
Connecting to the database:
We use the python connector to write into the database. The python connector is MySQL Fabric aware. We shall look into the details of using the Fabric aware python connector later in this blog.
We use the python connector to write into the database. The python connector is MySQL Fabric aware. We shall look into the details of using the Fabric aware python connector later in this blog.
Starting MySQL Fabric
- Start a MySQL Server that acts as the state store.
- Change fabric.cfg, for configuring Fabric, to use this MySQL Server as the state store.
- Fabric persists metadata about the shard topology in a set of relation tables in the MySQL Server acting as the state store. This step creates the schemas of the tables in the state store.
Commands:
mysqlfabric
manage setup
- Starting the Fabric Server (contains an XML-RPC Server).
Commands:
mysqlfabric
manage start
Creating the sharding topology
Creating the Global Group
The global group in a sharding setup
is used to propagate schema updates to all tables in the sharding
setup and updates to global tables throughout the sharding ecosystem.
The global group contains,
- The schema of the sharded tables
- The global tables
Commands:
mysqlfabric
group create GLOBAL_GROUP
mysqlfabric
group add GLOBAL_GROUP localhost:13001 root ""
mysqlfabric
group add GLOBAL_GROUP localhost:13002 root ""
Find the UUID
of localhost:13001 and promote it as master
mysqlfabric
group lookup_servers GLOBAL_GROUP
mysqlfabric
group promote GLOBAL_GROUP <UUID of MySQL Server running on
localhost:13001>
Creating the Server Group (for the first shard)
Each shard maps to a server group.
Each server group is a set of MySQL servers in a HA configuration.
The application starts with a single server group and adds more shards (server groups)
as required. Let us create the first shard.
Commands:
mysqlfabric
group create GROUPID1
mysqlfabric group add GROUPID1 localhost:13003 root ""
mysqlfabric
group add GROUPID1 localhost:13004 root ""
mysqlfabric group lookup_servers GROUPID1
mysqlfabric
group promote <UUID of MySQLServer running on localhost:13003>
Create a Sharding Definition
A sharding definition introduces a
particular sharding scheme to which multiple tables (related by a
common sharding key) can be mapped. A Sharding definition is composed
of
- Sharding scheme – RANGE, HASH
- Global Group – Gets the global updates for all the shard tables mapped to this sharding definition. It also stores the global tables in this sharding definition.
Command
mysqlfabric sharding define RANGE
GLOBAL_GROUP
Map the tables to the sharding definition
Each sharding definition is associated with a unique ID. This unique ID is generated when we create a sharding definition. In the above case this ID is 1.We map this unique ID to the table being sharded. Each mapping maps a table and the column containing the shard key to the sharding definition.Commands:
mysqlfabric
sharding add_mapping 1
employee.employee emp_no
mysqlfabric
sharding add_mapping 1 employee.salaries emp_no
mysqlfabric
sharding add_mapping 1 employee.dept_emp emp_no
Add the Shards
When we add shards we define the way
the data is sharded based on the shard key. Each of the shards,
- Specify the shard mapping ID to which it belongs.
- It also specifies the group in which the shard data will be present.
- When we add a shard we also need to specify if it will be ENABLED for operations on shards.
- A RANGE shard definition also specifies a lower_bound which represents the least value of the shard key that will be present in a particular shard. In this case we assume that the lower_bound=1.
Commands:
mysqlfabric
sharding add_shard 1 GROUPID1 ENABLED 1
Define the Schema on the Global Group
The schema for the tables is defined
on the master of the global group so that it gets replicated to all
the shards.
Queries:
CREATE
DATABASE employee;
use
employee;
CREATE
TABLE employee(emp_no INT PRIMARY KEY AUTO_INCREMENT, birth_date
DATE, first_name VARCHAR(14), last_name VARCHAR(16), gender
ENUM('M','F'), hire_date DATE);
CREATE
TABLE salaries(emp_no INT, salary INT, from_date DATE, to_date DATE,
FOREIGN KEY(emp_no) REFERENCES employee(emp_no));
CREATE
TABLE dept_emp(emp_no INT, dept_no CHAR(4), from_date DATE, to_date
DATE, FOREIGN KEY(emp_no) REFERENCES employee(emp_no));
CREATE
TABLE fuel_reimb(role VARCHAR(20), allowance INTEGER);
Write the Application
The below application is written in
Python. The application is very simple and uses connector python to
insert a row into the table. The goal of the application is to
demonstrate the extensions in the python connector that make it
FABRIC aware.
Application Code:
import
mysql.connector.fabric as connector
if __name__
== "__main__":
fabric_params
= {
"fabric"
: {"host" : "localhost", "port" :
8080},
"user"
: "root", "passwd" : ""
}
__cnx =
connector.MySQLFabricConnection(**fabric_params)
__cnx.set_property(key=str(1),tables=["employee.employee"])
cur =
__cnx.cursor()
cur.execute("use
employee")
cur.execute("INSERT
INTO employee(birth_date, first_name, last_name, gender, hire_date)
VALUES(NULL, 'A', 'B', 'M', NULL)")
__cnx.commit()
Explanation:
fabric_params
= {
"fabric"
: {"host" : "localhost", "port" :
8080},
"user"
: "root", "passwd" : ""
}
__cnx
= connector.MySQLFabricConnection(**fabric_params)
The above code
gets a connection to the running Fabric XML-RPC server. The
parameters that are passed to the MySQLFabricConnection class are
“host”
- The hostname at which the Fabric server runs.
“port”
- The port number at which the Fabric server runs.
“user”,
“passwd” - Login credentials for the Fabric state store.
__cnx.set_property(key=str(1),tables=["employee.employee"])
key –
The shard key value
tables
– The table name being sharded.
set_property
automatically redirects internally to the correct shard on which the
queries are executed.
If a query
needs to be executed on all the shards, example an update on a global
table or a schema change operation on all the shards, then add a
parameter
scope =
“GLOBAL”
to signify
global operations. This is set of “LOCAL” by default.
Adding more Shards
As the size of the shard increases,
we add more shards to the sharding topology.
Commands:
Add a group
for the shard
mysqlfabric
group create GROUPID2
mysqlfabric
group add GROUPID2 localhost:13005 root ""
mysqlfabric
group add GROUPID2 localhost:13006 root ""
mysqlfabric
group lookup_servers GROUPID2
mysqlfabric
group promote <UUID of MySQLServer running on localhost:13005>
Add the
shard definition
The difference
from the previous definition will be that the lower_bound will point
to employee number from which we start inserting into this shard. In
this case we choose a hypothetical value of 10001. This means that
starting 10001 the employee information will be stored in this shard.
mysqlfabric
sharding add_shard 1 GROUPID2 ENABLED 10001
The following
will direct the application to this new shard.
__cnx.set_property(key=str(10001),tables=["employee.employee"])
In the above
case any key value greater than 10001 will automatically redirect the
insert to the new shard.
NOTE: The
above step actually adds a new shard starting at a specified
lower_bound. Other options for moving data around in a sharded setup
are shard moves and splits. These shall be explored in later blogs.
Find out how to migrate from an unsharded to a sharded setup here http://vnwrites.blogspot.in/2013/09/mysqlfabric-sharding-migration.html .
Find out how to migrate from an unsharded to a sharded setup here http://vnwrites.blogspot.in/2013/09/mysqlfabric-sharding-migration.html .