Introduction
Having written an application to work
with an unsharded MySQL setup, how do we migrate the application to a
sharded setup ?
The above problem can be divided into
two parts
- Migrating the MySQL Store (with minimum disruption to ongoing transactions)
- Migrating the Application.
This blog deals with the first
problem of migrating the MySQL Store. The second problem of migrating
the application will be handled in a separate blog post, and we shall
assume for now that we just replace the old unsharded non-Fabric
aware application with the new Fabric aware application.
Most enterprises start with a single
server setup to develop a prototype and later realize that they need
to scale. This blog presents a way for such a setup to seamlessly
scale to a sharded setup with minimal disruption of ongoing
transactions.
Unsharded Setup
The following is the schema in the
MySQL Server running on localhost:13013
The employee database has the
following tables
- employee
- salaries
- dept_emp
- fuel_reimb
Creating the Sharded Configuration
Let us assume that our final sharded
configuration will have three shards. Each of these shards will be
present in one group and there will be one global group.
The tables employee, salaries and
dept_emp need to be sharded based on emp_no while fuel_reimb is a
global table that needs to be present in all the shards.
Bring the Target server into the Fabric ecosystem
Create a Fabric group and add the
server to it. Promote the server to master.
Commands:
mysqlfabric
group create GROUPID1
mysqlfabric
group add GROUPID1 localhost:13013 root ""
mysqlfabric
group promote GROUPID1
The server is now part of the Fabric
setup.
Creating the Global Group
The global group in a sharding setup
is used to propagate schema updates to all tables in the sharding
setup and updates to global tables throughout the sharding ecosystem.
The global group contains
- The schema of the sharded tables
- The global tables
NOTE:
Once we have restored the schema on the global group, until the
global group is connected to the server group of the shard (GROUPID1) i.e. Until we
create the sharding setup, we need to ensure that schema changing
transactions do not occur on the server group of the shard. If schema changes occur
in the server group of the shard, these will need to be propogated to the global
group also, otherwise it will result in mismatching schemas in the
global group and the server group of the shard. This can cause problems for global
operations that expect the new schema. Once the sharding setup is
created, schema changes can be sent to the global group.
NOTE:
Similarly once the global tables have been restored on the global
group, we need to prevent changes to the global groups until the
sharding setup is created, after which the changes can be propagated
to the global group.
Commands:
mysqlfabric
group create GLOBAL_GROUP
mysqlfabric
group add GLOBAL_GROUP localhost:13011 root ""
mysqlfabric
group promote GLOBAL_GROUP
Creating the data in the Global Group
Backup the schema of all the tables
and restore them on the on the global group. Also backup the data in
the global tables and restore them on the global group.
Commands:
Use mysqldump
to backup the schemas
mysqldump
-d -u root --single-transaction --all-databases --socket <Server
socket file for localhost:13013> > schema.sql
NOTE:
--no-data can be used instead of -d in the above command.
Use mysql
client to restore the schemas
mysql -u
root --socket <Server socket file for localhost:13011> <
schema.sql
Use mysqldump
to backup the global tables
mysqldump
-u root --no-create-info --single-transaction --socket <server
socket file for localhost:13013> employee fuel_reimb >
global.sql
The
GTID_EXECUTED variable would have been set on the destination
server. This causes applying the dump using the mysql client to fail.
Reset this to allow restoring the global table.
reset
master
Restore the
data on the global tables
mysql
--database=employee -u root --socket <server socket file for
localhost:13011> < global.sql
Creating the Sharding Setup
Create a sharding definition
A sharding definition introduces a
particular sharding scheme to which multiple tables related by a
common sharding key can be mapped. A Sharding definition is composed
of
- Sharding scheme – RANGE, HASH
- Global Group – Gets the global updates for all the shard tables mapped to this sharding definition. It also stores the global tables in this sharding definition.
Command
mysqlfabric
sharding define HASH GLOBAL_GROUP
Map the tables to the sharding definition
Each sharding definition is
associated with a unique ID. We map this unique ID to the table being
sharded. Each mapping maps a table and the column containing the
shard key to the sharding definition.
Commands:
mysqlfabric
sharding add_mapping 1 employee.employee emp_no
mysqlfabric
sharding add_mapping 1 employee.dept_emp emp_no
mysqlfabric
sharding add_mapping 1 employee.salaries emp_no
Define the Shards
The shard definitions define the way
the data is sharded based on the shard key. For a hash based sharding
scheme, since the shards are automatically defined on the shard key,
we do not need to manually define the lower_bounds for each shard.
Each shard is associated with a
sharding definition and a group on which the data will be present.
When we define a shard we also need to specify if it will be ENABLED
for operations on shards.
Commands:
mysqlfabric
sharding add_shard 1 GROUPID1 ENABLED
The sharding setup now contains a
global group and one shard containing all the data.
Once the sharding setup has been
created we can start using the Fabric enabled connector to start
executing transactions. This would require shutting down the
non-Fabric-aware application and replacing it by a Fabric-aware
application.
Split the shards
Split the shard containing the data.
Splitting helps to distribute the load into another group. This is a
way to scale our setup for both read and write loads. The splitting
can happen while transactions are ongoing.
Commands:
Add another
group to which we want to split the data
mysqlfabric
group create GROUPID2
Add the
servers that are part of the split group
mysqlfabric
group add GROUPID2 localhost:13009 root ""
Promote a
server as master in the split group
mysqlfabric
group promote GROUPID2
Split the
shard
If you do not
know the shard ID, dump the sharding information to find the shard
ID.
mysqlfabric
store dump_sharding_information
split the
shard
mysqlfabric
sharding split 1 GROUPID2
Now the data is distributed between
the GROUPID1
and the GROUPID2.
Further Splits
The shards can be split further
depending on need. For example to split shard with shard ID = 3 into
a group with group ID = GroupID3 we will run the following command,
Commands:
mysqlfabric
sharding split 3 GROUPID3
After the
above command is executed the topology would look like the following.
Please find a few thoughts on maintaining a sharded system here http://vnwrites.blogspot.com/2013/09/mysqlfabric-sharding-maintenance.html