NAME

DBIx::Class::Storage::DBI::Replicated - BETA Replicated database support

SYNOPSIS

The Following example shows how to change an existing $schema to a replicated storage type, add some replicated (read-only) databases, and perform reporting tasks.

You should set the 'storage_type attribute to a replicated type. You should also define your arguments, such as which balancer you want and any arguments that the Pool object should get.

my $schema = Schema::Class->clone;
$schema->storage_type(['::DBI::Replicated', { balancer_type => '::Random' }]);
$schema->connection(...);

Next, you need to add in the Replicants. Basically this is an array of arrayrefs, where each arrayref is database connect information. Think of these arguments as what you'd pass to the 'normal' $schema->connect method.

$schema->storage->connect_replicants(
  [$dsn1, $user, $pass, \%opts],
  [$dsn2, $user, $pass, \%opts],
  [$dsn3, $user, $pass, \%opts],
);

Now, just use the $schema as you normally would. Automatically all reads will be delegated to the replicants, while writes to the master.

$schema->resultset('Source')->search({name=>'etc'});

You can force a given query to use a particular storage using the search attribute 'force_pool'. For example:

my $rs = $schema->resultset('Source')->search(undef, {force_pool=>'master'});

Now $rs will force everything (both reads and writes) to use whatever was setup as the master storage. 'master' is hardcoded to always point to the Master, but you can also use any Replicant name. Please see: DBIx::Class::Storage::DBI::Replicated::Pool and the replicants attribute for more.

Also see transactions and "execute_reliably" for alternative ways to force read traffic to the master. In general, you should wrap your statements in a transaction when you are reading and writing to the same tables at the same time, since your replicants will often lag a bit behind the master.

If you have a multi-statement read only transaction you can force it to select a random server in the pool by:

my $rs = $schema->resultset('Source')->search( undef,
  { force_pool => $db->storage->read_handler->next_storage }
);

DESCRIPTION

Warning: This class is marked BETA. This has been running a production website using MySQL native replication as its backend and we have some decent test coverage but the code hasn't yet been stressed by a variety of databases. Individual DBs may have quirks we are not aware of. Please use this in first development and pass along your experiences/bug fixes.

This class implements replicated data store for DBI. Currently you can define one master and numerous slave database connections. All write-type queries (INSERT, UPDATE, DELETE and even LAST_INSERT_ID) are routed to master database, all read-type queries (SELECTs) go to the slave database.

Basically, any method request that DBIx::Class::Storage::DBI would normally handle gets delegated to one of the two attributes: "read_handler" or to "write_handler". Additionally, some methods need to be distributed to all existing storages. This way our storage class is a drop in replacement for DBIx::Class::Storage::DBI.

Read traffic is spread across the replicants (slaves) occurring to a user selected algorithm. The default algorithm is random weighted.

NOTES

The consistency between master and replicants is database specific. The Pool gives you a method to validate its replicants, removing and replacing them when they fail/pass predefined criteria. Please make careful use of the ways to force a query to run against Master when needed.

REQUIREMENTS

Replicated Storage has additional requirements not currently part of DBIx::Class. See DBIx::Class::Optional::Dependencies for more details.

ATTRIBUTES

This class defines the following attributes.

schema

The underlying DBIx::Class::Schema object this storage is attaching

pool_type

Contains the classname which will instantiate the "pool" object. Defaults to: DBIx::Class::Storage::DBI::Replicated::Pool.

pool_args

Contains a hashref of initialized information to pass to the Balancer object. See DBIx::Class::Storage::DBI::Replicated::Pool for available arguments.

balancer_type

The replication pool requires a balance class to provider the methods for choose how to spread the query load across each replicant in the pool.

balancer_args

Contains a hashref of initialized information to pass to the Balancer object. See DBIx::Class::Storage::DBI::Replicated::Balancer for available arguments.

pool

Is a DBIx::Class::Storage::DBI::Replicated::Pool or derived class. This is a container class for one or more replicated databases.

balancer

Is a DBIx::Class::Storage::DBI::Replicated::Balancer or derived class. This is a class that takes a pool (DBIx::Class::Storage::DBI::Replicated::Pool)

master

The master defines the canonical state for a pool of connected databases. All the replicants are expected to match this databases state. Thus, in a classic Master / Slaves distributed system, all the slaves are expected to replicate the Master's state as quick as possible. This is the only database in the pool of databases that is allowed to handle write traffic.

ATTRIBUTES IMPLEMENTING THE DBIx::Storage::DBI INTERFACE

The following methods are delegated all the methods required for the DBIx::Class::Storage::DBI interface.

read_handler

Defines an object that implements the read side of DBIx::Class::Storage::DBI.

write_handler

Defines an object that implements the write side of DBIx::Class::Storage::DBI, as well as methods that don't write or read that can be called on only one storage, methods that return a $dbh, and any methods that don't make sense to run on a replicant.

around: connect_info

Preserves master's connect_info options (for merging with replicants.) Also sets any Replicated-related options from connect_info, such as pool_type, pool_args, balancer_type and balancer_args.

METHODS

This class defines the following methods.

BUILDARGS

DBIx::Class::Schema when instantiating its storage passed itself as the first argument. So we need to massage the arguments a bit so that all the bits get put into the correct places.

_build_master

Lazy builder for the "master" attribute.

_build_pool

Lazy builder for the "pool" attribute.

_build_balancer

Lazy builder for the "balancer" attribute. This takes a Pool object so that the balancer knows which pool it's balancing.

_build_write_handler

Lazy builder for the "write_handler" attribute. The default is to set this to the "master".

_build_read_handler

Lazy builder for the "read_handler" attribute. The default is to set this to the "balancer".

around: connect_replicants

All calls to connect_replicants needs to have an existing $schema tacked onto top of the args, since DBIx::Class::Storage::DBI needs it, and any connect_info options merged with the master, with replicant opts having higher priority.

all_storages

Returns an array of all the connected storage backends. The first element in the returned array is the master, and the rest are each of the replicants.

execute_reliably ($coderef, ?@args)

Given a coderef, saves the current state of the "read_handler", forces it to use reliable storage (e.g. sets it to the master), executes a coderef and then restores the original state.

Example:

my $reliably = sub {
  my $name = shift @_;
  $schema->resultset('User')->create({name=>$name});
  my $user_rs = $schema->resultset('User')->find({name=>$name});
  return $user_rs;
};

my $user_rs = $schema->storage->execute_reliably($reliably, 'John');

Use this when you must be certain of your database state, such as when you just inserted something and need to get a resultset including it, etc.

set_reliable_storage

Sets the current $schema to be 'reliable', that is all queries, both read and write are sent to the master

set_balanced_storage

Sets the current $schema to be use the </balancer> for all reads, while all writes are sent to the master only

connected

Check that the master and at least one of the replicants is connected.

ensure_connected

Make sure all the storages are connected.

limit_dialect

Set the limit_dialect for all existing storages

quote_char

Set the quote_char for all existing storages

name_sep

Set the name_sep for all existing storages

set_schema

Set the schema object for all existing storages

debug

set a debug flag across all storages

debugobj

set a debug object

debugfh

set a debugfh object

debugcb

set a debug callback

disconnect

disconnect everything

cursor_class

set cursor class on all storages, or return master's

cursor

set cursor class on all storages, or return master's, alias for "cursor_class" above.

unsafe

sets the "unsafe" in DBIx::Class::Storage::DBI option on all storages or returns master's current setting

disable_sth_caching

sets the "disable_sth_caching" in DBIx::Class::Storage::DBI option on all storages or returns master's current setting

lag_behind_master

returns the highest Replicant "lag_behind_master" in DBIx::Class::Storage::DBI setting

is_replicating

returns true if all replicants return true for "is_replicating" in DBIx::Class::Storage::DBI

connect_call_datetime_setup

calls "connect_call_datetime_setup" in DBIx::Class::Storage::DBI for all storages

connect_call_rebase_sqlmaker

calls "connect_call_rebase_sqlmaker" in DBIx::Class::Storage::DBI for all storages

GOTCHAS

Due to the fact that replicants can lag behind a master, you must take care to make sure you use one of the methods to force read queries to a master should you need realtime data integrity. For example, if you insert a row, and then immediately re-read it from the database (say, by doing $result->discard_changes) or you insert a row and then immediately build a query that expects that row to be an item, you should force the master to handle reads. Otherwise, due to the lag, there is no certainty your data will be in the expected state.

For data integrity, all transactions automatically use the master storage for all read and write queries. Using a transaction is the preferred and recommended method to force the master to handle all read queries.

Otherwise, you can force a single query to use the master with the 'force_pool' attribute:

my $result = $resultset->search(undef, {force_pool=>'master'})->find($pk);

This attribute will safely be ignored by non replicated storages, so you can use the same code for both types of systems.

Lastly, you can use the "execute_reliably" method, which works very much like a transaction.

For debugging, you can turn replication on/off with the methods "set_reliable_storage" and "set_balanced_storage", however this operates at a global level and is not suitable if you have a shared Schema object being used by multiple processes, such as on a web application server. You can get around this limitation by using the Schema clone method.

my $new_schema = $schema->clone;
$new_schema->set_reliable_storage;

## $new_schema will use only the Master storage for all reads/writes while
## the $schema object will use replicated storage.

FURTHER QUESTIONS?

Check the list of additional DBIC resources.