Replication

This document describes how to configure ATSD replication.

The replication process is from master to slave, meaning where all transactions on the master cluster are replayed to the slave cluster.

In the guide atsd_master is the hostname of the master host and atsd_slave is the hostname of the slave host.

Note: This guide applies only on new ATSD installations. Executing this guide on an existing ATSD installation leads to the loss of all stored data on both the master and slave machines. Note: If master loses connection with slave, it accumulates all the data and events for the duration of the connection loss and starts transferring the accumulated data once connection with slave is re-established. No data is lost in the process.

Requirements

Both the master and slave machines must have static a IP addresses in the local network.

Both machines must have identical hardware configurations. Review ATSD Requirements.

The same versions of ATSD must be installed on both machines. See ATSD installation guides.

Installation

MASTER & SLAVE: Complete this process on both machines – master and slave.

Stop ATSD and all components:

/opt/atsd/bin/atsd-all.sh stop

Change /etc/hosts to form:

sudo nano /etc/hosts
127.0.0.1    localhost
master_ip    master_hostname
slave_ip     slave_hostname

Note: the following lines must not be contained in the hosts file. This is the case for both master and slave.

127.0.1.1    atsd_master
127.0.1.1    atsd_slave

Example of a correct hosts file:

127.0.0.1    localhost
172.30.0.66    atsd_master
172.30.0.78    atsd_slave

Add the hbase.replication property to the configuration tag in the hbase-site.xml file:

<property>
    <name>hbase.replication</name>
    <value>true</value>
</property>

SLAVE: Only complete this process on the slave machine.

Edit the atsd-all.sh file to disable ATSD startup:

sudo nano /opt/atsd/bin/atsd-all.sh

Comment out the following strings in the start_all function:

${ATSD_TSD} start
if [ ! $? -eq 0 ]; then
    return 1
fi

Result:

#   ${ATSD_TSD} start
#   if [ ! $? -eq 0 ]; then
#       return 1
#   fi

Start Hadoop and HBase:

/opt/atsd/bin/atsd-all.sh start

Run the replication configuration script:

/opt/atsd/hbase_util/configure_replication.sh slave

This command truncates all ATSD tables.

Verify that ATSD tables are present.

Start HBase shell and list tables:

echo "list" | /opt/atsd/hbase/bin/hbase shell 2>/dev/null | grep -v "\["

The output contains a list of ATSD tables, all starting with atsd_:

MASTER: Only complete this process on the master machine.

Start Hadoop and HBase:

/opt/atsd/bin/atsd-dfs.sh start
/opt/atsd/bin/atsd-hbase.sh start

Add replication peer.

echo "add_peer '1', \"atsd_slave:2181:/hbase\"" | /opt/atsd/hbase/bin/hbase shell

Ensure that the peer is set.

echo "list_peers" | /opt/atsd/hbase/bin/hbase shell
PEER_ID CLUSTER_KEY STATE
1 atsd_slave:2181:/hbase ENABLED
1 row(s) in 0.0930 seconds

Run replication configuration script:

/opt/atsd/hbase_util/configure_replication.sh master

This command truncates all ATSD tables and enables replication on all ATSD column families.

Start ATSD:

/opt/atsd/bin/atsd-tsd.sh start

Verify that ATSD tables are present: list tables

echo "list" | /opt/atsd/hbase/bin/hbase shell 2>/dev/null | grep -v "\["

Output contains a list of ATSD tables, all starting with atsd_.

Replication for New Tables

New tables created in the source cluster are not automatically replicated. Configure the replication for new tables as follows:

MASTER: Only complete this process on the master machine.

Write the table schema to a file:

/opt/atsd/hbase_util/configure_replication.sh schema atsd_new > atsd_new_schema.txt

Copy table schema file to the slave machine:

scp atsd_new_schema.txt atsd_slave:/tmp

SLAVE: Only complete this process on the slave machine.

Create the new table in the slave database:

/opt/atsd/hbase/bin/hbase shell < /tmp/atsd_new_schema.txt

MASTER: Only complete this process on the master machine.

Enable replication for the new table:

/opt/atsd/hbase_util/configure_replication.sh flag atsd_new

Verify that the new table is being replicated using the verification instructions below.

Verifying Replication

Option 1

SLAVE: Only complete this process on the slave machine.

Check HBase logs for replication activity:

tail -n 1000 /opt/atsd/hbase/logs/hbase-axibase-regionserver-atsd_slave.log | grep replicated

The output contains replication activity and the of amount tables replicated on the slave machine:

2015-07-17 16:39:22,926 INFO  regionserver.ReplicationSink (ReplicationS
ink.java:replicateEntries(158)) - Total replicated: 4
2015-07-17 16:39:24,019 INFO  regionserver.ReplicationSink (ReplicationS
ink.java:replicateEntries(158)) - Total replicated: 1
2015-07-17 16:39:25,083 INFO  regionserver.ReplicationSink (ReplicationS
ink.java:replicateEntries(158)) - Total replicated: 1
2015-07-17 16:39:31,122 INFO  regionserver.ReplicationSink (ReplicationS
ink.java:replicateEntries(158)) - Total replicated: 1

Option 2

MASTER: Only complete this process on the master machine.

Open ATSD web interface and navigate to the Alert > Rules page.

Click Create and complete the following fields as specified below:

  • Name: testrule
  • Metric: testrule
  • Condition: true

Click Save.

Scan the atsd_rule table and note down the amount of line contained in the table:

echo "scan 'atsd_rule'" | /opt/atsd/hbase/bin/hbase shell

Output:

SLAVE: Only complete this process on the slave machine.

Scan the atsd_rule table and note down the amount of line contained in the table:

echo "scan 'atsd_rule'" | /opt/atsd/hbase/bin/hbase shell

The output contains the same amount of rows as on the master: