centos7database-replicationmysql-5.7ubuntu-22.04master-master

MySQL Master-Master Replication Error: "Error Connecting to Master" Error 2003


Context:

I'm migrating a MySQL 5.7.42 database from an existing external (to AWS) CentOS 7 server to a new private subnet Ubuntu 22 AWS EC2. Due to the nature of the surrounding architecture, I require a temporary master-master sync.

The databases are identical (as one is a restored Percona XtraBackup of the other), bar their server id and id offset. The only noticeable difference is their base OS.

Both servers communicate over a single outbound reverse port-mapped SSH connection from the EC2 through a Bastion. The SSH config maps outbound connections hitting port 3307 over the tunnel from either server to the recipient's port 3306.

Replication with CentOS as primary and Ubuntu as secondary works.

Problem:

BUT Replication with Ubuntu as primary and CentOS as secondary fails, resulting in this CentOS SHOW SLAVE STATUS\G output:

Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
error connecting to master 'replicatorEC2@127.0.0.1:3307' - retry-time: 60  retries: 1

There is no denied connection log on the Ubuntu machine, and the error log on CentOS is as follows:

[ERROR] Slave I/O for channel '': error connecting to master 'replicatorEC2@127.0.0.1:3307' - retry-time: 60  retries: 1, Error_code: 2003

This suggests that the CentOS replication process isn't hitting the Ubuntu server.

How I got here tl;dr:

I've created a mock version of the architecture so I can blow stuff up without worrying.

I've created a mock database on CentOS and cloned it via Percona XtraBackup 2.4 to Ubuntu.

I've created a replication user on each database and granted them REPLICATION SLAVE.

I've then set up the CHANGE MASTER TO command on each machine to point at the other.

What does work:

Each server is accessible from the other at 127.0.0.1:3307 via the MySQL CLI over the ssh tunnel.

The replication users can log in and have the appropriate grant.

The master.info file is appropriately populated by CHANGE MASTER TO.

Notes:

There is no meaningful diff in SHOW VARIABLES; output between the database servers

The my.cnf file is identical (except for server id and id autoincrement offset)

The master.info file is identical (except for binlog details AND as CentOS has never managed to connect as a secondary server, it hasn't got a recorded server uuid in there)

The replication users are identical except for the username.

Moved the troubleshooting section to the end, as it continues to grow.

How I got here in detail:

CentOS

Created a new CentOS 7 server using iptables as the firewall with the following rules:

sudo iptables -L

Chain INPUT (policy DROP)
target  prot source    destination
ACCEPT  all  anywhere  anywhere    state RELATED,ESTABLISHED
ACCEPT  all  anywhere  anywhere
LOG     tcp  anywhere  anywhere    tcp dpt:SSHPort recent: UPDATE seconds: 10 hit_count: 5 name: SSH side: source mask: 255.255.255.255 LOG level debug prefix "SSH Repeat Attempts"

DROP    tcp  anywhere  anywhere    tcp dpt:SSHPort recent: UPDATE seconds: 10 hit_count: 6 name: SSH side: source mask: 255.255.255.255

ACCEPT  tcp  anywhere  anywhere    tcp dpt:SSHPort recent: SET name: SSH side: source mask: 255.255.255.255

ACCEPT  icmp anywhere  anywhere    icmp echo-request
LOGGING all  anywhere  anywhere
ACCEPT  tcp  anywhere  anywhere    tcp dpt:opsession-prxy
ACCEPT  tcp  anywhere  anywhere    tcp dpt:mysql

Chain FORWARD (policy DROP)
target  prot source    destination

Chain OUTPUT (policy ACCEPT)
target  prot source    destination

Chain LOGGING (1 references)
target  prot source    destination
LOG     all  anywhere  anywhere    limit: avg 2/min burst 5 LOG level warning prefix "IPTables-Dropped: "

DROP    all  anywhere  anywhere

Installed MySQL 5.7.42:

// Install wget
sudo yum install wget

// Get the MySQL community release
sudo yum localinstall https://dev.mysql.com/get/mysql57-community-release-el7-11.noarch.rpm

// Fix gpg key mismatch
sudo rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022

// Install mysql community server
sudo yum install mysql-community-server

Start service

sudo systemctl start mysqld

Get the default server password

sudo grep 'password' /var/log/mysqld.log

Setup MySQL

sudo mysql_secure_installation
// Enable password validation
// Disable anonymous users
// Disable remote root
// Remove test database
// Reload privilege table

Update MySQL config:

[mysqld]
bind_address =              *

datadir =                   /var/lib/mysql
socket =                    /var/lib/mysql/mysql.sock

server-id =                 # a unique id
auto_increment_increment =  2
auto_increment_offset =     1 # this is 2 on the Ubuntu server

log_bin =                   /var/lib/mysql/mysql-bin.log
binlog-ignore-db =          mysql

relay-log =                 /var/lib/mysql/relay-bin
relay-log-index =           /var/lib/mysql/relay-bin.index
relay-log-info-file =       /var/lib/mysql/relay-bin.info

log-error =                 /var/log/mysqld.log
pid-file =                  /var/run/mysqld/mysqld.pid

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links =            0

Create mockDB:

mysql -u root -p

CREATE DATABASE mockDB;

USE mockDB;

CREATE TABLE customers (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(50),
    email VARCHAR(100),
    age INT
);

INSERT INTO customers (name, email, age)
VALUES ('John Doe', 'john.doe@example.com', 30);

Create a user to handle the backup process

CREATE USER 'replicatorCentOS'@'localhost' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'replicatorCentOS'@'localhost';

FLUSH PRIVILEGES;

Record binlog position

SHOW MASTER STATUS;

Example SHOW MASTER STATUS; response

mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000002 |     2142 |              | mysql            |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

Install Percona XtraBackup 2.4

sudo yum install https://repo.percona.com/yum/percona-release-latest.noarch.rpm
sudo yum install percona-xtrabackup-24

Create a hot backup with Percona XtraBackup

sudo xtrabackup --backup --no-timestamp --user=root --password='password' --target-dir=/home/user/backup

Permissions for backup files

sudo chown -R user:group ~/backup
sudo chmod -R u+rw ~/backup

Compress the backup

tar -cpzvf backup.tar.gz -C ~/backup .

Install AWS CLI

sudo yum install awscli

Configure AWS CLI

aws configure

Upload the compressed backup to S3

aws s3 cp backup.tar.gz s3://s3Address/backup.tar.gz

Ubuntu

AWS infrastructure:

SSH config

Host centOSReverse
  HostName centOSPublicIP
  Port SSH port
  User user
  IdentityFile ~/.ssh/sshKey
  ProxyJump bastion
  LocalForward 3307 localhost:3306
  RemoteForward 3307 localhost:3306

Host bastion
  HostName bastionPrivateIP
  Port SSH port
  User user
  IdentityFile ~/.ssh/sshKey

Host *
  ServerAliveInterval 60
  ServerAliveCountMax 3

Reverse port map ssh from Ubuntu ssh to CentOS.

ssh -f -N centOSReverse
autossh -M -f -N centOSReverse

Close the connections like this:

pkill -f "ssh -f -N centOSReverse"
pkill -f "autossh -M -f -N centOSReverse"

Install MySQL 5.7.42

wget https://dev.mysql.com/get/mysql-apt-config_0.8.17-1_all.deb
sudo dpkg -i mysql-apt-config_0.8.17-1_all.deb
// Select Bionic and hit OK
// Select the first option
// Select MySQL 5.7
// Select OK

// Add the MySQL APT repository key
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 467B942D3A79BD29

sudo apt-get update

// Install MySQL 5.7
sudo apt install -f mysql-client=5.7* mysql-community-server=5.7* mysql-server=5.7*

Install Percona XtraBackup 2.4

wget https://repo.percona.com/apt/percona-release_latest.$(lsb_release -sc)_all.deb
sudo dpkg -i percona-release_latest.$(lsb_release -sc)_all.deb
sudo apt-get update
sudo apt-get install percona-xtrabackup-24

Install/configure AWS CLI

sudo apt-get install awscli
aws configure

Download backup

aws s3 cp s3://S3Address/backup.tar.gz ~/backup.tar.gz

Uncompress backup

mkdir backup
tar -xpzvf ~/backup.tar.gz -C ~/backup

Prepare backup

xtrabackup --prepare --target-dir=/home/user/backup

Stop MySQL service

sudo service mysql stop

Clear MySQL datadir

sudo rm -rf /var/lib/mysql/
sudo mkdir /var/lib/mysql
sudo chown -R mysql:mysql /var/lib/mysql

Restore backup

sudo xtrabackup --copy-back --target-dir=/home/user/backup

Adjust permissions

sudo chown -R mysql:mysql /var/lib/mysql
sudo find /var/lib/mysql -type d -exec chmod 700 {} \;
sudo find /var/lib/mysql -type f -exec chmod 600 {} \;

Start MySQL service

sudo service mysql start

Create a user to handle the backup process

CREATE USER 'replicatorUbuntu'@'localhost' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'replicatorUbuntu'@'localhost';

FLUSH PRIVILEGES;

Record binlog position

SHOW MASTER STATUS;

Example SHOW MASTER STATUS; response

mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      953 |              | mysql            |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

Both servers

Run change master to script

Script

#!/bin/bash

read -p "Enter the replication master username: " username
read -s -p "Enter the replication master password: " password
echo ""
read -p "Enter the replication host: " host
read -p "Enter the replication port: " port
read -p "Enter the log file name: " logFile
read -p "Enter the log position: " logPosition
echo "MySQL root password below: "
mysql -u root -p -e "
STOP SLAVE;
CHANGE MASTER TO MASTER_HOST='$host', MASTER_PORT=$port, MASTER_USER='$username', MASTER_PASSWORD='$password', MASTER_LOG_FILE='$logFile', MASTER_LOG_POS=$logPosition;
START SLAVE;"

Make executable

chmod +x mysql_replication.sh

Run

Example responses CentOS

./mysql_replication.sh
username: replicatorUbuntu
replicatorUbuntuPassword
host: 127.0.0.1
port: 3307
logFile: mysql-bin.000002
LogPosition: 2142
MySQLRootPassword

Ubuntu

./mysql_replication.sh
username: replicatorCentOS
replicatorCentOSPassword
host: 127.0.0.1
port: 3307
logFile: mysql-bin.000002
LogPosition: 2142
MySQLRootPassword

And voila, you will have half of the intended replication working for some reason.

My current theory:

All I know is that the problem is between the CentOS MySQL server and ends before it leaves the CentOS server.

Troubleshooting:

I've blown away and rebuilt these servers a few times now. Some things are too permissive in my desperation to fix this; I'll tighten them up once it's working.

Enabled general and verbose logging via MySQL config

log_error_verbosity =       3
general_log =               ON
general_log_file =          /var/lib/mysql/mysql-bin.log

MySQL documentation: Many posts suggest reading a dead MySQL docs page for Access Denied I believe these to be its analogues for 5.7:

Going to run debugging soon Debugging and creating trace files. Debugging options.

Research: Remove bind-address from config

Many error 2003 questions are resolved with firewall changes:

Disable iptables firewall

sudo systemctl stop iptables
sudo systemctl status iptables

Reset replication process:

mysql -u root -p
STOP SLAVE;
START SLAVE;

SHOW SLAVE STATUS\G

Slave status error

Last_IO_Error: error connecting to master 'replicatorUbuntu@127.0.0.1:3307' - retry-time: 60  retries: 1

Check error log

sudo tail -n 100 /var/log/mysqld.log

Logged error

[ERROR] Slave I/O for channel '': error connecting to master 'replicatorUbuntu@127.0.0.1:3307' - retry-time: 60  retries: 1, Error_code: 2003

Restart iptables service

sudo systemctl start iptables
sudo systemctl status iptables

Verify SSH connection I tried an outbound SSH connection from each server in case there was something about the reverse port map, but I got the same result:

Test Ubuntu SSH config:

Host CentOS
  HostName CentOSPublicIP
  Port SSHPort
  User User
  IdentityFile ~/.ssh/SSHKey
  ProxyJump bastion
  LocalForward 3307 localhost:3306

Host bastion
  HostName BastionLocalIP
  Port SSHPort
  User User
  IdentityFile ~/.ssh/SSHKey

Host *
  ServerAliveInterval 60
  ServerAliveCountMax 3

Test CentOS SSH config:

Host Ubuntu
  HostName UbuntuLocalIP
  Port SSHPort
  User User
  IdentityFile ~/.ssh/SSHKey
  ProxyJump bastion
  LocalForward 3307 localhost:3306

Host bastion
  HostName BastionPublicIP
  Port SSHPort
  User User
  IdentityFile ~/.ssh/SSHKey

Host *
  ServerAliveInterval 60
  ServerAliveCountMax 3

SELinux on the CentOS server: Check if SELinux is enabled and in it's default 'enforcing' mode

sestatus

Check existing port rules

sudo semanage port -l

According to the thread this should tell SELinux to allow mysql to connect through port 3307

semanage port -a -t mysqld_port_t -p tcp 3307

THAT WAS IT!


Solution

  • Adding port 3307 to SELinux solved it:

    semanage port -a -t mysqld_port_t -p tcp 3307
    

    My problem was a duplicate of this: https://serverfault.com/questions/461704/mysql-error-2003-over-ssh-tunnel-when-replicating-not-when-using-client

    My first time setting up a CentOS server so I didn't know it would be on and blocking any ports without explicit rules.