postgresqlamazon-web-servicesrdspgpool

Can't use PGPool with Amazon RDS Postgres


I have a Postgres 9.4 RDS instance with Multi-AZ, and there's a slave, read-only replica.

Up to this point the load balancing was made in the business layer of my app, but it's inefficient, and I was hoping to use PGPool, so the app interacts with a single Postgres connection.

It turns out that using PGPool has been a pain in the ass. If I set it to act as a load balancer, simple SELECT queries throw errors like:

 SQLSTATE[HY000]: General error: 7 
 message contents do not agree with length in message type "N" 
 server sent data ("D" message) 
 without prior row description ("T" message)

If I set it to act in a master/slave mode with stream replication (as suggested in Postgres mail list) I get:

psql: ERROR:  MD5 authentication is unsupported 
in replication and master-slave modes.
HINT:  check pg_hba.conf

Yeah, well, pg_hba.conf if off hands in RDS so I can't alter it.

Has anyone got PGPool to work in RDS? Are there other tools that can act as middleware to take advantage of reading replicas in RDS?


Solution

  • I was able to make it work here are my working config files:

    You have to use md5 authentication, and sync the username/password from your database to the pool_passwd file. Also need enable_pool_hba, load_balance_mode, and master_slave_mode on.

    pgpool.conf

    listen_addresses = '*'
    port = 9999
    pcp_listen_addresses = '*'
    pcp_port = 9898
    pcp_socket_dir = '/tmp'
    listen_backlog_multiplier = 1
    backend_hostname0 = 'master-rds-database-with-multi-AZ.us-west-2.rds.amazonaws.com'
    backend_port0 = 5432
    backend_weight0 = 0
    backend_flag0 = 'ALWAYS_MASTER'
    backend_hostname1 = 'readonly-replica.us-west-2.rds.amazonaws.com'
    backend_port1 = 5432
    backend_weight1 = 999
    backend_flag1 = 'ALWAYS_MASTER'
    enable_pool_hba = on
    pool_passwd = 'pool_passwd'
    ssl = on
    num_init_children = 1
    max_pool = 2
    connection_cache = off
    replication_mode = off
    load_balance_mode = on
    master_slave_mode = on
    

    pool_hba.conf

    local   all         all                               md5
    host    all         all         127.0.0.1/32          md5
    

    pool_passwd

    username:md5d51c9a7e9353746a6020f9602d452929
    

    to update pool_password you can use pg_md5 or

    echo username:md5`echo -n usernamepassword | md5sum`
    username:md5d51c9a7e9353746a6020f9602d452929 -
    

    Output of running example:

    psql --dbname=database --host=localhost --username=username --port=9999
    
    database=> SHOW POOL_NODES;
    
     node_id |                        hostname                 | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
    ---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
     0       | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up     | 0.000000  | primary | 0          | false             | 0
     1       | readonly-replica.us-west-2.rds.amazonaws.com    | 8193 | up     | 1.000000  | standby | 0          | true              | 0
    
    database=> select now();
    
     node_id |                        hostname                 | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
    ---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
     0       | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up     | 0.000000  | primary | 0          | false             | 0
     1       | readonly-replica.us-west-2.rds.amazonaws.com    | 8193 | up     | 1.000000  | standby | 1          | true              | 1
    
    database=> CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
    CREATE TABLE
    database=> INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
    INSERT 0 1
    database=> select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
      master_ip   |  replica_ip
    --------------+---------------
     172.31.37.69 | 172.31.20.121
    (1 row)
    

    You can also see from the logs id does both databases:

    2018-10-16 07:56:37: pid 124528: LOG:  DB node id: 0 backend pid: 21731 statement: CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
    2018-10-16 07:56:47: pid 124528: LOG:  DB node id: 0 backend pid: 21731 statement: INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
    2018-10-16 07:56:52: pid 124528: LOG:  DB node id: 1 backend pid: 24890 statement: select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
    

    Notice the insert used ip_address of master, and the next select used ip_address of the read only replica.

    I can update after more testing, but psql client testing looks promising.