postgresqlrepmgr

repmgr pg_rewind failed with could not fetch pg_control remote file


I'm testing repmgr to setup auto fail over and replication for postgreSQL cluster.

So far I had reach the stage where I had successfully setup a 3-nodes cluster which consists of one primary and 2 standby. Fail over is a success in which I purposely down the primary node, and one of the standby node is successfully promoted to primary node automatically.

The prob starts when I want to sync back the failed primary node back to the cluster as standby node...

If I attempt to use node rejoin when postgres service is up, I will see these messages:

ERROR: database is still running in state "in production"
HINT: "repmgr node rejoin" cannot be executed on a running node

Fair enough, but if I run node rejoin after I stop the postgres service, I get these instead:

ERROR: pg_rewind execution failed
DETAIL: pg_rewind: error: could not fetch remote file "global/pg_control": ERROR:  permission denied for function pg_read_binary_file

The postgreSQL version I using is 15 and repmgr version is 5.3.3

Kindly help me on this. Thanks~


Solution

  • The documentation says:

    The connection must be a normal (non-replication) connection with a role having sufficient permissions to execute the functions used by pg_rewind on the source server (see Notes section for details) or a superuser role.

    The user you are using for replication doesn't seem to meet these criteria.

    I am not a repmgr expert, but I'd say that you should have a superuser in conninfo and your regular replication user in replication_user.