33533/7D2841D8
even though the received LSN keeps advancing. There is a logical worker that runs on the subscriber because I see it in pg_stat_activity
however I'm not sure what it's doing.I want to pull the plug on whatever this transaction is since it's been taking days. I found the pg_replication_origin_advance
function which will allow me to do that, however I don't know what the "next" LSN value. Is there a way I can look that up?
I figured out what my problem was: triggers on the subscriber.
I saw in pg_locks
that a table was constantly being hit. There was an exclusive lock on the table, and no other query wanted to use it which was good. It was just busy all the time.
Somehow I found pg_stat_user_functions
which says how much time a function (like a trigger function) executes for. There were a pair of functions that had an extremely large amount of time running and it kept rising.
I did an alter table foo disable trigger x
, and once that was done, replication immediately started flowing. My data was now out-of-date because the trigger wasn't running, but that's fine as long as my publisher doesn't crash due to running out of space. After replication was caught up, I re-enabled the triggers, and things have been running okay.