postgresqlhaproxy

Is send-binary a must when connecting postgresql to haproxy?


I tried to connect postgresql through haproxy. However, postgresql only has an error of 'invalid length of startup packet'. Should haproxy->postgresql be set up by calculating packets with tcp-check send-binary rather than tcp-check send?

haproxy.cfg

#add
global
    log 127.0.0.1 local0
    maxconn 1000
    user root
    group root
    daemon
    stats socket /var/run/haproxy.sock mode 660 level admin
    stats timeout 2m

defaults
    log global
    mode tcp
    balance roundrobin
    retries 3
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen pgReadWrite
    mode tcp
    bind *:5430
    balance roundrobin

    option tcp-check
    tcp-check connect
    tcp-check send info\ replication\r\n
    tcp-check expect string role:master
    #tcp-check send "select 1";

    # server list to check
    server db0 192.168.10.190:5432 check port 5432

but the result

[2024-01-10 10:47:05.183 KST][192.168.10.150(59524)][[unknown]][[unknown]][[unknown]][2521][00000]LOG:  connection received: host=192.168.10.150 port=59524
[2024-01-10 10:47:05.183 KST][192.168.10.150(59524)][[unknown]][[unknown]][[unknown]][2521][08P01]LOG:  invalid length of startup packet
[2024-01-10 10:47:07.185 KST][192.168.10.150(59530)][[unknown]][[unknown]][[unknown]][2522][00000]LOG:  connection received: host=192.168.10.150 port=59530
[2024-01-10 10:47:07.185 KST][192.168.10.150(59530)][[unknown]][[unknown]][[unknown]][2522][08P01]LOG:  invalid length of startup packet
[2024-01-10 10:47:09.190 KST][192.168.10.150(59546)][[unknown]][[unknown]][[unknown]][2523][00000]LOG:  connection received: host=192.168.10.150 port=59546
[2024-01-10 10:47:09.190 KST][192.168.10.150(59546)][[unknown]][[unknown]][[unknown]][2523][08P01]LOG:  invalid length of startup packet
[2024-01-10 10:47:11.192 KST][192.168.10.150(34472)][[unknown]][[unknown]][[unknown]][2524][00000]LOG:  connection received: host=192.168.10.150 port=34472
[2024-01-10 10:47:11.192 KST][192.168.10.150(34472)][[unknown]][[unknown]][[unknown]][2524][08P01]LOG:  invalid length of startup packet
[2024-01-10 10:47:13.195 KST][192.168.10.150(34488)][[unknown]][[unknown]][[unknown]][2525][00000]LOG:  connection received: host=192.168.10.150 port=34488
[2024-01-10 10:47:13.195 KST][192.168.10.150(34488)][[unknown]][[unknown]][[unknown]][2525][08P01]LOG:  invalid length of startup packet
[2024-01-10 10:47:15.200 KST][192.168.10.150(34504)][[unknown]][[unknown]][[unknown]][2527][00000]LOG:  connection received: host=192.168.10.150 port=34504
[2024-01-10 10:47:15.200 KST][192.168.10.150(34504)][[unknown]][[unknown]][[unknown]][2527][08P01]LOG:  invalid length of startup packet

Solution

  • No, this won't work. Or is too hard and error prone.

    What does tcp-check do? It sends given bytes through tcp and expects some response. But how does network communication work for postgres? Postgres uses a custom protocol for communication. From the docs:

    The first byte of a message identifies the message type, and the next four bytes specify the length of the rest of the message

    You can read through entire docs how messages to postgres are actually created. So postgres expects a complex sequence of bytes on the wire, not just trivial select 1. The actual query will be embedded somewhere in the message, but the message itself is way more.

    Then you have to have a look at message flow. Because it is not as simple as "sending encoded select 1 and validate response". You start communication with startup message, followed by authentication, etc. All of that already will be way to difficult for you to implement manually. And you will have to do that, because haproxy only accepts raw tcp message in this feature. And if you add encryption on top of that (which should be mandatory anyway), then it becomes impossible due to its dynamic nature. Unless haproxy supports decryption with this feature, which I'm not sure of. Still, with or without encryption it already is too hard.

    As mentioned in comments some people actually do create such packets manually in haproxy config. Here's a relevant copied piece of such haproxy config:

    backend backend_pg
        option tcp-check
        tcp-check connect
    
    # user: pgc
    # database: template1
    #
        tcp-check send-binary 00000025                 # packet length
        tcp-check send-binary 00030000                 # protocol version
        tcp-check send-binary 7573657200               # "user"            ( 5 bytes )
        tcp-check send-binary 70676300                 # "pgc"             ( 4 bytes )
        tcp-check send-binary 646174616261736500       # "database"        ( 9 bytes )
        tcp-check send-binary 74656d706c6174653100     # "template1"       ( 10 bytes )
        tcp-check send-binary 00                       # terminator
    
    # expect: Auth
    #
        tcp-check expect binary 52                     # Auth request
        tcp-check expect binary 00000008               # packet length     ( 8 bytes )
        tcp-check expect binary 00000000               # auth response ok
    
    # write: run simple query
    # "select pg_is_in_recovery();"
    #
        tcp-check send-binary 51                       # simple query
        tcp-check send-binary 00000020                 # packet length     ( 4 bytes)
        tcp-check send-binary 73656c65637420           # "select "         ( 7 bytes )
        # "pg_is_in_recovery();"
        tcp-check send-binary 70675f69735f696e5f7265636f7665727928293b    #   ( 20 bytes )
        tcp-check send-binary 00 # terminator                                 ( 1 byte )
    
    
        # write: terminate session
        tcp-check send-binary 58                       # Termination packet
        tcp-check send-binary 00000004                 # packet length: 4 (no body)
        # avoids :  <template1-pgc-2019-01-18 11:23:06 CET>LOG:  could not receive data from client: Connection reset by peer
    
    # expect: Row description packet
    #
        tcp-check expect binary 54                         # row description packet (1 byte)
        tcp-check expect binary 0000002a               # packet length: 42 (0x2a)
        tcp-check expect binary 0001                   # field count: 1
        tcp-check expect binary 70675f69735f696e5f7265636f7665727900 # field name: pg_is_in_recovery
        tcp-check expect binary 00000000               # table oid: 0
        tcp-check expect binary 0000                   # column index: 0
        tcp-check expect binary 00000010               # type oid: 16
        tcp-check expect binary 0001                   # column length: 1
        tcp-check expect binary ffffffff               # type modifier: -1
        tcp-check expect binary 0000                   # format: text
    
    # expect: query result data
    #
    # "f" means node in master mode
    # "t" means node in standby mode (read-only)
    #
        tcp-check expect binary 44                     # data row packet
        tcp-check expect binary 0000000b               # packet lenght: 11 (0x0b)
        tcp-check expect binary 0001                   # field count: 1
        tcp-check expect binary 00000001               # column length in bytes: 1
        tcp-check expect binary 66                     # column data, "f"
    
    # write: terminate session
        tcp-check send-binary 58                       # Termination packet
        tcp-check send-binary 00000004                 # packet length: 4 (no body)
    
    # close open sessions in case the downed server is still running but is out of sync with the master
        default-server on-marked-down shutdown-sessions
    

    And you would have to adapt this to your case. Quoting colonel Kurtz: "The horror... the horror..." Besides the network protocol of postgres is known to be unstable, i.e. upgrade of postgres may (and it actually did happen multiple times) have backwards incompatible changes.

    My advice is to use a dedicated postgres client instead. When you use a postgres client, then it does all of that for you. And it is maintained by people that actually work on postgres, who know what they do. So my advice is: do not use this feature (it hardly ever is useful anyway) and if you want to have checks, then create a cron job (or equivalent) that will call the database via some postgres client.

    That being said, if you want a load balancer in front of postgres then why don't you use something dedicated for that job? Like pgbouncer or pgpool.