postgresqltcpproxytraefiksni

Is it possible to use Traefik to proxy PostgreSQL over SSL?


Motivations

I am a running into an issue when trying to proxy PostgreSQL with Traefik over SSL using Let's Encrypt. I did some research but it is not well documented and I would like to confirm my observations and leave a record to everyone who faces this situation.

Configuration

I use latest versions of PostgreSQL v12 and Traefik v2. I want to build a pure TCP flow from tcp://example.com:5432 -> tcp://postgresql:5432 over TLS using Let's Encrypt.

Traefik service is configured as follow:

  version: "3.6"
    
    services:
    
      traefik:
        image: traefik:latest
        restart: unless-stopped
        volumes:
          - "/var/run/docker.sock:/var/run/docker.sock:ro"
          - "./configuration/traefik.toml:/etc/traefik/traefik.toml:ro"
          - "./configuration/dynamic_conf.toml:/etc/traefik/dynamic_conf.toml"
          - "./letsencrypt/acme.json:/acme.json"
    
        networks:
          - backend
        ports:
          - "80:80"
          - "443:443"
          - "5432:5432"
    
    networks:
      backend:
        external: true

With the static setup:


[entryPoints]
  [entryPoints.web]
    address = ":80"
    [entryPoints.web.http]
      [entryPoints.web.http.redirections.entryPoint]
        to = "websecure"
        scheme = "https"

  [entryPoints.websecure]
    address = ":443"
    [entryPoints.websecure.http]
      [entryPoints.websecure.http.tls]
        certresolver = "lets"

  [entryPoints.postgres]
    address = ":5432"

PostgreSQL service is configured as follow:

version: "3.6"

services:

  postgresql:
    image: postgres:latest
    environment:
      - POSTGRES_PASSWORD=secret
    volumes:
      - ./configuration/trial_config.conf:/etc/postgresql/postgresql.conf:ro
      - ./configuration/trial_hba.conf:/etc/postgresql/pg_hba.conf:ro
      - ./configuration/initdb:/docker-entrypoint-initdb.d
      - postgresql-data:/var/lib/postgresql/data
    networks:
      - backend
    #ports:
    #  - 5432:5432
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=backend"
      - "traefik.tcp.routers.postgres.entrypoints=postgres"
      - "traefik.tcp.routers.postgres.rule=HostSNI(`example.com`)"
      - "traefic.tcp.routers.postgres.tls=true"
      - "traefik.tcp.routers.postgres.tls.certresolver=lets"
      - "traefik.tcp.services.postgres.loadBalancer.server.port=5432"

networks:
  backend:
    external: true

volumes:
  postgresql-data:

It seems my Traefik configuration is correct. Everything is OK in the logs and all sections in dashboard are flagged as Success (no Warnings, no Errors). So I am confident with the Traefik configuration above. The complete flow is about:

EntryPoint(':5432') -> HostSNI(`example.com`) -> TcpRouter(`postgres`) -> Service(`postgres@docker`)

But, it may have a limitation at PostgreSQL side.

Debug

The problem is that I cannot connect the PostgreSQL database. I always get a Timeout error.

I have checked PostgreSQL is listening properly (main cause of Timeout error):

# - Connection Settings -
listen_addresses = '*'
port = 5432

And I checked that I can connect PostgreSQL on the host (outside the container):

psql --host 172.19.0.4 -U postgres
Password for user postgres:
psql (12.2 (Ubuntu 12.2-4), server 12.3 (Debian 12.3-1.pgdg100+1))
Type "help" for help.

postgres=#

Thus I know PostgreSQL is listening outside its container, so Traefik should be able to bind the flow. I also have checked external traefik can reach the server:

sudo tcpdump -i ens3 port 5432
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
09:02:37.878614 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [S], seq 1027429527, win 64240, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0
09:02:37.879858 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [S.], seq 3545496818, ack 1027429528, win 64240, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
09:02:37.922591 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [.], ack 1, win 516, length 0
09:02:37.922718 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [P.], seq 1:9, ack 1, win 516, length 8
09:02:37.922750 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [.], ack 9, win 502, length 0
09:02:47.908808 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [F.], seq 9, ack 1, win 516, length 0
09:02:47.909578 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [P.], seq 1:104, ack 10, win 502, length 103
09:02:47.909754 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [F.], seq 104, ack 10, win 502, length 0
09:02:47.961826 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [R.], seq 10, ack 104, win 0, length 0

So, I am wondering why the connection cannot succeed. Something must be wrong between Traefik and PostgreSQL.

SNI incompatibility?

Even when I remove the TLS configuration, the problem is still there, so I don't expect the TLS to be the origin of this problem.

Then I searched and I found few posts relating similar issue:

As far as I understand it, the SSL protocol of PostgreSQL is a custom one and does not support SNI for now and might never support it. If it is correct, it will confirm that Traefik cannot proxy PostgreSQL for now and this is a limitation.

By writing this post I would like to confirm my observations and at the same time leave a visible record on Stack Overflow to anyone who faces the same problem and seek for help. My question is then: Is it possible to use Traefik to proxy PostgreSQL?

Update

Intersting observation, if using HostSNI('*') and Let's Encrypt:

    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=backend"
      - "traefik.tcp.routers.postgres.entrypoints=postgres"
      - "traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
      - "traefik.tcp.routers.postgres.tls=true"
      - "traefik.tcp.routers.postgres.tls.certresolver=lets"
      - "traefik.tcp.services.postgres.loadBalancer.server.port=5432"

Everything is flagged as success in Dashboard but of course Let's Encrypt cannot perform the DNS Challenge for wildcard *, it complaints in logs:

time="2020-08-12T10:25:22Z" level=error msg="Unable to obtain ACME certificate for domains \"*\": unable to generate a wildcard certificate in ACME provider for domain \"*\" : ACME needs a DNSChallenge" providerName=lets.acme routerName=postgres@docker rule="HostSNI(`*`)"

When I try the following configuration:

    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=backend"
      - "traefik.tcp.routers.postgres.entrypoints=postgres"
      - "traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
      - "traefik.tcp.routers.postgres.tls=true"
      - "traefik.tcp.routers.postgres.tls.domains[0].main=example.com"
      - "traefik.tcp.routers.postgres.tls.certresolver=lets"
      - "traefik.tcp.services.postgres.loadBalancer.server.port=5432"

The error vanishes from logs and in both setups the dashboard seems ok but traffic is not routed to PostgreSQL (time out). Anyway, removing SSL from the configuration makes the flow complete (and unsecure):

    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=backend"
      - "traefik.tcp.routers.postgres.entrypoints=postgres"
      - "traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
      - "traefik.tcp.services.postgres.loadBalancer.server.port=5432"

Then it is possible to connect PostgreSQL database:

time="2020-08-12T10:30:52Z" level=debug msg="Handling connection from x.y.z.w:58389"

Solution

  • SNI routing for postgres with STARTTLS has been added to Traefik in this PR. Now Treafik will listen to the initial bytes sent by postgres and if its going to initiate a TLS handshake (Note that postgres TLS requests are created as non-TLS first and then upgraded to TLS requests), Treafik will handle the handshake and then is able to receive the TLS headers from postgres, which contains the SNI information that it needs to route the request properly. This means that you can use HostSNI("example.com") along with tls to expose postgres databases under different subdomains.

    As of writing this answer, I was able to get this working with the v3.0.0-beta2 image (Reference)