kubernetesk3scoredns

CoreDNS is not working correctly in K3s cluster


I have a k3s cluster with 1 master and 2 workers and I'm having a lot of issues with the DNS resolutions. The DNS resolution stops working from time to time and I have to restart the pods to make it work, and after a while it stops working again, the logs on the CoreDNS only shows something like this:

│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override                                                                                                │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server                                                                                                  │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override                                                                                                │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server                                                                                                  │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override                                                                                                │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server  

When I restart the deployment, I go to a DNS debug pod I have and do nslookup google.com it usually works a couple of times, and then I only get this:

/ # nslookup google.com
;; connection timed out; no servers could be reached

This is the Corefile on the configmap

data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
        import /etc/coredns/custom/*.override
    }
    import /etc/coredns/custom/*.server
  NodeHosts: |
    10.25.165.243 i-0617a53e804d5cd0a
    10.25.130.16 i-0596bb6d78c4748c5
    10.25.157.146 i-0796e8161080685c4
    10.25.204.118 i-0c409f7c2d69c3011

I tried scaling the CoreDNS to 2 replicas, and it seems it helps, and now sometimes after the failures I can get a response, but it is still failing half of the times.

Update

I removed the import lines from the configmap as those were causing those weird logs warnings, but the error is still happening, the logs now look like this:

.:53
[INFO] plugin/reload: Running configuration SHA512 = b941b080e5322f6519009bb49349462c7ddb6317425b0f6a83e5451175b720703949e3f3b454a24e77f3ffe57fd5e9c6130e528a5a1dd00d9000e4afd6c1108d
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3

Solution

  • The problem was with the firewall, I needed to open the port 53, unfortunately this was not in the k3s documentation, but for the DNS to work correctly the workers and the master need to be able to communicate via this port.