I have a k3s cluster with 1 master and 2 workers and I'm having a lot of issues with the DNS resolutions. The DNS resolution stops working from time to time and I have to restart the pods to make it work, and after a while it stops working again, the logs on the CoreDNS only shows something like this:
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.override │
│ [WARNING] No files matching import glob pattern: /etc/coredns/custom/*.server
When I restart the deployment, I go to a DNS debug pod I have and do nslookup google.com
it usually works a couple of times, and then I only get this:
/ # nslookup google.com
;; connection timed out; no servers could be reached
This is the Corefile on the configmap
data:
Corefile: |
.:53 {
errors
health
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
hosts /etc/coredns/NodeHosts {
ttl 60
reload 15s
fallthrough
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
import /etc/coredns/custom/*.override
}
import /etc/coredns/custom/*.server
NodeHosts: |
10.25.165.243 i-0617a53e804d5cd0a
10.25.130.16 i-0596bb6d78c4748c5
10.25.157.146 i-0796e8161080685c4
10.25.204.118 i-0c409f7c2d69c3011
I tried scaling the CoreDNS to 2 replicas, and it seems it helps, and now sometimes after the failures I can get a response, but it is still failing half of the times.
Update
I removed the import lines from the configmap as those were causing those weird logs warnings, but the error is still happening, the logs now look like this:
.:53
[INFO] plugin/reload: Running configuration SHA512 = b941b080e5322f6519009bb49349462c7ddb6317425b0f6a83e5451175b720703949e3f3b454a24e77f3ffe57fd5e9c6130e528a5a1dd00d9000e4afd6c1108d
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3
The problem was with the firewall, I needed to open the port 53, unfortunately this was not in the k3s documentation, but for the DNS to work correctly the workers and the master need to be able to communicate via this port.