I'm trying to deploy a Prometheus nodeexporter Daemonset in my AWS EKS K8s cluster.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: prometheus
chart: prometheus-11.12.1
component: node-exporter
heritage: Helm
release: prometheus
name: prometheus-node-exporter
namespace: operations-tools-test
spec:
selector:
matchLabels:
app: prometheus
component: node-exporter
release: prometheus
template:
metadata:
labels:
app: prometheus
chart: prometheus-11.12.1
component: node-exporter
heritage: Helm
release: prometheus
spec:
containers:
- args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --web.listen-address=:9100
image: prom/node-exporter:v1.0.1
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: metrics
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: true
- mountPath: /host/sys
name: sys
readOnly: true
dnsPolicy: ClusterFirst
hostNetwork: true
hostPID: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: prometheus-node-exporter
serviceAccountName: prometheus-node-exporter
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /proc
type: ""
name: proc
- hostPath:
path: /sys
type: ""
name: sys
After deploying however, its not getting deployed on one node.
pod.yml file for that file looks like this:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
generateName: prometheus-node-exporter-
labels:
app: prometheus
chart: prometheus-11.12.1
component: node-exporter
heritage: Helm
pod-template-generation: "1"
release: prometheus
name: prometheus-node-exporter-xxxxx
namespace: operations-tools-test
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: prometheus-node-exporter
resourceVersion: "51496903"
selfLink: /api/v1/namespaces/namespace-x/pods/prometheus-node-exporter-xxxxx
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- ip-xxx-xx-xxx-xxx.ec2.internal
containers:
- args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --web.listen-address=:9100
image: prom/node-exporter:v1.0.1
imagePullPolicy: IfNotPresent
name: prometheus-node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: metrics
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: true
- mountPath: /host/sys
name: sys
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: prometheus-node-exporter-token-xxxx
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostNetwork: true
hostPID: true
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: prometheus-node-exporter
serviceAccountName: prometheus-node-exporter
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/disk-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/pid-pressure
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/network-unavailable
operator: Exists
volumes:
- hostPath:
path: /proc
type: ""
name: proc
- hostPath:
path: /sys
type: ""
name: sys
- name: prometheus-node-exporter-token-xxxxx
secret:
defaultMode: 420
secretName: prometheus-node-exporter-token-xxxxx
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-11-06T23:56:47Z"
message: '0/4 nodes are available: 2 node(s) didn''t have free ports for the requested
pod ports, 3 Insufficient pods, 3 node(s) didn''t match node selector.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: BestEffort
As seen above POD nodeAffinity looks up metadata.name which matches exactly what I have as a label in my node.
But when I run the below command,
kubectl describe po prometheus-node-exporter-xxxxx
I get in the events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 60m default-scheduler 0/4 nodes are available: 1 Insufficient pods, 3 node(s) didn't match node selector.
Warning FailedScheduling 4m46s (x37 over 58m) default-scheduler 0/4 nodes are available: 2 node(s) didn't have free ports for the requested pod ports, 3 Insufficient pods, 3 node(s) didn't match node selector.
I have also checked Cloud-watch logs for Scheduler and I don't see any logs for my failed pod.
The Node has ample resources left
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 520m (26%) 210m (10%)
memory 386Mi (4%) 486Mi (6%)
I don't see a reason why it should not schedule a pod. Can anyone help me with this?
TIA
As posted in the comments:
Please add to the question the steps that you followed (editing any values in the Helm chart etc). Also please check if the nodes are not over the limit of pods that can be scheduled on it. Here you can find the link for more reference: LINK.
no processes occupying 9100 on the given node. @DawidKruk The POD limit was reached. Thanks! I expected them to give me some error regarding that rather than vague node selector property not matching
Not really sure why the following messages were displayed:
The issue that Pods
couldn't be scheduled on the nodes (Pending
state) was connected with the Insufficient pods
message in the $ kubectl get events
command.
Above message is displayed when the nodes reached their maximum capacity of pods (example: node1
can schedule maximum of 30
pods).
More on the Insufficient Pods
can be found in this github issue comment:
That's true. That's because the CNI implementation on EKS. Max pods number is limited by the network interfaces attached to instance multiplied by the number of ips per ENI - which varies depending on the size of instance. It's apparent for small instances, this number can be quite a low number.
Docs.aws.amazon.com: AWSEC2: User Guide: Using ENI: Available IP per ENI
-- Github.com: Kubernetes: Autoscaler: Issue 1576: Comment 454100551
Additional resources: