kubernetesjenkinsjenkins-pipelinecloudjenkins-kubernetes

Kubernetes Cloud Configuration as Agent in Jenkins: Issue - Continuous Pod Creation and Unknown Client Name Error


I am working on configuring a Kubernetes cloud environment to manage agent provisioning for our QA Jenkins server. However, we are encountering two main issues:

  1. Continuous Pod Creation: After starting a Jenkins job, we observe a continuous creation of pods with seemingly similar names (e.g., sample-job-2-9-x79k3-vlvp9-v6lkj). This results in a large number of pods being spawned unnecessarily, consuming resources.

  2. "Unknown Client Name" Error in the pod log: Within the created pods, we are seeing the error message "Unknown client name: sample-job-2-9-x79k3-vlvp9-v6lkj" in the logs. This error is causing disruptions and preventing successful agent connection.

I have followed couple of Jenkins documentation and have verified the setup. However, these issues persist, impacting Jenkins pipeline execution.

Enclosing most of the information needed to understand the environment and error logs at Jenkins and agent level.

Jenkins Job log:
[Pipeline] Start of Pipeline
[Pipeline] podTemplate
[Pipeline] {
[Pipeline] node
Created Pod: kubernetes-test jenkinsns/sample-job-2-9-x79k3-vlvp9-g134p
**ERROR: Failed to launch sample-job-2-9-x79k3-vlvp9-g134p**
Also:   java.lang.Throwable: waiting here
    at io.fabric8.kubernetes.client.utils.Utils.waitUntilReady(Utils.java:173)
    at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.waitUntilCondition(BaseOperation.java:890)
    at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.waitUntilReady(BaseOperation.java:878)
    at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.waitUntilReady(BaseOperation.java:93)
    at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:169)
    at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:298)
    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
    at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
io.fabric8.kubernetes.client.KubernetesClientException: **Failure executing: GET at: https://10.65.184.52:6443/api/v1/namespaces/jenkinsns/pods?fieldSelector=metadata.name%3Dsample-job-2-9-x79k3-vlvp9-g134p&resourceVersion=759022&allowWatchBookmarks=true&watch=true. Message: Forbidden.**
    at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:728)
    at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:708)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.lambda$start$3(WatchConnectionManager.java:128)

#############################################################################
Kubernetes cloud configuration in Jenkins->Manage Jenkins->Cloud section

Name : kubernetes-test
Kubernetes URL : https://10.65.184.52:6443
Disable https certificate check : Checked
Kubernetes Namespace : jenkinsns
Credentials : Added (secret text id)
Jenkins URL: http://asvdqjenkins01:8080/
Jenkins tunnel: asvdqjenkins01:5555
Pod Labels by default: Key -> jenkins, Value -> salve

Pod Templates -> Not added
#############################################################################
Jenkins job name: sample_job_2
Jenkins job pipeline script:

pipeline {
    agent {
        // Reference the Kubernetes cloud configuration by its name
        kubernetes {
            cloud 'kubernetes-test'
            defaultContainer 'jnlp'
        }
    }
    
    stages {
        stage('Test') {
            steps {
                // Add your testing steps here
                sh 'echo "Testing in Kubernetes"'
            }
        }
    }
}
#############################################################################

Job is running continously and pods are getting created one after another. Here is the screenshot.

root@invk8-dqjenkins1-m1:~# kubectl get pods -n jenkinsns
NAME                               READY   STATUS    RESTARTS   AGE
sample-job-2-7-twtpt-510vt-mxcf5   1/1     Running   0          27m
sample-job-2-7-twtpt-510vt-xvn79   1/1     Running   0          27m
sample-job-2-9-x79k3-vlvp9-g134p   1/1     Running   0          13m
sample-job-2-9-x79k3-vlvp9-v6lkj   1/1     Running   0          13m

Here is the Pod log from one such pods sample-job-2-9-x79k3-vlvp9-v6lkj:

Aug 23, 2023 5:46:21 AM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: sample-job-2-9-x79k3-vlvp9-v6lkj
Aug 23, 2023 5:46:21 AM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3142.vcfca_0cd92128
Aug 23, 2023 5:46:21 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/agent/remoting as a remoting work directory
Aug 23, 2023 5:46:21 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
Aug 23, 2023 5:46:21 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://asvdqjenkins01:8080/]
Aug 23, 2023 5:46:22 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Aug 23, 2023 5:46:22 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability check
Aug 23, 2023 5:46:22 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
  Agent address: asvdqjenkins01
  Agent port:    5555
  Identity:      88:f3:10:6d:77:7c:c4:68:4a:eb:96:9a:b2:a3:01:e2
Aug 23, 2023 5:46:22 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Aug 23, 2023 5:46:22 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to asvdqjenkins01:5555
Aug 23, 2023 5:46:22 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Aug 23, 2023 5:46:22 AM org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run
INFO: Waiting for ProtocolStack to start.
Aug 23, 2023 5:46:23 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 88:f3:10:6d:77:7c:c4:68:4a:eb:96:9a:b2:a3:01:e2
Aug 23, 2023 5:46:23 AM org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer onRecv
INFO: [JNLP4-connect connection to asvdqjenkins01/10.23.185.62:5555] Local headers refused by remote: Unknown client name: sample-job-2-9-x79k3-vlvp9-v6lkj
Aug 23, 2023 5:46:23 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Protocol JNLP4-connect encountered an unexpected exception
java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: **Unknown client name: sample-job-2-9-x79k3-vlvp9-v6lkj
        at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
        at hudson.remoting.Engine.innerRun(Engine.java:814)
        at hudson.remoting.Engine.run(Engine.java:543)
Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: sample-job-2-9-x79k3-vlvp9-v6lkj**
        at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.newAbortCause(ConnectionHeadersFilterLayer.java:380)
        at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.onRecvClosed(ConnectionHeadersFilterLayer.java:435)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:825)
        at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:289)
        at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:168)
        at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:825)
        at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:155)
        at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$700(BIONetworkLayer.java:51)
        at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:257)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:125)
        at java.base/java.lang.Thread.run(Unknown Source)
        Suppressed: java.nio.channels.ClosedChannelException
                ... 7 more
Aug 23, 2023 5:46:23 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: reconnect rejected, sleeping 10s:
java.lang.Exception: The server rejected the connection: None of the protocols were accepted
        at hudson.remoting.Engine.onConnectionRejected(Engine.java:893)
        at hudson.remoting.Engine.innerRun(Engine.java:840)
        at hudson.remoting.Engine.run(Engine.java:543)

    #############################################################################

Here is the result output of pod details: 
`kubectl describe pod sample-job-2-9-x79k3-vlvp9-v6lkj -n jenkinsns`

Name:         sample-job-2-9-x79k3-vlvp9-v6lkj
Namespace:    jenkinsns
Priority:     0
Node:         invk8-dqjenkins1-w2/10.65.186.128
Start Time:   Wed, 23 Aug 2023 11:16:20 +0530
Labels:       jenkins=slave
              jenkins/label=sample_job_2_9-x79k3
              jenkins/label-digest=5948c9090bdf7c052caef5b9ef866f7bb0b8bf64
Annotations:  buildUrl: http://asvdqjenkins01:8080/job/sample_job_2/9/
              cni.projectcalico.org/containerID: 567412e70106ba3acc4dd821f23851050582dedf21572d64efea5c08ae5faba9
              cni.projectcalico.org/podIP: 192.168.39.46/32
              cni.projectcalico.org/podIPs: 192.168.39.46/32
              runUrl: job/sample_job_2/9/
Status:       Running
IP:           192.168.39.46
IPs:
  IP:  192.168.39.46
Containers:
  jnlp:
    Container ID:   docker://4999eebd72f259382de1ba00d63e40b6d5ac90cc5e1dee7b35cc4aee2f6ec844
    Image:          jenkins/inbound-agent:3142.vcfca_0cd92128-1
    Image ID:       docker-pullable://jenkins/inbound-agent@sha256:704a4b18ac78355701e89584999f2a4ba54e10cc674421a65a61763b0032eafe
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 23 Aug 2023 11:16:21 +0530
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:     100m
      memory:  256Mi
    Environment:
      JENKINS_SECRET:         3efe2e05d59f988076ab1ed9836b9236c2d439cd2b99640ecd7123e94d76cbbe
      JENKINS_TUNNEL:         asvdqjenkins01:5555
      JENKINS_AGENT_NAME:     sample-job-2-9-x79k3-vlvp9-v6lkj
      JENKINS_NAME:           sample-job-2-9-x79k3-vlvp9-v6lkj
      JENKINS_AGENT_WORKDIR:  /home/jenkins/agent
      JENKINS_URL:            http://asvdqjenkins01:8080/
    Mounts:
      /home/jenkins/agent from workspace-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lqppt (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  workspace-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-lqppt:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  24m   default-scheduler  Successfully assigned jenkinsns/sample-job-2-9-x79k3-vlvp9-v6lkj to invk8-dqjenkins1-w2
  Normal  Pulled     24m   kubelet            Container image "jenkins/inbound-agent:3142.vcfca_0cd92128-1" already present on machine
  Normal  Created    24m   kubelet            Created container jnlp
  Normal  Started    24m   kubelet            Started container jnlp
    #############################################################################

Jenkins setup:

Jenkins: 2.401.3
OS: Linux - 5.15.0-79-generic
Java: 11.0.20 - Ubuntu (OpenJDK 64-Bit Server VM)
---
analysis-model-api:11.6.0
ant:497.v94e7d9fffa_b_9
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-150.v7a_b_9d17134a_5
authentication-tokens:1.53.v1c90fd9191a_b_
bootstrap5-api:5.3.0-1
bouncycastle-api:2.29
branch-api:2.1122.v09cb_8ea_8a_724
build-name-setter:2.3.0
build-timeout:1.31
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.0.0
cloudbees-folder:6.848.ve3b_fd7839a_81
command-launcher:107.v773860566e2e
commons-lang3-api:3.13.0-62.v7d18e55f51e2
commons-text-api:1.10.0-68.v0d0b_c439292b_
conditional-buildstep:1.4.3
config-file-provider:953.v0432a_802e4d2
credentials:1271.v54b_1c2c6388a_
credentials-binding:631.v861c06d062b_4
dashboard-view:2.495.v07e81500c3f2
data-tables-api:1.13.5-1
display-url-api:2.3.8
docker-commons:439.va_3cb_0a_6a_fb_29
docker-workflow:572.v950f58993843
durable-task:513.vc48a_a_075a_d93
echarts-api:5.4.0-5
email-ext:2.100
embeddable-build-status:412.v09da_db_1dee68
font-awesome-api:6.4.0-2
forensics-api:2.3.0
git-client:4.4.0
gradle:2.8.2
htmlpublisher:1.32
instance-identity:173.va_37c494ec4e5
ionicons-api:56.v1b_1c8c49374e
jackson2-api:2.15.2-350.v0c2f3f8fc595
jakarta-activation-api:2.0.1-3
jakarta-mail-api:2.0.1-3
javax-activation-api:1.2.0-6
javax-mail-api:1.6.2-9
jaxb:2.3.8-1
jdk-tool:73.vddf737284550
jquery3-api:3.7.0-1
junit:1217.v4297208a_a_b_ce
kubernetes:3995.v227c16b_675ee
kubernetes-client-api:6.4.1-215.v2ed17097a_8e9
kubernetes-credentials:0.10.0
kubernetes-pipeline-devops-steps:1.6
mailer:463.vedf8358e006b_
matrix-auth:3.1.10
matrix-project:808.v5a_b_5f56d6966
metrics:4.2.18-442.v02e107157925
mina-sshd-api-common:2.10.0-69.v28e3e36d18eb_
mina-sshd-api-core:2.10.0-69.v28e3e36d18eb_
okhttp-api:4.11.0-157.v6852a_a_fa_ec11
p4:1.14.2
pam-auth:1.10
parameterized-trigger:2.46
pipeline-build-step:505.v5f0844d8d126
pipeline-graph-analysis:202.va_d268e64deb_3
pipeline-groovy-lib:671.v07c339c842e8
pipeline-input-step:477.v339683a_8d55e
pipeline-milestone-step:111.v449306f708b_7
pipeline-model-api:2.2144.v077a_d1928a_40
pipeline-model-definition:2.2144.v077a_d1928a_40
pipeline-model-extensions:2.2144.v077a_d1928a_40
pipeline-rest-api:2.33
pipeline-stage-step:305.ve96d0205c1c6
pipeline-stage-tags-metadata:2.2144.v077a_d1928a_40
pipeline-stage-view:2.33
plain-credentials:143.v1b_df8b_d3b_e48
plugin-util-api:3.3.0
prism-api:1.29.0-7
resource-disposer:0.23
role-strategy:680.v3a_6a_1698b_864
run-condition:1.6
saml:4.429.v9a_781a_61f1da_
scm-api:676.v886669a_199a_a_
script-security:1269.v639888f5e366
snakeyaml-api:1.33-95.va_b_a_e3e47b_fa_4
ssh-agent:333.v878b_53c89511
ssh-credentials:308.ve4497b_ccd8f4
ssh-slaves:2.916.vd17b_43357ce4
sshd:3.312.v1c601b_c83b_0e
structs:324.va_f5d6774f3a_d
thinBackup:1.18
throttle-concurrents:2.14
timestamper:1.26
token-macro:384.vf35b_f26814ec
trilead-api:2.84.v72119de229b_7
variant:59.vf075fe829ccb
warnings-ng:10.4.0
workflow-aggregator:596.v8c21c963d92d
workflow-api:1259.vb_47f14fffc8a_
workflow-basic-steps:1042.ve7b_140c4a_e0c
workflow-cps:3744.v6f2c0fe0e54d
workflow-durable-task-step:1284.v4fcd365b_75b_e
workflow-job:1326.ve643e00e9220
workflow-multibranch:756.v891d88f2cd46
workflow-scm-step:415.v434365564324
workflow-step-api:639.v6eca_cd8c04a_a_
workflow-support:848.v5a_383b_d14921
ws-cleanup:0.45

Solution

  • Could you share more details about how you installed your Jenkins master instance? From your logs it looks like main reason of pods failure is this error message

    io.fabric8.kubernetes.client.KubernetesClientException: **Failure executing: GET at: https://10.65.184.52:6443/api/v1/namespaces/jenkinsns/pods?fieldSelector=metadata.name%3Dsample-job-2-9-x79k3-vlvp9-g134p&resourceVersion=759022&allowWatchBookmarks=true&watch=true. Message: Forbidden.**
    

    In my opinion it is related to RBAC permissions that your Jenkins master have. You can use this role and following role binding as reference to setup it properly.

    For now it looks like you miss pods GET permission.