Tekton cli (in POD): cannot find existing pipeline

111 views Asked by At

Running a 'tkn' command in a deployed pod wil result in the error message that it cannot find the pipeline in namespace 'test'.

The deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tkncli
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tkncli
  template:
    metadata:
      labels:
        app: tkncli
    spec:
      containers:
        - name: tkncli
          image: quay.io/rhcanada/tkn-cli
          imagePullPolicy: IfNotPresent
          command:
            - tkn
          args:
            - -n
            - test
            - pipeline
            - start
            - postsync-pipeline
            - --param
            - pause-duration="2"

The error message:

Error: Pipeline name postsync-pipeline does not exist in namespace test.

Are these resources really missing? Nope.

$ k get pipeline -n test
NAME AGE
postsync-pipeline 24m

$ k get task -n test
NAME AGE
postsync-task 12m

I replaced the Docker container with a more official one using this Dockerfile. Same result.

I can start the pipeline with this command:

$ tkn -n test pipeline start postsync-pipeline --param pause-duration="2" --showlog

The result is:

PipelineRun started: postsync-pipeline-run-jzfvf
Waiting for logs to be available...
[first-task : say-it] Text one
[second-task : say-it] Text two

How to reproduce?

  1. Add the K8s resource: deployment - see above.
  2. Add the K8s resources: pipeline and task - see below.

The pipeline:

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: postsync-pipeline
  namespace: test
spec:
  params:
    - name: pause-duration
      description: uitstellen voor starten
      type: string
      default: "2"
  tasks:
    - name: first-task
      taskRef:
        name: postsync-task
      params:
        - name: pause-duration
          value: $(params.pause-duration)
        - name: say-what
          value: "Text one"
    - name: second-task
      taskRef:
        name: postsync-task
      params:
        - name: pause-duration
          value: $(params.pause-duration)
        - name: say-what
          value: "Text two"

The Task:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: postsync-task
  namespace: test
spec:
  params:
    - name: pause-duration
      description: How long to wait before saying something
      default: "0"
      type: string
    - name: say-what
      description: What should I say
      default: hello
      type: string
  steps:
    - name: say-it
      image: registry.access.redhat.com/ubi8/ubi
      command:
        - /bin/bash
      args: ['-c', 'sleep $(params.pause-duration) && echo $(params.say-what)']

When even further simplifying the situation, (1) with the default namespace, (2) an rbac serviceaccount and (3) just a 'pipeline ls' I get this error:

Error: Couldn't get kubeConfiguration namespace: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

2

There are 2 answers

3
VonC On BEST ANSWER

In your deployment YAML, I would try and change the command and args for the container to facilitate debugging.

command: ["/bin/sh"]
args: ["-c", "sleep infinity"]

That would replace the original tkn command with a shell command that keeps the container running indefinitely (sleep infinity). It allows you to exec into the pod and manually execute the tkn command. That way, you can interactively troubleshoot the issue.

kubectl exec -it tkncli-pod -n test -- /bin/sh
# then, for testing
tkn -n test pipeline start postsync-pipeline --param pause-duration="2" --showlog

By using this debugging setup, you can manually test various tkn commands within the pod's environment to better understand why it is unable to find the pipeline.

Make sure the service account associated with the tkncli pod has the necessary permissions to interact with Tekton resources in the test namespace. You might need to create a Role and a RoleBinding for this purpose.
See for illustration tektoncd/pipeline issue 1830

Make sure the CLI version in your container is compatible with the Tekton version on your cluster.

Verify that there are no network policies (a bit as in tektoncd/pipeline issue 3154) preventing the pod from communicating with the Kubernetes API server. Also, make sure DNS resolution is working correctly in the pod, allowing it to resolve and access the Tekton resources.


When experimenting with the above situation (in the default namespace: tkn pipeline ls) I get with another serviceaccount (as above specified) the error message: Error: Could not get kubeConfiguration namespace: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

When even further simplifying the situation, (1) with the default namespace, (2) an rbac serviceaccount and (3) just a 'pipeline ls' I get this error:

Error: Could not get kubeConfiguration namespace: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

It looks like that the tkn CLI is unable to access the Kubernetes API because the necessary configuration is not available in the container's environment. Here is an ASCII diagram to illustrate the issue:

+----------------------------------------------------------+
| Kubernetes Cluster                                       |
|                                                          |
| +----------------------+                                 |
| | Deployment: tkncli   |                                 |
| | Namespace: default   |                                 |
| |                      |                                 |
| | Pod:                 |         +---------------------+ |
| | - tkncli container   |   ----> | Issue:              | |
| |   Command: tkn       |         | Missing Kubernetes  | |
| |   Args: pipeline ls  |         | configuration       | |
| |                      |         +---------------------+ |
| | ServiceAccount:      |                                 |
| | - Custom RBAC        |                                 |
| +----------------------+                                 |
|                                                          |
+----------------------------------------------------------+

The container running the tkn CLI needs to have access to the Kubernetes configuration, which usually means having a valid kubeconfig file or environment variables that provide the necessary information to connect to the Kubernetes API.

Kubernetes automatically creates a service account token and mounts it into pods. Make sure your deployment is configured to use the service account with the appropriate permissions.

As suggested by the error message, you can set the KUBERNETES_MASTER environment variable in your deployment to point to the Kubernetes API server. That is usually the internal cluster address.

If the tkn CLI requires a kubeconfig file, you might need to create a ConfigMap or Secret containing the kubeconfig and mount it into your container. However, this is less common and can have security implications.

Your deployment YAML, with the KUBERNETES_MASTER environment variable, would be:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tkncli
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tkncli
  template:
    metadata:
      labels:
        app: tkncli
    spec:
      serviceAccountName: your-service-account
      containers:
        - name: tkncli
          image: quay.io/rhcanada/tkn-cli
          imagePullPolicy: IfNotPresent
          env:
            - name: KUBERNETES_MASTER
              value: https://kubernetes.default.svc
          command: ["/bin/sh"]
          args: ["-c", "sleep infinity"]

Replace your-service-account with the name of the service account that has the required permissions. After applying this updated deployment, you can exec into the pod and try running the tkn command again. If the issue is related to the Kubernetes API access, these changes should help resolve it.

0
tm1701 On

The solution of VonC gives much insight - and therefore I awarded as the solution.

My question was also - is there a better way to start a pipeline in a PostSync?

I found a solution by sending a curl request in the PostSync job to an event listener. That works great!

When reading about this, I came across this description. You will find this nice overview, confirming my above solution.

enter image description here

When you find better solutions, please suggest them.