Kubernetes cluster unreachable with Terraform and Helm

853 views Asked by At

I am creating a GKE cluster using terraform for my thesis. At first I was able to create it, but then I added Istio, prometheus and others. So I destroyed the cluster to create it again with all these. I started getting the same error: Kubernetes cluster unreachable. I have checked credential issues and added a service account, it didn't work.

I thought it was a credentials issue with helm, which I use to create istio and addons of it. I also thought this was maybe a problem with the kubeconfig file, which I am not sure how to resolve. I set the KUBE_CONFIG_PATH, but it didn't help.

In the end I decided to try only creating the cluster again and it still doesn't work and I get Error: Kubernetes cluster unreachable. What is going on here? I think it has either to do with the credentials or the kubeconfig, but I am lost at this point.

Anyone ever encountered this issue? How did you resolve it?

The Terraform providers file:

terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "4.63.1"
    }

    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "2.21.1"
    }

    helm = {
      source = "hashicorp/helm"
      version = "2.10.1"
    }

    kubectl = {
      source  = "gavinbunney/kubectl"
      version = ">= 1.7.0"
    }
  }
}

provider "google" {
  project = var.gcp_project_id
  region  = var.region
  credentials = file("${var.credentials_gcp}/service-account.json")
}

provider "kubernetes" {
  # config_path = "~/.kube/config"
  # config_context = "gke_kube-testing-384710_europe-west1_thesis-cluster"
  host                   = "https://${google_container_cluster.primary.endpoint}"
  token                  = data.google_client_config.main.access_token
  cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
}


provider "helm" {
  kubernetes {
    config_path = "~/.kube/config"
  }
}

The terraform kubernetes file:

resource "google_container_cluster" "primary" {
    name = var.name
    location =  var.zone
    remove_default_node_pool = true
    initial_node_count = 1 

    network = google_compute_network.main.self_link
    subnetwork = google_compute_subnetwork.private.self_link

    logging_service = "none"         # "logging.googleapis.com/kubernetes"
    monitoring_service = "none"      # "monotoring/googleapis.com/kubernetes"   
    networking_mode = "VPC_NATIVE"

    # Optional, for multi-zonal cluster
    node_locations = var.multi-zonal ? local.zones : []   # if multi-zonal == true then use the zones in locals, else use []
     

     addons_config {
        http_load_balancing {
          disabled = true
        }

        horizontal_pod_autoscaling {
          disabled = true
        }
     }

     vertical_pod_autoscaling {
       enabled = false
     }

     release_channel {
       channel = "REGULAR"
     }

     workload_identity_config {
       workload_pool = "${var.gcp_project_id}.svc.id.goog"
     }

     ip_allocation_policy {
       cluster_secondary_range_name = "k8s-pod-range"
       services_secondary_range_name = "k8s-service-range"
    depends_on = [ 
      # module.enable_google_apis,
      # module.gcloud
      ]
     }

     private_cluster_config {
       enable_private_nodes = true
       enable_private_endpoint = false
       master_ipv4_cidr_block = "172.16.0.0/28"
     }
}

# Get credentials for cluster
resource "null_resource" "gcloud-connection" {
  provisioner "local-exec" {
    command = "gcloud container clusters get-credentials ${var.name} --zone ${var.zone} --project ${var.gcp_project_id}"
  }

  depends_on = [ google_container_cluster.primary ]
}

# Apply YAML kubernetes-manifest configurations      
resource "null_resource" "apply_deployment" {
  provisioner "local-exec" {
    interpreter = ["bash", "-exc"]
    command     = "kubectl apply -k ${var.filepath_manifest} -n ${var.namespace}"
  }

  depends_on = [ 
    null_resource.gcloud-connection
  ]
}

resource "google_service_account" "kubernetes" {
    account_id = "kubernetes"
}

Am I maybe using the service account incorrectly?

If you need any other piece of code or information, feel free to ask.

2

There are 2 answers

0
Νικόλας Καστρινάκης On

First, I resolved a couple of issues with creating the cluster itself and I can now create it with no problems. As for the other issues, I believe the problem is that the Helm provider requires the kubeconfig file to access the cluster, but the cluster is not yet created when I run terraform init.

To work around this issue I found a solution to be to first run:

terraform apply -target=google_container_node_pool.general --auto-approve

This creates the cluster. Then I run:

gcloud container clusters get-credentials <cluster-name> --zone <zone> --project <project-id>

This adds the credentials of the new cluster to the kubeconfig file. Then I run:

terraform apply --auto-approve

This configures the helm provider again with the already created cluster credentials in the new kubeconfig context, so the helm provider can install charts to the cluster.

It works for now, but I would like to know if there is another way to do this without needing to do this whole process every time, only using terraform and not the terminal.

Also, I don't understand why the credentials command doesn't work in terraform. I use it there as well, but it doesn't seem to matter.

0
Friedec On

This solution is for AWS, not GCP. But maybe you can get the same idea. Kubernetes token is shortlived, only 10-15 minutes. Unfortunately, auth token cannot be refreshed because configuration-wise nothing has changed.

You need to use exec like this documentation, but with slight adjustment to include role arn

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_name
}

provider "kubernetes" {
  alias                  = "kubernetes"
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)

  exec {
      api_version = "client.authentication.k8s.io/v1beta1"
      args        = [
        "eks", "get-token", 
        "--cluster-name", data.aws_eks_cluster.cluster.name, 
        "--role-arn", var.iam_admin_role_arn #add role-arn
      ]
      command     = "aws"
  }
}