I'm encountering a problem with scheduling Prometheus pods on AWS EKS Fargate within my Kubernetes cluster. Here are the details of my setup:
I have configured an AWS EKS Fargate profile named terraform_eks_fargate_profile_monitoring for my Prometheus namespace (monitoring). Here's the relevant Terraform configuration:
resource "aws_eks_fargate_profile" "terraform_eks_fargate_profile_monitoring" {
fargate_profile_name = "monitoring"
cluster_name = aws_eks_cluster.terraform_eks_cluster.name
pod_execution_role_arn = aws_iam_role.terraform_eks_fargate_pods.arn
subnet_ids = aws_subnet.terraform_eks_vpc_private_subnets[*].id
selector {
namespace = "monitoring"
}
}
I've deployed Prometheus using Helm with the following command:
helm install prometheus prometheus-community/prometheus -n monitoring
However, when I describe the pods, I receive the following warning:
Name: prometheus-prometheus-node-exporter-4kvv7
Namespace: monitoring
Priority: 0
Service Account: prometheus-prometheus-node-exporter
Node: <none>
Labels: app.kubernetes.io/component=metrics
app.kubernetes.io/instance=prometheus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=prometheus-node-exporter
app.kubernetes.io/part-of=prometheus-node-exporter
app.kubernetes.io/version=1.7.0
controller-revision-hash=9bd9c77f
helm.sh/chart=prometheus-node-exporter-4.31.0
pod-template-generation=1
Annotations: cluster-autoscaler.kubernetes.io/safe-to-evict: true
Status: Pending
IP:
IPs: <none>
Controlled By: DaemonSet/prometheus-prometheus-node-exporter
Containers:
node-exporter:
Image: quay.io/prometheus/node-exporter:v1.7.0
Port: 9100/TCP
Host Port: 9100/TCP
Args:
--path.procfs=/host/proc
--path.sysfs=/host/sys
--path.rootfs=/host/root
--path.udev.data=/host/root/run/udev/data
--web.listen-address=[$(HOST_IP)]:9100
Liveness: http-get http://:9100/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9100/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
HOST_IP: 0.0.0.0
Mounts:
/host/proc from proc (ro)
/host/root from root (ro)
/host/sys from sys (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
proc:
Type: HostPath (bare host directory volume)
Path: /proc
HostPathType:
sys:
Type: HostPath (bare host directory volume)
Path: /sys
HostPathType:
root:
Type: HostPath (bare host directory volume)
Path: /
HostPathType:
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: :NoSchedule op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 21m default-scheduler 0/5 nodes are available: 1 Too many pods. preemption: 0/5 nodes are available: 5 No preemption victims found for incoming pod.
Warning FailedScheduling 15s (x6 over 20m) default-scheduler 0/7 nodes are available: 1 Too many pods. preemption: 0/7 nodes are available: 7 No preemption victims found for incoming pod.
I get this on all the node-exporter pods. The pushgateway and metrics pod runs fine
I'm not sure that the resource requests and limits specified in the Prometheus pod configuration are compatible with Fargate.
Since i used
helm install prometheus prometheus-community/prometheus -n monitoring
it has the default resource specifications for the Prometheus server container.