I am using Terraform to create an ECS cluster with an EC2 instance. My goal is to have a single task running on only one EC2 instance. I am managing both the capacity provider and auto-scaling for this cluster.
Initially, the deployment of a task to an EC2 instance runs smoothly. However, when I try to deploy a new task definition to replace the existing task, the ECS keeps the task in a PROVISIONING state. The task stays in this state until I change the auto-scaling group's max_size from 1 to 2. Once I do that, the new task deployment is done in a new EC2 instance, and the previous instance is removed after some time as a action related to a CloudWatch alarm (CapacityProviderReservation < 100 for 15 datapoints within 15 minutes).
For now, in my non-production environment, I want to keep only one instance as part of my cluster and allow multiple deployments of the same task on the same instance.
Terraform code:
# ECS service
resource "aws_ecs_service" "this" {
name = "cluster"
iam_role = aws_iam_role.ecs_role.arn
cluster = aws_ecs_cluster.cluster.id
task_definition = aws_ecs_task_definition.task_definition.arn
desired_count = 1
force_new_deployment = true
load_balancer {
target_group_arn = aws_alb_target_group.lb.arn
container_name = aws_ecs_task_definition.task_definition.family
container_port = 80
}
ordered_placement_strategy {
type = "binpack"
field = "memory"
}
capacity_provider_strategy {
capacity_provider = aws_ecs_capacity_provider.ecs_capacity_provider.name
base = 1
weight = 100
}
lifecycle {
create_before_destroy = true
ignore_changes = [
desired_count
]
}
}
# Auto scaling
resource "aws_autoscaling_group" "ecs_asg" {
name = "asg"
vpc_zone_identifier = [for subnet in var.public_subnet_ids : subnet]
max_size = 1
min_size = 1
desired_capacity = 1
health_check_type = "EC2"
protect_from_scale_in = false
launch_template {
id = aws_launch_template.template.id
version = "$Latest"
}
instance_refresh {
strategy = "Rolling"
}
lifecycle {
create_before_destroy = true
}
}
# capacity provider
resource "aws_ecs_capacity_provider" "ecs_capacity_provider" {
name = "ecs_capacity_provider"
auto_scaling_group_provider {
auto_scaling_group_arn = aws_autoscaling_group.ecs_asg.arn
managed_termination_protection = "DISABLED"
managed_scaling {
maximum_scaling_step_size = 2
minimum_scaling_step_size = 1
status = "ENABLED"
target_capacity = 100
}
}
}
resource "aws_ecs_cluster_capacity_providers" "ecs_capacity_providers" {
cluster_name = aws_ecs_cluster.cluster.name
capacity_providers = [aws_ecs_capacity_provider.ecs_capacity_provider.name]
default_capacity_provider_strategy {
base = 1
weight = 100
capacity_provider = aws_ecs_capacity_provider.ecs_capacity_provider.name
}
}
# ECS Task
resource "aws_ecs_task_definition" "task" {
family = "task"
container_definitions = jsonencode([
{
name = "task",
image = "test",
cpu = "768",
memory = "4096",
essential = true
portMappings = [
{
containerPort = 80
hostPort = 80
protocol = "tcp"
}
]
logConfiguration = {
logDriver = "awslogs",
options = {
"awslogs-group" = aws_cloudwatch_log_group.logs.name,
"awslogs-region" = var.region,
"awslogs-stream-prefix" = "app"
}
}
}
])
execution_role_arn = aws_iam_role.ecs_exec.arn
task_role_arn = aws_iam_role.ecs_task.arn
}
I noticed there is a related cloud watch alarm being triggered when I try the second task deployment which is linked to the auto scaling:
Alarm: "TargetTracking-test-ecs-asg-AlarmHigh-e5a4556-5686-5669-26546-e745a5ed90cb"
