Amazon CloudWatch for automated rollback monitors and automatically react to changes during a rolling update. Enables customers to add automatic safeguards for Amazon ECS service rolling updates. And you can easily automate discovery and remediation for failed deployments and minimize the impact of a bad change. Customers can use Amazon ECS application programming interfaces (APIs) or the Amazon ECS Console to configure CloudWatch for rolling updates. You can configure one or more CloudWatch metric alarms to determine if the deployment was successful.
How does automating rollbacks work
Amazon ECS customers use Rolling Update to deploy their services to a cluster. This time the Amazon ECS scheduler replaces the currently running tasks with new tasks on a rolling basis until the desired count reaches. The deployment configuration controls the number of tasks the scheduler adds or removes from the service during a rolling update. Here, new tasks make up the primary deployment, whereas the previous tasks are known as the active deployment. This feature enables you to handle automation natively using Amazon ECS API without additional services. When you create or update services with Amazon ECS API using a JSON string, you can configure one or more Amazon CloudWatch metric alarms in the deploymentConfiguration field as below:
JSON
The CLI of –deployment-configuration is
Bash
In figure 1 below, during a rolling update, Amazon ECS starts monitoring the list of configured Amazon CloudWatch alarms as soon as one or more tasks of the updated service are running. If there are no alarms, the rolling updates are complete, and the deployment process is done when the primary deployment is healthy and has reached the desired count, and the active deployment has scaled down to zero. With alarms configured, the deployment will continue for an additional duration, the bake time. At this time, the primary deployment remains within the IN_PROGRESS state. Amazon ECS calculates the length of the bake time based on the properties of the Amazon CloudWatch alarms. If there are no alarms at the end and they remain in the OK state, the deployment will be considered a success.
In Figure 2 below, Amazon ECS begins the rollback if an Amazon CloudWatch alarm is activated. Notification about the failed deployment via the event bus and the current deployment status changed to FAILED. The active deployment becomes the primary and scaled back to the desired count, and the failed deployment is scaled down and deleted.
OpenTelemetry SDK, you can use AWS Distro for OpenTelemetry to export application metrics to Amazon CloudWatch.
For Application Load Balancers, AWS recommends using alarm metrics HTTPCode_ELB_5XX_Count and HTTPCode_ELB_4XX_Count metrics to check for HTTP error code spikes. For existing applications, CPUUtilization metrics are used to monitor CPU consumption, and MemoryUtilization metrics are used to monitor memory consumption.
Conclusion
Amazon CloudWatch for automated rollback can use via the AWS CLI, AWS SDK, or AWS CloudFormation. Amazon CloudWatch alarms only support Amazon ECS services for monitoring and automatically reacting to changes during a rolling update.
Metclouds Technologies is here to help you to automate your rollback with AWS CloudWatch.