Automate the CloudWatch Agent configuration upon instance size change

Hetul Sheth
2 min readMar 11, 2023

Hello Folks,

In this scenario, We already have instances which have CloudWatch Agent setup and their Disk and Memory size being monitored(custom metrics) under CloudWatch with Alarms setup for same.

There have been sometime scenarios where customer intentionally changes the instance size but which unintentionally leads to this Alarm go from ‘OK’/ ‘IN ALARM’ state to ‘INSUFFICIENT DATA’ state. And that is because when you setup metrics for Disk or Memory you provide CloudWatch Agent Configuration file which determines what custom metrics are going to get monitored and what is the thresholds for each.[1]

In my case the agent configuration file used is:

So to remediate the issue I was able to setup an automation which would detect when the instance size/type changes and upon this event will trigger a Lambda function which will run SSM Run Document ‘AmazonCloudWatch-ManageAgent’ which will re-configure the CW agent with latest configuration and using the config file, detect the latest size of instance and reconfigure the metric to keeping alarm in functional state.

To set this up, we’ll use EventBridge to detect event, Lambda to filter the Eventname of ModifyInstanceAttribute of type ‘instance_type’ and execute the SSM send_command which then will trigger the SSM Run Command with the desired Parameters.

Here is the EventBridge rule I used:

This triggers Lambda Function which has permission to access EventBridge and SSM.

Here is the Lambda code:

In the above code I have stored the Config file in SSM Parameter store with the name ‘AmazonCloudWatch-windows’(this is same as the one I provided earlier), hence used: ‘optionalConfigurationLocation’: [‘AmazonCloudWatch-windows’].

Majority of my time spent here was digging around the exact Parameter names we have to use. Was getting a lot of errors for that but then I reviewed the SSM Document in detail and was able to find the exact syntax.

Also it is necessary to add sleep time cause when instance size changes the event will immediately trigger Lambda but we need to wait till the instance gets into properly running state with Session Manager active for SSM to Run command.

Hope you find this helpful!

[1] https://aws.amazon.com/premiumsupport/knowledge-center/cloudwatch-memory-metrics-ec2/

--

--

Hetul Sheth

AWS Certified Solutions Architect, Developer and SysOps Admin Associate | Azure Certified