Scheduling AWS EMR clusters resize
2019-07-22
Below a sample of howto schedule an Amzon Elastic MapReduce (EMR) cluster resize. It is useful if you have a cluster that is less used during the nights or in the weekends
I used a lambda function triggered by a Cloudwatch rule. Here is my python lambda function
import boto3, json
MIN=1
MAX=10
def lambda_handler(event, context):
region = event["region"]
ClusterId = event["ClusterId"]
InstanceGroupId = event["InstanceGroupId"]
InstanceCount = int(event['InstanceCount'])
if InstanceCount >= MIN and InstanceCount <= MAX:
client = boto3.client('emr', region_name=region)
response = client.modify_instance_groups(
ClusterId=ClusterId,
InstanceGroups= [{
"InstanceGroupId": InstanceGroupId,
"InstanceCount": InstanceCount
}])
return response
else:
msg = "EMR cluster id %s (%s): InstanceCount=%d is NOT allowed [%d,%d]" % (ClusterId, region, InstanceGroupId, InstanceCount, MIN,MAX)
return {"response": "ko", "message": msg}
Below the CloudWatch rule where the input event is a constant json object like
{"region": "eu-west-1","ClusterId": "j-dsds","InstanceGroupId": "ig-sdsd","InstanceCount": 8}