Optimizing Cumulative AWS Costs with Lambda - Real-Time Cost Control and Automated Cleanup

Optimizing Cumulative AWS Costs with Lambda - Real-Time Cost Control and Automated Cleanup

Cost control in AWS is critical, especially when handling multiple resources across dynamic environments. Here, I'll show you how to build a Lambda function that tracks cumulative costs in real-time, then triggers a cleanup if a budget threshold (like $3) is exceeded—all without manual intervention. This approach combines AWS CloudTrail, CloudWatch, and DynamoDB to monitor usage and enforce cost limits through dynamic configurations.


Objective: Enforce Cost Control Dynamically 💡

By monitoring cumulative costs and using automated cleanup, you:

  • Control Spending: Ensure costs stay within the $3 limit across all services.

  • Automate Cleanup: Trigger resource deletion automatically if cumulative costs exceed the budget.

  • Keep Code Lean: Rely on DynamoDB configurations for each resource, keeping the code generic and adaptable.


Step 1: Set Up Cleanup Configurations in DynamoDB 🗃️

The foundation of this Lambda function is DynamoDB, which holds cleanup configurations for each resource type. By specifying actions for each resource (pre-cleanup, main action, and post-cleanup), you create a flexible and easily manageable setup.

Example DynamoDB Configuration

For an EC2 instance type with pre-cleanup and post-cleanup actions:

{
    "ServiceName": "EC2",
    "ResourceType": "t2.micro",
    "CleanupActions": {
        "main_action": {
            "method": "terminate_instances",
            "params": {
                "InstanceIds": ["{{InstanceId}}"]
            }
        },
        "pre_cleanup": [
            {
                "method": "detach_volume",
                "params": {
                    "VolumeId": "{{volume_id}}"
                }
            }
        ],
        "post_cleanup": [
            {
                "method": "delete_snapshots",
                "params": {
                    "SnapshotId": "{{snapshot_id}}"
                }
            }
        ]
    },
    "CostThreshold": 0.50
}

This configuration dynamically applies actions depending on the resource type and cost threshold, simplifying the code and making it scalable.


Step 2: Capture Recently Created Resources Using CloudTrail 📄

We use CloudTrail to identify resources created within the past three hours. This allows the Lambda function to track and analyze costs of newly spun-up resources within the timeframe.

Code for Fetching Recent Resources

import boto3
from datetime import datetime, timedelta

cloudtrail = boto3.client('cloudtrail')

def fetch_resources_created_in_last_3_hours():
    three_hours_ago = datetime.utcnow() - timedelta(hours=3)
    created_resources = []

    # Fetch events in CloudTrail from the last 3 hours
    events = cloudtrail.lookup_events(
        StartTime=three_hours_ago,
        EndTime=datetime.utcnow()
    )

    for event in events['Events']:
        event_details = json.loads(event['CloudTrailEvent'])
        for resource in event_details.get('resources', []):
            created_resources.append({
                "resourceType": resource.get('resourceType'),
                "resourceName": resource.get('resourceName')
            })

    print(f"Resources created in last 3 hours: {created_resources}")
    return created_resources

This generic function captures all recent resources, keeping the code flexible for multiple service types.


Step 3: Calculate Cumulative Costs Using CloudWatch and AWS Pricing 📊

The Lambda function calculates the cumulative cost of these resources by fetching usage data from CloudWatch and pricing from AWS Pricing API. This gives real-time cost data, allowing the function to act as soon as the cumulative cost crosses the threshold.

Code for Calculating Cumulative Costs

pricing = boto3.client('pricing', region_name='us-east-1')
cloudwatch = boto3.client('cloudwatch')

def fetch_cumulative_usage(created_resources):
    cumulative_cost = 0.0

    for resource in created_resources:
        service_name = resource['resourceType']
        resource_id = resource['resourceName']

        # Fetch usage metrics (e.g., CPUUtilization for EC2)
        usage = fetch_usage(
            service_name,
            metric_name="CPUUtilization", 
            namespace=f"AWS/{service_name}",
            dimensions=[{"Name": "InstanceId", "Value": resource_id}]
        )

        # Fetch pricing for the specific usage type
        usage_type = get_usage_type(service_name)
        price_per_unit = fetch_pricing(service_name, usage_type)

        if price_per_unit is not None:
            cumulative_cost += usage * price_per_unit
        print(f"Cumulative cost so far: ${cumulative_cost:.2f}")

    print(f"Total cumulative cost over last 3 hours: ${cumulative_cost:.2f}")
    return cumulative_cost

def fetch_pricing(service_code, usage_type):
    response = pricing.get_products(
        ServiceCode=service_code,
        Filters=[{'Type': 'TERM_MATCH', 'Field': 'usagetype', 'Value': usage_type}]
    )
    price_list = json.loads(response['PriceList'][0])
    price_dimension = list(price_list['terms']['OnDemand'].values())[0]['priceDimensions']
    return float(list(price_dimension.values())[0]['pricePerUnit']['USD'])

def fetch_usage(service_name, metric_name, namespace, dimensions):
    usage_data = cloudwatch.get_metric_statistics(
        Namespace=namespace,
        MetricName=metric_name,
        Dimensions=dimensions,
        StartTime=datetime.utcnow() - timedelta(hours=1),
        EndTime=datetime.utcnow(),
        Period=3600,
        Statistics=['Sum']
    )
    usage = usage_data['Datapoints'][0]['Sum'] if usage_data['Datapoints'] else 0
    return usage

This code calculates cumulative costs across all new resources, ensuring the function can terminate resources when cumulative costs cross the threshold.


Step 4: Perform Cleanup Actions if Costs Exceed Threshold 🧹

When cumulative costs exceed $3, the Lambda function triggers cleanup actions as defined in DynamoDB. By applying pre-cleanup, main, and post-cleanup actions generically, the function ensures that resources are terminated efficiently without manual oversight.

Code for Executing Cleanup

def perform_generic_cleanup(resources):
    for resource in resources:
        service_name = resource['resourceType']
        resource_name = resource['resourceName']

        # Fetch config for each resource type
        cleanup_actions, _ = get_cleanup_config(service_name, resource_name)
        service_client = boto3.client(service_name.lower()) if service_name else None
        if not service_client:
            print(f"No client available for service: {service_name}")
            continue

        # Execute pre-cleanup, main action, and post-cleanup actions
        execute_generic_action(service_client, cleanup_actions.get('pre_cleanup', []))
        execute_generic_action(service_client, [cleanup_actions.get('main_action', {})])
        execute_generic_action(service_client, cleanup_actions.get('post_cleanup', []))

def execute_generic_action(service_client, action_list):
    for action in action_list:
        if action:
            method_name = action['method']
            params = replace_placeholders(action.get('params', {}), actual_values={})
            try:
                method = getattr(service_client, method_name)
                method(**params)
                print(f"Executed {method_name} with parameters: {params}")
            except Exception as e:
                print(f"Error executing {method_name} with parameters {params}: {e}")

Using DynamoDB-configured actions, this function applies pre-configured cleanup steps to each resource dynamically, keeping the code lean and manageable.


Conclusion: Achieving Strategic Cost Control with Automation

By setting up this Lambda function, you align financial goals with technical strategy in the cloud:

  • Automated cost control without the need for ongoing monitoring.

  • Real-time tracking of all cumulative costs across services.

  • Flexible and scalable cleanup driven by DynamoDB configurations.

This approach puts you in full control of AWS spending, ensuring you maintain cost efficiency and operational effectiveness across every service. With this solution, every new resource is tracked and managed automatically, saving time, budget, and unnecessary cloud complexity.