AWS Cost Efficiency

Automating Deletion of Incomplete Multipart Uploads

Managing incomplete multipart uploads in Amazon S3 is essential for controlling storage costs. Accumulated failed uploads can lead to unexpected charges. Automation is key to efficiently handling these uploads and preventing unnecessary expenses. By setting up automated strategies, you can quickly identify and clean up incomplete uploads, reducing manual effort and improving cost efficiency.

In this blog, we’ll explore how to use AWS tools to automate the management of failed multipart uploads, helping you optimize storage and minimize costs. For steps on manual setup, refer to Part 1 - How to Reduce Amazon S3 Costs by Deleting Incomplete Multipart Uploads

1. Using S3 Storage Lens

S3 Storage Lens provides visibility into your storage usage and activity trends. To set it up, follow these steps:

  • Step 1 - Create a JSON configuration file

This file contains the configuration for the S3 Storage Lens. Save it as s3-storage-lens-config.json.

{
    "Id": "example-storage-lens-configuration",
    "AccountLevel": {
        "ActivityMetrics": {
            "IsEnabled": true
        },
        "BucketLevel": {
            "ActivityMetrics": {
                "IsEnabled": true
            }
        }
    },
    "DataExport": {
        "CloudWatchMetrics": {
            "IsEnabled": true
        },
        "S3BucketDestination": {
            "Format": "CSV",
            "OutputSchemaVersion": "V_1",
            "AccountId": "your-account-id",
            "Arn": "arn:aws:s3:::your-bucket-name",
            "Prefix": "s3-storage-lens-output/"
        }
    },
    "IsEnabled": true,
    "AwsOrg": {
        "Arn": "arn:aws:organizations::your-account-id:organization/o-exampleorgid"
    },
    "StorageLensArn": "arn:aws:s3:us-east-1:your-account-id:storage-lens/example-storage-lens-configuration"
}

Replace your-account-id, your-bucket-name, and other placeholders with your specific details.

  • Step 2 - Run the AWS CLI command

Use the following AWS CLI command to create the Storage Lens configuration.

aws s3control create-storage-lens-configuration --config-id example-storage-lens-configuration --account-id your-account-id --cli-input-json file://s3-storage-lens-config.json

This command sets up the Storage Lens to monitor and export data related to incomplete multipart uploads.

2. Monitor Incomplete Multipart Uploads

To automate the monitoring and management of incomplete multipart uploads, you can use the Boto3 library in Python. Here’s how you can list and abort incomplete uploads in your S3 bucket.

  • Step 1 - List Multipart Uploads

Use the following Python script to list ongoing multipart uploads in your S3 bucket. This script will help you identify incomplete uploads by providing details such as upload IDs and object keys. aws s3api list-multipart-uploads --bucket example-bucket

aws s3api list-multipart-uploads --bucket example-bucket

Replace <bucket-name> with your actual bucket name. The response will include details about ongoing multipart uploads, such as 'UploadId' and 'Key', which are needed to manage these uploads.

Here's a sample response you might receive from the 'list_multipart_uploads' call in AWS S3, which includes details about ongoing multipart uploads:

{
    "Uploads": [
        {
            "Initiator": {
                "DisplayName": "username",
                "ID": "arn:aws:iam::0123456789012:user/username"
            },
            "Initiated": "2015-06-02T18:01:30.000Z",
            "UploadId": "dfRtDYU0WWCCcH43C3WFbkRONycyCpTJJvxu2i5GYkZljF.Yxwh6XG7WfS2vC4to6HiV6Yjlx.cph0gtNBtJ8P3URCSbB7rjxI5iEwVDmgaXZOGgkk5nVTW16HOQ5l0R",
            "StorageClass": "STANDARD",
            "Key": "multipart/01",
            "Owner": {
                "DisplayName": "aws-account-name",
                "ID": "100719349fc3b6dcd7c820a124bf7aecd408092c3d7b51b38494939801fc248b"
            }
        }
    ],
    "CommonPrefixes": []
}

This response provides the essential details needed to manage ongoing multipart uploads, including the 'UploadId' and 'ID', which are necessary for operations like aborting or completing the uploads.

  • Step 2 - Abort a Multipart Upload

Once you have the details of the ongoing multipart uploads from the previous step, you can abort a specific multipart upload using the following script:

import boto3

# Initialize S3 client
s3 = boto3.client('s3')

# Abort a specific multipart upload
s3.abort_multipart_upload(
    Bucket='<bucket-name>',
    Key='<object-key>',
    UploadId='<upload-id>'
)

By following these steps, you can effectively manage incomplete multipart uploads in your S3 bucket named 'example-bucket', ensuring optimal use of your S3 storage space.

3. Abort Incomplete Multipart Uploads Using S3 Lifecycle 

A lifecycle policy in Amazon S3 is a set of rules that define actions to be applied to objects in a bucket. These actions can be used to manage and automate the storage lifecycle of objects. The typical actions include transitioning objects to different storage classes (like moving infrequently accessed data to a cheaper storage class) and expiring (deleting) objects that are no longer needed. Lifecycle policies help optimize costs and manage data more efficiently by automating these processes.

Steps to Set Up Lifecycle Policy:

  • Step 1 - Initialize the S3 Client: Use the Boto3 library to initialize the S3 client.
  • Step 2 - Define the Lifecycle Policy: Create a lifecycle policy that specifies the rules for aborting incomplete multipart uploads.
  • Step 3 - Apply the Lifecycle Policy: Apply the defined lifecycle policy to the specified S3 bucket.

Code Snippet to Set Lifecycle Policy:

import boto3

# Initialize S3 client
s3 = boto3.client('s3')

# Bucket name
bucket_name = 'your-bucket-name'

# Define lifecycle policy
lifecycle_policy = {
    'Rules': [
        {
            'ID': 'AbortIncompleteMultipartUploadRule',
            'Status': 'Enabled',
            'AbortIncompleteMultipartUpload': {
                'DaysAfterInitiation': 7
            }
        }
    ]
}

# Apply lifecycle policy
s3.put_bucket_lifecycle_configuration(
    Bucket=bucket_name,
    LifecycleConfiguration=lifecycle_policy
)

print(f"Lifecycle policy applied to bucket: {bucket_name}")

By following these steps and using the provided code snippet, you can set up a lifecycle policy in Amazon S3 to automatically abort incomplete multipart uploads after 7 days, helping to manage and optimize your storage costs.

Conclusion

Automating the management of incomplete multipart uploads in Amazon S3 is both straightforward and highly effective. By using tools such as S3 Storage Lens, Boto3 for monitoring and  lifecycle policies you can streamline the process of identifying and cleaning up failed uploads.Implementing these automated strategies simplifies the task of managing your storage, reducing manual effort and significantly cutting down on storage costs. Efficient automation ensures that incomplete uploads are promptly addressed, avoiding unnecessary charges that can accumulate over time. With these automation techniques, you can potentially save a substantial amount on your S3 storage bucket costs, keeping your environment cost-effective and well-managed.

FAQ’s

1. What are incomplete multipart uploads in Amazon S3?

Incomplete multipart uploads occur when a file is uploaded in parts, but the upload process is not completed. These partial uploads can remain in S3 and accumulate, leading to unnecessary storage costs.

2. Why is automating the management of failed multipart uploads important?

Automation helps efficiently handle incomplete uploads, reducing manual effort and minimizing the risk of incurring additional storage charges. It ensures timely cleanup and optimizes cost management.

3. Which AWS tools can be used to automate the optimization of failed multipart uploads?

AWS tools such as AWS Storage Lens, AWS Lambda, Amazon SNS, and the Boto3 library can be used to automate the monitoring, management, and cleanup of incomplete multipart uploads.

4. How can I set up automated cleanup for failed multipart uploads?

You can use AWS Lambda functions triggered by S3 events, configure S3 Lifecycle policies to automatically abort incomplete uploads after a specified period, and set up S3 Storage Lens for monitoring.

5. What are the benefits of using S3 Storage Lens and AWS Lambda for this purpose?

S3 Storage Lens provides visibility into storage usage and trends, while AWS Lambda automates the cleanup process by responding to events and notifications. Together, they help manage and reduce storage costs effectively.

Subscribed !
Your information has been submitted
Oops! Something went wrong while submitting the form.

Similar Blog Posts

Maintain Control and Curb Wasted Spend!

Strategical use of SCPs saves more cloud cost than one can imagine. Astuto does that for you!