AWS Cost Efficiency

How to Reduce Amazon S3 Costs by Deleting Incomplete Multipart Uploads

Best Practices for Managing Multipart Uploads and Reducing Unfinished Upload Costs

Did you know?

Incomplete uploads in Amazon S3 can account for a surprising amount of storage—up to 20% of what you're paying for!
This could mean anywhere from 100MB to multiple terabytes of wasted storage due to these unfinished files.

Amazon S3’s multipart upload feature is essential for handling large files, enabling improved throughput and resilience against network issues. This feature allows you to upload a single large object as a set of parts, which can be uploaded in parallel, enhancing overall upload speed and reliability. However, incomplete multipart uploads can accumulate over time, leading to unnecessary storage costs and potentially impacting your bucket's performance.

This detailed guide provides step-by-step instructions and insights to help you manage your S3 buckets more efficiently and cost-effectively.

Potential High Storage Costs of S3

Amazon S3 charges for all stored data, including incomplete multipart uploads. These hidden charges can add up, especially if you frequently upload large files in parts. Over time, the accumulated cost of these incomplete uploads can become a financial burden.

Financial Impact

  • Scenario: Tech Solutions frequently uploads large files to Amazon S3 using multipart uploads. Occasionally, these uploads fail, leaving incomplete multipart uploads that consume storage space and incur costs. The company uploads 10,000 large files per month. Each file is split into 10 parts during the upload. 5% of the multipart uploads fail, leaving incomplete uploads. Each part of the upload is 100 MB in size. Amazon S3 Standard storage costs $0.023 per GB per month
  • Monthly Incomplete Uploads:

1. Number of failed uploads per month: 10,000 × 0.05 = 500

2. Number of incomplete parts per failed upload: 10

3. Total incomplete parts per month: 500×10 = 5000

4. Size of incomplete parts per month: 5000×100 MB=50,000 MB=500 GB

  • Annual Incomplete Uploads: Size of incomplete parts per year: 500 GB×12 months=6000 GB
  • Annual Cost: Storage cost per year: 6000 GB × 0.023 USD/GB =138.0 USD 

The company incurs an additional cost of $138.0 annually due to storage consumed by failed multipart uploads.

Understanding Amazon S3 Multipart Uploads

Amazon S3 Multipart Uploads allow you to upload a single object as a set of parts. Each part is a contiguous portion of the object's data. This feature is especially useful for uploading large files, as it can improve upload efficiency and reliability.

Here is a detailed explanation for Amazon S3 Multipart Upload features:

Feature Description
Minimum Size Parts as small as 5 MB (except the last part).
Maximum Size Up to 5 GB per part; total object size up to 5 TB.
Concurrency Upload multiple parts in parallel.
Fault Tolerance Retry failed parts without affecting others.
Resumability Pause and resume uploads if needed.
Initiation Use ‘InitiateMultipartUpload’ to start.
Completion Use ‘CompleteMultipartUpload' to finish.
Abort Abort incomplete uploads with ‘AbortMultipartUpload’.
Lifecycle Policies Automatically clean up incomplete uploads.
Security Parts are encrypted during upload.

Amazon S3 Storage Lens Metrics

Amazon S3 Storage Lens provides a comprehensive set of metrics to help you monitor and optimize your S3 storage. The metrics are divided into Free and Advanced categories, each offering different levels of detail and insight:

  • Free Metrics: Available to all users, these metrics cover essential storage and usage statistics. They include general metrics like total storage and object count, as well as detailed insights into cost optimization and data protection.
  • Advanced Metrics: These metrics provide deeper insights and are available through an upgrade. They include detailed data on request types, lifecycle rules, and advanced data protection features. They offer extended visibility into your storage and access patterns.

For detailed information consider checking Amazon S3 Storage Lens metrics glossary

By focusing on these metrics, you can gain better insights into your S3 usage and optimize the performance and cost-effectiveness of your multipart uploads.

Monitoring for Failed Multipart Uploads

Monitoring for failed multipart uploads in Amazon S3 is crucial to ensure that storage costs are minimized, and storage efficiency is maintained. Here's how you can do it:

To effectively monitor failed multipart uploads, you should focus on the following metrics:

  • Number of Incomplete Multipart Uploads: Tracks the count of multipart uploads that have not been completed.
  • Total Storage of Incomplete Multipart Uploads: Measures the total storage occupied by incomplete multipart uploads.
  • Multipart Upload Completion Rate: Monitors the rate at which multipart uploads are completed successfully.
  • Multipart Upload Failure Rate: Tracks the rate at which multipart uploads fail to complete.
  • Time Since Incomplete Multipart Uploads: Measures the duration that multipart uploads have remained incomplete.

Strategies to Manage and Avoid Costs from Incomplete Multipart Uploads in Amazon S3

To avoid multipart uploads in Amazon S3 from becoming incomplete and incurring unnecessary costs, follow these strategies:

1. Using S3 Storage Lens

S3 Storage Lens helps reduce Amazon S3 costs by providing clear visibility into incomplete multipart uploads. The dashboard shows how much storage these incomplete uploads occupy and how long they've been stored. By identifying and managing these uploads either you can delete unnecessary data, free up storage space, and lower your costs.

space, and lower your costs.

Manual Steps

  1. Sign in to the AWS Management Console
  2. Access S3 and Navigate to Storage Lens
  3. Create or Select a Storage Lens Dashboard

To create or select a Storage Lens dashboard, use a default option or click "Create dashboard" to set up a new one. Configure the scope by choosing buckets and regions, select between free or advanced metrics, and optionally set up metrics export to S3 for detailed analysis.

  1. Configure the Dashboard with Desired Buckets and Metrics

To configure the dashboard with desired buckets and metrics, select the relevant buckets and focus on metrics related to cost optimization. Primary metrics include total storage, object count, and average object size for an overview of storage usage. Secondary metrics offer detailed insights for cost optimization, such as noncurrent version bytes, noncurrent version object count, delete marker object count, delete marker storage bytes, % incomplete multipart upload bytes, % incomplete multipart upload object count, and incomplete multipart upload metrics older than 7 days. 

The graph above illustrates the trends and distributions of the primary metric, 'total storage,' and the secondary metric, '% incomplete multipart upload bytes,' over a specified time range.

  1. Select the Multipart Upload Metric

Specifically, select the metric for " % Incomplete multipart upload bytes" to view the distribution and impact of incomplete multipart uploads.

  1. Optionally Set Up Data Export

This will export the metrics to an S3 bucket, allowing you to analyze the data over extended periods.

  1. Review Metrics and Identify Incomplete Uploads

Identify any incomplete multipart uploads that are consuming unnecessary storage space and take corrective actions, such as deleting incomplete uploads, to optimize storage usage and reduce costs.

2. Abort Incomplete Multipart Uploads Using S3 Lifecycle 

When you use Amazon S3’s Lifecycle policies to abort incomplete multipart uploads, the process significantly impacts your storage costs. During a multipart upload, files are divided into smaller parts, which are stored temporarily until the upload is completed. If an upload is interrupted or fails, the incomplete parts can remain in your S3 bucket indefinitely, consuming storage space and potentially increasing your costs.

Manual Steps

  1. Sign in to the AWS Management Console.
  2. Access S3 and navigate to your bucket.
  3. Open the Management tab.
  4. Create a new lifecycle rule.

When creating a new lifecycle rule, you can filter objects by prefix, object tags, object size, or a combination that suits your use case. Name the rule appropriately, such as "AbortIncompleteMultipartUploadRule." You can apply this rule to the entire bucket or specific prefixes, depending on your requirements.

  1. Add an action to abort incomplete multipart uploads

When configuring lifecycle rule actions, choose the specific actions you want the rule to perform. Options include moving current or noncurrent versions of objects between storage classes, expiring current versions, permanently deleting noncurrent versions, and deleting expired object delete markers or incomplete multipart uploads. Note that these actions are not supported when filtering by object tags or object size. Per-request fees apply; learn more or see Amazon S3 pricing for details.

  1. Set the action to trigger after 7 days.
  2. Review and create the rule.

3. Set Up Notifications for Multipart Uploads

 Implementing multipart upload notifications and management in Amazon S3 helps reduce costs by quickly identifying and handling incomplete uploads. By setting up S3 event notifications to trigger alerts and using AWS Lambda to automatically abort these uploads, you prevent unused storage from accumulating. This ensures that only completed data is stored, thus avoiding unnecessary storage charges and keeping your S3 costs under control.

Manual Steps

  1. Sign in to the AWS Management Console
  2. Create an SQS Queue

To access Amazon SQS, first log in to the AWS Management Console, then select "Services" and choose "Simple Queue Service (SQS)" under the Application Integration category. To create a new queue, click on "Create queue," select "Standard" for the queue type, enter a name for your queue, and configure any additional settings as needed. Finally, click "Create queue" to complete the setup.

  1.  Configure S3 Event Notifications

To configure S3 event notifications, first access Amazon S3 by selecting "Services" from the AWS Management Console and then choosing "S3" under the Storage category. Next, select the bucket where you want to monitor multipart uploads. Click on the "Properties" tab, scroll down to the "Event notifications" section, and click "Create event notification." Enter a name for the event notification, then select "All object create events" and "All object remove events" in the "Event types" section. In the "Destination" section, choose "SQS queue" and select the SQS queue you previously created. Finally, click "Save changes" to complete the configuration.

  1. Create an AWS Lambda Function
  2. Add Code to Abort Incomplete Multipart Uploads
  3. Deploy the Lambda Function
  4. Add SQS Trigger to Lambda

To add an SQS trigger to your Lambda function, first navigate to the Lambda function console and click on the "Add trigger" button. Select "SQS" as the trigger type and choose the SQS queue you created earlier. Click "Add" to configure the trigger. Next, ensure that your Lambda function has the necessary permissions to read messages from the SQS queue and perform operations on S3. Update the Lambda function's execution role to include permissions for both SQS and S3 to enable proper functioning.

4. Optimize Multipart Upload Part Sizes

Optimizing the size of the parts during a multipart upload can significantly enhance upload efficiency and reduce the likelihood of incomplete uploads. By selecting the appropriate part size based on file size and network conditions, you can achieve a more reliable and cost-effective upload process.

Scenario Recommended Part Size Reason
Files Less Than 100 GB 5 MB to 100 MB Balances the number of parts and upload time.
Files Larger Than 100 GB 500 MB to 1 GB Reduces the total number of parts and improves performance.
Unstable Networks Smaller Part Sizes (e.g., 5 MB to 50 MB) Minimizes impact of interruptions and easier retries.
Stable, High-Speed Networks Larger Part Sizes (e.g., 100 MB to 1 GB) Optimizes upload performance and reduces total upload time.

Conclusion

Incomplete multipart uploads in Amazon S3 can significantly inflate storage costs, accounting for up to 20% of your expenses. To manage and reduce these costs, enable S3 Storage Lens to monitor usage, implement lifecycle policies to automatically delete incomplete uploads after a set period, and set up notifications with AWS Lambda to abort unfinished uploads promptly. These strategies help ensure that only fully uploaded data consumes storage, optimizing your S3 costs efficiently.

Looking to streamline this process with automation and reduce efforts?

Don't miss Part 2 of this guide where we dive into automated solutions for reducing Amazon S3 Costs by Deleting Incomplete Multipart Uploads.

Part 2 - https://www.astuto.ai/blogs/automating-deletion-of-incomplete-multipart-uploads

Subscribed !
Your information has been submitted
Oops! Something went wrong while submitting the form.

Similar Blog Posts

Maintain Control and Curb Wasted Spend!

Strategical use of SCPs saves more cloud cost than one can imagine. Astuto does that for you!