Earlier this year I noticed that my AWS storage costs had increased significantly and wanted to understand why. In this post, I’ll describe my research, what I discovered, and how I adjusted my storage configuration.

When I opened my first AWS account I assumed that I would only need one which I could configure appropriately to serve my needs. Several years later, I have six, am actively using AWS Organizations1 constantly improving the footprint using the Well-Architected Framework2. In the Security Pillar, Enable Traceability is a defined design principle which I absolutely agree with and is the easiest to implement with automation. To achieve it, I log all of the CloudTrail streams from each account to a centralized log bucket in a security account. I used the AWS documentaton here to set it up. By configuring a single trail for the organization, all accounts in that organization log their events to it without any additional configuration. As I add and remove accounts, as long as they are part of the organziation, their respective events will be logged to it.

I deployed the centralized bucket via CloudFormation in early 2019 using the following template:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
AWSTemplateFormatVersion: 2010-09-09
Description: >
  This template instantiates an S3 bucket to collect all CloudTrail trail logs
  from all accounts within the organization
Parameters:
  LogBucketName:
    Type: String
    Description: Name of S3 Bucket that will store data from all trails
Resources:
  LogBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName:
        Ref: LogBucketName
      LifecycleConfiguration:
        Rules:
          -
            Id: Archive
            Status: Enabled
            Transitions:
              -
                StorageClass: GLACIER
                TransitionInDays: 90
  LogBucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      Bucket:
        Ref: LogBucket
      PolicyDocument:
        Version: 2012-10-17
        Statement:
          -
            Sid: CloudTrailACL
            Effect: Allow
            Principal:
              Service: cloudtrail.amazonaws.com
            Action: s3:GetBucketAcl
            Resource:
              !Sub arn:aws:s3:::${LogBucket}
          -
            Sid: CloudTrailWrite
            Effect: Allow
            Principal:
              Service: cloudtrail.amazonaws.com
            Action: s3:PutObject
            Resource:
              !Sub arn:aws:s3:::${LogBucket}/AWSLogs/*
            Condition:
              StringEquals:
                s3:x-amz-acl: bucket-owner-full-control

The line numbers that are the most critical are 22 and 23. My thought behind setting up the lifecycle policy3 this way to ensure that I could store events for perpetuity and that after 90 days I would benefit from lower storage costs by moving the data to Glacier.

When I noticed that my storage costs had increased instead of decreased, I figured I had misconfigured (read: miscalculated) my lifecycle policy based on expectations. Enter AWS Cost Management4, specifically Cost Explorer to figure out why. Once Cost Explorer has analyzed the billing data across your accounts, you can use the UI (or the API) to filter and sort the data in as many ways as you like. For this specific circumstance, I chose to filter on all S3 API operations. The image below is what I discovered.

Can you spot the unexpected cost?

While the S3 Glacier storage cost per GB is considerably lower than the S3 Standard cost5 ($0.004/GB versus $0.023/GB) I neglected to consider the cost of transitioning the data into Glacier. The cost for each set of 1000 Lifecycle transition requests into Glacier is $0.05. “Considering that CloudTrail publishes log files multiple times an hour, about every five minutes”6 the number of transitions were significant!

I realized my mistake and made a slight adjustment to the CloudFormation mentioned above.

ExpirationInDays: 365
Transitions:
  -
    StorageClass: STANDARD_IA
    TransitionInDays: 90

Instead of moving into Glacier, I selected Standard-Infrequent Access which has a constraint against files that are smaller than 128KB. In my case, this made much more sense than storing all of the data in Glacier. In addition, I decided that having a year’s worth of data was enough, and to expire (delete) any content in the bucket 365 days after creation. This would ensure that my CloudTrail storage costs remain more controlled.

The link to the current version of the CloudFormation I have deployed is here.

While the impact to the bottom line of my billing was small, the percentage difference was significant which is why I wanted to call it out. For those of you that manage much larger deployments, cost optimization is highly recommended as its one of the pillars of any Cloud well-architected framework.