Leverage Elasticity to your Cloud for Cost Optimization!

During the COVID-19 outbreak, business continuity is a crucial aspect for nearly all organizations. Optimizing costs to survive and thrive is a pressing need in current scenario.

Technology plays an important role for organizations in their business operations and must say even in Business Continuity.

Cloud Computing is already a norm in this digital disruption. During this difficult time, organizations who have still not adopted the cloud computing are hurriedly evaluating their cloud migration strategies. On the other hand, organizations who are already on cloud are reconnoitering all minute measures to improve their cost effectiveness operations.

In the following post today, I shall be articulating some of the AWS best practices for organizations to mandate into their cost efficiency cloud operations.

  1. Right Sizing an Ongoing Process: reviewing the capacity of each workload on cloud is an essential need and a mandate to have effective operational efficiency.

In the following solution which is built on Amazon CloudFormation stack to deploy AWS resources and a custom Python script which shall be responsible to pull insights from Cloud Watch for your instances. The managed service solution provision a two-nodes RedShift cluster and deploys EC2 instance in VPC. The instance hosts a sequence of Python scripts that collect

utilization data from Amazon CloudWatch and then run a custom query in a temporary. Amazon Redshift cluster to produce the right-sizing analysis. Both the raw CloudWatch data and the analysis (CSV format) are stored in an Amazon S3 bucket. Users have the option to automatically terminate the Amazon EC2 instance and Amazon Redshift cluster after the analysis is delivered to reduce ongoing cost. After downloading the analysis from Amazon S3, users can then manually delete the AWS CloudFormation stack.

In the following available solution from Amazon as part of Managed service offerings to perform a right-sizing analysis and offered detailed recommendations for more cost-effective solution.

Though there are other service offerings available towards cost optimization such as Trust Advisor, Cost Explorer. However, the above solution organizations can leverage when they want more deep dive insights on their cost optimizations and take corrective measures. As a recommendation, AWS recommends organizations to provision this solution in every two weeks for continuous monitoring as a process.

2. Identify RDS instance, Redshift with low operations:

Organizations can leverage the Out of the Box service offerings Trusted Advisor to identify the DB instances which have not had any connections for a prolonged period of time, organizations can delete the instances automatically.

If persistent storage is needed for data on the instance, you can use lower-cost options such as taking and retaining a DB snapshot. Manually created DB snapshots are retained until you delete them.

Trusted Advisor Underutilized Redshift Cluster check in a similar manner to what we had just explained for RDS. It identifies clusters which had no connections for a predefined period and less than a defined CPU threshold for specific defined period to “Pause” underutilized Redshift cluster.

3. Amazon DynamoDB Scale-In Scale-Out: Organizations should leverage the DynamoDB Auto scaling capabilities understand the usage pattern and take a smartest decision to scale in or scale out your DynamoDB.

DynamoDB uses a scaling policy in Application Auto Scaling. You can set minimum and maximum levels of reads and write capacity in addition to the target utilization percentage. Auto scaling as a standard pattern uses Cloud Watch to monitor table’s read and write capacity metrics (ConsumedReadCapacityUnits ,ConsumedWriteCapacityUnits) and track consumed capacity. The upper threshold alarm is triggered when consumed reads or writes breach the target utilization percent for two consecutive minutes.

4. Serverless — Starting & Stopping Instance: Organizations should adopt automated process for starting and stopping instances according to a time-based schedule in order to reduce expenses. It is a perfect example of using the serverless approach. Environments such as Sandbox, Staging does not need to be available round clock.

In this case, you can configure a scheduled cron job on Amazon CloudWatch Events to trigger Lambda functions. The function’s handler will scan all EC2 instance metadata and identify those with an “Environment” tag, while ignoring those without. Once all instances with the target tag have been identified, the Lambda function will use AWS EC2 API to start or stop the target instances at the designated time.

5. Deleting unused Amazon EBS volumes: Limited visibility into a volume’s lifecycle can result in costs for unutilized resources. AWS builds cost-management products to access, organize, understand, control, and optimize costs on AWS.

For more reference and understanding please refer to a detailed AWS BLOG

6. Deleting unused Amazon Elastic IP Addresses: Checks for Elastic IP addresses (EIPs) that are not associated with a running Amazon Elastic Compute Cloud (Amazon EC2) instance. EIPs are static IP addresses designed for dynamic cloud computing. Unlike traditional static IP addresses, EIPs can mask the failure of an instance or Availability Zone by remapping a public IP address to another instance in your account. A nominal charge is imposed for an EIP that is not associated with a running instance.

Lambda functions to delete unassigned Elastic IPs. It is important to delete these resources which constitute a significant portion of total AWS infrastructure costs, deleting orphaned resources can result in major savings.

7. Lazy Load Balancers: Organizations needs to keep a vigilant check of their load balancer configurations that are not actively used. Any load balancer that is configured accrues charges. Organizations as part of best practices and cost efficiencies needs to track if a load balancer has no associated back-end instances or if network traffic is severely limited, the load balancer is not being used effectively.

Trust Advisor Idle LoadBalancer Check to fetch a report of all load balancers in your account that have RequestCounts less than defined value over the past X days. Once You see listed load balancers which are not active for past many days, user need to manually delete the load balancer.

This check currently only checks for Classic Load Balancer type within ELB service. It does not include other ELB types (Application Load Balancer, Network Load Balancer).

8. Instance Reservation: Organizations should explore the possibility of making reservations of Amazon EC2 Instances or EC2 Saving Plans must be evaluated, always considering what the use loads will be during the coming months.

At the time of instance bookings, we have some parameters to take into account that affect the possible savings to be generated in the costs of compute usage, they would be as follows:

· Region / location of the servers.

· Type of instances: families and sizes.

· Commitment period: 1 or 3 years.

· Payment method: total or partial payment at the beginning of the commitment, or directly only by establishing a commitment (without any payment in advance).

Depending on how we combine our choice, the percentage of savings that we could obtain will vary between 10%-75% vs. the same when used on demand.

One can use the AWS Spot Instance Advisor a tool to check the current spot prices, savings and frequencies of interruptions.

9. Move Objects Data to Lower Cost Tiers in S3: almost all the organizations are using S3 storage as one of the most favorable storage. The best practice is to move data between the tiers of storage depending on its usage. For example, Infrequent Access Storage is ideal for long-term storage, backups and disaster recovery content, while Glacier is best suited for archival.

In addition, the infrequent access storage class is set at the object level
and can exist in the same bucket as standard. The conversion is as simple
as editing the properties of the content within the bucket or creating a
lifecycle conversion policy to automatically transition S3 objects between
storage classes.

10. Delete Aged Snapshots: Many organizations use EBS snapshots to create point-in-copy recovery points to use in case of data loss or disaster. However, EBS snapshot costs can quickly get out of control if not closely monitored. Individual snapshots are not costly, but the cost can grow quickly when several are provisioned.

You can use Amazon Data Lifecycle Manager to automate the creation, retention, and deletion of snapshots taken to back up your Amazon EBS volumes. Automating snapshot management helps you to:

· Protect valuable data by enforcing a regular backup schedule.

· Retain backups as required by auditors or internal compliance.

· Reduce storage costs by deleting outdated backups.

Organizations can help get EBS snapshots back under control by
monitoring snapshot cost and usage per instance to make sure they do
not spike out of control. Set a standard in your organization for how many
snapshots should be retained per instance. Remember that the majority of
the time, a recovery will occur from the most recent snapshot.

Conclusion

It is important for organizations to adopt these best practices as part of their continuous delivery process and not just a one time or a project activity. The dynamic and ever changing nature of the cloud, cost optimization activities should ideally take place continuously. I have just covered top 10 best practices which I foresee as important part for organizations to have well defined control measures in place. There are many such offerings and best practices which are offered from AWS towards Cost Optimization as one of the strongest pillar of AWS Well Architected Framework.

Principal Solution Architect with over 14 years of extensive IT Architecture who share the enthusiasm for exploiting technology to create business value.