Software optimize

AWS auto-scaling: Add notification and test to see what happens

Nick Hardiman finalizes his demonstration of setting up auto-scaling for an Amazon Web Service by showing you how to add a notification alert and then testing it all to see what happens.

I set up Amazon auto-scaling to satisfy my 12 principles of operational readiness. Here are the pieces of the auto-scaling demonstration, of which this is the final installment.

AWS Auto-scaling is now all configured and ready to rock. These three procedures complete the series.

  • Add auto-scaling notification. I want an email when something changes.
  • Torture the service. Load test and see what happens.
  • Leave auto-scaling in place or remove? Is AWS auto-scaling safe to leave in place?

Add Auto Scaling notification

This requires a brief diversion into the land of Amazon's SNS (Simple Notification Service).

Create a topic.

  1. Sign up for SNS.
  2. Create a topic. This gives me an ARN to use.
  3. Subscribe using my email address.

    You have chosen to subscribe to the topic:

    arn:aws:sns:eu-west-1:123494605340:topic01

    To confirm this subscription, click or visit the link below (If this was in error no action is necessary):

    Confirm subscription
  4. Confirm.
  5. Look for the confirmation e-mail. From: MyNewDisplayName no-reply@sns.amazonaws.comSubject: AWS Notification - Subscription Confirmation

Add a notification

My-MacBook-Pro:~ nick$ as-put-notification-configuration cag01 --topic-arn arn:aws:sns:eu-west-1:123494605340:topic01 --notification-types autoscaling:EC2_INSTANCE_LAUNCH, autoscaling:EC2_INSTANCE_TERMINATE
OK-Put Notification Configuration
My-MacBook-Pro:~ nick$

This addition sends a test email to my account.

From:  MyNewDisplayName <no-reply@sns.amazonaws.com>
 Subject:      AWS Notification - autoscaling:TEST_NOTIFICATION
Service: AWS Auto Scaling
Time: 2012-06-12T23:16:38.816Z
RequestId: a97c1596-b4e4-11e1-a3d6-09c4d2dc5a81
Event: autoscaling:TEST_NOTIFICATION
AccountId: 123494605340
AutoScalingGroupName: cag01
AutoScalingGroupARN: arn:aws:autoscaling:eu-west-1:123494605340:autoScalingGroup:b8d8007e-51b8-4834-b0fb-0835d37d8be1:autoScalingGroupName/cag01
If you wish to stop receiving notifications from this topic, please click or visit the link below to unsubscribe:
https://sns.eu-west-1.amazonaws.com/unsubscribe.html?SubscriptionArn=arn:aws:sns:eu-west-1:123494605340:topic01:98854af2-9c58-4944-9dae-3e28e4f801c3&Endpoint=nick@internetmachines.co.uk
...

Check the configuration.

My-MacBook-Pro:~ nick$ as-describe-notification-configurations
NOTIFICATION-CONFIG  cag01  arn:aws:sns:eu-west-1:123494605340:topic01  autoscaling:EC2_INSTANCE_LAUNCH
NOTIFICATION-CONFIG  cag01  arn:aws:sns:eu-west-1:123494605340:topic01  autoscaling:EC2_INSTANCE_TERMINATE
My-MacBook-Pro:~ nick$

Torture the service

As with all IT work, you have to check what you've done. Your assumptions may be wrong. Testing means measuring twice and torturing the service until it breaks.

  • Start a new instance to host the load test. An Amazon micro-size free tier instance is fine.
  • Update the OS.

Test scale out, following these steps:

1. Install httpd.

2. Use ab to create a heavy server load. Use the internal name, not the external one.

ec2-user@ip-10-226-5-80 ~]$ ab -n 10000 -c 10 http://ip-10-58-134-91.eu-west-1.compute.internal/drupal7/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>

3. Check the latest measurements.

My-MacBook-Pro:~ nick$ mon-get-stats CPUUtilization --statistics "Average" --namespace "AWS/EC2" --dimensions "AutoScalingGroupName=cag01" 
...
2012-06-12 23:42:00  100.0  Percent
2012-06-12 23:43:00  99.14  Percent
2012-06-12 23:44:00  100.0  Percent
...

4. Check monitor alarm status.

My-MacBook-Pro:~ nick$ mon-describe-alarm-history
cma01-add  2012-06-12T23:46:27.198Z  Action               Successfully executed action arn:aws:autoscaling:eu-west-1:123494605340:scalingPolicy:98962cd4-123f-4e3b-a3d3-55a8db985713:autoScalingGroupName/cag01:policyName/cap01-add
cma01-add  2012-06-12T23:46:27.177Z  StateUpdate          Alarm updated from OK to ALARM
cma01-del  2012-06-12T23:43:16.983Z  StateUpdate          Alarm updated from ALARM to OK
...
The newest messages are at the top. That first line shows the scale out policy being triggered. It should create a new EC2 machine.

5. Check  the list of EC2 machines and the load balancer (using the command line tools or the AWS console - either is fine).

6. Check the e-mail inbox.

From:  MyNewDisplayName <no-reply@sns.amazonaws.com>
Subject:      AWS Notification - autoscaling:EC2_INSTANCE_LAUNCH
Service: AWS Auto Scaling
Time: 2012-06-12T23:49:31.398Z
RequestId: 7650781a-4cd2-4219-99c5-0dca26dd3747
Event: autoscaling:EC2_INSTANCE_LAUNCH
AccountId: 123494605340
AutoScalingGroupName: cag01
...

7. Wait a few minutes

8. Check it all again.

Test scale in.

9. Stop the load test.

10. Check monitor alarm status.

My-MacBook-Pro:~ nick$ mon-describe-alarm-history
cma01-del  2012-06-13T00:02:16.964Z  Action               Failed to execute action arn:aws:autoscaling:eu-west-1:123494605340:scalingPolicy:7893501c-2a0c-4e7e-88a9-bd68c9a34c0b:autoScalingGroupName/cag01:policyName/cap01-del
cma01-del  2012-06-13T00:01:17.002Z  Action               Successfully executed action arn:aws:autoscaling:eu-west-1:123494605340:scalingPolicy:7893501c-2a0c-4e7e-88a9-bd68c9a34c0b:autoScalingGroupName/cag01:policyName/cap01-del
cma01-del  2012-06-13T00:01:16.977Z  StateUpdate          Alarm updated from OK to ALARM
cma01-add  2012-06-12T23:50:27.169Z  StateUpdate          Alarm updated from ALARM to OK
...

11. Wait a few minutes.

12. Make sure EC2 machines were removed and the fleet is reset to the minimum level.

13. Check the statistics, e-mail inbox and the load balancer.

Clean up.

14. Destroy the new load test instance.

My wrong assumptions

My testing saved me from leaving a useless auto-scaling configuration in place. I started off trying to monitor the load balancer latency, not the CPU. That made perfect sense to me: there are metrics to do that kind of thing so it should be easy. I used this command.

mon-put-metric-alarm cma01-add \
 --alarm-actions arn:aws:autoscaling:eu-west-1:123494605340:scalingPolicy:45e9ed16-9462-4905-b9d4-cc99fcff0d43:autoScalingGroupName/cag01:policyName/cap01-add \
 --comparison-operator  GreaterThanThreshold \
 --dimensions  "LoadBalancerName=clb01"  \
 --evaluation-periods 3 \
 --metric-name  Latency \
 --namespace   "AWS/ELB" \
 --period  60  \
 --statistic  Average \
 --threshold  8  \
 --unit  Seconds

I thought that meant:

Watch the load balancer. If it takes more than 8 seconds to get a reply from the web server pool, kick off the Auto Scaling policy.

No, it didn't mean that at all. Despite my best efforts to torture my servers, the statistics were collected many minutes apart and reported as tiny sub-second latencies. Why? I still don't know. The answer will become clear at some point, but I did not get an answer in time for this configuration. I changed to CPU instead, and that worked well.

Leave auto-scaling in place or remove?

Auto-scaling is complete. Did it work as expected? If you are sure your auto-scaling works correctly, leave it in place.

Make plans to go check it regularly. The longer a configuration is left in place, the more fragile it becomes. The environment slowly changes around the configuration, until it is no longer relevant or just fails to work when called upon.

If in doubt, remove the new configuration. It's no fun paying for unused capacity or losing business because your customer service is under-powered. The AWS calculator gives an idea of what you will pay for your infrastructure.

Remove notification

My-MacBook-Pro:~ nick$ as-delete-notification-configuration cag01 --topic-arn arn:aws:sns:eu-west-1:123494605340:topic01 
 
 Are you sure you want to delete this notification configuration? [Ny]y
OK-Deleted Notification Configuration
My-MacBook-Pro:~ nick$
Remove Cloudwatch monitors.
My-MacBook-Pro:~ nick$ mon-delete-alarms cma01-add
 
 Are you sure you want to delete these Alarms? [Ny]y
OK-Deleted alarms
My-MacBook-Pro:~
 
My-MacBook-Pro:~ nick$ mon-delete-alarms cma01-del
 
 Are you sure you want to delete these Alarms? [Ny]y
OK-Deleted alarms
My-MacBook-Pro:~ nick$
Check your work.
My-MacBook-Pro:~ nick$ mon-describe-alarms
No alarms found
My-MacBook-Pro:~ nick$

Remove the Auto Scaling Policy

My-MacBook-Pro:~ nick$ as-delete-policy  arn:aws:autoscaling:eu-west-1:123494605340:scalingPolicy:45e9ed16-9462-4905-b9d4-cc99fcff0d43:autoScalingGroupName/cag01:policyName/cap01-add
Are you sure you want to delete this policy? [Ny]y
OK-Deleted Policy
My-MacBook-Pro:~ nick$
My-MacBook-Pro:~ nick$ as-delete-policy arn:aws:autoscaling:eu-west-1:123494605340:scalingPolicy:8bf0832a-45f3-446b-93bc-3b7a94609943:autoScalingGroupName/cag01:policyName/cap01-del
Are you sure you want to delete this policy? [Ny]y
OK-Deleted Policy
My-MacBook-Pro:~ nick$

Remove the Auto Scaling Group

Halt Auto Scaling.

My-MacBook-Pro:~ nick$ as-suspend-processes cag01
OK-Processes Suspended
My-MacBook-Pro:~ nick$
Remove any scaled-out EC2 machines.
My-MacBook-Pro:~ nick$ as-terminate-instance-in-auto-scaling-group i-9b0888d3 --no-decrement-desired-capacity
Are you sure you want to terminate this instance?  [Ny]y
INSTANCE  d19341b0-f35c-4648-9a0b-720ee0fa0a8f  InProgress
My-MacBook-Pro:~ nick$
Remove the group.
My-MacBook-Pro:~ nick$ as-delete-auto-scaling-group cag01
Are you sure you want to delete this AutoScalingGroup? [Ny]y
OK-Deleted AutoScalingGroup
My-MacBook-Pro:~ nick$

Remove the launch configuration

My-MacBook-Pro:~ nick$ as-delete-launch-config clc01
Are you sure you want to delete this launch configuration? [Ny]y
OK-Deleted launch configuration
My-MacBook-Pro:~ nick$

Commands covered

Here, in the order they appeared, are the commands used in the last few weeks to add, remove and tweak AWS auto-scaling:

ec2-describe-images
as-create-launch-config
as-describe-launch-configs
as-create-auto-scaling-group
as-describe-auto-scaling-groups
as-put-scaling-policy
as-describe-policies
mon-list-metrics
mon-get-stats
mon-put-metric-alarm
mon-describe-alarms
as-put-notification-configuration
as-describe-notification-configurations
mon-get-stats
mon-describe-alarm-history
as-delete-notification-configuration
mon-delete-alarms
mon-describe-alarms
as-delete-policy
as-suspend-processes
as-terminate-instance-in-auto-scaling-group
as-delete-auto-scaling-group

About

Nick Hardiman builds and maintains the infrastructure required to run Internet services. Nick deals with the lower layers of the Internet - the machines, networks, operating systems, and applications. Nick's job stops there, and he hands over to the ...

0 comments