So, you’re thinking about moving your application to the public cloud and leaning toward Amazon Web Services (AWS) as the best fit?
Well, that is no surprise. More and more businesses are looking to leverage the powerful tools developed by our friends in Seattle. As AWS approaches the $100 billion revenue mark this year, the popularity of the platform is unlikely to change anytime soon.
While on the surface AWS appears easy enough to set up and manage, there are countless deployment decisions one must make to ensure mission-critical apps can withstand a severe outage. The considerations and process for moving to the public cloud should be just as rigorous as any major change to your IT strategy. In lieu of that, here are three common pitfalls you’ll want to avoid.
Pitfall #1: Assuming your Cloud Environment is Redundant
First, don’t make the mistake of thinking that just because something is in the “cloud” makes it redundant. Although the physical hardware underlying your EC2 instance is redundant, you still need to maintain at least two copies of each EC2 instance via load balancing or a primary/replica scenario.
Pitfall #2: Running your Critical Apps in a Single Zone
Second, you need to be multi-zoned, and if your budget permits, multi-regioned. Even if you have primary and replica EC2 instances for your application, it will do you no good unless they are in different zones.
Are zone-wide outages really something that happens with AWS? Absolutely.
Last Sunday, AWS data center outages in the Sydney, Australia availability zone (AZ) caused significant downtime for many major “Down Under” web entities, including Foxtel Play (a popular video streaming service) and Channel Nine (a leading television and entertainment network). To complicate matters, API call failures even prevented customers with multi-zone redundancy from staying online, meaning only multi-region customers or customers with redundant infrastructure outside of AWS were safe.
The CIO of REA Group, an Australian digital advertising firm that weathered the outgage, provided this on-point advice and insight in ITNews.com.au:
“Multi AZ and ultimately, multi-region, with some smart architecture for deployment is key to cloud resilience today . . . We learned a lot. Power failure is a tough event for anyone to suffer, and we have an A-team of engineers. Others will be learning different, tougher lessons about good AZ management.”
Pitfall #3: Failing to Back Up Storage Volumes
Third, make sure you plan for EBS (Elastic Block Store) failures. Best practice methods available at AWS will tell you to store all your data on the EBS volumes, which allows you to easily move them to new hardware, should that hardware fail. While sound advice, you also need to account for the possibility of a failure of the EBS system itself.
Should this occur, you’ll need backups or copies of your data on local (ephemeral) storage. This way, if there is a failure of the EBS system you will be able to restore your application and maintain your disaster recovery plan.
If All Else Fails
Lastly, it’s wise to maintain an “if all else fails” option. If you do encounter an issue where each of your load balancing and redundancy protocols fail, then it is helpful to have a mechanism that kicks in and displays an error message on your web page or application. This can be done quite easily by leveraging Route 53 DNS failover and Amazon CloudFront.
Moral of the story: Plan wisely when getting started in AWS — just as you would with any of your key infrastructure.
Updated: January 2019