When Amazon’s James Hamilton gives a talk, anyone remotely interested in data-center engineering listens. At this year’s AWS re:Invent, Hamilton, Vice President and Distinguished Engineer at Amazon, described the — five years in the making — top-to-bottom overhaul of the data-center ecosystem driving Amazon Web Services (AWS).

Networking needed to be fixed

Hamilton started out by suggesting that networking was the number one target for improvement. Costs associated with networking were escalating, while computing costs were going down. The overarching reason: Amazon engineers could not adapt off-the-shelf networking gear and current protocols to meet load demands. So the company figured out what they needed and contracted an Original Design Manufacturer (ODM) to make custom networking gear. In addition, Amazon contracted a team to create a new protocol stack to reduce network hierarchy and latency.

Interestingly, Hamilton mentioned several times throughout the presentation that off-the-shelf equipment will never be a good choice for enterprise data-centers. Being designed to meet the needs of a wide-range of customers, off-the-shelf equipment inherits hardware and software bloat that disallows any kind of streamlining for specific operations.

An overview of Amazon’s data-center ecosystem

Next Hamilton mapped out the current AWS data-center ecosystem starting at the top with the company’s global infrastructure.

Figure A

AWS Region (Figure A): Amazon divides the world into 11 regions. Doing so affords Amazon customers the following advantages:

  • Compliance with government regulations regarding storage of data is simplified.
  • Latency between the customer’s network and Amazon’s Transit Centers is reduced.

Hamilton said Amazon also decided early on to use private fiber runs between regions, which eliminates peering fights, increases reliability, lowers latency, and allows capacity planning.

Figure B

AWS Availability Zone (AZ) (Figure B): There are 28 AZs dispersed throughout the 11 AWS regions, meaning Amazon has at least 28 data centers. Each AZ has redundant paths to both Transit Centers and the other AZs in the region using Dense Wavelength Division Multiplexing (DWDM) links. Amazon demands a latency of less than 2 ms between the AZs, and the inter-AZ fiber links must handle 25 Tb per second traffic loading.

Figure C

AWS Data Center (Figure C): Hamilton mentioned Amazon settled on a data-center size of 25 to 30 megawatts — 50,000 to 80,000+ servers. This size according to Hamilton is optimal: increase the size and Amazon’s ROI drops. Also, bigger data centers are an increased risk in the event of a catastrophic failure. Each data center is provisioned to handle 102 Tb per second loading.

Figure D

AWS Rack, Server, and NIC (Figure D): Besides network latency, Hamilton said they found unacceptable latency in the server software stack:

  • The software stack consisting of application, guest OS, Hypervisor, and NIC — milliseconds of latency.
  • Traffic passing through the NIC — microseconds of latency.
  • Traffic passing from server to server through a fiber link — nanoseconds of latency.

To get rid of software-stack latency, Amazon now gives each guest a virtual network card using Single-Root I/O Virtualization (SR-IOV). Hamilton explained the tough part about using SR-IOV was figuring out how to isolate each virtual NIC, prevent DDoS attacks, and monitor capacity.

Figure E

AWS Custom Server and Storage Designs (Figure E): Off-the-shelf networking equipment was mentioned earlier as being a huge hindrance to Amazon. Hamilton said that same sentiment applied to servers, which led to the company’s decision to build proprietary servers, processors, and racks:

  • The servers are Amazon designed and ODM built.
  • The processors are a custom design that Amazon developed with Intel.
  • Amazon racks hold 864 hard drives and weigh more than 2,000 pounds.

Figure F

AWS Power Infrastructure (Figure F): Building proprietary networking gear, servers, processors, and racks does not seem too out of the ordinary. How about building electrical substations? It seems scheduling and building an electrical substation is a lengthy process — too long, with Amazon constantly building data centers. So, Amazon management decided it was in the company’s best interest to build its own power substations and eliminate the bottleneck.

As for electricity, Amazon, like Google and Microsoft, prefers Power Purchasing Agreements, and the adjoined Renewable Energy Certificates.

Hamilton’s last comment

Hamilton, near the end of his presentation, talked about the “pace of innovation” that’s occurring at Amazon. AWS is growing fast, which is introducing some management concerns about remaining nimble in a competitive market. Hamilton happily reported that AWS is delivering more service at a faster pace, and reliability has increased.