Facebook's ground-up data-center redesign is saving the company billions of dollars and energy. Read about Facebook's announcements at the 2015 Open Compute Project Foundation summit.
When people are asked what they like about Facebook, few if any mention the company being an innovative pioneer in data-center engineering. Actually, in a roundabout way, that's a tribute from more than 1.3 billion Facebook users to those who work in Facebook data centers.
Redesign from the ground up
To continue to provide good service to an ever-increasing number of members (Figure A) required Facebook to add data-center infrastructure and equipment designing to its repertoire in 2009 — that's when Facebook management tasked a group of company engineers with designing a data center from the ground up. Two years later, Facebook's data center in Prineville, Oregon went online. Once the data center was dialed-in, operators determined the Prineville facility was using 40% less energy than other Facebook data centers.
The following concepts, software, and equipment developed by Facebook engineers are the reasons the Prineville and the newer Altoona, Iowa data centers are so efficient:
- Open Compute server, an optimized, clean-slate server redesign
- Wedge, a new Top-of-Rack (ToR) switch
- FBOSS, a new operating system for network devices
- Autoscale, a new load-balancing technology
- Redesigned data-center at Altoona
- Fabric, a new data-center topology
- 6-pack, a modular switch platform
Open Compute Project
Facebook realized the importance of its efforts and made public all pertinent information about the above technology through the Open Compute Project (OCP) Foundation. "We believe that openly sharing ideas, specifications, and other intellectual property is the key to maximizing innovation and reducing operational complexity in the scalable computing space," explains the project's mission statement. "The Open Compute Project Foundation provides a structure in which individuals and organizations can share their intellectual property."
Announcements from the 2015 OCP summit
The OCP Foundation had its 2015 U. S. summit last week and there were a slew of interesting announcements from Frank Frankovsky, president and chairman of the OCP Foundation. "I'm happy to say that the OCP community's influence has gained a lot of momentum in the past year, with new contributions and membership from companies like HP, Dell, Cisco, Apple, and Microsoft," said Frankovsky on the first day of the summit. "We have nearly 200 companies now participating in the project, and every day new technologies are being developed and contributed."
Frankovsky also disclosed that Facebook contributed the specifications for Wedge to the OCP Foundation. The press release mentions, "Facebook is working with Accton, Broadcom, Cumulus, and Big Switch to create a Wedge product package for the OCP community. Accton will begin shipping Wedge in the first half of 2015."
The social-media giant also released FBOSS Agent, software based on Broadcom's OpenNSL library and created to run on Facebook's Wedge, and OpenBMC, a system-management platform. "Recently, we found the Baseboard Management Controller (BMC) software stack too closed to meet our needs, so we built our own version, which we're open-sourcing today," explains Tian Fang, a software engineer at Facebook. OpenBMC will be first deployed on Wedge and then 6-pack. The diagram in Figure B depicts how all the component technologies come together in the Wedge ToR switch.
Facebook's big announcement
The big announcement from Facebook at the summit was the company's new System On a Chip (SoC) platform code-named Yosemite. "Over the last 18 months, Facebook has been working with Intel on a new SoC compute server that dramatically increases speed and more efficiently serves Facebook traffic," according to Facebook's press release. "Yosemite is our first system-on-a-chip compute server that supports four independent servers (Mono Lake) at a performance-per-watt superior to traditional data-center servers for heavily parallelizable workloads."
Hu Li, Facebook hardware-design engineer, introduced Yosemite (Figure C) in this post, mentioning that it includes the following design elements:
- Server-class SoC with multiple memory channels, which provides high-performance computing in 65W TDP for SoC and 90W for the whole server card.
- Standard SoC card interface to provide a CPU-agnostic system interface.
- Platform-agnostic system management solution to manage the system and these 4 SoC server cards, regardless of vendor.
- Multi-host network interconnect card following OCP Mezzanine Card 2.0 specification, which connects up to 4 SoC server cards through a single Ethernet port.
- Cost-effective, flexible, and easy-to-service system structure.
Facebook isn't finished
At the summit, Facebook representatives disclosed that improvements engineers made to the company's data-center infrastructure has saved Facebook $2 billion in infrastructure costs over the past three years. "We're not finished — not even close," hinted that more was to come.
- Facebook's Wedge: A novel approach to Top-of-Rack switches
- How Facebook saves electricity by using its Autoscale load balancing technology
- Facebook's next-gen data center network debuts in Altoona, Iowa
- Facebook's 6-pack: The main ingredient in its data center network redesign
- Facebook Fabric: An innovative network topology for data centers
- Facebook's Open Compute Project gets the nod from Greenpeace
- 10 things you should know about Open Compute