Open evolution of cloud data centers behind OpenRack 3.0

On June 25th, OCP China Day was held in Beijing. The conference was co-hosted by the OCP Foundation and OCP Platinum members. Nearly 1,000 engineers and data center practitioners attended the conference.

OCP is the world’s largest hardware open community with more than 200 core members, including Google, Microsoft, Intel, IBM, Inspur, etc. More than 7,000 companies have participated in community activities. In 2011, Facebook launched the community to restructure data center hardware design and build an innovative technology ecosystem through open source and open source. After the establishment of the community, the development speed exceeded everyone’s expectations. In 2018, the OOC non-board member purchases grew by more than 120% year-on-year, reaching $2.56 billion, and is expected to exceed $10.7 billion by 2022.

At present, all cloud computing data centers adopt OCP open technologies in whole or in part. A large number of innovative technologies and products in the cloud computing era, such as whole rack servers, storage servers, and rack-mounted high-density servers, are in the OCP community. Directly promoted to develop. The development of OCP is also the process of cloud computing transformation in the entire data center industry.

5G spawn cloud data center 2.0

This OCP China Day event focuses on three major issues: edge computing, AI, and cloud data centers. With the application of 5G, the information technology revolution represented by cloud computing, mobile internet, and big data has begun to usher in a new starting point. The 5G era is not only the era of AI, edge computing, but also the era of the Internet of Things, and also greater bandwidth. In the era of large-scale interconnection, cloud data centers need to carry a larger amount of traffic and data, thereby further accelerating scale, modernization and upgrading.

If the current cloud data center is version 1.0, then the cloud data center in the 5G era is version 2.0. Through water cooling, 48V power supply, overcome physical limitations, further improve data center power density; software-defined technology is fully applied, hardware standardization, firmware open source unified, IT infrastructure to achieve true unification, integration, and open; Redfish new management architecture replaces current IPMI, combined with openBMC, forms the next generation of data center management technology ecosystem.

Looking at the next generation of full-rack servers from OpenRack 3.0

As of January 2019, the number of ultra-large-scale data centers in the world has reached 430, an increase of 11% year-on-year, and it is expected to reach 500 by the end of the year. According to the capacity of 100,000 units in each data center, the ultra-large-scale data center has been built to accommodate 43 million servers. According to IDC data, the total global server shipments in 2018 are only 11.75 million units.

These large-scale and ultra-large-scale data centers have been continuously increasing the density of equipment, directly leading to different forms of multi-node servers dominated by whole cabinets. In the past 10 years, the global market share of shipments has risen from 0. Up to 20%. OCP’s OpenRack 2.0 and ODCC’s Scorpio 2.5 are the two major open technology standards for the entire rack server. Most of the deployed servers follow the above two standards. However, these two standards have encountered hard walls that are physically limited in terms of power supply and heat dissipation, and it is difficult to continue to increase the density. Therefore, both OCP and ODCC are developing the next-generation standard 3.0, which is generally supported by 12-48V high-voltage power supply, 15-33KW high-power support, and liquid-cooled heat dissipation.

At the conference, the head of Facebook technology Steve Mills explained the latest OpenRack 3.0 full-rack server specification. The new specification will increase the power from 40OU to 44OU, and the maximum weight will increase from 1400Kg to 1600kg. Supporting 21-inch and 19-inch nodes, the node height unit supports both OU and standard U; the internal structure has also been adjusted to allow users to deploy specialized modules such as heterogeneous accelerators and storage, as the specification involves liquid cooling. And the 48V power supply and other technologies that have not been applied in large scale, many details are still to be determined, so the standard has not been officially released, and is in the stage of public consultation.

Efforts beyond OpenRack 3.0

The open standards of OCP are often derived from mature application practices. The leading practices of member companies will become community standards after a series of complex, rigorous and completely transparent processes. This strict process guarantees the practicality and authority of OCP standards. Sex, also brings a problem – OCP standards lag behind the development of practical applications, such as the standard OAM of heterogeneous accelerators released this year, and NVIDIA’s GPU technology appeared more than a decade ago, GPU is used in the AI ​​field There are also 10 years.

The rapid innovation of OCP and ODCC community members at the technical and program level has made up for the problems caused by the slow upgrade of standards. The innovative IP or technical specifications of OCP members can be published on the community platform as long as they are accepted by the community. At this event, Tencent United Wave contributed the T-flex 2.0 specification to the OCP community, which was previously accepted by the ODCC community. Based on the I/O pooling technology, the server was implemented by decoupling and reorganizing different modules of the server. Modular iteration and flexible combination, can achieve heterogeneous acceleration, cold storage, HPC cluster and other different application scenarios, that is, hyper-scale data centers can be based on the specification for unified server architecture, reducing the complexity of procurement operation and maintenance, Reduce overall costs.

The efficiency of the data center depends not only on innovation at the hardware level, but also on the improvement of management technology. Intel introduced two data center management technologies at the conference. Once most cloud platforms enter the managed broadcast mode (that is, the management node sends various scheduling commands to the resource nodes), all resource nodes will prioritize the management commands, causing the currently queued business processes to stop, and the services will appear transient. Interrupted, Intel put the management interrupt function on the PRM level, which can effectively shorten the business interruption time. In addition, the cooling system of the data center will be adjusted according to the load level, but the feedback of the large-scale data center is complex and the delay is very high, resulting in the cooling adjustment being significantly slower than the load change. Intel has added the prediction window supported by AI in the management system. The cooling system adjustment does not have to rely on feedback, and the cooling strategy is more precise.

Open refactoring of IT infrastructure

Microsoft’s SONiC has been the most successful data center open-source project in recent years. The 400G Ethernet switches of Mellanox and DELTA support SONiC. The Phoenix project of China’s ODCC community also adopts SONiC. Microsoft has built the open-source switch operating system SONiC. The white box switch integrates into the industrial ecology.

Alibaba shared the application practice of SONiC at the event. Alibaba adopted SONiC to build a super-large production network, connecting hundreds of thousands of servers, millions of virtual machines and tens of millions of terminals, achieving high bandwidth and low latency. Alibaba has the ability to withstand the “Double Eleven” network frenzy, and Alibaba has done a lot of personalized development based on SONiC, which has promoted the innovation of Alibaba’s actual business.

SDN technology such as SONiC reconstructs data center network, SDS technology such as Ceph reconstructs data center storage, cloud computing technology such as OpenStack reconstructs data center server, and open source definition software and standardized hardware are becoming standardization of next generation IT infrastructure. select.

Open Firmware, open at the firmware level

OCP’s other community project, Open Firmware, has grown rapidly in recent years. The community’s mission is to develop agile, open, and standard firmware design specifications to accommodate the needs of next-generation cloud computing infrastructure development. Firmware refers to the underlying code of the device stored inside the device. Similar to the “driver”, the operating system must be driven by the firmware to drive the components of the server. Through the openness of the firmware, the data center can develop a deep data center unified management solution, and implement advanced operations such as remote unified upgrade of firmware, thereby simplifying the operation and maintenance complexity of the data center and even realizing the autonomy of the data center.

The project team is developing open source kits that include only the most basic platform code to identify white-box hardware, and also develop a white-box hardware system that can be built and booted by joint community members to form a soft and hard-open Open Firmware ecosystem.

OpenRMC, the framework for the next generation of management technology

OpenRMC, another project group led by Inspur, is working on integrating OpenBMC and Redfish to form a unified framework for next-generation data center management. This is a joint project of Linux, DMTF and OCP communities.

The Baseboard Management Controller (BMC) is an embedded management unit that monitors the status of the server and provides out-of-band management services. The BMC software stacks of all major server vendors are closed source and have poor compatibility, which affects the unified management of data center equipment. Therefore, in 2015, Facebook launched the OpenBMC open-source project, and the project was transferred to the Linux Foundation.

RedFish is the next generation data center management standard developed by the OMTF standards organization to replace the current IPMI. IPMI has fewer functions and poor scalability, and is only suitable for small and medium-sized data center management. Redfish has good scalability and rich functions. It provides standardized and easy-to-integrate management interfaces for various vendors’ diverse infrastructures. In addition to servers, Redfish is gradually expanding its support for storage and networking to meet the needs of hyperscale data centers. Advanced management needs.

OCP’s OpenRMC team has developed the industry’s first version of OpenBMC that is compliant with the Redfish specification, further enhancing the modularity and standardization of OpenBMC and accelerating the introduction of formal community standards. In the future, OpenRMC is expected to be integrated with Open Firmware to form a complete set of data center management architecture specifications deep into the firmware layer.

Unity, openness, integration, technology and industry

The OCP China Day event showcases technical content and constitutes a complete next-generation cloud data center technology framework – a new and upgraded form of Open Rack 3.0, with open integration and richer management from firmware to data center. These technologies will gradually replace the original technology with the application of 5G and AI, and complete the upgrade of the entire technology ecology.

In addition, it is worth mentioning that OCP, ODCC and other open hardware and software communities are being driven by actual needs, deep communication and cooperation, and even blurring the boundaries. At the conference, the keynote speech not only came from the OCP technical groups, but also from the ODCC community. Baidu shared the practical experience of the Scorpio server.

Cindy19
We will be happy to hear your thoughts

Leave a reply

Fiber4Sale
Logo
Shopping cart