DESIGN TOOLS
APPLICATIONS

IDC’s insight on today’s flexible server architecture

IDC guest author | May 2024

This blog was written by a guest author from IDC for publication on Micron.com

One size server does not fit all

Extraordinary growth in data volume* and velocity is straining the data center. Generic, fixed architecture servers that once met the needs of most data center workloads aren’t flexible enough to meet the specialized needs of today's ever-diversifying, modern workloads that demand different combinations of data processing, movement, storage and analysis capabilities.

*From 2023-2028, IDC forecasts that the volume of data created each year will increase at a CAGR of 24.4%1
 

To provide those capabilities and conform to performance, power, and cost (performance/watt/$) constraints, enterprise and cloud data center architects often turn to white box servers because of their flexible architectures. In fact, these non-branded servers grew to almost 45% of the overall worldwide server shipments in 2023. These architects are applying three major architectural techniques:

  • Artificial intelligence puts more intelligence in situ, meaning, where the data are, and there are more intelligent memory, storage, networking and processing power to process, move, store and analyze data more efficiently. AI operates in many ways big and small, from analyzing large data sets and determining outcomes to deciding what data will be needed where – edge or core, for example — in an enterprise's infrastructure to monitoring the network to determine who can be allowed on and who cannot. AI's deep penetration into so many aspects of IT and OT operations speaks to the care that must be taken to tune a system to provide the right capabilitie
  • Heterogeneous computing mixes and matches within a server's configuration the memory, storage, processing and connectivity technologies according to the needs of the workload. For example, servers that 10 years ago relied on GPUs integrated on fixed silicon dies now often have robust discrete GPUs supported by their own dedicated memories. AI servers that didn't exist 10 years ago, exist now because many-core CPUs, high-end GPUs, and specialized custom chips (ASICs) come together to serve the demanding throughput of AI.
  • Distributed computing moves the servers to where the data is and adapts the memory, storage, processing and connectivity capabilities to minimize the costs of moving data and to reduce lag time between data center and end-user. Servers in centralized core data centers serve high-performance tasks, requiring powerful CPUs, GPUs and FPGAs, while servers in edge (a computing paradigm where infrastructure and workloads are placed closer to where data are generated and consumed) data centers serve more domain-specific tasks amid resource constraints, requiring more power-efficient CPUs and SoCs and low-power memory. Data center locations reflect a hybrid model with massive core data center servers coupled with edge servers located strategically close to populations that are using the data.

Flex your data center’s potential

Flexible white box server architectures provide vast potential to tune a server's configuration according to the performance, power and cost needs of the server's targeted workloads. Compute, memory, storage and networking technologies today are far more scalable and cost-effective than they were just 5 years ago.

In compute, server microprocessor portfolios now offer a choice from as little as four cores for small, low-intensity workloads to as many as 144 cores for the highest performance workloads; 288 core options will come into the mainstream in 2025. Further, server microprocessors support choices of more memory capacity and I/O capacity. Through PCIe, choices of high-performance accelerators, including GPUs, FPGAs and custom ASICs — many designed for AI and programmed to the intended workload — offload work from the microprocessor, balancing the performance and power needs across subsystems.

In memory, the industry is quickly transitioning to DDR5 for server main memory; DDR5 memory modules not only have more capacity but also carry more local intelligence (data buffers) and manage their own power consumption. For the accelerators, high bandwidth memory (HBM), specifically HBM3E is the standard today, provide dedicated, high-capacity, low latency support for high-performance workloads, such as AI model training. 

In storage, AI is expected to be a catalyst for storing more data on SSDs. To adapt, system architects have inserted faster, higher-capacity and NVMeTM-enabled drives into the memory and storage hierarchy to bring data to data processors faster. In storage infrastructure, huge amounts of unstructured data and structured data being used for training AI models, are forcing storage architectures to combine object and file storage so data pipelines can access data stored in both formats. 

Networking is integral to cost — and performance-efficient servers. While data processing technologies like GPUs have received significant investment in the initial phases of AI infrastructure development, AI models move enormous amounts of data across server subsystems, from server to server, and across the data center. To minimize time in the network**, network ICs have increased throughput up to 1600 Gb/second for ethernet and compute architectures have segmented network architectures in normal ethernet over-subscribed networks, AI processing networks with either ethernet or InfiniBandTM, and PCIe®- or NVLinkTM-enabled GPU backend networks for scale up networking.

**The time that data spends in the network can leave GPUs and CPUs idle up to 60% of the time.2
 

The disruptive effect of generative AI as a major new workload requires the adoption of AI-ready infrastructure and entering a critical phase: buildout. Starting in 2024, enterprises will accelerate deployment of new AI-ready hardware and software infrastructure as they invest to drive meaningful gains in business and staff productivity as well as reimagine customer digital experiences.

The flexibility that white box servers bring to the data center now is laying the groundwork for future data centers that are modular, highly scalable and supercharged by next-generation technologies like UCIeTM, CXLTM and HBM3E to provide a roadmap for continuous architectural adaption to accommodate the disruption of AI.
 

IDC, Worldwide IDC Global DataSphere Forecast, 2024–2028: AI Everywhere, But Upsurge in Data Will Take Time, Doc #US52076424, May 2024

2 Outlook for AI Semiconductors and Storage Components in IT Infrastructure, IDC # US51851524, February 2024

The opinions expressed in this article are those of the individual contributing author and not Micron Technology, Inc., its subsidiaries or affiliates. All information is provided “AS-IS” and neither Micron nor the author make any representations or warranties with respect to the information provided. Micron products are warranted as provided for in the products when sold, applicable data sheets or specifications. Information, products, and/or specifications are subject to change without notice.  Micron and the Micron logo are trademarks or registered trademarks of Micron Technology, Inc. Any names or trademarks of third parties are owned by those parties and any references herein do not imply any endorsement, sponsorship or affiliation with these parties.

 

IDC guest author, Shane Rau

International Data Corporation (IDC) is a global market intelligence, data and events provider for the information technology, telecommunications and consumer technology markets.