AI at the Edge: Future memory and storage in accelerating intelligence
The expanding use of AI in industry is accelerating more complex approaches — including machine learning (ML), deep learning and even large language models. These advancements offer a glimpse of the massive amounts of data expected to be used at the edge. Although the current focus has been on how to accelerate the neural network operation, Micron is driven on making memory and storage that is refined for AI at the Edge.
What is synthetic data?
The IDC1 predicts that, by 2025, there will be 175 zettabytes (1 zettabyte =1 billion terabytes) of new data generated worldwide. These quantities are hard to fathom, yet the advancements of AI will continue to push the envelope for data-starved systems.
In fact, the ever-increasing AI models have been stifled by the amount of real physical data that is obtained from direct measurements or physical images. It’s easy to identify an orange if you have a sample of 10,000 readily available images of oranges. But if you need specific scenes to compare — for example, a random crowd vs. an organized march or anomalies in a baked cookie vs. a perfect cookie — accurate results can be difficult to confirm unless you have all the variant samples to create your baseline model.
The industry is increasingly using synthetic data.2 Synthetic data is artificially generated based on simulation models that, for example, offer statistical realities of the same image. This approach is especially true in industrial vision systems where baselines for physical images are unique and where not enough “widgets” can be found on the web to offer a valid model representation.
Source: “Forget About Your Real Data – Synthetic Data Is the Future of AI,” Maverick Research, 2021, via “What Is Synthetic Data,” Gerard Andrews, NVIDIA, 2021.
Of course, the challenge is where these new forms of data will reside. Certainly, any new datasets that are created must be stored either in the cloud or, for more unique representations, closer to where data needs to be analyzed – at the edge.
Model complexity and the memory wall
Finding the optimal balance between algorithmic efficiency and AI model performance is a complex task, as it depends on factors such as data characteristics and volume, resource availability, power consumption, workload requirements and more.
AI models are complex algorithms that can be characterized by their number of parameters: The greater the number of parameters, the more accurate the results. The industry started with a common baseline model, such as ResNet50 as it was easy to implement and became the baseline for network performance. But that model was focused on limited datasets and limited applications. As these transformers have evolved, we see that the evolution of transformers has increased parameters over increased memory bandwidth.3 This outcome is an obvious strain: Regardless of how much data the model can handle, we are limited by the bandwidth of memory and storage available for the model and parameters.
Evolution of the number of parameters of state-of-the-art (SOTA) models over the years, along with the AI accelerator memory capacity (green dots). Source: “AI and Memory Wall,” Amir Gholami, Medium, 2021.
For a quick comparison, we can look at an embedded AI system’s performance in tera operations per second (TOPS). Here we see that AI edge devices less than 100 TOPS may need around 225 GB/s and those above 100 TOPS may require 451 GB/s of memory bandwidth (Table 1).
Intelligent endpoints | Customer premise edge | Infrastructure edge | |
---|---|---|---|
INT 8 TOPS | <20 | <100 | ~100 - 200 |
Memory BW required* | 90 GB/s | 225 GB/s | 451 GB/s |
IO width requirements | x16, x32 | x64, x128 | X256 |
Memory solutions | |||
Compute DRAM | LPDDR4 4.2 GT/s per pin |
LPDDR5 ; LPDDR5x 6.4 GT/s ; 8.5 GT/s per pin |
|
Max transfer rate per pin | |||
Max device BW (x32) | 13GB/s | 26GB/s ; 34GB/s |
Table 1 – Comparing AI system memory bandwidth requirements and memory technology device bandwidth. (* Estimated bandwidth required to saturate DLA for INT8 Resnet 50 model). Micron.
So, one way to optimize that model is to consider higher performing memory that also offers the lowest power consumption.
Memory is keeping up with AI accelerated solutions by evolving with new standards. For example, LPDDR4/4X (low-power DDR4 DRAM) and LPDDR5/5X (low-power DDR5 DRAM) solutions have significant performance improvements over prior technologies
LPDDR4 can run up to 4.2 GT/s per pin(giga transfer per second per pin) and support up to x64 bus width. LPDDR5X offers a 50% increase in performance over the LPDDR4, doubling the performance to as much as 8.5GT/s per pin. In addition, LPDDR5 offers 20% better power efficiency than the LPDDR4X (source: Micron). These are significant developments that can support the need to cater to widening AI edge use cases.
What are the storage considerations?
It’s not enough to think that compute resources are limited by the raw TOPs of the processing unit or by the bandwidth of the memory architecture. As ML models are becoming more sophisticated, the number of parameters for the model are expanding exponentially as well.
Machine learning models and datasets expand to achieve better model efficiencies, so higher-performing embedded storage will be needed as well. Typical managed NAND solutions such as e.MMC 5.1 with 3.2 Gb/s are ideal not only for code bring-up but also for remote data storage. In addition, solutions such as UFS 3.1 can run seven times faster — to 23.2 Gb/s — to allow for more complex models.
New architectures are also pushing functions to the edge that were typically relegated to cloud or IT infrastructure. For example, edge solutions implement a secure layer that offers an air gap between restricted operation data and the IT/cloud domain. AI at the edge also supports intelligent automation such as categorizing, tagging and retrieving stored data.
Memory storage developments such as NVMeTM SSDs that support 3D TLC NAND offer high performance for various edge workloads. For example, Micron’s 7450 NVMe SSD leverages a 176-layer NAND technology that’s ideal for most edge and data center workloads. With 2ms quality of service (QoS) latency, it’s ideal for the performance requirements of SQL server platforms. It also offers FIPS 140-3 Level 2 and TAA compliance for U.S. federal government procurement requirements.
The growing ecosystem of AI edge processors
Allied Market Research estimates the AI edge processor market will grow to $9.6 billion by 2030.4 Interestingly though, this new cohort of AI processor start-ups are developing ASICs and proprietary ASSPs geared for more space-and-power-constrained edge applications. These new chipsets also need the trade-off balance of performance and power when it comes to memory and storage solutions.
In addition, we see that AI chipset vendors have developed enterprise and data center standard form factor (EDSFF) accelerator cards that can be installed in a 1U solution and located with storage servers adaptable to accelerate any workload — from AI/ML inference to video processing — using the same module.
How do you seek the right memory and storage partner?
AI is no longer hype but a reality that’s being implemented in all verticals. In one study, 89% of industry already has a strategy or will have a strategy around AI at the edge within the next two years.5
But implementing AI is not a trivial task, and the right technologies and components will make all the difference. Micron’s portfolio of the latest technologies, both in memory and storage, leads the way for industrial customers with our IQ value proposition. If you are designing an AI Edge System, let Micron help get your product to market faster than ever. Contact your local Micron representative or distributor of Micron products (www.micron.com).
1 Source: “The Digitization of the World – From Edge to Core,” IDC/Seagate, 2018.
2 Source: “Forget About Your Real Data – Synthetic Data Is the Future of AI,” Maverick Research, 2021, via “What Is Synthetic Data,” Gerard Andrews, NVIDIA, 2021.
3 Source: “AI and Memory Wall,” Amir Gholami, Medium, 2021.
4 Source: “Edge AI Processor Market Research, 2030”. Allied Market Research, June 2022.
5 Source: “Mastering Digital Transformation in Manufacturing,” Jash Bansidhar, Advantech Connect, 2023.“Mastering Digital Transformation in Manufacturing,” Jash Bansidhar, Advantech Connect, 2023.