The use of data processing units (DPU) is beginning to grow in large enterprises as AI, security and networking applications demand greater system performance.
Much DPU development to date has been aimed at hyperscalers. Looking ahead, DPU use in the data center and elsewhere in the enterprise network is expected to grow. One way that could happen is the melding of DPU technology with networking switches – a technology combination AMD Pensando calls a “smartswitch.”
An early entrant in that category is HPE Aruba’s CX 10000, which combines DPU technology from AMD Pensando with high-end switching capabilities. Available since early 2022, the CX 10000 is a top-of-rack, L2/3 data-center box with 3.6Tbps of switching capacity. The box eliminates the need for separate appliances to handle low latency traffic, security and load balancing, for example.
“We think smartswitches are the easiest way for enterprises to absorb DPU technology because it lets them retire old appliances and bring significant technology and scale to their networks,” said Soni Jiandani, chief business officer with AMD Pensando’s networking technologies and solutions group. “It lowers the barrier to entry for a lot of businesses as well – if we look at the 300-plus installations of the CX 10000 so far, you see a mix of very large to mid-sized customers looking to take advantage of DPU assets just like cloud customers.”
The smartswitch and DPU integration follows core enterprise initiatives such as consolidation, modernization and security, Jiandani said. “On day one of implementation, it lets them take advantage of great visibility, telemetry and performance.”
While the CX 10000 is the only switch on the market currently to support blended DPU technology, more are expected, experts say.
“During 2022 and in 1Q’23, we saw robust growth in the Smartswitch line from HPE Aruba and the CX 10000 platform. We expect to see more vendor partnerships productizing Smartswitch platforms over the next couple of years, with the most prominent western vendors (Cisco, Arista, Juniper, Dell) to explore and most to release this product class in the 2024 time frame,” stated Alan Weckel, founding technology analyst for the 650 Group, in a recent report.
By the end of the forecast period, over half the ports in the 650 Group’s forecast will be smart or programmable, coming from DPU-based solutions and direct programmability in the ASIC itself, according to Weckel.
“As the data center market moves beyond traditional workloads to AI/ML, the network will need to evolve and become more than just speeds and feeds providing connectivity between compute appliances and the end-user,” Weckel stated.
“Traditional switching ASICs don’t have the processing capacity, sufficient hardware memory resources, or flexible programmable data planes to allow them to implement stateful network functions or services,” Weckel stated. “Networking will become more powerful, and stateful network functions for network virtualization, enhanced security (e.g., stateful firewalls), load balancing, QoS, and cost metering will migrate from costly appliances into Ethernet switches.”
Customers will get increased performance, cost savings, and better agility from their network with DPUs embedded in it, Weckel stated.
In virtual environments, putting functions like network-traffic encryption and firewalling onto DPUs is also expected to drive use of the technology. The processing required to enforce microsegmentation policies that divide networks into firewalled zones can also be handled by smartNICs, experts say.
“The ability to deliver east-west security and microsegmentation and firewall capabilities on every server and to protect the applications through a distributed policy-based model will be a core teak of the DPU,” Jiandani said. “Currently customers see anywhere from a 30% to a 60% total cost of ownership reduction within the DPU environment.”
Enterprise organizations can utilize DPUs for other core applications as well.
“Storage offloads include accelerators for inline encryption and support for NVMe-oF. The hypervisor can be also moved from the CPU to the SmartNIC, as in the case of Project Monterey from VMWare, potentially improving utilization without significant customizations,” said Dell Oro Group’s senior director Baron Fung in a recent SmartNIC Summit presentation.
As part of its Project Monterey, VMware developed a feature called DPU-based Acceleration for NSX, which lets customers move networking, load balancing, and security functions to a DPU, freeing up server CPU capacity. The system can support distributed firewalls on the DPU, or large database servers that could securely handle tons of traffic without impacting their server environment, according to VMware.
“While Project Monterey is expected to spur enterprise Smart NIC adoption and is supported by the major vendors such as AMD, Intel, and Nvidia, traction has been slow this year to date as end-users are still accessing the total cost of ownership (TCO) of Smart NICs,” Fung stated.
While growth of the standard network interface card market is stagnating to low-single-digit growth in the next five years, Dell’Oro projects growth of the SmartNIC market, which include other variants such as data processing unit (DPU) or infrastructure processing units (IPU), to surpass 30%, Fung said.
Another major application is to help large enterprise customers support AI applications. In its most recent five-year data center forecast, Dell’Oro Group stated that 20% of Ethernet data center switch ports will be connected to accelerated servers to support AI workloads by 2027. The rise of new generative AI applications will help fuel more growth in an already robust data center switch market, which is projected to exceed $100 billion in cumulative sales over the next five years, said Sameh Boujelbene, vice president at Dell’Oro.
In another recent report, the 650 Group stated that AI/ML puts a tremendous amount of bandwidth performance requirements on the network, and AI/ML is one of the major growth drivers for data center switching over the next five years. “With bandwidth in AI growing, the portion of Ethernet switching attached to AI/ML and accelerated computing will migrate from a niche today to a significant portion of the market by 2027,” the 650 Group stated.
Innovations in Ethernet technologies will be constant to meet the growing requirements of AI networking, Jiandani said.
Speaking about AMD Pensando’s DPU technology, Jiandani said the advantage is that it’s programmable, so customers will be able to build customizable AI pipelines with their own congestion management capabilities.
Supporting efforts like the Ultra Ethernet Consortium (UEC) is one such development.
AMD, Arista, Broadcom, Cisco, Eviden, HPE, Intel, Meta and Microsoft recently announced the UEC, a group hosted by the Linux Foundation that’s working to develop physical, link, transport and software layer Ethernet advances. The idea is to augment current Ethernet technology in order to handle the scale and speed required by AI.
“We have the ability to accommodate for the essential services that we need to deliver for AI networks and the applications that run on top of it,” Jiandani said. “We’re going to build out a broad ecosystem of partners, which will help lower the cost of AI networking and give customers the freedom of picking best-of-breed networking technologies. We want customers to be able to accommodate AI in a highly programmable way.”
Looking at the market for AI, Nvidia vice president of enterprise computing Manuvir Das said at the Goldman Sachs Communacopia and Tech Conference that the total addressable market for AI will consist of $300 billion in chips and systems, $150 billion in generative AI software, and $150 billion in omniverse enterprise software. These figures represent growth over the long term, Das said, though he did not specify a target date, according to a Yahoo Finance story.
Nvidia is capitalizing in a monster way on AI and its use of its GPU technology, largely in hyperscaler networks at this point. The company’s second-quarter revenue came in at $13.51 billion, a 101% jump year-over-year that it credited largely to AI development.