You must Register or Login to Like or Dislike this video
The world has modified dramatically since generative AI made its debut. Companies are beginning to use it to summarize on-line opinions. Shoppers are getting issues resolved by way of chatbots. Staff are engaging in their jobs sooner with AI assistants. What these AI functions have in widespread is that they depend on generative AI fashions which have been skilled on high-performance, back-end networks within the information heart and served by way of AI inference clusters deployed in information heart front-end networks.
Coaching fashions can use billions and even trillions of parameters to course of huge information units throughout synthetic intelligence/machine studying (AI/ML) clusters of graphics processing unit (GPU)-based servers. Any delays—akin to from community congestion or packet loss—can dramatically affect the accuracy and coaching time of those AI fashions. As AI/ML clusters develop ever bigger, the platforms which might be used to construct them have to assist larger port speeds in addition to larger radices (such because the variety of ports). A better radix permits the constructing of flatter topologies, which reduces layers and improves efficiency.
Assembly the calls for of high-performance AI clusters
Lately, we have now seen the GPU wants for scale-out bandwidth improve from 200G to 400G to 800G, which is accelerating connectivity necessities in comparison with conventional CPU-based compute options. The density of the info heart leaf should improve accordingly, whereas additionally maximizing the variety of addressable nodes with flatter topologies.
To deal with these wants, we're introducing the Cisco 8122-64EH/EHF with assist for 64 ports of 800G. This new platform is powered by the Cisco Silicon One G200—a 5 nm 51.2T processor that makes use of 512G x 112G SerDes, which permits excessive scaling capabilities in only a two-rack unit (2RU) type issue (see Determine 1). With 64 QSFP-DD800 or OSFP interfaces, the Cisco 8122 helps choices for...
The world has modified dramatically since generative AI made its debut. Companies are beginning to use it to summarize on-line opinions. Shoppers are getting issues resolved by way of chatbots. Staff are engaging in their jobs sooner with AI assistants. What these AI functions have in widespread is that they depend on generative AI fashions which have been skilled on high-performance, back-end networks within the information heart and served by way of AI inference clusters deployed in information heart front-end networks.
Coaching fashions can use billions and even trillions of parameters to course of huge information units throughout synthetic intelligence/machine studying (AI/ML) clusters of graphics processing unit (GPU)-based servers. Any delays—akin to from community congestion or packet loss—can dramatically affect the accuracy and coaching time of those AI fashions. As AI/ML clusters develop ever bigger, the platforms which might be used to construct them have to assist larger port speeds in addition to larger radices (such because the variety of ports). A better radix permits the constructing of flatter topologies, which reduces layers and improves efficiency.
Assembly the calls for of high-performance AI clusters
Lately, we have now seen the GPU wants for scale-out bandwidth improve from 200G to 400G to 800G, which is accelerating connectivity necessities in comparison with conventional CPU-based compute options. The density of the info heart leaf should improve accordingly, whereas additionally maximizing the variety of addressable nodes with flatter topologies.
To deal with these wants, we’re introducing the Cisco 8122-64EH/EHF with assist for 64 ports of 800G. This new platform is powered by the Cisco Silicon One G200—a 5 nm 51.2T processor that makes use of 512G x 112G SerDes, which permits excessive scaling capabilities in only a two-rack unit (2RU) type issue (see Determine 1). With 64 QSFP-DD800 or OSFP interfaces, the Cisco 8122 helps choices for 2x 400G and 8x 100G Ethernet connectivity.
Cisco Silicon One structure, with its absolutely shared packet buffer for congestion management and P4 programmable forwarding engine, together with the Silicon One software program growth package (SDK), are confirmed and trusted by hyperscalers globally. By way of main improvements, the Cisco Silicon One G200 helps 2x the efficiency and energy effectivity, in addition to decrease latency, in comparison with the previous-generation machine.
With the introduction of Cisco Silicon One G200 final 12 months, Cisco was first to market with 512-wide radix, which can assist cloud suppliers decrease prices, complexity, and latency by designing networks with fewer layers, switches, and optics. Developments in load balancing, link-failure avoidance, and congestion response/avoidance assist enhance job completion occasions and reliability at scale for higher AI workload efficiency (see Cisco Silicon One Breaks the 51.2 Tbps Barrier for extra particulars).
The Cisco 8122 helps open community working programs (NOSs), akin to Software program for Open Networking within the Cloud (SONiC), and different third-party NOSs. By way of broad utility programming interface (API) assist, cloud suppliers can use tooling for administration and visibility to effectively function the community. With these customizable choices, we’re making it simpler for hyperscalers and different cloud suppliers which might be adopting the hyperscaler mannequin to fulfill their necessities.
Along with scaling out back-end networks, the Cisco 8122 can be used for mainstream workloads in front-end networks, akin to electronic mail and net servers, databases, and different conventional functions.
Bettering buyer outcomes
With these improvements, cloud suppliers can profit from:
Simplification: Cloud suppliers can streamline networks by lowering the variety of platforms wanted to scale with high-capacity compact programs, in addition to relying on fewer networking layers and optics and fewer cabling. Complexity can be lowered by way of fewer platforms to handle, which can assist decrease operational prices.
Flexibility: Utilizing an open platform permits cloud suppliers to decide on the community optimization service (NOS) that most closely fits their wants and permits them to develop customized automation instruments to function the community by way of APIs.
Community velocity: Scaling the infrastructure effectively results in fewer potential bottlenecks and delays that might result in slower response occasions and undesirable outcomes with AI workloads. Superior congestion administration, optimized reliability capabilities, and elevated scalability assist allow higher community efficiency for AI/ML clusters.
Sustainability: The ability effectivity of the Cisco Silicon One G200 can assist cloud suppliers meet information heart sustainability objectives. The upper radix helps cut back the variety of gadgets through the use of a flatter construction to raised management energy consumption.
The way forward for cloud community infrastructure
We’re giving cloud suppliers the pliability to fulfill vital cloud community infrastructure necessities for AI coaching and inferencing with the Cisco 8122-64EH/EHF. With this platform, cloud suppliers can higher management prices, latency, area, energy consumption, and complexity in each front-end and back-end networks. At Cisco, we’re investing in silicon, programs, and optics to assist construct scalable, high-performance information heart networks for cloud suppliers to assist ship high-quality outcomes and insights shortly with AI and mainstream workloads.
The Open Compute Venture (OCP) World Summit assembly is October 15–17, 2024, in San Jose. Come go to us in the neighborhood lounge to study extra about our thrilling new improvements; clients can signal as much as see a demo right here.
0 Comments