Intel claims to have doubled the efficiency and worth of its Xeon household of processors for knowledge centres with the introduction of a brand new sixth-generation Xeon 6 line, plus new Gaudi 3 accelerators. The previous – offered as Xeon 6 with P-cores (‘efficiency’ cores) – doubles efficiency of its Xeon 5 central processing models (CPUs) for synthetic intelligence (AI) and high-power compute (HPC) workloads, it mentioned; the latter gives “as much as” 20 % extra throughput and twice the compute worth (“value/efficiency”) of Nvidia’s H100 graphics unit (GPU).
It primarily based the final measure on grunt energy for inference of LLaMa 2 70B1 basis language fashions. The brand new P-core chips, codenamed Granite Rapids, for enterprise knowledge centres change the previous ‘Scalable Xeon’ (‘Emerald Rapids’) models; they go alongside its second-track E-core platform, named Xeon 6 Sierra Forest, and introduced beforehand, for cloud prospects. The dual-track P-line and E-line nomenclature defines CPUs for enterprise-geared HPC in knowledge centres and cloud-oriented core density and power effectivity in multi-threaded workloads.
Intel’s new Xeon 6 with P-cores household includes 5 CPUs, up to now, accessible with 64, 72, 96, 120, or 128 cores – greater than doubling the core-count of the fifth-generation Xeon line on the excessive finish. They’re being produced on Intel’s new 3nm-class course of know-how (Intel 3), really categorized as 5nm, the place its earlier technology was on Intel 7 at 10nm. The Xeon 6 additionally options double the reminiscence bandwidth, and AI acceleration in each core. “This processor is engineered to fulfill the efficiency calls for of AI from edge to knowledge heart and cloud environments,” the agency mentioned.
In the meantime, the Gaudi 3 accelerator options 64 Tensor processor cores (TPCs) and eight matrix multiplication engines (MMEs) for deep neural community computations. It contains 128GB of HBM2e reminiscence for coaching and inference, and 24 200Gb Ethernet ports for scalable networking. It’s appropriate with the PyTorch framework and Hugging Face transformer and diffuser fashions. Intel is working with with IBM to deploy Gaudi 3 AI accelerators as-a-service on IBM Cloud – to additional “decrease complete value (TCO) to leverage and scale AI, whereas enhancing efficiency”, it mentioned.
Intel can also be working with OEMs together with Dell Applied sciences and Supermicro to develop co-engineered techniques tailor-made to particular buyer wants for efficient AI deployments, it mentioned. Dell is at the moment co-engineering retrieval-augmented technology (RAG) options utilizing Gaudi 3 and Xeon 6. It acknowledged: “Transitioning gen AI from prototypes to manufacturing techniques presents challenges in real-time monitoring, error dealing with, logging, safety and scalability. Intel addresses these via co-engineering efforts with OEMs and companions to ship production-ready RAG options.”
It continued: “These options, constructed on the Open Platform Enterprise AI (OPEA) platform, combine OPEA-based microservices right into a scalable RAG system, optimized for Xeon and Gaudi AI techniques, designed to permit prospects to simply combine purposes from Kubernetes, Pink Hat OpenShift AI, and Pink Hat Enterprise Linux AI.” It talked-up its Tiber portfolio to sort out developer challenges with value, complexity, safety, and scalability. It’s providing Xeon 6 preview techniques for analysis and testing, and early entry to Gaudi 3 for validating AI mannequin deployments.
Justin Hotard, government vice chairman and common supervisor of Intel’s knowledge heart and AI group, mentioned: “Demand for AI is main to an enormous transformation within the knowledge heart, and the business is asking for selection in {hardware}, software program and developer instruments. With our launch of Xeon 6 with P-cores and Gaudi 3 AI accelerators, Intel is enabling an open ecosystem that enables our prospects to implement all of their workloads with better efficiency, effectivity and safety.”