Architecture and application scenarios in research
With more than 38 petaflop/s peak computing power (number of double-precision operations that the system can perform per second), the ZIH system Capella achieves very good rankings in the current November edition of the Top 500 list of the fastest high-performance computers. The high-performance computer, installed by the Saxon company Megware and designed in close cooperation with Lenovo and other companies, is equipped with more than 140 nodes. Each of these nodes has four H100 accelerators from NVIDIA and two AMD processors with 32 cores each. In addition, each graphics processor (GPU) is equipped with 94 gigabytes of high bandwidth memory. A very fast memory system of over 1 petabyte provides a so-called “burst buffer” with data at over 1,500 gigabytes/s for the AI accelerators, which benefits data-intensive applications such as the training of large AI models. By integrating Capella into the existing infrastructure in the data center, the file systems of the Barnard HPC cluster could also be efficiently connected in the overall complex. With its hot water cooling and waste heat recovery, Capella continues the high standards of energy efficiency in the ZIH data center.
Capella is versatile. However, with its combination of more than 560 fast AI accelerators and fast cache for data provision, Capella is predestined for applications in the field of artificial intelligence and data analysis and will be an essential tool for the members of the AI Competence Center ScaDS.AI Dresden/Leipzig, among others. For example, training and improving European language models is an important application scenario for Capella, such as in the OpenGPT-X project. Other areas of application with high performance and memory requirements can be found in medical research, where, for example, machine learning methods are used for cancer diagnosis and the development of new drugs, or in earth system science, e.g. to gain new insights into natural disasters and climate change from earth observation data.
Supercomputing at the ZIH
As the National Supercomputing Center in the NHR network (National High Performance Computing), the ZIH has the task of providing researchers with precisely the infrastructure, method development and support they need to efficiently and sustainably advance solutions for the complex research questions in their respective fields. An important focus of the ZIH is on the strategic priorities of big data, data analytics and AI – particularly in the life and earth system sciences. In order to fulfill this mission, ZIH employees have been successfully operating, developing and testing new technologies for high-performance computing (HPC), data-intensive computing and energy-efficient computing for years. In addition to continuous in-house research, which focuses in particular on potential solutions for major HPC challenges, these efforts are continuously flanked by market analyses and an intensive exchange with users in order to offer tailor-made systems at the cutting edge of technology.
As with the previously installed clusters, the Capella components used also offer interesting opportunities for measurements for energy efficiency research and performance optimization at the ZIH.
In addition to direct HPC support for the systems provided, which is open to all users, the ZIH portfolio also includes further training. The numerous training courses on offer range from introductions to using the systems to training on the available software tools and programming. In addition, as a training company, ZIH offers first-class vocational training for IT specialists. In this way, we are also addressing the future need for support services and the technical operation of the systems.
Financing
The cluster is funded by the National High Performance Computing (NHR@TUD) and the AI Competence Center ScaDS.AI Dresden/Leipzig – in equal parts by the BMBF and the Free State of Saxony – as well as the German Center for Astrophysics.
Information on the ranking of the Top 500 and the GreenHPC list
While most of the systems on the list worldwide are located in the USA (172), the overall ranking shows that the most powerful European computers are located in Italy, Finland and Switzerland. With 41 systems, Germany is ahead of France (24) and the UK (14). The medium-sized company Megware from Chemnitz was able to position itself well in the current list overall, with eight computers at German universities; including three new systems, one of which is Capella at TU Dresden.
The Green 500 list ranks the systems in the TOP 500 according to their performance efficiency, which is measured in gigaflops/watt. The decisive factor here is therefore not pure performance, but how much computing power a system delivers per watt of electrical power consumed. The decisive factor here is therefore not the size of the system, but its technology.
Contact
Jacqueline Papperitz
Project Coordination/Public Relations
CIDS – Center for Interdisciplinary Digital Sciences
Tel.: +49(351)463-32431
Email: jacqueline.papperitz@tu-dresden.de
– – – – –
Further links
👉 https://tu-dresden.de
Photo: TU Dresden / Daniel Hackenberg