India began its super-computing journey with the Param in the 1980s after the United States (US) banned the export of Cray supercomputers and in 2008, ‘EKA’, from Tata’s Computational Research Laboratories, was ranked 9th among global high-power computing (HPC) systems. This was a time when both India and China had only a handful of supercomputing systems that could make it to the TOP 500 list. Since then, however, their respective supercomputing stories have diverged significantly. While in India, benign neglect of the HPC sector has led to a floundering of the nascent ecosystem, China has made furious progress and its indigenously developed supercomputers now rank among the fastest in the world. Indeed, behind the recent hype surrounding the unveiling of petaflop HPC systems in India lies the sad reality that they are either wholly imported or are assembled from imported components. A belated attempt is being made to revitalize the HPC sector in India, but before we turn to examine that, let us take a closer look at what China has achieved.

 

 

China moves

China’s Sunway Taihu-Light supercomputer at the National Supercomputing Center in Wuxi, one of six National Supercomputing Centers in China, was developed at the National Research Center of Parallel Computer Engineering and Technology (NRCPC) based on processors and interconnects designed and fabricated in China. It utilizes the 28nm process Sunway Processor SW26010 and Sunway interconnects to deliver a  Linpack Performance of 93,014.6 Tflop/s. It currently  ranks 3rd in the top 500 HPCs with Tianhe(Milky Way)-2 which uses Intel Xeon E5-2692v2 processors and a proprietary interconnection network called TH Express-2 designed by National University of Defense Technology (NUDT) with a Linpack Performance (Rmax) 61,444.5 Tflop/s at National Super Computer Center in Guangzhou right behind it.

TaihuLight was ranked 1 from June 2016 to November 2017. The Sunway SW26010 is  a RISC architecture based 260-core processor designed by the National Higher Performance Integrated Circuit Design Centre, Shanghai which was supported by National Science and Technology Major Project (NMP): Core Electronic Devices, High-end Generic Chips, and Basic Software. The system was funded from three sources: the central Chinese government, the province of Jiangsu, and the city of Wuxi. Each contributed approximately 600 million RMBs or a total of 1.8 billion RMBs for the system or approximately $270M USD. That is the cost of the building, hardware, R&D, and software costs. The Tianhe-2 supercomputer earlier ranked 1 for three years ahead of the U.S. “Titan” system at Oak Ridge National Laboratory since June 2013. The stated goal for Chinese supercomputing especially the Sunway machine is four areas namely advanced manufacturing (CAE, CFD), earth system modeling and weather forecasting; life science, and big data analytics.

 

 

Will China win the exascale race?

China has made high-performance computing a focus since 2002 and now has turned its efforts to building an exascale system. Export of Intel, Nvidia and AMD CPUs and GPUs to Chinese supercomputing centers and research institutions was banned by the United States Department of Commerce citing “nuclear explosive activities” usage of such harware. China developed the Phytium FT-2000/64 unveilled in Fall ’16, presumed to be on-par with Broadwell Xeon series of Intel as part of a broad-based focus on domestic development of software and hardware in response. China is developing its multiple exscale prototypes based on ARM, X86 (zen based), homegrown arch simultaneously. China plans a broadbased supercomputing future on these prototypes. Sugon will use traditional technologies like x86 processors and accelerators made by Chinese chip maker Hygon, a multi-level interconnect design and immersive cooling that will do away with the need of fans. The Tianhe prototype will use new 16-nanometer MT-2000+ many-core processor from Matrix, a 3D butterfly network with a maximum of four hops for the whole system. The Sunway prototype will use the SW26010 chips, a high-bandwidth and high-throughput network powered by a self-developed network chip, and a water-cooling system with enhanced copper cold plate.

 

Hyperion Research estimates ( at ISC 2018 ) that China will beat out United States, Japan and Europe to reach excascale systems with a peak in 2020. China’s proposed systems are expected to source hardware and processors from Chinese vendors. Hyperion lists four target systems in the U.S. lineup: ANL’s A21, ORNL’s Frontier (OLCF5), LLNL’s El Capitan (ATS-4), NERSC-10. System deliveries are expected to begin in 2021 and will have roughly one year between installations, with early operation expected one year later for each system.

 

Beijing takes the AI race seriously

The luscious promise of an AI-enabled future has given rise to greater funding of chip startups that cater to AI computing elsewhere. The Chinese government continues to set challenging targets and support chip design companies and manufacturers. However they significantly lag in R&D funding that is vital for growth in this sector. Despite the rampant growth in its IC industry, China continues to rely on imports for semiconductors. Chinese company Hygon obtained AMD’s Zen architecture IP through a convoluted joint-venture of THATIC with AMD in 2016. It has now begun producing its own Dhyana processor that many believe bears a remarkable resemblance to AMD’s EPYC processor. China’s recent foray in manufacturing is not just in memory but all aspects of computing have domestic programs and builders. So much so, that even if symbolic, government procurement lists a domestic chip for the first time.

 

Silicon manufacturing is a necessity to have a component manufacturing ecosystem. This enables local assembly of electronic products removing the need for importing semi or fully built products. India’s journey towards setting up a commercial-scale semiconductor fabrication plant (fab) has led to zero outcomes, while China has raced ahead. Of course, there is no denying the fact that Fabs are eye-wateringly expensive and cutting edge production requires massive investments, not just in physical infrastructure but also in the technology necessary and India has balked at making such investments.

 

Meanwhile,  SK Hynix will build a new plant in Wuxi, China in 2019 along with Wuxi Industrial Development Group, an investment firm run by the regional government. The plant will house a new 200 mm wafer analog foundry production line.  To tackle  rising computing demands, Chinese designers are also looking past the x86 platform into newer architectures. R&D investments in HPC are estimated at nearly $2 billion per year in China according to Hyperion. To cater to the varied application demands, many chip firms are developing purpose-built products using varied architecture in lieu of current processor and GPU based systems.

 

 

Can India catch up?

India’s National Super-computing Mission (NSM) finally unveiled in 2015, although in the works since 2012, aims to create a grid of supercomputers connecting academic and research institutions across the country. NSM envisages the setting up nearly 50 supercomputers in three phases with progressively higher indigenous content levels.The program is estimated at Rs 4,500 crore in which Rs 2,800 crore will be funded by the Ministry of Science and Technology and the rest, about Rs 1,700 crore, from Ministry of Electronics & Information Technology (MeITY).

 

Milind Kulkarni, overseeing the project, said six supercomputers will be built in the first phase with three of these to be foreign built. Moving towards India’s goal of indigenous super-computing, the remaining three will be manufactured abroad, but assembled in India with C-DAC handling the overall system design. India has signed a deal with French supercomputer manufacturer Atos to build its first phase of the NSM. Atos will supply its BullSequana supercomputers that will be assembled in India for the first phase and progressively increase domestic components  with the ultimate aim being of an Indian supercomputer design in the third phase.

 

Indeed, in the second phase, major parts like high-speed Internet switches, compute nodes and network systems will be manufactured in India and a greater part of the system will be built in India in the third phase. IIT-Kharagpur will receive a 1.3 petaflop machine while 650 teraflop machine has been planned for IISER Pune and IIT-BHU each.

 

Fig 1. Market share of Top100 HPC Systems https://science.energy.gov/~/media/ascr/ascac/pdf/meetings/201609/Dongarra-ascac-sunway.pdf

 

The NSM’s objectives notwithstanding, as of today, India’s fastest supercomputer is the “PRATYUSH” , a Cray XC40 systems imported at a cost of Rs 440 crores, and is located at the Indian Institute of Tropical Meteorology (IITM), Pune. It is ranked 45  on the list of fastest supercomputers in the world and was launched in June 2018. The India Meteorological Department (IMD)  is generating short range (three to five days) and medium range (four to ten days) weather forecasts using this supercomputer.which utilizes Intel Xeon Broadwell E5-2695 processors  with a peak performance of 4,006 TFLOPS and uses Cray’s Aries NOC with Dragonfly Interconnect network topology.

 

SAHASRAT, India’s first petaflop system located at the Supercomputing Education and Research Center (SERC) at the IISc is also an import, being a  CRAY XC40  using Intel Xeon Haswell E5-2680 processors and Proprietary Cray Aries Interconnect with Dragonfly topology, Nvidia Tesla GPU Accelerator cards in nodes with Intel Xeon-Phi processors. It was India’s fastest supercomputer when unveiled in February 2015.

 

In a bid to reduce India’s overwhelming import dependency in the supercomputing sphere, The Shakti project by the RISE group at IIT-M is working on the Para-SHAKTI (parallel SHAKTI) project. Para-SHAKTI will make microprocessors for indigenous high-performance computers with over 32 SHAKTI cores. The group also aims to build high speed interconnects for servers and supercomputers based on variants of the RapidiIO and GenZ standards along with a family of processors.

 

A domestic  Indian processor and system based on it will create an entire ecosystem of dedicated interconnects, co-processors, accelerators, compilers, operating systems, libraries, and binaries to run applications. It will also foster further research and development to produce the  next-generation of such systems that produce faster and more efficient results. Possession of such capabilities will aid India  in its geopolitical competition with China, since a number of smaller nations today seek such capabilities. CERN’s gift of old servers to Nepal’s Kathmandu University should make decision makers in New Delhi sit up and take note for China will not be too far behind with similar moves in India’s neighbourhood.


© Delhi Defence Review. Reproducing this content in full without permission is prohibited.