Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Data challenges for next-generation stargazers

Data challenges for next-generation stargazers

Image credit: skatelescope.org

 

Since the time of Galileo, stargazers have been using telescopes to discover the hidden secrets of deep space. As scientific methods progressed decade after decade and century after century, the goals of observation also changed dramatically. The Twentieth century saw an explosion of new technologies, from satellites to mobile computing. Telescopes grew enormously to see deeper and fainter objects hidden at the far end of the universe. The larger the diameter of the telescope, the deeper and fainter it can see. This was the driving equation behind creating those giant machines. As the telescope size grew, so did the downloaded data, from megabytes to terabytes. 

 

On the other hand, computing disciplines were evolving at the speed of light. The growing processing power of computers permitted scientists to perform data-intensive computations, which later culminated in process-hungry simulations. In the last quarter of the twentieth century, multi-wavelength astronomy began evolving. Astronomers started watching the celestial events using other wavelengths. Enormous arrays of dishes were constructed to watch the heavens at radio wavelengths. Collectively, those arrays were equivalent to many square kilometers of collecting radio photons. Such an array was called an interferometer. 

 

Square Kilometer Array Radio Telescope

The Square Kilometer Array (SKA) is a massive radio interferometer that is currently being built in Australia and South Africa. Once operational, it will dwarf all the giant telescopes currently in use across the globe. SKA will image the entire sky 1,000,000 times faster than today’s highly advanced telescopes. When completed, SKA phase 1 will have 130,000 antennae and 200 parabolic dishes to collect that valuable information emerging from enormous clusters of galaxies, gigantic black holes, the celestial light houses called pulsars, and the events that happened at the time of the big bang. SKA phase 2 is in the planning stage and will be 10 times larger than phase 1. It will be the biggest radio telescope in the world, with the highest sensitivity, angular resolution, and survey speed. The overall photon-collecting area of the telescope will be equivalent to one square kilometer, hence the name. 

 

At the heart of the SKA’s data handling capabilities is the so-called science data processor (SDP); which must be able to handle exascale processing and do so within the stringent power limitations. SDP gets the raw data known as "visibilities" from the central signal processor (CSP), another processor in the pipeline, whereas the CSP gets the raw data directly from the antenna arrays.

 

This is a story about those SKA computing challenges and the novel solutions suggested by two Thoughtworkers.  The solution was published as a position paper at the CARRV 2021 workshop at the prestigious International Symposium on Computer Architecture (ISCA). The paper was well received by the astronomical and computing communities. The authors, Mr. Harshal H. and Arunkumar M.V. at e4rTM are currently working on the solutions in collaboration with research and academic organizations within India.

 

The SKA telescope design implements a very long baseline interferometry (VLBI)  technique where the array is very long. For SKA, the antenna arrays in Australia will be 65 km, whereas in South Africa they will be 150 km in length. The signals from the antennae will be combined and analyzed at a raw data rate of petabits per second. And that is way beyond the capabilities of today’s commercial off-the-shelf hardware and software. 

Challenges of exascale computing

Let us understand the fundamental challenges of exascale processing. SKA-SDP needs to process 19,000 petabytes of data per year in real time, and its high-performance computer (HPC) will need a 300 petaflops operations capability. Flop stands for floating-point operations per second. It is a measure of processor speed. One petaflop is one thousand trillion or one quadrillion operations per second. SKA demands 300 times more power than this! The first challenge is how to achieve this processing speed? The second challenge is the high energy consumption of data movement in the system. For example, today, one byte of data needs 50 pico-joules of energy. One pico-joule is one-million-millionth of a joule. Even with such tiny consumption, SKA will need 10 megawatts of energy. As per Prof. Peter Braam of Oxford University, it is impossible to get that much energy in Cape Town or in Perth, where the SDP will be located.

 

Now let us understand the software challenges. Thoughtworkers have identified three key algorithms running in SDP. This trio is responsible for converting the visibilities into science-ready data products like clean images. The table outlines the algorithms and their challenges.

Algorithm

Function

Challenges

Gridding

Populates the global sky with visibilities

Memory intensive. Slow.

Degridding

Estimates visibilities based on input

Memory intensive. Slow.

2D FFT

Processes static images for spectral analysis

Low utilization of CPU. Slow.

Thoughtworkers have also identified critical bottlenecks in today’s SDP design on board. They are briefly listed in the table below with the suggested remedies.

Bottleneck

Remedy

  • Nature of algorithms

Choosing appropriate algorithm for architecture and also modular design in the algorithm

  • Fewer computations per byte of data accessed from memory (Low arithmetic intensity). Time delay and larger energy consumption for moving data

Bring compute closer to data (in-memory and near-memory strategies)

  • Irregular memory access. High inefficiency

Data reordering, larger caches. Improve data locality of the algorithms

  • “Limited” Off-chip memory bandwidth

On-chip high bandwidth memory

  • “Low” CPU performance per watt

Offload to domain-specific accelerators

  • Frequent data transfer to-and-from discrete accelerators

Integrated accelerators

  • “Small” Memory size in discrete accelerators

Integrated accelerators with uniform-memory access

After analyzing the SKA computing architecture, a proposal with multiple solutions was suggested. The proposal, dubbed RISKA—the reduced instruction set computer implementation for square kilometer array—had three goals: exascale computing with high energy efficiency, domain-specific supercomputing, and high-performance data-intensive computing in a single environment. 

RISKA suggestions 

  • Implement a memory-centric architecture where the data processing will be in-memory or near-memory. This will reduce the locality issues. This will also help to reduce the idle time of the processor in between two operations. Overall result will be reduced energy consumption.

 

  • Implement an open-source system-on-chip design having SDP-specific accelerators. Close integration with the host is the key. This approach can employ uniform memory access (UMA), a shared memory architecture for multiprocessors. Here, the processors access a single memory using the interconnection network. The important point is that each processor gets equal memory access time and equal speed. This will reduce the inefficiency due to separate memories for CPU accelerators.

 

  • Implement  domain-specific hardware  architectures (DSA) along with domain-specific language (DSL). This approach is suitable especially for algorithms, as they exhibit some aspects of data parallelism. This solution also suggests hardware-software co-design to mitigate the high complexity of the SKA-SDP architecture due to the existing discrete component design.  This will reduce the algorithm challenges mentioned above and reduce energy consumption while increasing the speed of processing.

 

  • Implement neural DSA. It will prove helpful, especially in pulsar discovery observations where heuristic and machine-learning capabilities, along with high speed, are of the utmost importance.

 

  • The proposal is based on RISC-V, an open-source specification that will help reduce costs. RISC-V delivers a new level of free, extensible software and hardware freedom in architectural design. It is on par with modern CPUs in terms of performance, code, density, and power. RISC-V is based on only 47 base instruction sets, uses a modular approach, and has a simple-to-develop application domain-specific design. Due to the modular approach, heterogeneous multicore architectures are also possible. This approach will be much more energy-efficient, which is one of the core requirements of the SKA-SDP design.

Future

SKA computing is a typical example of all the future mega-scale science projects. There are more in the pipeline like thirty meter telescope (TMT), large synoptic survey telescope (LSST), extremely large telescope (ELT) and the like. The challenges of exascale data processing and low energy consumption will be ubiquitous across such projects. Memory-centric processing architecture on system-on-chip design may become a standard practice for such projects. Another necessary part of the solution is domain specific architectures and languages. Such architectures will have specific accelerators for pre-defined functions. 

 

The final part of the solution is open-source hardware and software. Open source hardware, verification, and tooling are gaining momentum. Hardware engineering is learning from software about abstraction, OO design, and libraries of functionality. Corporates and universities are collaborating on mutual challenges of interest. Business models are evolving to incorporate open source development, corporates are innovating at higher levels, leveraging open source IP and accelerating time to market. 

 

The team’s next challenge is to address the daunting task of prototyping the domain-specific system-on-chip design, fabrication, and timely deployment. Specially designed for mega science projects only, we have very little experience of testing and deploying such systems. Modeling and simulation may give us some clues to the unknown intricacies of the new architectures. Future architectures will bring different challenges and will open new windows of opportunities for the computing professionals. Academia and industry collaboration can make it happen at a faster rate than previously imagined. 

 

(I want to thank Mr. Harshal H and Mr. Arunkumar M V from e4rTM Thoughtworks for their meticulous review of the content. This article is based on the research paper written by them.)

How can you achieve faster growth?