The EuroExa project proposes a High-Performance Computing (HPC) architecture which is both scalable to Exascale performance levels and delivers world-leading power efficiency. This is achieved through the use of low-power ARM processors accelerated by closely-coupled FPGA programmable components. In order to demonstrate the efficacy of the design, the EuroExa project includes application porting work across a rich set of applications. One such application is the new weather and climate model, LFRic (named in honour of Lewis Fry Richardson), which is being developed by the UK Met Office and its partners for operational deployment in the middle of the next decade.
Much of the run-time of the LFRic model consists of compute intensive operations which are suitable for acceleration using FPGAs. Programming methods for such high-performance numerical workloads are still immature for FPGAs compared with traditional HPC architectures. The paper describes the porting of a matrix-vector kernel using the Xilinx Vivado toolset, including High-Level Synthesis (HLS), discusses the benefits of a range of optimizations and reports performance achieved on the Xilinx UltraScale+ SoC.
Performance results are reported for the FPGA code and compared with single socket OpenMP performance on an Intel Broadwell CPU. We find the performance of the FPGA to be competitive when taking into account price and power consumption.