Optimising Dynamic Binary Modification across 64-bit Arm Microarchitectures

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

  • Authors:
  • Guillermo Callaghan
  • Cosmin Gorgovan
  • Mikel Luján

Abstract

A common optimisation used in most Dynamic Binary Modification (DBM) systems is trace generation as these traces improve locality and code layout. We describe an optimised code layout for traces as well as present how to adapt the runtime algorithm to generate it. In this way, we manage to reduce the overhead on all the Arm systems evaluated; 5 different microarchitectures.

A major source of overhead for DBMs comes from handling indirect branches. Indirect Branch Inlining (IBI) is a mechanism that attempts to avoid this overhead by using predictions about the target of the indirect branch. We analyse the behaviour of the indirect branch inlining and propose a new predictor, Trace Restricted IBI (TRIBI), and how to optimise IBI given the new trace generation algorithm.

Our evaluation shows a geometric mean overhead for SPEC CPU2006 of 9% for a Cortex-A53 (in-order core), and for out-of-order cores 11% on an X-Gene-2, 10% on a Cortex- A57, 7% on a Cortex-A72 and 8% on a Cortex-A73, when com- pared to native execution. This is a reduction of the overhead between 30% to 50% compared to the publicly available DBM systems MAMBO, and, even higher, against DynamoRIO. Using PARSEC 3.0, we evaluate the scalability across threads on the X-Gene-2 system (server machine with the highest number of cores) and show a geomean overhead between 6-8%.

Bibliographical metadata

Original languageEnglish
Title of host publicationProceedings of the 16th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE ’20)
PublisherAssociation for Computing Machinery
Number of pages13
DOIs
Publication statusPublished - 17 Mar 2020

Related information