Low Overhead Dynamic Binary Translation on ARM

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

  • Authors:
  • Amanieu d'Antras
  • Cosmin Gorgovan
  • Jim Garside
  • Mikel Luján


The ARMv8 architecture introduced AArch64, a 64-bit execution mode with a new instruction set, while retaining binary compatibility with previous versions of the ARM architecture through AArch32, a 32-bit execution mode. Most hardware implementations of ARMv8 processors support both AArch32 and AArch64, which comes at a cost in hardware complexity.

We present MAMBO-X64, a dynamic binary translator for Linux which executes 32-bit ARM binaries using only the AArch64 instruction set. We have evaluated the performance of MAMBO-X64 on three existing ARMv8 processors which support both AArch32 and AArch64 instruction sets. The performance was measured by comparing the running time of 32-bit benchmarks running under MAMBO-X64 with the same benchmark running natively. On SPEC CPU2006, we achieve a geometric mean overhead of less than 7.5 % on in-order Cortex-A53 processors and a performance improvement of 1 % on out-of-order X-Gene 1 processors.

MAMBO-X64 achieves such low overhead by novel optimizations to map AArch32 floating-point registers to AArch64 registers dynamically, handle overflowing address calculations efficiently, generate traces that harness hardware return address prediction, and handle operating system signals accurately.

Bibliographical metadata

Original languageEnglish
Title of host publicationProceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017
Publication statusPublished - 18 Jun 2017