Exploration of task-based scheduling for convolutional neural networks accelerators under memory constraints

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Development of application specific accelerators for deep convolutional neural networks (ConvNets) have mainly focussed on accelerating the computationally intensive layers, that is the convolutional layers, to improve performance and energy efficiency. Traditional approaches in this space have relied on handcrafted dataflow implementations to leverage the fine-grained parallelism and datalocality properties within these layers. However, ConvNets layers also have an untapped potential from cross-layer data locality.
In our work, we explore a novel approach in the context of deep neural networks accelerators by modelling the computation as a task-dependency directed acyclic graph and proposing a memoryaware heuristic based on Heterogeneous Earliest Finish Time (HEFT) for task-graph scheduling on shared memory systems.
Our results show the benefits of task graphs in terms of better memory use (23.4 % less) over conventional layer-by-layer processing in a simulated environment with the first three layers of LeNet-5. Certain task-graphs trade-off makespan (10% increase) for memory use (20 % decrease). Finally, our exploration of graphs with different slicing configurations for the pooling layer while using memory-aware HEFT versus the original HEFT reveals that regular shaped tiles across layers offers better makespan and memory use than tiles with large dimensions along one axis.

Bibliographical metadata

Original languageEnglish
Title of host publication5th Workshop on design of Low Power EMbedded Systems at Computing Frontiers 2019
Pages366-372
DOIs
Publication statusPublished - 2019
EventThe 16th ACM International Conference - Alghero, Italy
Event duration: 30 Apr 20192 May 2019

Conference

ConferenceThe 16th ACM International Conference
Period30/04/192/05/19