Writing efficient programs for heterogeneous platforms is challenging: programmers must deal with multiple programming models, partition work for CPUs and accelerators with different compute capabilities, requiring different amounts of parallelism, and manage memory in multiple distinct address spaces. Consequently, programming models which only require expressing parallelism and data dependences can not only unburden the programmer from these technical decisions, but also increase code and performance portability. Past research has identified data-flow task parallel programming models are a good fit for increasing the programmer productivity as well as unleashing the parallel processing power of massively parallel heterogeneous architectures. Especially, the dependence information readily available in the modern data-flow task parallel programming models can be exploited for a better task and data placement decisions to achieve higher performance and portability. This thesis focuses on the efficient scheduling of data-flow task parallel programs to a wide range of heterogeneous architectures from multi-core CPUs combined with discrete GPUs to multi-core CPUs with FPGA in system-on-chips. The proposed strategies balance the workload across heterogeneous resources, while simultaneously leveraging the task dependence information available in OpenStream--a platform-neutral and heterogeneity-agnostic data-flow programming model-- to optimize the scheduling of tasks and data transfers.