This paper presents a compact and efficient yet powerful binary framework based on image gradients for robust facial representation. It is termed as Binary Gradient Patterns (BGP). To discover underlying local structures in the gradient domain, image gradients are computed from multiple directions and encoded into a set of binary strings. Certain types of these binary strings have meaningful local structures and textures, as they detect micro oriented edges and retain strong local orientation, thus enabling great discrimination. Face representations by these structural BGP histograms exhibit profound robustness against various facial image variations, in particular illumination. The binary strategy realized by local correlations substantially simplifies the computational complexity and achieves extremely efficient processing with only 0.0032s in Matlab for a typical image. Furthermore, the discrimination power of the BGP has been enhanced on a set of orientations of the image-gradient magnitudes. Extensive experimental results on various benchmarks demonstrate that the BGP-based representations significantly improve over the existing local descriptors and state-of-the-art methods in the terms of discrimination, robustness and complexity and in many cases the improvements are substantial. Combining with the deep networks, the proposed descriptors can further improve the performance of the deep networks on real-world datasets.