Deictic communication embodies referential relationships about the external world through multi-modal integration of social and non-social cues such as pointing gestures, demonstratives and perceived objects. The multi-modal processing of these cues requires the coordination of attention in time and space to extract and understand their indexical meanings conveyed by interlocutors. This thesis aims to model the cognitive processes of deictic communication for humanoid robots to improve their interaction abilities following development robotics approach. This PhD tackles the computational problems of multi-modal processing of cognition by integrating artificial neural networks with dynamical neural fields to capture the temporal dynamics and spatial representations of deictic communication. Furthermore, the cognitive model has been designed by taking inspiration from how infants develop a comprehension of deixis and how this is linked to two neural networks of attention in the infant brain, i.e. orienting and alerting. Moreover, this model has been evaluated in psychology experiments by replacing infants with the iCub robot and setting the age parameters of the cognitive model accordingly. As a result, this thesis provides with a novel cognitive robotics model for the iCub humanoid robot inspired by the developmental processes of joint attention in the early infancy. The replication experiments demonstrate the model's ability to generate orienting behaviours modulated by pointing gestures, deictic cues and object perception, similarly to 5, 7, 10 and 12-month-old infants. Hence, they provide with proof of concepts for the cognitive abilities of this novel computational hybrid approach. Furthermore, this thesis also suggests particular functional links between deictic cues and the attentional networks, which might be existing in the infant brain, although these hypotheses need to be validated by child experiments in future work.