Video-games are an important AI research area, and are often used as a convenient proxy for testing general AI techniques that are applicable to wider domains. An increasingly popular technique for video-game AI is deep reinforcement learning. When applied to video-games, information about the current game-state is typically provided as a series of images, representing how a human player might play by watching a computer or television screen. While this representation of the game-state ensures parity between the information given to human and AI players, for many applications an image-based representation may not be the most practical way of conveying information to the AI agent. In this thesis we explore an alternative method of representing game-state information, where the game-state is represented in terms of objects. Since structuring object information in a way which is compatible with conventional neural network architectures is a challenging problem, instead we look at alternative neural network architectures for various alternative input structures. One of the classes of networks we study are various architectures for sets. We provide a plausible explanation for the core mechanism behind these set networks, and investigate the properties implied by this mechanism via an empirical study. By using hand-crafted mapping from objects to feature vectors, we demonstrate how an object-based AI can be constructed using set networks, and propose a basic object-based agent based on both DQN and PPO algorithms. We compare these object-based agent to a standard image-based agent, as well as a similar object-based agent using relational transformations, on a variety of games from the GVG-AI Competition. We also propose a general framework for end-to-end learning on JSON data. Two instantiations of this framework are compared to a number of baseline approaches on a variety of UCI datasets imagined as JSON. We use this framework to construct a JSON-based video-game agent, thus removing the reliance on hand-crafted object-level feature mappings, and compare its performance to our object-based agent.