An FPGA implementation of a fine grain generalpurpose SIMD processor array is presented. The processor architecture has a compact processing element which is encapsulated into two configurable logic blocks (CLBs) and is then replicated to form an array. A 32 x 32 processing element array is implemented on a low-cost Xilinx XC5VLX50 FPGA using four-neighbour connectivity with the possibility to scale up using a larger FPGA. The processor array operates at a frequency of 150 MHz and executes a peak of 153.6 GOPS (bit-serial operations). Binary and 8-bit greyscale image processing is performed and demonstrated. © 2012 IEEE.