BACKGROUND: The police attend numerous domestic violence events each year, recording details of these events as both structured (coded) data and unstructured free-text narratives. Abuse types (including physical, psychological, emotional, and financial) conducted by persons of interest (POIs) along with any injuries sustained by victims are typically recorded in long descriptive narratives.
OBJECTIVE: We aimed to determine if an automated text mining method could identify abuse types and any injuries sustained by domestic violence victims in narratives contained in a large police dataset from the New South Wales Police Force.
METHODS: We used a training set of 200 recorded domestic violence events to design a knowledge-driven approach based on syntactical patterns in the text and then applied this approach to a large set of police reports.
RESULTS: Testing our approach on an evaluation set of 100 domestic violence events provided precision values of 90.2% and 85.0% for abuse type and victim injuries, respectively. In a set of 492,393 domestic violence reports, we found 71.32% (351,178) of events with mentions of the abuse type(s) and more than one-third (177,117 events; 35.97%) contained victim injuries. "Emotional/verbal abuse" (33.46%; 117,488) was the most common abuse type, followed by "punching" (86,322 events; 24.58%) and "property damage" (22.27%; 78,203 events). "Bruising" was the most common form of injury sustained (51,455 events; 29.03%), with "cut/abrasion" (28.93%; 51,284 events) and "red marks/signs" (23.71%; 42,038 events) ranking second and third, respectively.
CONCLUSIONS: The results suggest that text mining can automatically extract information from police-recorded domestic violence events that can support further public health research into domestic violence, such as examining the relationship of abuse types with victim injuries and of gender and abuse types with risk escalation for victims of domestic violence. Potential also exists for this extracted information to be linked to information on the mental health status.