Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in EnglandCitation formats

Standard

Harvard

APA

Vancouver

Author

Bibtex

@article{f9df957063ff4dbaabbca9f3b3894483,
title = "Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in England",
abstract = "Background: Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used as Patient Reported Experience Measures for primary care dentistry.Aim: To assess the reliability of three different online sentiment analysis tools (Amazon Web Services, Google and Monkeylearn) at assessing the sentiment of reviews of dental practices working on National Health Service contracts in the United Kingdom.Methods: A Python 3 script was used to mine 15800 reviews from 4803 unique dental practices on the NHS.uk websites between April 2018 – March 2019. A random sample of 270 reviews were rated by the three sentiment analysis tools. These reviews were rated by 3 blinded independent human reviewers and a pooled sentiment score was assigned. Kappa statistics were used to assess the level of agreement. Disagreements between the automated and human reviewers were qualitatively assessed. Results: There was good agreement between the sentiment assigned to reviews by the human reviews and AWS (k=0.660). The Google (k=0.706) and Monkeylearn (k=0.728) showed slightly better agreement at the expense of usability on a massive dataset. There were 33 disagreements in rating between AWS and human reviewers, of which n=16 were due to syntax errors, n=10 were due to misappropriation of the strength of conflicting emotions and n=7 were due to a lack of overtly emotive language in the text.Conclusions: There is good agreement between the sentiment of an online review assigned by a group of humans and by cloud-based sentiment analysis. This may allow the use of automated sentiment analysis for quality assessment of dental service provision in the NHS.",
author = "Matthew Byrne and Lucy O'Malley and Anne-Marie Glenny and Iain Pretty and Martin Tickle",
year = "2021",
month = dec,
day = "15",
doi = "10.1371/journal.pone.0259797",
language = "English",
volume = "16",
journal = "PL o S One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "12",

}

RIS

TY - JOUR

T1 - Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in England

AU - Byrne, Matthew

AU - O'Malley, Lucy

AU - Glenny, Anne-Marie

AU - Pretty, Iain

AU - Tickle, Martin

PY - 2021/12/15

Y1 - 2021/12/15

N2 - Background: Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used as Patient Reported Experience Measures for primary care dentistry.Aim: To assess the reliability of three different online sentiment analysis tools (Amazon Web Services, Google and Monkeylearn) at assessing the sentiment of reviews of dental practices working on National Health Service contracts in the United Kingdom.Methods: A Python 3 script was used to mine 15800 reviews from 4803 unique dental practices on the NHS.uk websites between April 2018 – March 2019. A random sample of 270 reviews were rated by the three sentiment analysis tools. These reviews were rated by 3 blinded independent human reviewers and a pooled sentiment score was assigned. Kappa statistics were used to assess the level of agreement. Disagreements between the automated and human reviewers were qualitatively assessed. Results: There was good agreement between the sentiment assigned to reviews by the human reviews and AWS (k=0.660). The Google (k=0.706) and Monkeylearn (k=0.728) showed slightly better agreement at the expense of usability on a massive dataset. There were 33 disagreements in rating between AWS and human reviewers, of which n=16 were due to syntax errors, n=10 were due to misappropriation of the strength of conflicting emotions and n=7 were due to a lack of overtly emotive language in the text.Conclusions: There is good agreement between the sentiment of an online review assigned by a group of humans and by cloud-based sentiment analysis. This may allow the use of automated sentiment analysis for quality assessment of dental service provision in the NHS.

AB - Background: Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used as Patient Reported Experience Measures for primary care dentistry.Aim: To assess the reliability of three different online sentiment analysis tools (Amazon Web Services, Google and Monkeylearn) at assessing the sentiment of reviews of dental practices working on National Health Service contracts in the United Kingdom.Methods: A Python 3 script was used to mine 15800 reviews from 4803 unique dental practices on the NHS.uk websites between April 2018 – March 2019. A random sample of 270 reviews were rated by the three sentiment analysis tools. These reviews were rated by 3 blinded independent human reviewers and a pooled sentiment score was assigned. Kappa statistics were used to assess the level of agreement. Disagreements between the automated and human reviewers were qualitatively assessed. Results: There was good agreement between the sentiment assigned to reviews by the human reviews and AWS (k=0.660). The Google (k=0.706) and Monkeylearn (k=0.728) showed slightly better agreement at the expense of usability on a massive dataset. There were 33 disagreements in rating between AWS and human reviewers, of which n=16 were due to syntax errors, n=10 were due to misappropriation of the strength of conflicting emotions and n=7 were due to a lack of overtly emotive language in the text.Conclusions: There is good agreement between the sentiment of an online review assigned by a group of humans and by cloud-based sentiment analysis. This may allow the use of automated sentiment analysis for quality assessment of dental service provision in the NHS.

U2 - 10.1371/journal.pone.0259797

DO - 10.1371/journal.pone.0259797

M3 - Article

VL - 16

JO - PL o S One

JF - PL o S One

SN - 1932-6203

IS - 12

M1 - e0259797

ER -