The Mechanical Psychologist: How Computational Techniques Can Aid Social Researchers in the Analysis of High-Stakes Conversation



Cook, Darren ORCID: 0000-0002-6810-0281
(2022) The Mechanical Psychologist: How Computational Techniques Can Aid Social Researchers in the Analysis of High-Stakes Conversation. PhD thesis, University of Liverpool.

[thumbnail of 201096188_Oct2022.pdf] Text
201096188_Oct2022.pdf - Author Accepted Manuscript

Download (3MB) | Preview

Abstract

Qualitative coding is an essential observational tool for describing behaviour in the social sciences. However, it traditionally relies on manual, time-consuming, and error-prone methods performed by humans. To overcome these issues, cross-disciplinary researchers are increasingly exploring computational methods such as Natural Language Processing (NLP) and Machine Learning (ML) to annotate behaviour automatically. Automated methods offer scalability, error reduction, and the discovery of increasingly subtle patterns in data compared to human effort alone (N. C. Chen et al., 2018). Despite promising advancements, concerns regarding generalisability, mistrust of automation, and value alignment between humans and machines persist (Friedberg et al., 2012; Grimmer et al., 2021; Jiang et al., 2021; R. Levitan & Hirschberg, 2011; Mills, 2019; Nenkova et al., 2008; Rahimi et al., 2017; Yarkoni et al., 2021). This thesis investigates the potential of computational techniques, such as social signal processing, text mining, and machine learning, to streamline qualitative coding in the social sciences, focusing on two high-stakes conversational case studies. The first case study analyses political interviewing using a corpus of 691 interview transcripts from US news networks. Psychological behaviours associated with effective interviewing are measured and used to predict conversational quality through supervised machine learning. Feature engineering employs a Social Signal Processing (SSP) approach to extract latent behaviours from low-level social signals (Vinciarelli, Salamin, et al., 2009). Conversational quality, calculated from desired characteristics of interviewee speech, is validated by a human-rater study. The findings support the potential of computational approaches in qualitative coding while acknowledging challenges in interpreting low-level social signals. The second case study investigates the ability of machines to learn expert-defined behaviours from human annotation, specifically in detecting predatory behaviour in known cases of online child grooming. In this section, the author utilises 623 chat logs obtained from a US-based online watchdog, with expert annotators labelling a subset of these chat logs to train a large language model. The goal was to investigate the machine’s ability to detect eleven predatory behaviours based on expert annotations. The results show that the machine could detect several behaviours with as few as fifty labelled instances, but rare behaviours were frequently over-predicted. The author next implemented a collaborative human-AI approach to investigate the trade-off between human accuracy and machine efficiency. The results suggested that a human-in-the-loop approach could improve human efficiency and machine accuracy, achieving near-human performance on several behaviours approximately fifteen times faster than human effort alone. The conclusion emphasises the value of increased automation in social sciences while recognising the importance of social scientific expertise in cross-disciplinary re- search, especially when addressing real-world problems. It advocates for technology that augments and enhances human effort and expertise without replacing it entirely. This thesis acknowledges the challenges in interpreting computational signals and the importance of preserving human insight in qualitative coding. The thesis also highlights potential avenues for future research, such as refining computational methods for qualitative coding and exploring collaborative human-AI approaches to address the limitations of automated methods.

Item Type: Thesis (PhD)
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 04 Aug 2023 11:38
Last Modified: 04 Aug 2023 11:38
DOI: 10.17638/03170404
Supervisors:
  • Maskell, Simon
  • Alison, Laurence
URI: https://livrepository.liverpool.ac.uk/id/eprint/3170404