Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE)



Yeates, Peter, Moult, Alice, Cope, Natalie, McCray, Gareth, Xilas, Eleftheria, Lovelock, Tom, Vaughan, Nicholas, Daw, Dan, Fuller, Richard ORCID: 0000-0001-7965-4864 and McKinley, Robert K Bob
(2021) Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE). ACADEMIC MEDICINE, 96 (8). pp. 1189-1196.

Access the full-text of this item by clicking on the Open Access link.
[img] PDF
Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE).pdf - Open Access published version

Download (1MB) | Preview

Abstract

<h4>Purpose</h4>Ensuring that examiners in different parallel circuits of objective structured clinical examinations (OSCEs) judge to the same standard is critical to the chain of validity. Recent work suggests examiner-cohort (i.e., the particular group of examiners) could significantly alter outcomes for some candidates. Despite this, examiner-cohort effects are rarely examined since fully nested data (i.e., no crossover between the students judged by different examiner groups) limit comparisons. In this study, the authors aim to replicate and further develop a novel method called Video-based Examiner Score Comparison and Adjustment (VESCA), so it can be used to enhance quality assurance of distributed or national OSCEs.<h4>Method</h4>In 2019, 6 volunteer students were filmed on 12 stations in a summative OSCE. In addition to examining live student performances, examiners from 8 separate examiner-cohorts scored the pool of video performances. Examiners scored videos specific to their station. Video scores linked otherwise fully nested data, enabling comparisons by Many Facet Rasch Modeling. Authors compared and adjusted for examiner-cohort effects. They also compared examiners' scores when videos were embedded (interspersed between live students during the OSCE) or judged later via the Internet.<h4>Results</h4>Having accounted for differences in students' ability, different examiner-cohort scores for the same ability of student ranged from 18.57 of 27 (68.8%) to 20.49 (75.9%), Cohen's d = 1.3. Score adjustment changed the pass/fail classification for up to 16% of students depending on the modeled cut score. Internet and embedded video scoring showed no difference in mean scores or variability. Examiners' accuracy did not deteriorate over the 3-week Internet scoring period.<h4>Conclusions</h4>Examiner-cohorts produced a replicable, significant influence on OSCE scores that was unaccounted for by typical assessment psychometrics. VESCA offers a promising means to enhance validity and fairness in distributed OSCEs or national exams. Internet-based scoring may enhance VESCA's feasibility.

Item Type: Article
Uncontrolled Keywords: Humans, Physical Examination, Psychometrics, Educational Measurement, Clinical Competence
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Institute of Life Courses and Medical Sciences
Depositing User: Symplectic Admin
Date Deposited: 09 Oct 2023 12:52
Last Modified: 09 Oct 2023 12:52
DOI: 10.1097/ACM.0000000000004028
Open Access URL: https://journals.lww.com/academicmedicine/fulltext...
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3173546