Letter to the Editor, Commentary

Can a 1-Item Scale for Psychotherapy Outcomes Be Psychometrically Robust?

Scott T. Meier*¹

[1] Counseling, School and Educational Psychology, University at Buffalo, Buffalo, NY, USA.

Clinical Psychology in Europe, 2025, Vol. 7(1), Article e15207, https://doi.org/10.32872/cpe.15207

Published (VoR): 2025-02-28.

*Corresponding author at: 80 Londonderry Ln, Getzville, NY 14068, USA. Phone: +1 716 906-9420. E-mail: stmeier@buffalo.edu

This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Gonçalves et al. (2024) recently described the selection of a 1-item outcome scale for the European Psychotherapy Consortium. The field has been trending toward brief scales because of research indicating greater patient compliance with fewer items (Miller et al., 2003; Miller et al., 2005). From a psychometric perspective, however, the 1-item emotional and psychological outcomes (EPO-1) measure is likely to produce low reliability and validity estimates. This results from measurement principles indicating that (a) reliability estimates increase with the number of items, and (b) validity estimates depend upon reliability. Measurement error decreases with an increasing number of item responses because random error sources tend to balance or cancel (Meier, 2013). If a patient misunderstands a question, for example, this error becomes a major influence on data in a single item self-report. Because multiple factors typically influence responses to any psychological item, scores on 1-item scales are less likely to be sensitive to change resulting from psychotherapy than a multi-item scale that aggregates change-relevant variance (Meier, 1997).

Other research suggests that many patients will not interpret the EPO-1 as test developers intended (Schwarz, 1999). Labeling this problem as intracategory variability, Dohrenwend (2006) observed that test-takers respond to item content on a self-report measure based on a wide range of personal experiences. When asked to report on a recent serious illness, for example, respondents will describe episodes that vary from simple flu to heart attacks. As a result, the basis on which individuals respond to health-related categories on self-report measures can range “from the catastrophic to the trivial” (Dohrenwend, 2006, p. 479). The EPO-1’s content is “At this moment, how well do you feel you are getting along emotionally and psychologically?” Patients respond on a 5-point scale ranging from 0 ("Very poorly; I can barely manage to deal with things") to 4 ("Very well; I have no important complaints"). For many individuals, these are cognitively complex tasks likely to lead to heterogeneous response processes and ratings.

Given that single item measures are inappropriate with ambiguous constructs (Allen et al., 2022), future research should evaluate reliability and validity estimates for the EPO-1. At a minimum, EPO-1 scores should evidence (a) stability over time in the absence of any intervention, (b) change over time when the patient participates in a psychosocial intervention, and (c) moderate to high correlations with existing measures of outcome. If EPO-1 scores fail to meet these standards, possible next steps include (a) augmenting EPO-1 data with one or more item(s) related to common factors that have been shown to influence outcome and (b) developing a system that minimizes respondent burden. Regarding (a), working alliance would appear to be a strong candidate given that psychotherapy researchers consistently find a modest positive effect of the client/therapist alliance on outcomes (Flückiger et al., 2018). Regarding (b), recent studies suggest that AI could produce outcome information through analysis of text produced by client discourse recorded during therapy sessions as well as clinicians’ unstructured progress notes (Chu et al., 2024).

Funding

The author has no funding to report.

Acknowledgments

The author has no additional (i.e., non-financial) support to report.

Competing Interests

The author has declared that no competing interests exist.

References

Allen, M. S., Iliescu, D., & Greiff, S. (2022). Single item measures in psychological science: A call to action [Editorial]. European Journal of Psychological Assessment, 38(1), 1-5. https://doi.org/10.1027/1015-5759/a000699
Chu, C., Sun, T., Zhang, B., & Rounds, R. (2024). Assessing vocational interests through chat: Development and validation of the Career Guidance Chatbot (CGC-bot) [Unpublished manuscript]. University of Illinois at Urbana-Champaign.
Dohrenwend, B. P. (2006). Inventorying stressful life events as risk factors for psychopathology: Toward resolution of the problem of intracategory variability. Psychological Bulletin, 132(3), 477-495. https://doi.org/10.1037/0033-2909.132.3.477
Flückiger, C., Del Re, A. C., Wampold, B. E., & Horvath, A. O. (2018). The alliance in adult psychotherapy: A meta-analytic synthesis. Psychotherapy, 55(4), 316-340. https://doi.org/10.1037/pst0000172
Gonçalves, M. M., Lutz, W., Schwartz, B., Oliveira, J. T., Saarni, S. E., Tishby, O., Rubel, J. A., Boehnke, J. R., Montesano, A., Paiva, D., Ceridono, D., Zech, E., Willemsen, J., Saarni, S. I., Kompan Erzar, K., Janeiro, L., Gelo, O. C. G., Errázuriz, P., Holas, P., . . . Barkham, M. (2024). Developing a European Psychotherapy Consortium (EPoC): Towards adopting a single-item self-report outcome measure across European countries. Clinical Psychology in Europe, 6(3), Article e13827. https://doi.org/10.32872/cpe.13827
Meier, S. (1997). Nomothetic item selection rules for tests of psychological interventions. Psychotherapy Research, 7(4), 419-427. https://doi.org/10.1080/10503309712331332113
Meier, S. T. (2013). Measuring change in counseling and psychotherapy. Guilford Press.
Miller, S. D., Duncan, B. L., Brown, J., Sparks, J. A., & Claud, D. A. (2003). The Outcome Rating Scale: A preliminary study of the reliability, validity, and feasibility of a brief visual analog measure. Journal of Brief Therapy, 2, 91-100. https://www.researchgate.net/publication/242159752
Miller, S. D., Duncan, B. L., Sorrell, R., & Brown, G. S. (2005). The Partners for Change Outcome Management System. Journal of Clinical Psychology, 61(2), 199-208. https://doi.org/10.1002/jclp.20111
Schwarz, N. (1999). Self-reports: How the questions shape the answers. The American Psychologist, 54(2), 93-105. https://doi.org/10.1037/0003-066X.54.2.93