Publications

Qualitative analysis of speech-language pathologists’ voice evaluation practices and perspectives

Madoule, M., Marks, K.L., Nagle K.F., Kirchgessner, E., Gill, A., Kline, J.C., Vojtech, J.M. & Stepp, C.E. (2025). American Journal of Speech-Language Pathology. https://pubs.asha.org/doi/10.1044/2025_AJSLP-24-00417

Purpose: The purpose of this qualitative study was to examine the structure of voice evaluations and gather clinicians’ opinions on the barriers to and benefits of using acoustic measures in these evaluations. A secondary goal was to investigate how clinicians assess strain and vocal effort.
Method: Fifteen voice-specialized speech-language pathologists from voice centers around the United States were interviewed to query their current voice evaluation practice patterns and opinions on acoustic measures. They were also asked how they evaluate strain and vocal effort. Thematic analysis was per-formed by two researchers based on the recorded interviews.
Results: Differences among practitioners were found in almost every component of the evaluation. Four themes related to barriers to and benefits of implementing acoustic measures in a voice evaluation were identified: Collecting and analyzing acoustic measures (a) take time, (b) do not inform therapy patterns, (c) allow for the most accurate comparison, and (d) supplement patient-centered care. Three themes emerged related to evaluating vocal effort and strain: Clinicians (a) lack consensus on objective measures of strain, (b) use more than just auditory perception to evaluate strain, and (c) assess vocal effort in different ways.
Conclusions: Although some speech-language pathologists view acoustic assessment as the gold standard for guiding therapeutic decisions, others believe it may not be strictly necessary for delivering effective voice therapy. Variations in the assessment of strain and vocal effort across voice clinics sug-gest a need for additional research in this area.

Madoule et al 2025

Development and rationale for the Consensus Auditory-Perceptual Evaluation of Voice—Revised (CAPE-Vr) – OPEN ACCESS

Kempster, G., Nagle, K.F. & Solomon, N.P. (2025). Journal of Voice. https://doi.org/10.1016/j.jvoice.2025.01.022

Rationale: The Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) has been in circulation for more than 20 years. Over the course of time, issues have arisen that have had an impact on the intended administration and interpretation of this common clinical tool. Purpose: Based on published literature, clinical experience, recent survey data, and practical considerations, and while maintaining the original purpose of the instrument, the authors developed a revised protocol, new rating form, and updated instructions for the CAPE-V, now called the CAPE-V—Revised (CAPE-Vr). Summary of Modifications: Revisions to the CAPE-V include the following: removal of textual labels indicating regions of severity under each visual analog scale on the rating form, instead displaying terms indicating the direction of the lines; modification of several of the stimuli; revised rating options for pitch, loudness, and resonance, and an added category for nasality; added space to describe inconsistencies according to task; modified options for vocal instabilities and other features; and added space for comments about overall impression. The form also includes sections for documenting recording and rating conditions. Updated instructions are provided to clarify the CAPE-Vr protocol and correspond closely to the rating form. Conclusion: The CAPE-Vr is constructed to avoid common errors and problems identified from previous use of the original CAPE-V. This paper provides a rationale for each modification to the original CAPE-V, an updated form, and an example of a completed form. The CAPE-Vr is intended as a clear and useful assessment tool for documenting the auditory-perceptual evaluation of voice.

Survey of voice-focused speech-language pathologists’ usage of the Consensus Auditory Perceptual Evaluation of Voice (CAPE-V)

Nagle, K. F., Kempster, G. B., & Solomon, N. P. (2024). Journal of Voice. https://doi.org/10.1016/j.jvoice.2024.08.032

Purpose: As part of the process of developing specific recommendations for modifying certain elements of the Consensus Auditory Perceptual Evaluation of Voice (CAPE-V) to promote end-user fidelity, the authors sought input from voice clinicians who regularly use the CAPE-V to assess voice quality.
Method: At an academic meeting focusing on voice disorders, we presented a poster briefly reviewing the CAPE-V protocol and describing several sources of variability that have been reported in its current use. Interested viewers were directed to a QR code linking to a brief, anonymous survey on how individuals currently use the CAPE-V and how they might improve it. A link to the survey was also distributed on the conference discussion board.
Results: Fifty-nine participants responded to the survey: 49 completed it. The median respondent reported 8years of experience conducting voice evaluations, with 50% of their current practice in voice, and about eight voice evaluations per week. Key findings from this survey were that fewer than half of respondents reported audio recording any components of in-person or virtual voice evaluations, and that most respondents reported changing some aspect of the CAPE-V tasks and stimuli in practice.
Conclusion: This exploratory study revealed a wide range of idiosyncratic practices by clinicians when administering and scoring the CAPE-V. The findings support planned revisions to the CAPE-V protocol and form involving the tasks, stimuli, and rating procedures.

Nagle et al 2024 Survey

Professionalism in the context of providing elective services: Reflecting on bias

Nagle, K.F. & Pilkington, B. (2024). Journal of Communication in Healthcare, https://doi.org/10.1080/17538068.2024.2323852

We examine the provision of elective pronunciation services, such as intelligibility enhancement, to non-native speakers by speech language pathologists (SLPs). Practices associated with the “modification” of non-native accent raise significant professionalism questions about bias for SLPs and healthcare team professionals. These questions arise partly due to the socio-cultural context in which SLPs practice and their clients live, and the relational nature of communication. We argue that due to the ambiguity inherent in accent modification practices, SLPs must weigh a variety of considerations before determining the contexts and circumstances in which such services are professionally acceptable. Our argument is rooted in consideration of the complex nature of professionalism related to communication. After surveying potentially relevant models from other healthcare professions and finding them wanting, we support our position in light of current literature on topics such as accounts of functionality. We conclude by generalizing our anti-bias recommendations to interprofessional healthcare professionalism.

Nagle & Pilkington 2024

Clinical use of the CAPE-V scales: Agreement, reliability & notes on voice quality

Nagle, K.F. (2022). Journal of Voice. https://doi.org/10.1016/j.jvoice.2022.11.014

The CAPE-V is a widely used protocol developed to help standardize the evaluation of voice. Variability of voice quality ratings has prevented development of training protocols that might themselves improve interrater agreement among new clinicians. As part of a larger mixed methods project, this study examines agreement and reliability for experienced clinicians using the CAPE-V scales. Experienced voice clinicians (N=20) provided ratings of recordings from 12 speakers representing a range of overall voice quality. Participants were instructed to rate the voices as they normally would, using the CAPE-V scales. Descriptive data were recorded and two levels of agreement were calculated. Single rater reliability was calculated using a 2-way random model of absolute agreement for intraclass correlations (ICC [2,1]). Participants’ use of the CAPE-V scales varied considerably, although most rated overall severity, breathiness, roughness and strain. Data from one participant did not meet a priori agreement criteria. Because outcomes were significantly different without their data, agreement and reliability were analyzed based on the reduced data set from 19 participants. Interrater agreement and reliability were comparable to previous research; the mean range of ratings was at least 47mm for all dimensions of voice quality. Results indicated differential use of the components of the CAPE-V form and scales in evaluating voice quality and severity of dysphonia, including categorical variability among ratings of all of the primary CAPE-V dimensions of voice quality that may complicate the clinical description of a voice as mildly, moderately or severely dysphonic.

Nagle 2022

Influence of phonatory break duration on auditory-perceptual ratings of speech acceptability and listener comfort in adductor-type laryngeal dystonia.

Doyle, P.C., Woldmo, R., Nagle, K.F., Crews, N. & Jovanovic, N. (2021). Journal of Voice. https://doi.org/10.1016/j.jvoice.2021.10.025

Abstract: This study empirically evaluated the influence of phonatory break duration on auditory-perceptual measures of speech produced by 26 adult speakers diagnosed with adductor-type laryngeal dystonia (AdLD). Fifteen inexperienced, young adult normal-hearing listeners provided ratings of speech acceptability and listener comfort for samples of running speech. Four phonatory break timing conditions were assessed using visual analog scaling methods. All stimuli were randomized for presentation and listeners were presented with experimental stimuli in a counterbalanced manner. Results indicate that the duration of phonatory breaks directly influenced listener ratings of speech acceptability (p<.001) and listener comfort (p<.001), with significant differences between original and modified recordings for both. Speech acceptability and listener comfort ratings were strongly correlated across all timing conditions (r = .85-.97). The duration of phonatory breaks and pauses have significant influence on judgments of speech acceptability and listener comfort for AdLD. This suggests that temporal factors such as phonatory break duration and pause time in AdLD may carry substantial negative impact on listeners’ perception relative to other auditory-perceptual features that co-exist in the signal.

Doyle et al 2021

Effect of noise on speech intelligibility & perceived listening effort in head & neck cancer.

Eadie, T., Durr, H., Sauder, C., Nagle, K.F., Kapsner-Smith, M. & Spencer, K. (2021). American Journal of Speech Language Pathology. https://doi.org/10.1044/2020_AJSLP-20-00149

Abstract: This study (a) examined the effect of different levels of background noise on speech intelligibility and perceived listening effort in speakers with impaired and intact speech following treatment for head and neck cancer (HNC) and (b) determined the relative contribution of speech intelligibility, speaker group, and background noise to a measure of perceived listening effort. Ten speakers diagnosed with nasal, oral, or oropharyngeal HNC provided audio recordings of six sentences from the Sentence Intelligibility Test. All speakers were 100% intelligible in quiet: Five speakers with HNC exhibited mild speech imprecisions (speech impairment group), and five speakers with HNC demonstrated intact speech (HNC control group). Speech recordings were presented to 30 inexperienced listeners, who transcribed the sentences and rated perceived listening effort in quiet and two levels (+7 and +5 dB SNR) of background noise. Significant Group × Noise interactions were found for speech intelligibility and perceived listening effort. While no differences in speech intelligibility were found between the speaker groups in quiet, the results showed that, as the signal-to-noise ratio decreased, speakers with intact speech (HNC control) performed significantly better (greater intelligibility, less perceived listening effort) than those with speech imprecisions in the two noise conditions. Perceived listening effort was also shown to be associated with decreased speech intelligibility, imprecise speech, and increased background noise. Speakers with HNC who are 100% intelligible in quiet but who exhibit some degree of imprecise speech are particularly vulnerable to the effects of increased background noise in comparison to those with intact speech. Results have implications for speech evaluations, counseling, and rehabilitation.

Eadie et al. 2021

Perceptual and acoustic assessment of strain using synthetically modified voice samples.

Park, Y., Diaz-Cadiz, M., Nagle, K.F. & Stepp, C. (2020). Journal of Speech Language & Hearing Research. https://doi.org/10.1044/2020_JSLHR-20-00294

Abstract: Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Stimuli were created using recordings of speakers producing /ifi/ with a comfortable voice and with maximum vocal effort. RFF values of the comfortable voice samples were synthetically lowered, and RFF values of the maximum vocal effort samples were synthetically raised. Mid-to-high frequency noise was added to the samples. Twenty listeners rated strain in a visual sort-and-rate task. The effects of RFF modification and added noise on strain were assessed using an analysis of variance; intra- and interrater reliability were compared with and without noise. Lowering RFF in the comfortable voice samples increased their perceived strain, whereas raising RFF in the maximum vocal effort samples decreased their strain. Adding noise increased strain and decreased intra- and interrater reliability relative to samples without added noise. Both RFF and mid-to-high frequency noise contribute to the perception of strain. The presence of dysphonia may decrease the reliability of auditory-perceptual evaluation of strain, which supports the need for complementary objective assessments.

Supplemental Material
https://doi.org/10.23641/asha.13172252

Elements of clinical training with the electrolarynx.

Nagle, K.F. (2019) In P. Doyle (Ed.), Clinical care and rehabilitation in head and neck cancer (pp 129-143). Cham, Switzerland: Springer.

Abstract: The electromechanical device commonly known as an electrolarynx (EL) is a popular primary or backup mode of postlaryngectomy alaryngeal communication. Learning to efficiently and successfully use an EL requires the acquisition of several skills, including: 1) appropriate placement of the device; 2) control of voice activation; 3) over-articulation and modulation of speech rate; and 4) awareness of paralinguistic behaviors. Mastering such skills can increase comprehensibility, and in turn, the potential for communicative success with the EL. Design features vary among commercially available devices, mostly in the type and degree of pitch modulation they offer. To optimize the ability of newer devices to modulate pitch, users may need specific practice directed toward enhancement of the suprasegmental aspects of their EL speech. This chapter addresses reviews current EL features and outlines how speech-language pathologists (SLP) can provide valuable training and insight for laryngectomees seeking to use this popular method of post-laryngectomy communication.

Perceived listener effort as an outcome measure for disordered speech.

Nagle, K.F. & Eadie, T.L. (2018). Journal of Communications Disorders, 73, 34-49.

Abstract: Perceived listening effort is a perceptual dimension used to identify the amount of work necessary to understand disordered speech. The purpose of this study was to investigate the utility of perceived listening effort to provide unique information about disordered speech. The relationships between perceived listening effort and two current outcome measures (speech acceptability, intelligibility) were examined for listeners rating electrolaryngeal speech, along with their reliability and intra-rater agreement. Ten healthy male speakers read low-context sentences using an electrolarynx. Twenty-five inexperienced listeners orthographically transcribed and rated the stimuli for perceived listening effort and speech acceptability using a visual analog scale. Strict reliability and agreement criteria were set. Perceived listening effort was moderately to strongly correlated with intelligibility (r = −0.76) and acceptability (r = −0.80), each of which contributed uniquely to ratings of perceived listening effort. However, only 17 listeners met stringent reliability and agreement criteria. Ratings of perceived listening effort may provide unique information about the communicative success of individuals with communication disorders. There is great variability, however, among inexperienced listeners’ perceptual ratings of electrolaryngeal speech. Future research should investigate variables that may affect perceived listening effort specifically and auditory-perceptual ratings in general.

Nagle & Eadie, 2018

Perceived naturalness of electrolaryngeal speech produced using sEMG-controlled vs. manual pitch modulation.

Nagle, K.F. & Heaton, J.T. (2016). Interspeech 2016. San Francisco, CA.

Abstract: Producing speech with natural prosodic patterns is an ongoing challenge for users of electrolaryngeal (EL) speech. This study describes speech produced using a method currently in development, wherein a prosodic pattern is derived from skin surface electromyographical (sEMG) signals recorded from under the chin (submental surface). Eight laryngectomees who currently use a TruTone EL as their primary or backup mode of speech provided samples of EL speech in two modes: conventional thumb-pressure pitch-modulated control (represented by the TruTone EL; Griffin Laboratories, CA, U.S.A.) and sEMG-based pitch-modulated control (EMG-EL). Ratings of perceived naturalness were obtained from ten listeners unfamiliar with EL speech. Listener ratings indicated that five speakers produced equally natural speech using both devices, and three produced significantly more natural speech using the EMG-EL than the TruTone EL. Mean fundamental frequency (f0) was similar within speakers for both modes; however, mean f0 range and standard deviation were significantly larger for the EMG-EL than for the TruTone EL, despite both devices having similar potential f0 range. This study showed that the EMG-EL provides an intuitive means of controlling f0-based prosodic patterns that are more natural-sounding than push-button control for some EL users.

Nagle & Heaton 2016

Emerging Scientist: Challenges to CAPE-V as a Standard.

Nagle, K.F. (2016). Perspectives of the ASHA Special Interest Groups, 1, 47-53.

Abstract: The Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V; American Speech-Language-Hearing Association, 2002) outlines a protocol for obtaining voice samples and rating their voice quality. It was developed as a standard voice protocol based on expert consensus and psychophysically appropriate measurement of auditory perceptual qualities of voice. The CAPE-V has since obtained widespread research and clinical use, but research suggests considerable variability in how both expert and new clinicians use its rating scales. In this paper, I review remaining challenges to standardizing voice quality evaluation and describe ongoing research addressing these challenges.

Nagle-2016-Perspectives_of_the_ASHA_Special_Interest_Groups

Generating tonal distinctions in Mandarin Chinese using an electrolarynx with preprogrammed tone patterns.

Guo, L., Nagle, K.F. & Heaton, J.T. (2016). Speech Communication, 78, 34-41.

Abstract: An electrolarynx (EL) is a valuable rehabilitative option for individuals who have undergone laryngectomy, but current monotone ELs do not support controlled variations in fundamental frequency for producing tonal languages. The present study examined the production and perception of Mandarin Chinese using a customized hand-held EL driven by computer software to generate tonal distinctions (tonal EL). Four native Mandarin speakers were trained to articulate their speech coincidentally with preprogrammed tonal patterns in order to produce mono- and di-syllabic words with a monotone EL and tonal EL. Three native Mandarin speakers later transcribed and rated the speech samples for intelligibility and acceptability. Results indicated that words produced using the tonal EL were significantly more intelligible and acceptable than those produced using the monotone EL.

Guo et al 2016

Speech & Voice Outcomes Lab

At Seton Hall University

Publications