Plenary Session 1
SMALLER STEPS, BIGGER GAINS: USING NESTED EPAS TO CAPTURE PROGRESSIVE OPERATIVE COMPETENCE
Maria R Castro, MD1, Kara Faktor, MD1, Yaacov Davidow, DO1, Olle ten Cate, PhD2, Patricia S O'Sullivan, EdD1, Lan T Vu, MD2; 1University of California, San Francisco, 2University Medical Center Utrecht, Utrecht, Netherlands
Introduction:
The American Board of Surgery (ABS) defines 18 Core EPAs for general surgery, assessed on a 4-point scale (Limited Participation, Direct Supervision, Indirect Supervision, Practice Ready). To provide a more granular assessment of progressive skill acquisition, we developed 10 Nested EPAs – discrete surgical tasks embedded within Core EPAs, such as minimally invasive abdominal entry and ostomy creation. These targeted tasks allow residents to achieve full entrustment (“Practice Ready”) on specific procedural components before demonstrating entrustment for the entire operation. We hypothesized that residents would earn higher entrustment scores on Nested EPAs than Core EPAs, thereby offering a more sensitive mechanism for detecting incremental operative competency.
Methods:
We analyzed EPA assessment scores for all clinical general surgery residents with both Core and Nested EPA assessments during the 2024-2025 academic year. Core and Nested EPA scores were assessed on the ABS 4-point scale. Residents were categorized as Interns (PGY1), Junior Residents (PGY2–3), and Senior Residents (PGY4–5). Mean Core and Nested EPA scores were calculated for each resident, then used to derive group means. Mean paired difference (Nested – Core) was evaluated using a one-sided paired t-test.
Results:
Thirty-nine residents (Intern=12, Junior Resident=18, Senior Resident=9) met inclusion criteria. Mean Core EPA scores were: Intern 1.89 (SD=0.35), Junior Resident 2.45 (SD=0.27), Senior Resident 3.23 (SD=0.32). Mean Nested EPA scores were: Intern 2.06 (SD=0.46), Junior Resident 2.48 (SD=0.47), Senior Resident 3.56 (SD=0.73). Mean paired differences (Nested - Core) were: Intern 0.17 (SD=0.45), Junior Resident 0.03 (0.38), Senior Resident 0.32 (0.89) (Figure 1). Across all residents, the overall mean paired difference (Nested - Core) was 0.14 (SD=0.55, p=0.06).
Conclusions:
Across all clinical years, mean Nested EPA trended higher than Core EPA scores. These findings suggest that Nested EPAs could serve as a valuable adjunct to existing competency-based assessment frameworks by capturing earlier and more nuanced progress within complex procedures and enabling programs to support individualized learning trajectories.

DOES MORE MEAN BETTER? EVALUATING FACULTY-TO-RESIDENT FEEDBACK IN GENERAL SURGERY
Michel S Kabbash, MD1, Jennifer H Fieber, MD1, Christiana M Shaw, MD, MS1, George A Sarosi, MD1, John L Falcone, MD, MS2; 1University of Florida, 2Owensboro Health
Introduction
Entrustable Professional Activities (EPAs) are increasingly used in general surgery residency programs. However, there is limited understanding of faculty-level feedback patterns. This study examines the relationship between the quality of narrative feedback submitted by faculty for general surgery residents and the number of evaluations they provide. We hypothesized that faculty who submit more EPA micro-assessments provide higher quality feedback.
Methods
We utilized EPA micro-assessments for gallbladder disease, appendicitis and inguinal hernia to analyze narrative feedback submitted by faculty to general surgery residents for the 2023-2025 academic years at an academic institution. De-identified data were extracted from the Surgery EPA application. Two independent raters evaluated the narrative feedback for quality using the QuAL (Quality of Assessment for Learning) rubric, to score the assessments on a zero-to-five-point scale based on evidence, suggestion and connection. Narratives with discordant scores were evaluated by a third rater. Inter-rater reliability was assessed. Data were analyzed using descriptive statistics, correlation testing, and linear regression. We used alpha=0.05 to determine statistical significance. This study was declared exempt by the Institutional Review Board.
Results
661 EPA micro-assessments from 26 faculty members were analyzed for quality. Nine evaluations had discordant scores and were ranked by a third rater. Inter-rater reliability was 0.85, indicating strong consistency. The mean QuAL score was 3.2 (IQR 2.5,4). The average word count was 60 (IQR 26-86) and positively correlated with QuAL scores (ρ= 0.81, p<0.001). Faculty with <25 evaluations had a mean QuAL score of 2.69 vs 3.3 for those who submitted ≥25 evaluations (U=96, p=0.015, Cohen’s d=0.97 indicating large effect size). Faculty members’ years in practice ranged from one to 37. Regression analysis showed that submitting ≥25 evaluations (p=0.015), narrative feedback word count (p<0.001), and faculty years in practice (p=0.015) were independently associated with higher quality feedback.
Conclusion
Quality of EPA narrative feedback submitted to residents by general surgery faculty was associated with longer word count feedback, faculty years in practice, and submitting more EPA micro-assessments (≥25). These findings support targeting faculty development to submit more evaluations and dictate longer feedback to enhance the educational impact of EPA micro-assessments.
HIGHER PERFORMING SURGICAL TRAINEES RECEIVE LOWER QUALITY OF FEEDBACK IN A NATIONAL, MULIT-INSTITUTIONAL STUDY OF ENTRUSTABLE PROFESSIONAL ACTIVITY ASSESSMENTS
Katie Glasgow, MD1, Ting Sun, PhD1, Brigitte Smith, MD, MHPE2, Libby Weaver, MD, MHPE1; 1University of Utah, 2University of Wisconsin
Background:
Surgical trainees require high-quality feedback. However, most faculty do not receive training on delivering effective feedback. Entrustable professional activities (EPA) assessments allow for frequent feedback in all phases of care. We sought to evaluate the quality of narrative feedback provided with EPAs in a national, multi-institutional study.
Methods:
EPA assessments were collected from 19 institutions in the Vascular Surgery EPA Pilot in 2024. Each assessment includes entrustment score, phase of care, and narrative feedback. Entrustment ratings include limited participation, direct supervision, indirect supervision and practice ready. Demographic information was self-reported. Quality scores were assigned to narrative feedback using the Quality of Assessment of Learning (QuAL) system. Descriptive statistics were calculated, stratified by demographic characteristics and phase of care. A mixed effects logistic regression was conducted to examining feedback quality across entrustment ratings, controlling for trainee gender, postgraduate year (PGY), trainee ethnicity (underrepresented in medicine [URiM] and non-URiM), faculty gender, faculty-trainee gender concordance, and phase of care. Quality scores were dichotomized into high (QuAL score of 4-5) and low (QuAL score of 1-3) for analysis.
Results:
Faculty feedback was provided in 759 of 1,998 evaluations, with 548(72%) scored as low quality. 56% of trainees were men, 54% were white, and 64% of faculty were men. No trainee or faculty demographics were associated with the probability of receiving or giving high quality feedback. Compared with "practice ready" entrustment ratings, however, lower entrustment ratings were associated with higher odds of receiving high-quality feedback (‘indirect supervision’: OR 3.25, [95% CI: 1.86-5.68]); ‘direct supervision’: OR 6.39, [95% CI: 3.29-12.44]; ‘limited participation’: OR 5.84, [95% CI: 1.14:24.16])(Figure 1). Additionally, trainees had significantly lower odds of receiving high-quality feedback during the preoperative phase compared to the intraoperative phase (OR 0.32, [95% CI: 0.09-1.19]).
Conclusion:
In this national, multi-institutional EPA assessments study, overall feedback quality was low. However, previously identified demographic differences in surgical feedback were not present. Trainee performance correlated with quality of feedback with higher performing trainees receiving lower quality feedback. This study highlights the need to foster faculty feedback skills for trainees at all performance levels to prevent limiting trainee learning.

BRIDGING THE GAP: LONGITUDINAL TRENDS IN INTRAOPERATIVE VERSUS NONOPERATIVE ENTRUSTMENT LEVELS
Andrada Diaconescu, MD1, Julia Kasmirski, MD1, Erin White, MD1, James Korndorffer, MD, MHPE2, M. Chandler McLeod, PhD1, George Sarosi, MD3, Carol Barry, PhD4, Andrew Jones, PhD4, Rebecca Minter, MD, MBA5, Karen Brasel, MD, MPH6, Brenessa Lindeman, MD, MEHP1; 1Department of Surgery, University of Alabama at Birmingham, Birmingham, AL, 2Department of Surgery and Perioperative Care, The University of Texas at Austin, Dell Medical School, Austin, TX, 3Department of Surgery, University of Florida, Gainesville, FL, 4American Board of Surgery, Philadelphia, PA, 5Department of Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI, 6Department of Surgery, Oregon Health & Science University, Portland, OR
Introduction
Resident intraoperative entrustment frequently lags behind nonoperative entrustment. However, less is known about how that dynamic changes over time. We hypothesize the difference between intraoperative and nonoperative entrustment, as measured by the American Board of Surgery (ABS) entrustable professional activities (EPAs) would be greatest in the PGY-1 year and the variance would approach zero as residents progress through training.
Methods
National EPA data from the ABS for PGY-1 through PGY-5 residents from July 2023 to July 2025 were reviewed. Entrustment levels were compared by phase and post-graduate year for common EPAs (abdominal wall hernia, appendicitis, colon disease, gallbladder disease, inguinal hernia). Statistical analyses were performed using Kruskal–Wallis tests, Wilcoxon rank-sum tests, and multivariable logistic regression.
Results
Subjects included 8,969 residents from 344 programs. A total of 97,252 EPA microassessments were available for analysis (82,759 intraoperative, 14,493 nonoperative/preoperative/postoperative). For PGY-1s, mean individual intraoperative entrustment scores (1=limited participation, 2=direct supervision, 3=indirect supervision, 4=practice ready) were significantly lower for the intraoperative phase than for the nonoperative/preoperative phase for each of the EPAs analyzed (mean scores ± SD for intraoperative vs nonoperative: abdominal wall hernia 1.65±0.50 vs 2.20±0.59, appendicitis 1.69±0.50 vs 2.32±0.65, colon disease 1.60±0.55 vs 2.16±0.63, gallbladder disease 1.59±0.49 vs 2.27±0.63, inguinal hernia 1.62±0.55 vs 2.24±0.67; all p<0.001). Each successive year the gap between intraoperative and nonoperative EPA scores progressively narrowed, as indicated by significant interactions between PGY and EPA type (OR(95% CI): all p<0.001; abdominal wall hernia OR=1.47(1.28,1.69); appendicitis OR=1.90(1.65,2.19); colon disease OR=1.48(1.31,1.67) gallbladder disease OR=1.85(1.66,2.06); inguinal hernia OR=1.51(1.26,1.81)) (Figure 1). However, at PGY-5, the odds of receiving practice ready entrustment (4) remained lower for intraoperative EPAs (OR(95% CI): all p<0.001; abdominal wall hernia OR=0.20(0.14,0.28); appendicitis OR=0.32(0.19,0.53); colon disease OR=0.15(0.11,0.20); gallbladder disease OR=0.26(0.19,0.36); inguinal hernia OR=0.10(0.06,0.17)).
Conclusions
There has been concern about the lag in intraoperative entrustment levels behind nonoperative entrustment. This study confirms PGY-1 residents begin training with significantly lower entrustment in the intraoperative phase when compared to nonoperative/preoperative/postoperative phase. By the end of training this difference narrows, though the gap persists for PGY-5s receiving practice ready intraoperative entrustment levels.
Figure 1

WHEN IMPLEMENTING SIMPL IS NOT SO SIMPLE: RE-IMPLEMENTATION AND ENGAGEMENT STUDY
Ramiz Kardan, MD1, Moshumi Godbole, MD1, Jason Lei, MD2, Benjamin Moran, MD1, Ramsey Dallal, MD1, Jandie Posner, MD1; 1Jefferson Einstein Philadelphia Hospital, 2Jefferson Torresdale Hospital
Background: Entrustable Professional Activities (EPAs) became mandatory for U.S. general surgery residency assessments in July 2023, usually recorded through the SIMPL mobile platform. Limited participation and inconsistent faculty involvement reduced EPA effectiveness at our institution. We aimed to identify barriers to EPA completion and develop a structured, reproducible strategy to enhance resident and faculty engagement.
Methods: We conducted a single-center, pre–post quality improvement study utilizing the ADKAR change management model. Barriers were identified through a faculty survey and resident feedback. The interventions (August–December 2024) included targeted re-education sessions, a question-based survey to clarify workflow needs, and gamification strategies like team-based leaderboards and incentives. We compared EPA assessment activity for residents across two 3-month periods: pre-intervention (Jan–Mar 2024) and post-intervention (Jan–Mar 2025). Primary outcomes included total EPA assessments and resident participation rates. Secondary outcomes encompassed assessments by phase of care, faculty-initiated assessment rates, the number of unique EPA categories per resident, and differences by PGY. Statistical analysis involved Fisher’s exact test or chi-square for proportions and Welch’s t-test for means.
Results: Faculty survey response rate was 13/66 (19.7%), with main barriers identified as time constraints, unclear workflow, and low perceived value. Total resident EPA assessments increased from 36 to 100 (p=0.007). Trainee participation rose from 9/24 (37.5%) to 18/24 (75%). By phase: pre-operative assessments went from 0 to 6, intra-operative from 33 to 79, post-operative from 1 to 9, and other from 2 to 6. Faculty-initiated assessments increased from 0% to 12% (raw counts 0/36 → 12/100). The mean number of unique EPA categories per resident increased from 0.96 ± 1.55 to 2.83 ± 2.97 (p=0.0096). Differences among PGY levels did not reach significance (p=0.057). No changes to program structure or case volume explain the observed effect.
Conclusions: A targeted, ADKAR-informed re-engagement strategy significantly boosted EPA engagement in a general surgery residency. Limited faculty survey responses and the single-center, pre–post design restrict generalizability. The intervention components are reproducible and may help other programs improve competency-based assessment uptake.
Figure 1: EPA evaluations by assessment type
