ASE 2024 Abstracts - The Association for Surgical Education

Poster Session I - Assessment

(P001) DEVELOPMENT OF A NON-TECHNICAL SKILLS ASSESSMENT TOOL FOR LAPAROSCOPIC INGUINAL HERNIA REPAIR
Zen Naito, Saseem Poudel, Keita Ishido, Satoshi Hirano; Gastroenterological Surgery II, Hokkaido University, Japan

Background: Surgical education research has traditionally focused on technical skills assessment. However, surgical performance also heavily depends on non-technical skills. It's estimated that "25% of surgical success is attributed to technique, while 75% is influenced by situation awareness and decision-making abilities." Despite the availability of assessment tools like NOTSS (Non-Technical Skills for Surgeons), non-technical skills remain understudied in surgical education due to challenges in quantification and standardization. This study aims to develop a novel assessment tool specifically for non-technical skills in laparoscopic inguinal hernia repair (LIHR), addressing a critical gap in surgical training evaluation and conduct a pilot study to collect validity evidence of this tool.

Methods: We designed a comprehensive assessment tool based on the NOTSS framework, with a particular focus on situation awareness and decision-making in LIHR. The tool was iteratively refined through expert consensus, involving experienced surgeons and surgical educators.Five surgeons with varying experience levels participated in structured online interviews about their recent LIHR cases, accompanied by relevant surgical footage. Two blinded raters, trained in the use of the new tool, independently assessed these interviews. Interrater reliability was calculated among three raters using intraclass correlation coefficient (ICC).

Results: The final assessment tool comprised 15 items, each rated on a 1-4 scale, covering key aspects of situation awareness and decision-making specific to LIHR. The online assessment process via Zoom proved feasible and efficient, allowing for detailed discussion and evaluation of non-technical skills. The interrater reliability among 3 raters was excellent (ICC: 0.97; 95% CI: 0.86-1.0).

Conclusion: We successfully developed a novel, procedure-specific assessment tool for non-technical skills in LIHR, demonstrating excellent interrater reliability in the pilot study. This tool addresses the need for more targeted evaluation of crucial non-technical aspects in surgical performance. Future research will explore how improvements in non-technical skills, as measured by this tool, contribute to advancements in technical performance and overall surgical outcomes.

(P002) TO OPERATE OR NOT TO OPERATE? A SCRIPT CONCORDANCE STUDY OF RESIDENT PREOPERATIVE DECISION-MAKING
Stefanie J Soelling, MD, MPH¹, Jamie Hillas, MD¹, Thomas Clancy, MD¹, George Molina, MD, MPH¹, Eliza Beth Littleton, PhD², Douglas S Smink, MD, MPH¹; ¹Brigham and Women's Hospital, ²University of Pittsburgh School of Medicine

Background:

Deciding when to operate is crucial to surgery but difficult to teach. Script concordance testing (SCT) can be used to assess clinical reasoning for uncertain scenarios. We aimed to assess preoperative decision making using SCT and evaluate certainty and confidence among surgical residents.

Methods:

We created 5 clinical scenarios (appendicitis, gallbladder pathology, inguinal hernia, small bowel obstruction, and necrotizing pancreatitis) and used SCT to evaluate preoperative decision making. Each clinical scenario had 4 management questions and asked how certain the resident was in the answer they selected. The grading rubric for the SCT was created from faculty surgeon responses, and it was then administered to general surgery residents (PGY1-5). Resident tests were evaluated for the overall mean score, scores per topic, and concordance between score and resident reported answer certainty. “Concordantly certain” was defined as scoring highly on the question and reporting certainty that the answer was correct, whereas “discordantly certain” was defined as scoring low on the question and reporting certainty that the answer was correct. Differences in certainty and overall confidence based on resident level and gender were also investigated.

Results:

Sixty residents participated (95.5% response rate); 58.3% were female. The overall mean percent score on the SCT was 66.3%. Scores were highest for appendicitis (72.8%) and lowest for necrotizing pancreatitis (56.3%). Senior residents scored significantly higher overall than junior residents (68.7% vs 63.9%, p=0.033) and reported significantly greater confidence in their knowledge of the material (63.3% vs 30.0% confident, p=0.004). There was no significant difference in confidence based on gender, though a higher proportion of male residents reported confidence (56% vs 40%, p=0.47). Residents were most often concordantly certain for appendicitis (44.6%), gallbladder pathology (51.3%), and inguinal hernia (48.4%). Residents were most often discordantly certain for small bowel obstruction (45.8%) and necrotizing pancreatitis (36.7%).

Conclusions:

SCT can evaluate resident preoperative decision making and can distinguish between junior and senior residents. Discordance between resident scores and reported certainty demonstrates a significant need for ongoing education regarding preoperative decision making. Intentional efforts are needed to determine how to best teach preoperative decision making to residents.

(P003) NAVIGATING THE DANGER ZONE: GONOGONET ENHANCES SAFETY AND SKILL IN LAPAROSCOPIC CHOLECYSTECTOMY
Bryanna Stukes, MD¹, Sofia Garces Palcios, MD¹, Shekhar Madhav Khairnar, MS¹, Kaustubh Gopal, BS¹, Sydney Vincenti, BS², Alexander Colonna, MD², Ryan Dumas, MD³, Ganesh Sankaranarayanan, PhD¹; ¹UT Southwestern Medical Center, ²University of Utah Health Science Center, ³Baylor College of Medicine

Background

Laparoscopic cholecystectomy (LC) risks bile duct injury, often prevented by ensuring the critical view of safety. GoNoGoNet, an open-source AI model, assists in surgical training by highlighting safe (Go) and unsafe (NoGo) dissection zones in real-time, potentially reducing bile duct injuries. This study assesses GoNoGoNet's feasibility in identifying unsafe dissection risks during LC.

Methods

Twenty LC videos were collected under a quality improvement protocol at a community hospital. Our video segmentation model isolated the hepatocystic triangle dissection phase for applying GoNoGoNet, creating an overlay of safe and unsafe zones. (Fig.1A) Primary outcomes included any surgical tool entry into the NoGo zone. Two trained graders assessed all videos using the Global Operative Assessment of Laparoscopic Skills (GOALS), tool invasion, and complexity based on the Parkland Grading Scale (PGS). Nine videos had LOW PGS (1-2), and eleven had HIGH PGS (3-5). Inter-rater reliability was determined using Intraclass Correlation Coefficient (ICC), while Spearman’s rank correlation analyzed the relationship between GOALS, NoGo intrusion, and dissection duration. We hypothesized that more red zone intrusions would correlate with lower surgical skill.

Results

High inter-rater agreement was observed for GOALS (ICC = 0.94, p < 0.05) and Tool Invasion (ICC = 0.82, p < 0.05). GOALS and NoGo intrusion showed a significant negative correlation (r = -0.66, p < 0.05), suggesting higher skill levels reduce intrusions. In complex cases, this negative correlation was stronger (r = -0.77, p < 0.05). Dissection duration positively correlated with NoGo intrusions (r = 0.57, p < 0.05), indicating more intrusions with longer duration.

Conclusion

GoNoGoNet may enhance LC safety by identifying hazardous zones. The negative correlation between higher skill levels, shorter dissection times, and reduced NoGo intrusion supports its potential as a training tool to improve surgical safety. Larger studies are recommended to validate and broaden its educational application in minimally invasive surgery.

(P004) GRANTSMANSHIP REFINED: EVALUATION AND TRAINING OF EFFICACY IN RESIDENTS (GREATER STUDY)
Jeffrey L Roberson, MD, Lillias H Maguire, MD; The Hospital of the University of Pennsylvania

Introduction

Extramural funding is critical to success in academic surgery and this careerlong process starts at the trainee level. Nevertheless, grantsmanship education is neither standardized nor prioritized for early surgical residents. Similarly, there is a lack of data on best practices and mechanisms to develop surgical trainees’ mastery of grant writing. Grantsmanship Refined: Evaluation and Training of Efficacy in Residents (GREATER Study) seeks to define the current landscape of trainees’ self-efficacy in grantsmanship.

Methods

A previously validated 19 question instrument survey has been distributed to members of the Association for Surgical Education for voluntary completion. General Surgery residents of all training program types as well as level are encouraged to participate with a goal of accruing over one hundred responses.

The survey covers three domains of grantsmanship: conceptualizing a study (8 items), designing and analyzing a study (7 items), and funding a study (4 items). Each question is scored 1 to 10 with 1 indicating no confidence and 10 demonstrating complete confidence.

Results

To date, 58 surveys have been completed. 90% (52) of respondents are enrolled in academic programs, and 57% (33) plan to complete seven years of General Surgery training. 62% (36) intend to pursue a career in academia with 53% (31) interested in clinical and outcomes research.

The median score for conceptualizing a study is 5.25 (IQR 4-7), designing and analyzing a study is 5 (IQR 3-7), and funding a study is 3.25 (IQR 2-5.5), indicating relatively low rates of self-efficacy among current respondents.

Conclusion

The GREATER Study aims to define the current state of self-efficacy in grantsmanship among General Surgery trainees. Preliminary data suggests that resident comfort with the grantsmanship process is low across all three domains. Following completion of accrual, the GREATER Study will inform ongoing curriculum development efforts to best support trainees pursuing a career in academic surgery.

(P005) COMPONENTS OF EFFECTIVE COACHING FEEDBACK: A QUALITATIVE ANALYSIS OF OPERATIVE COACHING
Theresa N Wang, MD, E. Christopher Ellison, MD, Michael R Go, MD, Alan E Harzman, MD, Kelly R Haisley, MD, Emily Huang, MD, Xiaodong (Phoenix) Chen, PhD; The Ohio State University

OBJECTIVE:

Operative coaching (OC), in which a third-party coach observes trainees in the operating room, allows a trained faculty coach to provide individualized resident-centered feedback to improve resident surgical competency and practice-readiness. Previous studies have established the efficacy of OC for improving residents’ self-regulated learning. We conducted this retrospective study of our institution’s OC program to characterize the structure and components of coaching feedback.

METHODS:

We reviewed all OC observations and extracted faculty coaches’ written feedback for chief residents between 2018 and 2024. We conducted qualitative analysis of the feedback comments utilizing an inductive open thematic analysis. Latent class analysis and sentiment analysis were computed using JMP Pro 16 to cluster feedback text and to identify positive/negative valence.

RESULTS:

41 general surgery chief residents (57.5% male) participated in OC cases, with 218 instances of coach feedback encompassing 52,340 words written by five faculty coaches. Most feedback followed a Situation-Behavior-Impact feedback model. We found three key themes of coaching feedback content: flow, technical skills, and teaching/learning (Fig 1). Flow included feedback regarding case anticipation, positioning and ergonomics, pacing, and overall control of the OR (“aura”). Technical skills included feedback on specific operative technique, instrument selection and handling, and exposure. Teaching/learning feedback was specific to teaching less experienced learners and suggestions for gaining future entrustment. Latent class analysis revealed five text clusters: robotic/independently/management,” “goal/independently/learning,” “room/setup/technique,” “hands/operation” and “instrument/using/hand.” These clusters were concordant with the three major themes. Sentiment analysis indicated that most feedback contained words with positive valence (207/218, 95.0%).

CONCLUSION:

Operative coaching allows for individualized resident feedback from observers who are focusing their full attention on the resident’s proficiency in the operating room. Our study found that coaching feedback tends to follow a Situation-Behavior-Impact model, focuses on three coachable content categories (operative flow, technical skills, and teaching/learning), and demonstrates overall positive tone. This characterization of key feedback items during coaching cases can allow for better shared goal-setting and individualized learning by residents and coaches in an OC program.

Fig 1.

(P006) DOES GENDER AND SENIORITY BIAS INFLUENCE RESIDENT EVALUATIONS? A SINGLE-CENTER REVIEW OF INTRAOPERATIVE EVALUATIONS
Nicole L Petcka, MD, Jahnavi K Srinivasan, MD, Eric M Knauer, MD, Anee Jackson, MD, Dominic Papandria, MD; Emory University

Introduction

Implicit bias plays a significant role in the dynamics of feedback and evaluation in surgical residency programs. The literature on how gender bias and seniority bias influences resident evaluations is varied. We hypothesize that affinity bias may affect assessment ratings, with performance rated superior for those trainees who share traits like those of the assessors regarding gender and seniority.

Methods

All resident intraoperative evaluations were collected from June 2021 to October 2024. Evaluations were collected using a modified Ottawa Surgical Competency Operating Room Evaluation Instrument including information on overall performance rating as below, at, or above level expected per resident year. Gender was determined by review of online trainee and faculty profiles. Absolute seniority was calculated as the period elapsed between the date of the core faculty assessor’s (CFAs) initial certification by the American Board of Surgery and the date of surgery for the case evaluated and subsequently divided into quartiles from least (first quartile, SQ1) to most (fourth quartile, SQ4) senior. The competence ratings were compared across groups using crosstabulations with a p<0.05 considered significant.

Results

There were 1274 assessments completed by 106 CFAs. When comparing overall performance rating between evaluator to resident gender, there was no difference in the scores between the groups for female to female, female to male, male to female, or male to male. However, male CFAs had lower rates of resident’s performance at “above expected for level” compared to female CFAs (182 of 579 (31.4%) vs. 120 of 279 (43.0%) respectively; p<0.01). While the rate of evaluations for “below expected for level” was low, only senior faculty gave this rating compared to junior faculty (7 (0.5%) vs. 0 (0.0%); p<0.01).

Conclusion

Male evaluators are less likely to rate residents as above level of expected and senior evaluators are more likely to rate residents as below level of expected. While it is reassuring that there were no overt differences based on resident gender, the findings suggest that implicit bias may play a role in resident feedback. Future studies should evaluate these biases and create systems to improve equity in resident evaluations.

(P007) ASSOCIATION BETWEEN AUTOMATED, DEPARTMENT-WIDE SUMMATIVE EDUCATIONAL FEEDBACK FOR FACULTY AND TRAINEE RATINGS OF FACULTY
Jordan A McKean, MD, Tyler J Loftus, MD, PhD, George A Sarosi, MD, Gilbert R Upchurch, MD, Amalia L Cochran, MD; University of Florida College of Medicine, Department of Surgery

Introduction: Providing academic surgeons with summative feedback from trainees has the potential to improve faculty educational performance. We hypothesized that distributing annual summative feedback letters to surgical faculty is associated with increased end-of-rotation faculty ratings at an academic quaternary care center.

Methods: We developed automated, reproducible methods for concatenating feedback elements in annual education letters addressed from our Vice Chair of Education to each Department of Surgery faculty member: 1) contributions to formal didactic and skills lab sessions, 2) graphic visualizations for end-of-rotation Likert scale evaluations of faculty completed by residents or medical students with statistical significance testing, overall rank, and overall quartile versus peers, and 3) text descriptions of engagement with frequent, workplace-based assessments of residents. These letters were first distributed in November 2022. To compare end-of-rotation faculty ratings by trainees before versus after the intervention, we obtained all available evaluations of 80 faculty members completed by residents or medical students during a 5-year period ending June 2024. For each question on these Likert scale evaluations, we compared the aggregate, non-parametric distributions of responses before versus after the intervention. Medical students transitioned to virtual education during the COVID-19 pandemic, so evaluations from students between March and December of 2020 were excluded. We corrected for multiple comparisons with the Benjamini-Hochberg procedure.

Results: There were 37,508 Likert scale responses on our 13-item evaluation of faculty completed by residents (27,364 before vs. 10,144 after the intervention). For all 13 items, the distribution of responses was higher (better) in the post-intervention period, as depicted in the Figure (all P <0.03). There were 10,764 Likert scale responses on our 10-item evaluation of faculty completed by medical students (9,206 before vs. 1,558 after the intervention). For all 10 items, the distribution of responses was similar before and after the intervention (all P >0.50).

Conclusions: Providing annual summative feedback letters to surgical faculty was associated with increased end-of-rotation faculty ratings by residents. Ratings of faculty by medical students, who have less continuity with surgical services, were unchanged after the intervention. As in other areas of clinical surgical practice, comparative performance data may encourage improvement in teaching.

(P008) FACULTY FAMILIARITY WITH TRAINEES IMPROVES INTRAOPERATIVE ENTRUSTMENT IN THE CARDIAC OPERATING ROOM
Megan Schultz, MD¹, Alexandra Theall², Rico Ozuna-Harrison³, Diamond Buchanan, MS³, Darrell Tubbs, MPH³, Julie Thompson Burdine, MS³, Niki Matsuko, MS³, Gurjit Sandhu, PhD³; ¹University of Michigan Department of Cardiac Surgery, ²University of Michigan Medical School, ³University of Michigan Department of Surgery

Introduction
Current surgical education paradigms emphasize competency and progressive entrustment as mediators of trainee preparedness for independent practice. The OpTrust tool has been validated in general surgery to measure entrustment within the faculty-trainee dyad and has shown that increased familiarity within dyads leads to increased operative entrustment. Here, we apply the OpTrust tool to the unique environment of the cardiac operating room to assess whether faculty familiarity with trainees impacts entrustment in this setting.

Methods
Elective cardiac surgeries were observed from November 2022 – June 2023. The OpTrust tool was used to assign faculty entrustment and trainee entrustability scores (1=low, 4=high). Faculty familiarity with trainees was self-reported using a 1 to 4 scale (1=not at all, 4=extremely). Pairwise Pearson correlation coefficients were used to assess the relationship between faculty familiarity and entrustment.

Results
12 trainees and 7 faculty were observed in 49 cardiac surgery cases. Faculty were very familiar with trainees, with a mean familiarity score of 3.57±0.74. Faculty familiarity was significantly associated with faculty entrustment (p=0.0029), trainee entrustability (p=0.0008), and global entrustment within the dyad (p=0.0012). Mean faculty entrustment scores were 2.28±0.44 with moderate familiarity and rose to 2.64±0.80 with extreme familiarity.

Conclusion
Faculty familiarity with trainees improves intraoperative entrustment in the cardiac operating room. This finding suggests that stronger professional relationships within operating dyads may increase trainee opportunities for technical advancement. The smaller nature of cardiac surgery training programs may uniquely facilitate longitudinal faculty-resident contact, thereby promoting progressive entrustment and trainee progression towards readiness for independent practice. Future research areas include exploring the impact of trainee familiarity with faculty on entrustment and the relationship between familiarity and development of operative skills.

(P009) RESIDENTS’ EMOTIONAL INTELLIGENCE AND ACADEMIC PERFORMANCE (REMINDER): A WORLDWIDE COLLABORATIVE SURVEY
Sarah Benammi¹, Lucia Paolini, MD², Annarita Tullio³, Carolyn MacCann⁴, Giacomo Calini², Reminder Collaborative Group⁵; ¹Mohamed VI University of Health Sciences, Casablanca, Morocco, ²University of Bologna, Bologna, Italy, ³Azienda Sanitaria Universitaria Friuli Centrale, Udine, Italy, ⁴The University of Sydney, Sydney, Australia, ⁵REMINDER Collaborative Group

INTRODUCTION

The core of residency is to acquire medical knowledge and clinical training. Medical knowledge is built on evidence-based medicine. Emotional intelligence is the ability to perceive accurately, appraise, and express emotion and it is pivotal for good patient care. The aim is to explore emotional intelligence (EI) among medical trainees worldwide and its association with academic performance and publications.

METHODS

REMINDER was a voluntary, collaborative survey for worldwide medical and surgical residents, trainees, and fellows available online from 15 February to 15 May 2023. Participants were invited to join through a dedicated social media page and email. The survey structure examined academic productivity (h-index, scientific articles), the Wong and Law Emotional Intelligence Scale (WLEIS evaluation of 16 items organized into 4 dimensions: Self-emotions Appraisal, Regulation of Emotions, Use of Emotion, Others-Emotion Appraisal; 7-point Likert scale), WHO-5 Well-being Index (WHO-5, range:0-25), sleep, and workload.

RESULTS:

Among 1120 participants:44% were female physicians, the mean age was 29±3 years, and the most participating specialty was surgery (36%). 646 participants either agreed or strongly agreed about having an interest in academic career. The mean h-index was 2.4±4.1 [mean±sd] and the number of articles as first author was 1.3±3.4. The mean WLEIS was 5.23±1.0 indicating that participants scored the highest response options. The mean score was 5.28±1.32 for Self-emotions Appraisal(SEA), 5.33±1.18 and 5.33±1.31 for Regulation of Emotions(ROE) and Use of Emotion(UOE) respectively being the highest scores, and 4.97±1.37 for Others-Emotion Appraisal(OEA). WHO-5 was 12.8±5.2, sleep was 5.9±0.9 (h/day), workload was 63.8±20.9 (h/week), and in-hospital night shift 5.4±4.3 (no./months). Predictors for the highest quartile (q3) of the h-index using Logistic regression prediction were seniority into residency training and fellowship (PGY5 p<.001, PGY6 p<.001). Predictors for first author-published articles were the Use of emotion(p<.001) and WHO-5 index(p=.009)

CONCLUSION

The h-index was not associated with WLEIS, WHO-5 Index, sleep, and workload but with years of training. Author's position in publications was associated with the use of emotion and the WHO-5 index. This suggests that residents' overall well-being influences their academic performance engagement in addition to seniority and experience.