Podium IB - Education Technology
(S014) AUGMENTED REALITY-ENHANCED CADAVER SURGICAL TRAINING FOR HEAD AND NECK SURGEONS: A PROOF-OF-CONCEPT STUDY
Sohei Mitani, MD, PhD1, Masaharu Isshiki2, Kayo Sakamoto3, Eriko Sato1, Yuki Irifune1, Yuki Hosokawa1, Koji Kinoshita2, Naohito Hato1; 1Department of Otolaryngology-Head and Neck Surgery, Ehime University Graduate School of Medicine, 2Graduate School of Science and Engineering, Ehime University, 3Division of Head and Neck Surgery, Shizuoka Cancer Center
Purpose: Surgical proficiency is a critical factor influencing postoperative complications and treatment outcomes. However, ensuring sufficient surgical training hours has become increasingly challenging due to advancements in medical technology, concerns about patient safety, and work-hour restrictions for trainees. To address these challenges, we developed an augmented reality (AR)-enhanced cadaver surgical training (CST) system to supplement traditional training methods and conducted a proof-of-concept study.
Methods: Neck dissection, a foundational procedure for head and neck surgeons, was divided into nine specific steps. For each step, we prepared written instructions, surgical videos, schemas, and 3D models. The 3D models were based on computed tomography images of a patient with normal neck anatomy, after obtaining informed consent. In collaboration with the engineering department, we developed an application that projects these materials as AR content onto the real world via a transparent head-mounted display (HMD). This allowed surgeons to access educational materials in real-time during CST and the AR content can be operated freely using eye tracking and fingertip gestures without contaminating their hands. Otolaryngology–head and neck surgeons performed neck dissections on cadavers using this AR-enhanced system, followed by a questionnaire survey. The questionnaire employed a 5-point Likert scale to evaluate various aspects of the training experience.
Results: Fourteen surgeons, with four to 18 years of post-graduate experience, participated in the AR-enhanced CST and responded to the survey. The overall usefulness of the AR-enhanced CST was rated highly, with an average score of 4.79±0.43 out of 5. The additional educational value provided by the AR content in CST was rated a perfect score of 5 by all participants. The ease of use of the AR content received a score of 4.07±0.92, and the burden of viewing content through the HMD was relatively low, with a score of 1.57±0.76 out of 5.
Conclusion: We developed an AR-enhanced CST system for neck dissection. This proof-of-concept study demonstrated the potential of the AR-enhanced system to complement and improve traditional surgical education effectively.
(S015) FORMAL LEADERSHIP CURRICULUM IN UNITED STATES SURGICAL RESIDENCY PROGRAMS: A SYSTEMATIC REVIEW
Charlotte F Wahle1, Michelle Ryder1, Rania Berkane2, Yifan Mao1, Christian de Virgilio, MD3, Shahrzad Bazargan, PhD2; 1David Geffen School of Medicine, University of California, Los Angeles, 2Charles R. Drew University of Medicine and Science, 3Department of Surgery, Harbor-UCLA Medical Center
Background
Leadership skills are critical for success as a surgical trainee and attending. However, there is a lack of structured leadership training curricula in surgical residency training programs. Many surveys and needs assessments have been conducted on this subject; although, implementation lags behind. The objective of this study was to assess the current state of the literature in terms of reporting on formal leadership curricula in surgical residency programs in the United States.
Methods
A systematic literature search was performed using the PRISMA guidelines. The search was conducted via PubMed, Embase, and Cochrane Reviews. Inclusion criteria consisted of all original research articles that discussed the implementation of a formal surgical leadership curriculum in ACGME-accredited residency programs. Exclusion criteria included non-surgical residencies, leadership training for medical students or faculty, non-U.S.-based residency programs, and needs-assessment-based research.
Results
A total of 199 studies underwent screening. Eleven of those studies (5.53%) met the final inclusion criteria. The years of publication ranged from 2004 to 2024. Of the 11 relevant studies, there were no randomized control trials. Ten of the included studies (90.9%) were prospective cohort studies, and the remaining study was a retrospective cohort study. A total of 297 residents were included across nine of the studies (2 out of 11 studies did not disclose how many residents participated). The length of the curricular interventions ranged from 3 weeks to 5 years. The most common forms of curricula were formal leadership lectures (including free-standing lectures and leadership-focused grand rounds speakers), journal clubs, and individualized leadership mentoring/coaching. Seven of the 11 formal leadership programs (64%) were modeled according to literature, programs, or experts in other industries, however these models varied widely, and none were surgical in nature. Curricular impact was measured via faculty feedback, rates of attrition, and surveys/feedback from the residents following the intervention.
Conclusion
Following a thorough review of the literature, it was concluded that the prevalence of formal leadership curricula in US residency programs is sparse. A select few residencies have begun to implement such programming, and these programs have largely been met with success and positive reactions from participants.
(S017) EXPLORING THE EFFICACY OF CHATGPT IN SIMULATION-BASED MEDICAL TRAINING FOR SURGICAL HISTORY TAKING
Cathleen A McCarrick, BA, MB, BAO, BCh, MCh, MPhil, Philip D McEntee, MB, BAO, BCh, Patrick D Boland, MB, BAO, BCh, Suzanne Donnelly, Helen Heneghan, MB, BAO, BCh, PhD, FRCS, Ronan A Cahill, MB, BAO, BCh, MD, PhD, FRCS; University College Dublin/ Mater Misericordiae Hospital
Introduction
Simulation-based medical training (SBMT) is increasingly recognized for its effectiveness in enhancing clinical skills among undergraduate students. Accompanying this trend is a growing interest in integrating artificial intelligence, particularly ChatGPT, to further enrich educational experiences. This study aims to explore the potential of AI in SBMT by assessing the impact of a program utilizing ChatGPT on the competency of senior clinical students in surgical history taking.
Methodology
With institutional ethical approval, the use of ChatGPT was implemented during our undergraduate clinical surgery module. Students were divided into two groups through randomized cluster sampling, with all participants undergoing a baseline patient history-taking assessment with a simulated patient (SP), evaluated by an independent blinded senior clinician. Control students continued with standard clinical experiential learning, while intervention group students engaged in directed sessions with ChatGPT. In these sessions, students took on the role of the doctor, while ChatGPT simulated the patient, using a provided synopsis of the surgical issue. Following this, sessional information was sent to a surgical tutor ensuring accuracy. Both groups underwent repeat assessments of observed SP history-taking by an independent blinded senior clinician within the same week. Communication skills were scored using an externally validated scoring rubric. Additionally, intervention group students completed surveys post-study.
Results
A total of 108 students participated, with 54 assigned to each group. Mean scores were similar (p > 0.05) at baseline, but those who underwent ChatGPT sessions demonstrated significantly higher scores (p < 0.001).
Conclusion
The findings of this novel study indicate that ChatGPT is a valuable tool for enhancing communication skills within medical training. The significant improvement in scores for the intervention group suggests that AI can effectively simulate patient interactions, providing a safe and flexible environment for students to practice and refine their history-taking abilities. As medical education evolves, integrating AI technologies like ChatGPT opens new avenues for exploration, potentially transforming communication training approaches. Continued research is essential to fully understand the implications and applications of AI in medical education, paving the way for innovative methodologies that prepare students for the complexities of real-world clinical interactions.
(S018) A HIGHER CALIBER APPROACH TO ORAL BOARDS PREPARATION DURING RESIDENCY
Adam M Wegener, MD, MS1, Justin D Faulkner, MD1, Austin T Coale, BS2, Sarah S Fox, MD, FACS1, Frank DiSilvio, MD3, Laura R Brown, MD, PhD, FACS3, Brian R Smith, MD, FACS4, Luke R Putnam, MD, MS, FACS5; 1Novant Health New Hanover Regional Medical Center, 2University of North Carolina at Chapel Hill School of Medicine, 3University of Illinois College of Medicine Peoria, 4University of California, Irvine, 5University of Southern California
Introduction:
Most general surgery graduates report feeling under-prepared for the American Board of Surgery (ABS) Certifying Exam. Residency programs have unequal access to dedicated faculty and oral board curricula. Thus, a group of surgeons developed an online platform to provide on-demand, SCORE-based mock oral board scenarios with trained, Board-certified examiners. The purpose of this study was to evaluate the perceived effectiveness of residency partnerships with the platform for oral board preparation.
Methods:
Three ACGME-accredited general surgery residency programs partnered with the platform to provide oral board preparation for senior residents from 2023-2024. A 6-question survey was administered to four cohorts: current residents, past residents, program directors, and coordinators. A 5-point Likert scale was used. Descriptive statistics were performed to assess overall satisfaction and perceived utility of the platform.
Results:
A total of 56 participants (93%) completed the surveys. All four groups felt the online platform made a significant impact on oral board preparation, was superior to their program’s previous curriculum, and were very likely to recommend the platform to colleagues (Table). Among residents, session recordings and external examiners were rated very important (4.0 ± 1.1, 4.5 ± 1.0, respectively). Program coordinators unanimously felt the platform was “extremely helpful” in saving time and reducing stress associated with facilitating mock orals.
Current Residents (n=39) |
Past Residents (n=11) |
Program Directors (n=5) |
Program Coordinators (n=4) |
|
---|---|---|---|---|
Overall Impact | 4.7±0.5 | 4.4±0.7 | 4.2±0.4 | 4.8±0.3 |
Better Curriculum | 4.6±0.6 | 4.1±0.7 | 4.4±0.9 | 5.0±0.0 |
Recommend to Others | 4.6±0.6 | 4.4±0.7 | 4.4±0.5 | 5.0±0.0 |
Conclusion:
Residency partnerships with the online platform focused on high-fidelity oral board preparation were perceived by residents and program leadership to be very effective and more comprehensive than previous curricula. Residents valued session recordings and external examiners while coordinators benefited from the platform’s assistance with providing content and examiners. Further study of these partnerships to assess longitudinal outcomes, including ABS Certifying Exam pass rates, is warranted.
(S019) ASSESSING RESIDENT COMPETENCY IN ROBOTIC CHOLECYSTECTOMY WITH CASE TIMELINES
Hoover Wu, MD, Joshua Ng, MD, Kulmeet Sandhu, MD, Miguel Burch, MD, Farin Amersi, MD, Yufei Chen, MD; Cedars-Sinai Medical Center
OBJECTIVE
Robotic surgery has been widely adopted in a multitude of general surgery specialties. However, more tools are needed to track stepwise progression and proficiency in our trainees. We aimed to utilize robotic console data to assess resident competency in robotic cholecystectomies.
METHODS
We reviewed console timelines of surgical residents involved in robotic cholecystectomies. The console data includes the types of instruments utilized, docking time, and instrument exchanges. Clip insertion was used as a marker for having obtained the critical view. Data points include how long residents performed the dissection before placing the first clip (preC), the amount of time to remove the gallbladder after the last clip was placed (postC), who placed the clips, and the number of handoffs to the attending.
RESULTS
A total of 39 case timelines were assessed from senior (year 4-5, 29 cases) and junior (year 1-3, 10 cases) residents. 55% were elective cases. The average console time was 59.8±23.0 mins, with the first clip being placed at a mean of 38.7±10.7% and the last clip 57.2±14.8% of case completion. The critical view was achieved in 97.4% of cases. Residents were active on the console 46.2% of the cases, with 22.7% during preC and 23.7% postC. Clips were placed by only the resident 48.7%, only attending 20.5%, or both 30.8% of the time. Senior and junior residents spent the same percentage (24.5±11.5% vs. 21.0±13.8%, P=0.50) of time postC, however senior residents were more active during preC (26.4±13.2%vs. 9.0±14.0%, p=0.02). Senior residents had a higher total percentage of operating time on the console than junior residents (50.6±17.9% vs. 34.7±23.5%, p=0.045). Senior residents placed clips independently 62.1% of the time compared to the junior residents 10%. Handoffs to the attending did not differ between resident groups during postC.
CONCLUSIONS
Senior residents have more active console time overall than junior residents, which may indicate increased operative autonomy. Robotic case timelines in cholecystectomies may be used to assess progressive competency in surgical training.
(S020) APPLICATION OF AI TO REDUCE ADMINISTRATIVE BURDEN DURING THE SURGERY CLERKSHIP: A PILOT STUDY
Swetha Jayavelu, MD, Lureye Myers, MS, Paul Haidet, MD, MPH, Brian Saunders, MD, Afif N Kulaylat, MD, MSc; The Pennsylvania State University College of Medicine
Introduction
Clinical evaluations are an important component of assessing medical student performance. Summative narratives are often used to describe performance on both clerkship rotations and Medical Student Performance Evaluation (MSPE) letters for residency application. Artificial intelligence (AI) provides the potential to reduce the administrative burden associated with this process. Our aim was to incorporate an AI language model to aid in summarizing clinical evaluations during the surgery clerkship.
Methods
We collected clinical evaluations of third-year medical students completing a university hospital-based surgical clerkship from 2023-2024. A prompt was constructed and used with Chat GPT (4.0)TM to summarize the clinical evaluations for each student. The AI-generated summaries were then compared to the corresponding human-generated summaries which were previously submitted for each student. For this pilot study, summaries were compared based on general time efficiency and accuracy.
Results
There were 126 students included in the study, with 64% being female (n=81). Each student completed a 6-week surgical rotation that contained a 4 week “general surgery” and 2 week “subspecialty surgery” experience. AI-generated summaries took an average of 1.6 minutes per student. Based on informal conversations with educational leadership who completed the human-generated summaries, these took, on average, approximately 15-30 minutes per student to complete. No major errors or misrepresentations were noted in the AI-generated student summaries.
Conclusion
AI-generated summative evaluations were substantially more time-efficient than human-generated evaluations, and appeared to be free of major errors or misrepresentations. This pilot study suggests there may be a role for AI in reducing the manual workload associated with administrative clerkship tasks in medical education, while maintaining accuracy. Future studies will focus on qualitative comparisons between AI- and human-generated summative evaluations, as well as perspectives and preferences among faculty and students.
(S021) POTENTIAL AND PITFALLS: CHATGPT’S PERFORMANCE ON SURGERY SHELF EXAMINATION
Baylee Brochu, BS1, Michael D Cobler-Lichter, MD2, Nikita M Shah1, Jessica M Delamater2, Ana M Reyes2, Talia R Arcieri2, Matthew S Sussman2, Edward B Lineen2, Laurence R Sands2, Vanessa W Hui2, Steven E Rodgers2, Chad M Thorson2; 1University of Miami Miller School of Medicine, 2University of Miami Miller School of Medicine DeWitt Daughtry Family Department of Surgery
Introduction: The recent surge in artificial intelligence (AI- related technologies presents potentially revolutionary advances in traditional methods of medical education. ChatGPT is one such application of AI that can take free-form text as input and generate a human-like response. This study sought to evaluate ChatGPT’s performance on a simulated surgery shelf exam and assess its potential as a learning tool for medical students.
Methods: Two 50-question tests were randomly selected from the National Board of Medical Examiner’s (NBME) practice surgery shelf exams. ChatGPT (Generative Pretrained Transformer-4o, September 2024) evaluated each question sequentially. Questions with images were excluded. Responses were recorded, and a board-certified general surgeon evaluated each justification. Each justification was graded as either having no errors, minor errors that do not significantly impact understanding of the topic, or major errors that significantly impact understanding of the topic.
Results: ChatGPT answered 96.6% of questions correctly. 9.2% of all responses contained minor errors, and 9.2% contained major errors. Amongst correctly answered questions, 9.5% contained minor errors, while 6.0% contained major errors, an example of which is in Figure 1. All major errors were because of incorrect information that was asserted as correct.
Conclusion: ChatGPT demonstrates very high accuracy when assessed on its ability to correctly answer multiple-choice questions that medical students could encounter on a surgery shelf exam. However, caution must be used when using ChatGPT as an adjunct to traditional education methods. With 15.5% of ChatGPT’s correct responses containing errors, often confidently asserting incorrect information, students risk learning incorrect information if using ChatGPT for this purpose without knowledge of this limitation.
Figure 1: Surgery shelf practice question evaluated by chatGPT and chatGPT's full response, which does answer the question correctly but incorrectly implies that an elevated D-dimer is confirmatory for a pulmonary embolism.