The training of future surgeons in virtual operating rooms poses the challenge of objectively evaluating their performance. Such evaluations are precise enough to facilitate realistic judgements about the actual skills mastered and those needing improvement, leading to the development of tailored educational programs based on individual learning curves.

Historically, aspiring doctors learned anatomy through the illicit dissection of unburied corpses, a practice unregulated and even contributing to a body trade for educational purposes. Today, anatomical dissection remains a core component of medical and surgical education globally. However, countries like Italy encounter challenges due to the scarcity of cadavers available for training. The 2020 law on the «disposition of one’s body and tissues post-mortem for study, training, and research» has done little to raise awareness or address these issues.

Contemporary surgical trainees not only study from textbooks and attend lectures but also gain practical experience by assisting senior surgeons in operating theatres, practicing on educational mannequins or anatomical specimens acquired from abroad, and observing autopsies in teaching hospitals. Over the last decade, immersive technologies have complemented traditional educational methods in high-income countries’ surgical training programs, including Italy, where virtual reality first featured in a surgical lesson in 2017.

Recall that virtual reality (VR) entered the medical sphere in the 1990s, initially adopted by senior surgeons for complex operational planning. Since then, VR technology has advanced in terms of graphics and execution speed, now capable of replicating increasingly realistic virtual environments and sensory experiences — auditory, olfactory, visual, and tactile. These advancements render VR an innovative educational tool in surgical training [source: “The Use of Virtual Reality in Surgical Training: Implications for Education, Patient Safety, and Global Health Equity” – Journal “Surgeries”].


The principal challenge in virtual reality surgical training is the classification of skills acquired by students. These skills are complex, not merely reducible to straightforward performance metrics, and cannot be effectively captured by a superficial neural network.
From Yunnan Normal University in China comes a sophisticated classification approach using five machine learning algorithms. This technique distinguishes between various levels of neurosurgical skill captured by a VR simulator, allowing for a bespoke final evaluation for each trainee.
Future scenarios include the development of methodologies for tailored assessment of surgical skills acquired in VR, leading to a more prompt and detailed understanding of each student’s behaviour during training. This advancement will enable the creation of customised training programmes for each individual.

Virtual reality surgical training: the benefits

Virtual reality creates a three-dimensional digital realm that accurately mimics the physical environment, immersing the user fully via a VR headset. Likewise, the virtual operating room, powered by surgery-specific VR software with visual and tactile feedback, allows students to navigate as though they were engaged in actual clinical practice.

The 3D imagery from the headset portrays a realistic surgical setting, intricately detailing the patient’s internal anatomy, organ topology, and pathologies, immersing the trainee completely. This setup lets them perceive even the minutest details necessary for simulating surgical procedures, with acute awareness of precise points on the patient’s body, hand positioning, movements, and exerted pressure [source: “Virtual Reality System Helps Surgeons, Reassures Patients” – Stanford Medicine].

One benefit of integrating virtual reality into surgical academic training is the preoperative immersion of students in the patient’s three-dimensional anatomy, coupled with a realistic, albeit virtual, experience of surgical techniques and procedures. This optimises the surgical act simulation [source: “Research: How Virtual Reality Can Help Train Surgeons” – Harvard Business Review].

An independent study by researchers from the Faculty of Medicine at McMaster University in Ontario, Canada – outlined in “Immersive Virtual Reality for Surgical Training: A Systematic Review” – examined the positive aspects of virtual reality surgical training.

An extensive review of primary studies published in major scientific journals from January 1, 2000, to January 26, 2021, yielded a synthesis of qualitative data and descriptive statistics. Notably, the analysis revealed that «out of 307 students who completed training across four surgical disciplines, those trained in virtual reality completed procedures 18% to 43% faster than their peers in the control group, and achieved higher scores in manual precision».

Virtual reality surgical training: the assessment issue

A notable drawback of virtual reality surgical training involves the analysis of performance indicators to gauge students’ skill levels and, concurrently, the training’s effectiveness. What, then, are “performance indicators” in the context of VR surgical training?

It is worth noting that during simulations, the VR surgical simulator gathers various data related to the surgical procedure itself, including manual skill in instrument handlingmovement qualityprecision, and speed or, conversely, any slowing, stopping, or potential signs of distress from the student. The aggregate of these indicators dictates the simulation’s outcome and, subsequently, the classification of the competencies acquired by the trainees, which can inform the development of more effective surgical training programmes.

A fundamental consideration, as detailed by researchers from Yunnan Normal University in China in “Personalized Assessment and Training of Neurosurgical Skills in Virtual Reality: An Interpretable Machine Learning Approach” (Virtual Reality & Intelligent Hardware, February 2024 issue), is that:

«Most surgical procedures necessitate a variety of complex psychomotor skills that single parameters cannot effectively evaluate»

Accordingly, the Chinese team advises that to comprehensively assess surgical experience, «multiple parameters must be combined and cross-referenced»

A Canadian study, “Neurosurgical Virtual Reality Simulation Metrics to Assess Psychomotor Skills During Brain Tumor Resection” (International Journal of Computer Assisted Radiology and Surgery, June 2014), highlighted this particularly for oncological neurosurgery, where training evaluation parameters are often multilevel and encompass various aspects, such as the percentage of brain tumor removed during a simulation, the amount of ‘normal’ brain tissue excised, the duration of the procedure, and the cumulative forces applied.

Moreover, the level of safety displayed by students during neurosurgical simulations should also be assessed, as the study group contends that performance in neurosurgery «relies on both psychomotor and cognitive abilities».

The potential of machine learning to discriminate between levels of parameters and skills

In the realm of virtual reality surgical training, the team at Yunnan Normal University points out that machine learning algorithms, leveraging data from VR surgical simulators, can accurately classify students’ acquired skills. An evident capability.

Several studies have explored these algorithms’ use in surgical simulators, such as evaluating skills in vascular procedures, robotic surgeries, and spinal operations. Particularly notable is the research featured in “Machine learning distinguishes neurosurgical skill levels in a virtual reality tumor resection task” (published in “Medical & Biological Engineering & Computing,” 2020), where the Chinese university’s researchers acknowledged this study’s pioneering exploration of machine learning’s capacity to differentiate between the surgical performances of experts and novices in a VR brain tumor resection task.

Yet, rather than using multiple multi-layered neural networks, this study, like many in its field, opted for a single-layer neural network. This choice «does not facilitate an understanding of the performance related to each participant’s individual features, nor the crafting of a bespoke training plan for each» This remains a critical point.

Thus, the authors of the cited Chinese study, published in the journal Virtual Reality & Intelligent Hardware, undertook the development of five classifierseach powered by a distinct machine learning algorithm. These classifiers aimed to discern varying levels of neurosurgical skill from performances captured by a VR simulator, focusing on individual students to ensure the most precise and tailored evaluation possible.

In detail, the five algorithms include a decision tree classifier; one based on linear discriminant analysis; a Naive Bayes algorithm; a supervised learning model known as “Support Vector Machines”; and lastly, a K-Nearest Neighbours (KNN) algorithm, all trained using data from a neurosurgical simulator.

From performance parameters to defining indicators of skill and surgical competence areas

The Chinese study on virtual reality surgical training evaluated the five algorithms by simulating a complete brain tumor removal procedure (involving skull perforation, meningeal cutting, and tumor resection), carried out by 79 surgeons with no prior experience with the VR simulator, who had only received verbal and written briefings on the equipment used.

The surgeons were categorized into three experience levels: ten neurosurgeons who had completed their specialty training, 21 senior neurosurgery trainees, and 48 junior trainees.

The test utilised a neurosurgical VR simulator with visual and tactile rendering capabilities; «the former produced 3D images of brain tissue and surgical tools through the VR headset, while the latter, via force feedback devices, enabled interaction with virtual objects and real-time tactile feedback».

Throughout the test, the simulator collected extensive data on the usage of surgical instruments, including positioning, time of use, and number of actions performed. Also recorded were «the number of meningeal contacts during cranial perforation, prescribed trigger contacts during meningeal cutting, and completed tumor resections».

Analysis of this extensive data allowed the machine learning algorithms to first identify broad neurosurgical performance parameters such as speed, acceleration, distance, variance, and the coefficient of variation of instrument movement. Referencing existing literature, these parameters were then used to define 91 neurosurgical skill indicators (subsequently reduced to 15), categorized into four competence areasmovement, coordination, safety, and efficiency, which assessed the varying skill levels of the participants.

Tabella che riporta i quindici indicatori di abilità, suddivisi in base ai tre scenari della simulazione neurochirurgica eseguita, ossia perforazione del cranio (Skull drilling), taglio meningeo (Meningeal Cutting) e resezione del tumore (Tumor Resection) [fonte: “Personalized assessment and training of neurosurgical skills in virtual reality: An interpretable machine learning approach” - Virtual Reality & Intelligent Hardware - https://www.sciencedirect.com/science/article/pii/S2096579623000451?via%3Dihub].
Fifteen final skill indicators, categorized according to three neurosurgical simulation scenarios: Skull Drilling, Meningeal Cutting, and Tumor Resection [source: “Personalized assessment and training of neurosurgical skills in virtual reality: An interpretable machine learning approach” – Virtual Reality & Intelligent Hardware – https://www.sciencedirect.com/science/article/pii/S2096579623000451?via%3Dihub].

Personalized evaluation via machine learning: insights from the neurosurgical simulation

The classification by the five machine learning algorithms developed by the Yunnan Normal University team successfully identified and quantified the varying skills of three distinct groups of surgeons, differentiated by their training and experience levels. These skills were linked to multiple surgical performance parameters.

This thorough and nuanced analysis promises to pave the way for innovative educational metrics in virtual reality surgical training. The three charts included illustrate how the models’ classification work linearly correlates most performance indicators with the surgical skill levels of the participant groups.

Grafici in cui le traiettorie di puntini colorati rappresentano i diversi livelli di competenze neurochirurgiche mostrate dai tre gruppi di partecipanti (post-specializzandi in blu, specializzandi junior in rosso e specializzandi senior in verde), espressi in base ai coefficienti di variazione nei tre scenari chirurgici simulati: (a) perforazione del cranio, (b) taglio meningeo, (c) resezione del tumore [fonte: “Personalized assessment and training of neurosurgical skills in virtual reality: An interpretable machine learning approach” - Virtual Reality & Intelligent Hardware - https://www.sciencedirect.com/science/article/pii/S2096579623000451?via%3Dihub].
Trajectories of colored dots depict the varying levels of neurosurgical proficiency among three groups of participants – post-fellowship (blue), junior residents (red), and senior residents (green) – measured by variation coefficients in three simulated surgical scenarios: Skull Drilling, Meningeal Cutting, and Tumor Resection [Credit: “Personalized assessment and training of neurosurgical skills in virtual reality: An interpretable machine learning approach” – Virtual Reality & Intelligent Hardware – https://www.sciencedirect.com/science/article/pii/S2096579623000451?via%3Dihub].

The classification further pinpointed critical performance parameters for individual participants in the neurosurgical simulation, effectively highlighting each person’s performance.

For example, «the twenty-third participant – the team notes – faced challenges with the drill’s movement during cranial perforation, the scalpel’s angle during meningeal cutting, and the left-hand surgical forceps during tumor removal, suggesting targeted areas for future training»

Additionally, «the sixtieth participant showed insufficiently smooth movements during cranial perforation and excessively wide angles. Moreover, the cutting trajectory during meningeal cutting was non-standard, while the final participant failed to achieve the required movement angles, especially during perforation and tumor removal».

Glimpses of Futures

What has been demonstrated conclusively dispels any doubts about the potential of machine learning techniques to differentiate – and classify – the various levels of skill exhibited by both individuals and groups during neurosurgical simulations in virtual reality. This provides a clear methodology for personalized assessment, which is essential for implementing surgical training in virtual reality. Such assessments are crucial for creating customized educational pathways for surgical trainees, offering more effective solutions than traditional academic teaching models.

Traditional models often necessitate students’ physical presence alongside senior surgeons in operating theatres and involve practice on training mannequins or expensive cadaveric specimens imported from abroad.

With the aim of anticipating possible future scenarios, let us now explore – using the STEPS matrix – the potential impacts of the evolution of personalized competency assessment techniques in virtual reality surgical training across various domains.

S – SOCIAL: in the future, the advancement of artificial intelligence techniques, such as machine learning, for personalized competency assessments in VR surgical training will facilitate a more timely and comprehensive understanding of each student’s performance relative to specific parameters and a wide array of skill indicators during simulations. This will be immensely beneficial for the educational frameworks of universities for two main reasons: firstly, it will allow for the creation of precise training plans tailored to the competencies each student has mastered, partially mastered, or failed to master, focusing sequentially on their next educational steps; secondly, it will expedite the training process itself. This acceleration goes beyond merely speeding up processes—it involves swiftly identifying subpar surgical performances and promptly customizing educational interventions to enhance them. Moreover, as the transition from traditional collective training to individual and “autonomous” training continues (with VR platforms enabling on-demand training accessible from anywhere at any time), it raises questions about whether Italian universities can manage and keep pace with this shift, especially considering the extensive ongoing technological transformation that has challenged them for years.

T – TECHNOLOGICAL: looking to the future, with VR surgical simulators evolving to simulate increasingly realistic surgical scenarios and patient anatomies with enhanced visual and tactile rendering, the AI algorithms responsible for classifying the skills learned by students during virtual reality training will need to be even more sophisticated. AI techniques that support personalized competency assessments could lead to predictive learning analyses of trainees, thanks to a greater volume of data collected by future simulators. This would aim to forecast each trainee’s future learning curve, anticipated progress in various surgical skills, and potential challenges or stagnations. Such technological progress would serve as a strategic asset, not only by providing early insights into students’ future performances but also by facilitating the design of targeted interventions and proactive support.

E – Economic: from an economic perspective, the impact of virtual reality (VR) surgical training on the finances of Italian universities is overwhelmingly positive. When compared to the running costs of traditional anatomy labs where, up until about a decade ago—before the introduction of immersive technologies in surgical education—students predominantly practiced on cadavers imported from countries like the USA or the Netherlands (where body donation for scientific purposes has never been taboo). These cadavers were subsequently returned at our expense or even purchased at substantial costs. Even today, as mentioned, anatomical specimens derived from cadavers are still acquired from abroad, particularly to allow residents in our surgical schools to practice suturing techniques. Focusing on the cost-benefit analysis of VR surgical training and considering the potential for VR to fully replace, in certain surgical areas, both mannequins and cadaver parts in training aspiring surgeons, the development of techniques for personalized assessment of skills acquired through VR education is crucial in both solidifying this practice and making it effective, innovative, and cost-efficient.

P – Political: in the future, with the advancement of supporting techniques, Surgical Certification Bodies must unequivocally consider the method of personalized evaluation of skills acquired through VR surgical education. This approach would enable them to verify the competencies of those trained with VR simulators more objectively and efficiently, and to identify truly qualified surgeons, who are valuable to the community. One example in the EU is the European Breast Surgical Oncology Certification, where a panel of expert breast surgeons from across Europe sets «a curriculum of theoretical knowledge and practical skills expected of a qualified breast surgeon practicing within the European Union and the European Economic Area». Another significant example of surgical competency certification in Italy is provided by the Italian Association of Hospital Pulmonologists (AIPO), which offers a technical-practical training pathway specifically designed to certify doctors’ competencies in managing patients with respiratory diseases.

S – Sustainability: from a perspective of social and economic sustainability, the impact of VR surgical training and the personalized assessment of skills learned through this modality is particularly pronounced in low- and middle-income regions, where a shortage of qualified surgeons and limited access to quality training mean that «these tools can significantly empower local professionals, enhance surgical care, and reduce reliance on external interventions. » Similarly, the team from the Grossman School of Medicine in New York, in their 2023 publication “The Use of Virtual Reality in Surgical Training: Implications for Education, Patient Safety, and Global Health Equity“, noted that augmented reality surgical training in rural, remote, and resource-limited settings—where access to equipment and internet connectivity is feasible—«could serve as a valuable resource by linking these communities with others that can provide educational support».

Written by: