How to Build a Patient-Specific Hybrid Simulator for Orthopaedic Open Surgery: Benefits and Limits of Mixed-Reality Using the Microsoft HoloLens

Orthopaedic simulators are popular in innovative surgical training programs, where trainees gain procedural experience in a safe and controlled environment. Recent studies suggest that an ideal simulator should combine haptic, visual, and audio technology to create an immersive training environment. This article explores the potentialities of mixed-reality using the HoloLens to develop a hybrid training system for orthopaedic open surgery. Hip arthroplasty, one of the most common orthopaedic procedures, was chosen as a benchmark to evaluate the proposed system. Patient-specific anatomical 3D models were extracted from a patient computed tomography to implement the virtual content and to fabricate the physical components of the simulator. Rapid prototyping was used to create synthetic bones. The Vuforia SDK was utilized to register virtual and physical contents. The Unity3D game engine was employed to develop the software allowing interactions with the virtual content using head movements, gestures, and voice commands. Quantitative tests were performed to estimate the accuracy of the system by evaluating the perceived position of augmented reality targets. Mean and maximum errors matched the requirements of the target application. Qualitative tests were carried out to evaluate workload and usability of the HoloLens for our orthopaedic simulator, considering visual and audio perception and interaction and ergonomics issues. The perceived overall workload was low, and the self-assessed performance was considered satisfactory. Visual and audio perception and gesture and voice interactions obtained a positive feedback. Postural discomfort and visual fatigue obtained a nonnegative evaluation for a simulation session of 40 minutes. These results encourage using mixed-reality to implement a hybrid simulator for orthopaedic open surgery. An optimal design of the simulation tasks and equipment setup is required to minimize the user discomfort. Future works will include Face Validity, Content Validity, and Construct Validity to complete the assessment of the hip arthroplasty simulator.


Introduction
Surgical simulation, a key enabling technique to revolutionize patient care and patient safety, can provide a standardized method for surgical training without the risks that come with operating on real patients [1].
Orthopaedic simulation has generally lagged behind other specialties, with fewer validated simulators available; this trend is now changing and recent studies support the notion that orthopaedic simulators have the potential to translate useful technical skills into the operating theatre [2].
Several techniques of simulation are available today, including virtual reality (VR) simulation, physical simulation, and hybrid (virtual-physical) simulation.
Existing VR orthopaedic simulators are limited by a poor haptic feedback. One of the major issues to be addressed is the simplification of the computational models to speed up the interactive simulation without compromising the effective realism of the tissue response [3]. Moreover, conventional haptic interfaces are limited in the magnitude of the forces being rendered, so they do not enable a realistic simulation of the surgical instruments/bone interaction, particularly in open surgery where the interaction forces can be of considerable magnitudes. is could explain why, in a recent study [2], Morgan et al. found that commercially available VR simulators are mainly focused on arthroscopy, a minimally invasive procedure.
As for physical simulation, companies like Sawbones [4] offer orthopaedic training models for open surgery procedures such as joint replacement surgery. e strength of these simulators lies in the realism of the synthetic bone, which requires no special handling or preservation and exhibits mechanical properties similar to human bone [5][6][7].
is is very important for a good simulation experience to allow the surgeon to develop a force-feedback memory, which is crucial for the success of a surgical procedure including tasks such as bone drilling. However, standard commercial mannequins lack objective assessment of performance and cover a very limited range of individual differences and pathologies. Patient-specific simulation, a new frontier that promises great benefits for surgical training and rehearsal [8][9][10], can overcome this latter limitation.
As suggested by a literature review on orthopaedic surgery simulation [11], "an ideal simulator should be multimodal, combining haptic, visual and audio technology to create an immersive training environment." Hybrid simulation technologies, which combine VR with physical models of the anatomy, are the best candidate to meet these requirements. Hybrid systems indeed have the advantages of physical simulators, which can mimic the properties of human tissue [12][13][14] offering the trainee the possibility to use actual surgical instruments and experience a realistic haptic feedback; and, at the same time, they exploit the benefit of computer visualization and simulation, offering also objective tools for assessing the surgical performance. Moreover, augmented reality (AR) elements can be added to enrich the synthetic environment, to make hidden structures visible, and to present additional information for the surgical tasks guidance [10,[15][16][17][18][19]. Finally, spatial sound can be added in AR applications to improve the realism of the simulated scenario.
Available display technologies for AR include spatial displays (screen-based and projection-based); hand-held displays (such as phones and tablets); and head-mounted displays (HMDs). HMDs are deemed as the most ergonomic solution for applications including manual tasks performed by the user under direct vision, like what happens in open surgery. HMDs indeed intrinsically provide the user with an egocentric viewpoint and they allow the user to work handsfree [20].
is work explores the potentialities offered by mixedreality (MR) using the HoloLens [21], an head-mounted display designed by Microsoft for MR applications, to develop an hybrid training system with immersive and interactive content.
Hip arthroplasty (HA), which involves replacing a damaged hip joint with a prosthetic implant, was chosen as a benchmark to evaluate the benefits/limits of the proposed system because it is one of the most widely performed procedures in orthopaedic practice [22], and there is a gap in the market for a high-fidelity hip replacement training simulators [11].
In a previous work [23], we have presented a lower torso phantom for HA including a patient-specific hemi-pelvis replica embedded in a soft synthetic foam. In this paper, we present the HipSim app: an evolution of our former simulator, focusing on the details for the implementation of wearable AR functionalities using the HoloLens. Quantitative and qualitative test were carried out to perform a preliminary evaluation of our multimodal surgical simulator and to explore advantages and limits of the new design and novel technologies being used.

Materials and Methods
e following paragraphs describe the peculiarities of the adopted HMD; the virtual content and the physical components of the simulator, with details on the implementation/ fabrication strategy; the calibration and registration methods to align the VR content with the physical word; and the testing strategies to preliminary validate the simulator.

Selection of the Head-Mounted Display. HoloLens is an
Optical See-rough (OST) HMD, which enables optical superposition of virtual content onto the user direct view of the physical world. Being an OST system, it offers an unhindered and instantaneous full-resolution view of the real environment which assures that visual and proprioception information is synchronized [24].
Differently, in Video See-rough (VST) HMDs, the virtual content is merged with the camera images captured by one or two external cameras rigidly fixed on the visor frame. is more obtrusive technology block out the realworld view in exchange for the ability to offer higher geometric coherence between virtual and real content, without requiring a user-specific calibration eye-to-display [25]. A complete comparison of OST and VST technologies is reported in [26].
Assuming that for simulation purposes the perceived positioning accuracy of the VR content is not as important as the possibility to give the user a naturalistic experience, we have opted for an OST system. More in particular, the HoloLens was chosen for our application since it provides significant benefits over other commercial HMD from human factors and ergonomics standpoints [27] and integrates important functionalities for an immersive and interactive simulation experience. In fact, the HoloLens offers head tracking, hand gesture controls, and voice commands and enables binaural audio to simulate effects such as spatial sound within the user environment. Additionally, HoloLens has no physical tethering constrains that can limit the movements/gestures of the user during the simulation of the surgical tasks.
A recent literature study on the evaluation of OST-HMD suitability for mixed-reality surgical intervention [28] shows that Microsoft HoloLens outperforms other currently available OST HMDs (Epson Moverio BT-200, ODG R-7), in terms of contrast perception, task load, and frame rate. e same study shows that the integration of indoor localization and tracking functionalities, enabled by HoloLens environmental understanding sensors, provides significantly less system lag in a relatively motionless scenario.
For all these reasons, HoloLens can be considered a good candidate for the implementation of mixed-reality open surgery simulators. However, some well-known technical issues of HDMs have to be considered, such as a small overlay field of view (FOV); the vergence-accommodation conflict (VAC) [29]; the perceptual issues, intrinsic to standard optical see-through HMDs, due to mismatched accommodation between the virtual content and the real-world scene [30]; and the difficulties of OST systems in handling occlusion between the real and virtual contents [26]. e overlay FOV can be defined as "the region of the field of view where graphical information and real information are superimposed" [26] which, in the HoloLens, is about 35°.
As for the vergence-accommodation conflict, users wearing HoloLens are forced to accommodate their eyes to a fixed focal distance of approximately 2.0 m (Figure 1) to maintain a clear image of the virtual content, while the depth of the virtual objects (and hence the binocular disparity) varies depending on the application.
is results in conflicting information within the vergence-accommodation feedback loops causing visual discomfort [30].
Moreover, the focal distance of each physical object in the real world depends on its relative distance from the user: if the distance gap between the display focal plane and realworld objects is beyond the human eye deep of field, the user cannot keep in focus both the virtual and real content at the same time [20]. e discomfort due to the vergence-accommodation conflict can be reduced by keeping the virtual content positioning stable over the time [31,32]. However, the mismatch between the focal distances of real and virtual objects, together with the difficulties in handling the occlusions of overlapping objects, can affect the accuracy of the rendered depth [26].
For this reason, quantitative and qualitative tests were performed to evaluate if the perceived positioning accuracy matches the requirements of the target application. Moreover, qualitative tests were also performed to evaluate the visual discomfort and the usability of the proposed HDM for our specific scenario: orthopaedic open surgery simulation.

Design and Implementation of the Simulator Components:
e Virtual Content. e development of the simulator starts from the segmentation and surface extraction of the anatomical organs of interest from a real CT dataset (Figure 2). e stack of medical images in DICOM format is processed using a semiautomatic tool, the EndoCAS Segmentation Pipeline [33] integrated in the open source software ITK-SNAP 1.5 [34]. en, mesh reconstruction and optimization (artefacts removal, holes filling, simplification, and filtering) stages are performed to generate the 3D models of the patient anatomy necessary for the surgical simulation. Optimization stages are performed using the open source software MeshLab [35] and Blender [36]. e bone models included in the present version of the simulator are: hip bones, sacrum, coccyx, and femoral heads. Moreover, a model of pelvis and the principal muscles around the hip joint (such as gluteal muscles, piriformis, inferior gemellus, superior gemellus, obturator internus) are included to increase the anatomical knowledge of the user-trainee and form a solid basis for a complete surgical simulation system. Other key surgical structures to be added for further improving the simulation are fasciae, nerves, tendons, and blood vessels.
Finally, the virtual environment is enriched with information from a simulated planning phase with the 3D Hip Plugin [23]: a pair of viewfinders and a dotted line are added to the virtual anatomical model to show the surgeon the optimal trajectory for the reaming tool. is information, coupled with the real-time tracking of the surgical instrument, could also be used for a quantitative evaluation of the surgical performance on the basis of the deviation of the reaming tool from the optimal trajectory.
Moreover, a selection of radiological images (a hip radiograph, a CTslice, a CT volumetric rendering) ( Figure 3) is added to the virtual content enriching the digital information available to the learner during the simulation.

Design and Implementation of the Simulator Components:
e Physical Components. e development of the physical simulator starts from the CAD design ( Figure 4). 3D virtual models are imported in the Creo Parametric 3D Modelling software, and each physical component is designed, including a support for the registration target (an Image Target as described in the following section). is support is rigidly anchored to the bone synthetic replica to guarantee a precise registration of the virtual content to the real scene.
A 3D printer (Dimension Elite 3D Printer) is used to turn the 3D CAD models into tangible 3D synthetic replicas made of acrylonitrile butadiene styrene (ABS). is plastic is commonly used for the manufacturing of bone replicas for orthopaedic surgery simulation since it adequately approximates the mechanical behaviour of the natural tissue [37]. Finally, silicone mixtures and polyurethane materials are used for the manufacturing of the soft parts. e final mannequin includes a replica of the acetabulum embedded in a soft synthetic foam. Moreover, a skin-like covering is provided for an accurate simulation of palpation and surgical incision.    performed to properly align the virtual content with the real objects. e calibration procedure is necessary to model intrinsically and extrinsically the virtual viewing frustum to the user viewing volume. To perform this calibration, the Microsoft HoloLens includes an official "Calibration" app, which however does not offer a complete user-based calibration procedure, but it is designed to solely determine the interpupillary distance (IPD) [38]. e registration can be accomplished in real time by tracking the relative position and orientation of the real objects with respect to the rendering camera; this information is then used to update the corresponding transformations within the virtual world.

Calibration and Registration of the Virtual and Physical
HoloLens includes a world-facing camera; thus, the optical detection and tracking of a target can be used for realtime registration purposes, with no need for an external tracking system. At this end, in our application, we use the detection and tracking functionalities offered by the Vuforia SDK [39].
More in particular, we employ an Image Target (Figure 5). Image Targets represent images that Vuforia Engine can detect and track at runtime. e Vuforia Engine detects and tracks the features that are naturally found in an image. ese features, extracted from the original image, are stored in a preprocessed database, which can then be integrated in the software application. is database can then be used by Vuforia Engine for runtime comparisons. Once the Image Target is detected, Vuforia Engine will track it as long as it is at least partially visible by the camera. e fundamental attributes for an accurate tracking of an Image Target are good contrast, no repetitive patterns, and wealth of details. Moreover, for near-field applications, a physical printed Image Target should be at least 12 cm in width and of reasonable height [39]. For a more detailed definition of Vuforia Image Targets, please refer to the Vuforia SDK [39].

Implementation Details.
From the software aspect, Unity3D (5.6.1f) was used to create the application (the HipSim app). e MixedRealityToolkit (2017.1.2), a collection of C# scripts and Unity components to develop mixed-reality applications, was utilized for the development of the surgical simulator. is toolkit allows the user to interact with the virtual content by means of head movements (Gaze), gestures (Air Tap, Bloom, etc.), and voice commands (via Cortana). A virtual cursor is added to the application to indicate the head/view direction: this interaction through head movements is called Gaze. e Gaze is estimated from the position and orientation of the user's head, without considering the user's eyes direction (since the current version of HoloLens does not include any eye-tracking sensor).
A Fitbox (a MixedRealityToolkit tool) is used in Unity to anchor in the physical space the virtual collection of radiological images according to the user preferences ( Figure 3).
A virtual menu with multiple toggle buttons has been implemented to select the virtual components (pelvis, bones, and muscles; preoperative plan) to be visualized during each surgical task. Figure 6 shows examples of AR images captured by the HoloLens word-facing camera during a surgical simulation trial.
Operating room ambient sound, including voices of surgical staff and sounds of medical equipment, has been included in the HipSim app to improve the realism and immersion of the surgical simulation.

Quantitative Study.
Quantitative tests were performed to estimate the accuracy of the system by evaluating the perceived position of AR targets.
Five (5) subjects (gender: 2 males, 3 females, 0 nonbinary; years of age: 24 min, 32 mean, 39 max, 6 STD) with 10/10 vision were recruited to participate in this study. e HoloLens were used to present four (4) virtual targets consisting of red spheres (0.5 mm radius) virtually located on the acetabulum surface (Figure 7(a)). Targets were designed in the CAD environment and their 3D positions were acquired in the virtual environment reference frame. Figure 7 shows the experimental setup consisting of: (iv) the NDI Aurora electromagnetic tracking system (V2 System); and (v) the NDI Aurora calibrated 6 degrees of freedom (DOF) digitizer.
e mannequin and the Aurora EM emitter were fixed in a stable position to avoid relative movement during the targeting trials. e rigid transformation A T V between the Aurora reference system and virtual environment reference frame was derived with a point-based registration algorithm: the positions of three landmarks (three corners of the simulator) were acquired in the CAD environment; the positions of the same landmarks were then acquired in the Aurora reference system with the digitizer; and then the transformation was derived with a least-squares error minimization algorithm [40]. Finally, the root mean squared registration error (RMSE) and the maximum registration error (MR) were computed and saved. e official HoloLens app was used to calibrate the HMD for each user before the targeting session. e tracking and registration functionalities supported by the Vuforia SDK were used for the real-time registration of the virtual targets and the real mannequin. e subjects were asked to use the digitizer to point at the perceived position of the four (4) virtual targets displayed is Image Target obtained a 5/5-star rating: star rating defines how well an image can be detected and tracked using the Vuforia SDK, and this rating is displayed in the Target Manager and returned for each uploaded Image Target via the Vuforia web API. through the HMD (Figure 7). Each target was acquired 3 times by each user, for a total of 12 targeting trials per person (60 in total). Target positions, acquired in the Aurora reference frame, were then expressed in the virtual environment reference frame by means of the A T V rigid transformation.
Targeting accuracy was measured as the average Euclidean distance between the perceived (digitized) position and the actual position of each target. e maximum and minimum error (Euclidean distance), as well as the standard deviation, were also calculated for each target. (20) subjects with 10/10 or corrected (lenses) to 10/10 vision were recruited from technical employees (engineers) and personnel with medical background (medical students, orthopaedic resident surgeons, orthopaedic surgeons) of the University of Pisa (see Table 1 for detailed demographics). e qualitative study includes: subjective workload assessments with the NASA Task Load Index (NASA-TLX) Questionnaire and a Likert Questionnaire to evaluate visual and audio perception, and interaction and ergonomics issues. NASA-TLX is a multidimensional rating procedure that provides an overall workload score, between 0 and 100, based on a weighted average of ratings on six subscales [41]:

Qualitative Study. Twenty
(i) mental demands ("How mentally demanding was the task?"), (ii) physical demands ("How physically demanding was the task?"), (iii) temporal demands ("How hurried or rushed was the pace of the task?"), (iv) own performance ("How successful were you in performing the task?"), (v) effort ("How hard did you have to work to achieve your level of performance?"), and (vi) frustration ("How insecure, discouraged, irritated, stressed, and annoyed were you?").
NASA-TLX Questionnaire was administrated to identify the primary source of workload during the execution of the proposed AR-based simulation and to investigate workload levels of users with differing characteristics ("Profession/ Position Held," "Experience with AR" etc.). e Likert Questionnaire, which is reported in Table 2, comprises 14 items, each evaluated using a 5-points Likert scale (from 1 � strongly disagree, to 5 � strongly agree). e experimental setup is depicted in Figure 6. e mannequin was positioned on a fixed height surgical table.
e study protocol for each participant included the following steps: (1) e participant fills out a Consent Form and a Demographic Form (Table 1) including information    Statistical analysis of data was performed using the SPSS ® Statistics Base 19 software.
Results of the NASA-TLX Questionnaire are summarized in terms of means and standard deviation. Data were processed using the analysis of variance (ANOVA) to examine possible relationships between individual characteristics and workload.
As for the Likert Questionnaire, the central tendencies of responses to a single Likert item were summarized by using median, with dispersion measured by interquartile range.
e Mann-Whitney U test and Kruskal-Wallis test were used to understand whether the answering tendencies (with respect to each Likert item) differ based on "Profession/ Position Held" and "Experience with AR"/"Experience with HoloLens". A p value <0.05 was considered statistically significant.

Quantitative Evaluation Results.
e obtained RMSE and MR are, respectively, 0.6 mm and 0.8 mm. Table 2 reports the accuracy obtained for each target, as well as the maximum error, minimum error and the standard deviation. e maximum error is compatible with values declared by HoloLens developers: Klein G. reported [42] a maximum static registration error <10 mrad, which results in an error of about 5 mm at a distance of 50 cm from the user (the approximate working distance in our setup).

Qualitative Evaluation Results.
e average time for the completion of the study was 40 minutes. e overall workload obtained (30.65) can be considered low giving that the average overall score observed in the literature for medical task is 50.60 (min 9.00; max 77.35) and for computer activities is 54.00 (min 7.46; max 78.00) [43]. Performance induced the highest workload indicating the overall satisfaction with self-assessed performance. Table 3 summarizes the results of the Likert Questionnaire. Results show no statistically significant differences in answering tendencies between engineers and clinicians with an exception for the postural discomfort during the application and the ease of aligning the surgical instrument to the AR viewfinders.
As for the postural discomfort, clinicians expressed a neutral opinion (median 3), while engineers agreed that they did not experience postural discomfort (median 4). Moreover, clinician also expressed a neutral opinion (median 3) regarding the ease of aligning the surgical instrument, while engineers strongly agreed that this task is easy (median 5).
Overall, participants agreed/strongly agreed that the virtual content is correctly aligned to the real objects (median 5), it is easy to perceive the spatial relationships between real and virtual objects (median 5), they did not notice motion of virtual content (median 4), they did not notice latency (median 4), they did not notice jitter (median 4), they did not experience double vision (median 5), they did not notice colour separation (median 5), the field of view is adequate for the application (median 4), the spatial sounds make the experience more immersive (median 4.5), the gesture interaction is easy and intuitive (median 5), and the voice interaction is easy and intuitive (median 4.5). e overall median opinion regarding the experience of visual fatigue is neutral (median 3.5).

Conclusions
As suggested by a recent literature review on orthopaedic surgery simulation [11], "an ideal simulator should be multimodal, combining haptic, visual and audio technology to create an immersive training environment." In this work, we present an innovative multimodal simulation tool, which takes advantage from patient-specific modelling to improve the realism of the simulated surgical case; rapid prototyping    Journal of Healthcare Engineering for the manufacturing of synthetic models, which guarantees a realistic haptic feedback; AR to enrich the simulated scenario and guide the learner during the surgical procedure; and HoloLens functionalities for an interactive and immersive simulation experience.
Results of quantitative and qualitative study encourage the usage of HoloLens technology for the implementation of a hybrid simulator for orthopaedic open surgery. e perceived positioning accuracy matches the requirements of the target application. Moreover, the perceived overall workload can be considered low, and subjects participating in this study expressed satisfaction with self-assessed performance. A positive feedback was obtained on visual and audio perception, and gesture and voice interaction independently of the level of previous experience with AR and HoloLens, and education backgrounds (medical or technical). As regards postural discomfort during the application and the experience visual fatigue, obtained results show a nonnegative opinion for a simulation experience with duration of 40 minutes (enough for the specific purposes). A more prolonged usage could negatively impact the comfort because of an increase of the visual fatigue. An optimal design of the simulation tasks and the simulation setup (time for each task, height of the surgical table, distance of user interaction) are required to minimize the user discomfort, so that the virtual content appears in the optimal/comfort zone for most of the time of the simulation period, and the head tilt is sustainable. Moreover, attention should be paid to the design of AR viewfinders (optimal shape, colour, transparency level) to ease the alignment task, which is already impaired by the focus rivalry between the physical and virtual content.
Hip arthroplasty, a surgical procedure which could take great advantage from simulation, was selected as a benchmark for this study. Primary and revision total HA indeed were ranked third and fourth among the orthopaedic interventions accounted for the greatest share of adverse events and excess hospital stay [44] and, as showed by several studies [45,46], the risk of complications after HA is strongly related to the surgeon's case volume. In this context, surgical simulation could play a pivotal role, offering novices an opportunity to practice skills outside the operating theatre, in a safe controlled environment.
Future work will include Face Validity, Content Validity, and Construct Validity for a complete assessment of the proposed simulator for this specific orthopaedic intervention. Additionally, in the future, our system could integrate novel haptic equipment and able to simulate high-magnitude force feedback. However, in this case, the usage of haptic interfaces will be limited to the simulation of the reamer-bone interactions, whereas the direct interactions between the surgeon hands and the soft tissue will be still simulated using the current synthetic mannequin.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.