3D surgical instrument collection for computer vision and extended reality
The data collection consists of four key steps: instrument preparation for scanning purposes, 3D scanning using structured light scanners, post-processing in proprietary software, analysis of the generated models, and model adaptation to create a variety of models based on the originals. Figure 1 provides an overview of our pipeline, which we will now describe in detail.
103 surgical instruments were used to create the dataset. These instruments were kindly provided by the Department of Oral and Maxillofacial Surgery of the University Hospital Rheinische-Westfälische Technische Hochschule (RWTH) Aachen. The study was conducted at the Institute of Artificial Intelligence in Medicine (IKIM) of the Essen University Hospital (AöR). For scanning, the instruments were prepared with AESUB white (Scanningspray Vertriebs GMBH, Recklinghausen, North Rhine-Westphalia, Germany).
Two structured light 3D scanners, the Autoscan Inspec (Shining 3D Corporation, Hangzhou, Zhejiang, China) and the Artec Leo (Artec3D, Senningerberg, Canton Luxembourg, Luxembourg), were used for obtaining the 3D models. Post-processing was done with their respective commercially available proprietary software UltraScan version 2.0.0.7 and Artec Studio 17 Professional version 1.0.141. All models were exported from the proprietary software as Stereolithography (STL) files (Stereolithography Interface Specification, June 1988, 3D Systems Inc., Valencia, California, United States).
Additional visual inspection was done in Microsoft 3D Viewer (Microsoft Corporation, Redmond, Washington, United States) and Blender 3.4.1 (The Blender Foundation, Amsterdam, Noord-Holland, Netherlands A Blender add-on based on Python was implemented, to alter original models and, thus, enabling the creation of a plethora of likewise models based on the original scanned instrument. Another method of creating likewise models can be fully automatically achieved using a Python 3.9 (The Python Software Foundation, Wilmington, Delaware, United States script. It allows scaling the instruments along the local axes of the original model, or the application of different types of smoothing (Taubin, Laplacian and Humphrey algorithms).
All processing scripts and the Blender add-on are included in the data repository30. The original STL models were analysed according to their geometric properties, such as length, width, height and volume. These descriptors, together with all created STL files are available in the data repository30. The STL models will also be made available on MedShapeNet (https://medshapenet.ikim.nrw)31. Figure 1 shows an overview of all steps involved in producing the data collection. Out of the 103 surgical instruments, 49 underwent scanning with the Artec Leo and 55 with the Autoscan Inspec. One instrument was scanned using both scanners. 11 instruments were scanned in alternative configurations, such as open and closed stances. Eight instruments were scanned in an open stance using the Artec Leo, and three instruments were scanned with the Autoscan Inspec, totalling four supplementary scans. To account for diverse post-processing options, we generated two STL files per scan, representing different settings. In Chapter “Data acquisition and post-processing with Artec Leo and Artec Studio
Professional 17” and “Data acquisition with Autoscan Inspec and UltraScan”, the differences between these two settings are explained. To confirm reproducibility one instrument was scanned and processed by two different people using the Autoscan Inspec resulting in an additional scan. Thus, a comprehensive sum of 114 STL files resulted from scans using the Artec Leo (49 + 8)*2 = 114. Similarly, scans performed using the Autoscan Inspec resulted in 120 STL files, (55 + 4 + 1)*2 = 120. For details on which instruments were scanned multiple times, we refer to the ‘Overview’ Word file within the data repository30.
Instrument preparation
The instruments used in the department of oral and maxillofacial surgery were divided into 27 classes, see Fig. 2. The instruments can be used for a plethora of interventions and actions. For example, retractors provide a clear view, hammers and chisels apply controlled force, clamps hold blood vessels and tissues, forceps grasp and manipulate these tissues, and dental probes examine teeth and gums.
Nearly all instruments are made of stainless steel, due to its durability and resistance to corrosion. The instruments are also smoothly polished. The reflective nature of polished stainless steel causes scatter and limits the accuracy of point cloud data obtained with structured light 3D scanners. Some instruments further have black handles, which absorb the light emitted by the scanner, thus limiting the obtained data severely.
Therefore, to enable appropriate scanned mesh quality, all instruments were prepared with 3D scanning spray. AESUB white, one of the most used 3D scanning sprays, is easily washable but does not evaporate, and has a layer thickness of approximately 0.007 millimetres. These properties are why we deem this spray suitable.
The Artec Leo has a 3D point accuracy of up to 0.1 millimetres and a resolution of up to 0.5 millimetres, a couple of spray layers with AESUB white should not negatively influence the scan resemblance of the real instrument. The Autoscan Inspec however has an Accuracy of 0.01 millimetre and spraying was therefore kept to a necessary minimum.
There are sprays on the market that enable the scanning of an object’s colours or provide a lower layer thickness than AESUB white. For instance, AESUB is transparent, orange, and yellow. However, AESUB orange offers only a slight reduction of a few micrometres in layer thickness, thus still influencing the scan outcome. AESUB yellow necessitates the use of a spray gun and additional expertise in spraying. In our experience AESUB transparent does not substantially decrease the surgical instrument’s reflectivity to allow it to be accurately scanned with the Artec Leo. We concluded that AESUB white is the most suitable choice for scanning our instruments.
This is especially true since Instruments scanned with the Artec Leo required evenly and fully covering layers of spray. Although the Autoscan Inspec was more adept at handling the reflective, shiny and absorbing properties of our instruments, in our setup, all stainless-steel instruments or parts required a single layer of spray for appropriate scans. In accordance with the manufacturer’s instructions, the scanning spray was always applied by shaking it prior to usage and spraying it at a distance of 15–20 centimetres while slowly and steadily moving around the instrument. An example of the results obtained before and after spraying is shown in Fig. 3. The obvious downside of this method is that the original surface texture cannot be captured, therefore we feel the universal STL format is appropriate for sharing the models created in this study.
Scanners and post-processing
A desktop computer with an AMD Ryzen 9–5900 × 12 Core Processor and 3.200 hertz DDR4 RAM along with an NVIDIA GeForce RTX 3090 graphics card was used for post-processing, analysing and augmenting the data collection. The 3D point cloud data was obtained using Artec Leo and Autoscan Inspec Structured light scanners. The corresponding software Artec Studio Professional 17 and UltraScan version 2.0.0.7 were used for post-processing and model generation.
Data acquisition and post-processing with Artec Leo and Artec Studio Professional 17
Artec Leo is a handheld 3D scanner. It utilises a white 12 Light-emitting diode (LED) array light source, with an optimal working distance between 0.35–1.2 metres. An accuracy of 0.2 mm + 0.3 mm/m should be obtainable according to the manufacturer.
As a trade-off between accuracy and the desire to have the whole instrument within our field of view during scanning, 0.5 metre scanning distance is chosen with a recommended exposure time of one millisecond. To guarantee this distance, the scanner was set to show the distance colour map superimposed on the scanned object while scanning. To minimise error, the scanner was set to only scan the object if tracking was maintained. In this context, tracking refers to the automatic estimation of the relative frame position performed by the scanner during recording. Tracking is based on common surface and texture features. Therefore, a simple background rich in texture features was chosen. Recording was performed with 60 high-definition frames per second. Artec Leo allows recording additional texture frames, the combined setting was used to record a total of 65 frames per second. Scanning was done by slowly and smoothly encircling the stationary instrument, changing the angle of the scanner in relation to the instrument throughout all positions in the circle. The actual scanning rate varied due to Artec Leo’s feature of recording only when tracking was accurate. Each scan recorded approximately 800–1200 frames which took less than a minute to acquire.
Each instrument was scanned in two orientations to ensure a complete compassing capture of the instrument for later post-processing. Both scans of the instrument were imported in Artec 17 professional with data density factor 8 for its artificial intelligence-powered enhanced reconstruction. This is a feature from the manufacturer to increase the data points and reduce the noise within the scan. These are also the maximum recommended settings for our hardware setup.
Figure 4 gives an overview of the steps for post-processing objects scanned with the Artec Leo and its proprietary software. After importing the recording and HD reconstruction, the global registration feature was used, and a region of interest was cropped out with the editor tool. The global registration converts all surfaces created from single frames to a single coordinate system. To do so, the software selects geometry and texture points within a frame and matches these points to the other frames. The software then tries to minimize the mean differences between these points. Unfortunately, the exact algorithm is not publicly disclosed by the manufacturer. Frames with an error distance greater than 0.3 mm were not used to generate surface models. To ensure accurate registration, a key frame ratio of 0.5 is used for geometry and texture-based registration, which looks for features within areas of five squared millimetres. Key-frame ratio determines the percentage of frames the software utilises for the registration on a scale of zero to one.
Semi-manual alignment was then performed, Fig. 4b, followed by an additional global registration to compensate for potential earlier mismatches based on features that are now deleted. The background, noise and artefacts were manually removed from both scans using the editor tool. Visual inspection was conducted, and if necessary, additional semi-manual alignment was performed. This was needed in cases where due to a lack of features on the surgical instrument, in combination with its symmetrical shape, the algorithm of the Artec Studio software prioritized background features over instrument features. Subsequently, outlier removal was performed with a 3D-noise level set to three, and accuracy set to 0.2 millimetres, which is the maximum 3D resolution recommended by the manufacturer. The suggested 3D resolution was derived from the scanner’s maximum 3D resolution of 0.2 millimetres. Outlier removal was performed by calculating the mean distance and standard deviation between neighbouring surface points. Surface points with a mean and standard deviation greater than an interval defined by the mean and standard deviation of all neighbourhood points were then classified as outliers and removed from the scene automatically. The 3D-noise level, which multiplies the standard deviation of neighbourhood points, controls outlier assignment. A higher 3D-noise level reduces identified outliers. Notably, our earlier manual point removal stage contributes to decreased noise on our surfaces. However, the specific process details remain undisclosed by the manufacturer, Artec3D.
Once both scans of the instrument were aligned and the background and noise were removed, fusion was applied to generate a watertight fusion mesh. A sharp and smooth fusion was performed, resulting in two STL models per scan. Sharp fusion contains a higher level of detail and achieves the maximum 3D resolution of 0.2 millimetres according to the manufacturer. The downside of this method is that potential noise left in the model after steps a-e from Fig. 4 may be intensified. The smooth fusion results in smoother models and despite the target resolution of 0.2 millimetres, the software may remove points to reach a maximum mean point distance of 0.6 millimetres. Additionally, models are automatically smoothed, and since surgical instruments are generally smooth, this might result in an aesthetically more appealing model. Unfortunately, the exact algorithms are not given by the manufacturer.
Fusion was performed with a resolution of 0.2 millimetres, ultra-HD sensitivity, and excluding frames above the maximum error threshold of 0.3 millimetres. In rare cases, the smooth fusion model was manually edited. These cases were the self-retaining retractors and speculums, where fine details and structures were not separated appropriately by the fusion process. In summary, while the sharp fusion models are always presented as is, the smooth fusion models are processed with the in-built smoothing function from Artec Studio, and manual editing in case of small structures which were over-smoothed by the software.
The models were inspected using the Artec Software and inside a rendering environment such as Blender. Eight instruments with movable parts, which may be present in different states in a surgical scene, were selected and scanned in different configurations. For example, some of the mouth gags were scanned with an open and closed state of the instrument. The final STL files are available in the data repository30.
Data acquisition with autoscan inspec and ultrascan
Autoscan Inspec is a high-end industrial table desktop scanner ideal for reverse engineering of small parts and has become well known for its usage in dental applications. It utilises a blue light source and two five-megapixel grey-scale cameras. According to the manufacturer, its accuracy is 0.008 millimetre, resulting in an overall 3D resolution of 0.05 millimetres. Its max scanning area is 100 × 100 × 75 millimetres, although scanning the object in multiple orientations does allow for scanning larger objects.
The instrument was attached to the scanner’s robotic table at the end of its arm. The table can rotate 360 degrees and the arm itself can rotate from 0 to 135 degrees, with 50 being level with the desktop table. The scanner has a default path with 10 pre-settings of the table and arms rotational position, leading to 10 frames making up a single scan. If desired, the user can manually change these positions and the number of frames. There’s an “add scan” option, which allows users to scan extra frames in a desired position and add them to the earlier scan sequel. Only for a few thin, lengthy objects, a manual scan path was used. The default scan path, using a few additional scans upon visual inspection, was deemed sufficient for the remaining instruments. The scan was edited by removing unwanted data points, then, the instrument was rotated 180 degrees and scanned again. This process is called a flip scan. The two scans were either automatically aligned using the UltraScans proprietary alignment method, or semi-manually aligned if automatic alignment was not possible.
Automatic alignment identifies identical data points in both initial and flip scans from their respective point clouds and performs the alignment according to them. Given surgical instrument symmetry, automatic alignment is challenging and often fails. It was possible for less than 10 cases. For the remaining cases, semi-manual alignment had to be performed, introducing potential human error. Assessing alignment success is always performed by the post-processing operator. Figure 5c1,c2 shows semi-manual alignment, resulting in Figure 5c3. If automatic alignment succeeds, Figure 5c1,c2 steps were bypassed, directly displaying Figure 5c3. Undesired data in the point cloud post-alignment is then removed manually.
A watertight mesh was created from the resulting point cloud. Models were generated as STL twice: once with the ‘remove highlight’ function on, and once without. ‘Remove highlight’ eliminates spikes. Spikes are defined as triangles arising from point cloud data that deviate from the smooth surrounding surfaces. These spikes are often induced due to reflective surfaces. Unfortunately, the manufacturer, Shining 3D,does not provide exact algorithmic details for surface model generation.
Three instruments were scanned in different configurations. For example, we scanned a surgical knife in two configurations: one with the protection clip, and another with the protection clip removed.
Instruments not completely captured with two scans, the initial scan and the flip scan, were scanned with three or four scans instead. If the instrument did not fit into the scanner, or multiple additional scans overloaded the PC’s RAM, it was not scanned with the Autoscan Inspec, but the Artec Leo instead. The final STL files are available in the data repository30.
Generating multiple likewise models
Blender add-on
To illustrate how virtual instruments can be easily transformed into multiple likewise instruments, we implemented a Blender Python add-on to assist in this task, see Fig. 6. The basis for the add-on are the simple deform function in Blender, allowing to bend, twist, taper and stretch the 3D meshes, and basic transformations, including rotation, translation and scaling.
Although these operations impact the entire model’s mesh, users have the option to automatically generate a fitting lattice, Figure 6a1-2, and manually define vertex groups within the lattice, such as those corresponding to handle and tip regions. A lattice is a 3D non-renderable deformation cage. When assigning these vertex groups within the lattice, they become accessible within the add-on. Predetermined groups can be linked to specific operations. This methodology ensures smoother transitions between vertex groups within the final mesh than working with the vertex groups directly.
The user can determine which operation to apply on which group of the lattice, including the minimum and maximum angle, amount of units or factors. A step size between the minimum and maximum value can be determined as well. After the user manually creates vertex group(s) and determines desired deform and transformation operation parameters, the add-on will automatically create all models and save them to the current Blender directory as STL files. It’s important to realise that the add-on will use the local axis of the model when specifying which action to perform around a certain axis.
If a factor or angle is zero, or scaling factor is one, the model is not saved since applying the change would not lead to a changed model. The same applies when rotation, scaling or translation operations are applied on the entire mesh. Even though the model’s position and orientation related to the world coordinate frame origin change, the model itself does not.
The add-on also has the option to create, show and save several differently smoothed models, using the subsurface, corrective smooth, Laplacian smooth and normal smooth modifiers. The add-on can also create multiple rescaled and smoothed meshes from all STL files found in a given directory path, e.g., the current Blender directory path. These functions have not been used in our case since we found that a Python script utilising the Trimesh library does this more efficiently and automatically, and it would also take enormous amounts of data storage. Therefore, we deemed it more appropriate to show an example and let potential users run the scripts themselves.
As an example of the Blender add-on usage, a single instrument from 12 different classes were modified using the Blender Python add-on. An overview of the settings and results while using the Blender add-on to create examples is given within the used instrument folder in the data repository30. The results for our example instruments can be found within the data repository30.
Python script
We furthermore provide a Python script to apply additional modifications to the instruments within our collection, to enlarge the dataset even further. Using the Trimesh library, this script has the capability of smoothing and scaling the meshes along their local axes. Suppose we scale one model with factors from 0.5 to 1.5 with steps of 0.1 along all three axes, we could generate 113−1 = 1,330
likewise model. We subtract one likewise model since rescaling with a factor of one along all three axes results in an unchanged model. This can be useful as a form of data augmentation for creating deep-learning datasets. The Python script also implements automatic Taubin, Laplacian and Humpfrey smoothing on the input models32,33.
At the start of the process, the user is prompted to provide a main directory, along with inputs if scaling and smoothing are desired, along with the scaling factors and number of iterations, respectively. The script will then search for all STL files within the main directory and its sub-directories and apply the specified scaling and smoothing operations accordingly. The resulting files are exported to newly created folders, which mirror the folder structure of the main directory provided by the user. We provide examples for two instruments in our data repository30, as shown in Fig. 7.
Analysis
All scanned models were made watertight in the proprietary software and visually inspected in Microsoft 3D Viewer. The use of scanning spray and recommended settings as described in section “Instrument preparation” until “Data acquisition with Autoscan Inspec and UltraScan”, is expected to result in submillimetre precise scans, as specified by manufacturers Artec3D and Shining 3D.
Using Python with the Numpy and Trimesh libraries, all scanned models were aligned with a tight enclosing bounding box. The bounding box was automatically oriented to minimize its volume. From this bounding box, the width, height and length in millimetres were calculated. These measurements were compared to the physical model using a flexible millimetre ruler to test if the models were precise on a millimetre level. For this assessment, we utilised the virtual models without the “remove highlight” function (Autoscan Inspec) and models generated with sharp fusion (Artec Leo).
The Trimesh library was also used to calculate the volume of the models. When comparing models from identical scans, i.e., the smooth and sharp fusion, or the scans with and without removing highlights, the average difference is less than 1 millimetre in all directions.
We scanned ‘Arterial Clamp Halsted Mosquito 1’ twice with Autocan Inspec and “Inspec Mouth Gag Denhart 1” with both 3D scanners. The deviation between these virtual models was under 1.5 millimetres, suggesting that human error in the manual post-processing contributed less than 1.5 millimetre of error. Individual measurements can be found in “Table 2 MeasurementsOnVirtualModelsUsingTrimeshLibrary” in the data repository30.
link