Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Preoperative evaluation of C2 pedicle screw placement using a deep learning model: Development and validation study

  • Junhao Bao ,

    Contributed equally to this work with: Junhao Bao, Wei Wang

    Roles Data curation, Formal analysis, Writing – original draft

    Affiliation Spine Department, Orthopaedic Center, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, Guangzhou, Guangdong, China

  • Wei Wang ,

    Contributed equally to this work with: Junhao Bao, Wei Wang

    Roles Formal analysis, Validation, Writing – original draft

    Affiliation Institute for Healthcare Artificial Intelligence Application, Guangdong Second Provincial General Hospital, Guangzhou, Guangdong, China

  • Yuelin Wu,

    Roles Validation

    Affiliations Spine Department, Orthopaedic Center, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, Guangzhou, Guangdong, China, Health Science Center, Jinan University, Guangzhou, Guangdong, China

  • Hao Ren,

    Roles Data curation

    Affiliations Institute for Healthcare Artificial Intelligence Application, Guangdong Second Provincial General Hospital, Guangzhou, Guangdong, China, Health Science Center, Jinan University, Guangzhou, Guangdong, China, Guangzhou Key Laboratory of Smart Home Ward and Health Sensing, Guangzhou, Guangdong, China

  • Zhaoquan Liang,

    Roles Data curation

    Affiliation Second School of Clinical Medicine, Southern Medical University, Guangzhou, Guangdong, China

  • Qiang Xiao,

    Roles Data curation

    Affiliation Second School of Clinical Medicine, Southern Medical University, Guangzhou, Guangdong, China

  • Yeyang Wang,

    Roles Investigation

    Affiliations Spine Department, Orthopaedic Center, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, Guangzhou, Guangdong, China, Health Science Center, Jinan University, Guangzhou, Guangdong, China

  • Fengshi Jing,

    Roles Conceptualization

    Affiliations Institute for Healthcare Artificial Intelligence Application, Guangdong Second Provincial General Hospital, Guangzhou, Guangdong, China, Health Science Center, Jinan University, Guangzhou, Guangdong, China, Guangzhou Key Laboratory of Smart Home Ward and Health Sensing, Guangzhou, Guangdong, China, Faculty of Data Science, City University of Macau, Macau SAR, China

  • Weibin Cheng ,

    Roles Supervision, Writing – review & editing

    chwb817@gmail.com (WC); lizhang686@163.com (LZ)

    Affiliations Institute for Healthcare Artificial Intelligence Application, Guangdong Second Provincial General Hospital, Guangzhou, Guangdong, China, Health Science Center, Jinan University, Guangzhou, Guangdong, China, Guangzhou Key Laboratory of Smart Home Ward and Health Sensing, Guangzhou, Guangdong, China, School of Data Science, City University of Hong Kong, Hong Kong SAR, China

  • Li Zhang

    Roles Conceptualization, Investigation, Writing – review & editing

    chwb817@gmail.com (WC); lizhang686@163.com (LZ)

    Affiliations Spine Department, Orthopaedic Center, The Affiliated Guangdong Second Provincial General Hospital of Jinan University, Guangzhou, Guangdong, China, Health Science Center, Jinan University, Guangzhou, Guangdong, China

Abstract

Background

Current preoperative assessment methods for C2 pedicle screw placement face challenges including low consistency, operational complexity, and high skill demands.

Objective

This study aimed to develop and validate a deep learning model for rapid and accurate assessment of C2 pedicle screw placement feasibility.

Materials and methods

We developed C2-Net, an automated deep learning pipeline incorporating an image segmentation module for delineating C2 pedicles in CT images and a screw placement probability assessment module. The model's performance was evaluated using 3D-printed manually placed screws as ground truth and compared with surgeons of different experience levels.

Results

On the test set, C2-Net achieved an accuracy of 89.4%, sensitivity of 90.0%, and specificity of 89.0%. The model demonstrated performance comparable to senior surgeons and numerically superior to junior surgeons, with higher consistency in diagnostic metrics. Attention maps generated by the model provided visual interpretation of the decision-making process. The predicted probabilities demonstrated capability in differentiating structural variations of C2 pedicles.

Conclusion

C2-Net shows high accuracy and efficiency in assessing C2 pedicle screw placement, outperforming junior surgeons. With its ability to provide rapid, consistent evaluations and visual interpretations, C2-Net demonstrates potential as a valuable assistive tool for clinical decision-making in spinal surgery.

Trial Registration: ChiCTR2500101655

Introduction

Atlantoaxial pedicle screw fixation is currently the most widely utilized posterior internal fixation technique in the clinic for addressing atlantoaxial instability [13]. Among these techniques, pedicle screw fixation of the axis has emerged as the preferred approach for posterior atlantoaxial fixation due to its superior biomechanical properties, minimal complication incidence, and high fusion rates [4,5]. The axis (C2) vertebra, serving as the transitional vertebra between the atlas (C1) and the lower cervical spine, exhibits complex and irregular anatomical morphology and structure. Additionally, the atlas pedicle is in close proximity to the spinal cord and vertebral artery. Consequently, during the placement of atlas pedicle screws, there exists a potential risk of vertebral artery injury, which may lead to intraoperative vertebral artery rupture, compromising blood supply to the vertebrobasilar artery system and thereby posing a threat to patient safety. Moreover, the success of C2 pedicle screw placement often hinges upon the dimensions of the narrowest segment of the patient's pediculoisthmic component (PIC), emphasizing the criticality of preoperative assessment for screw placement feasibility [6].

Our previous study had validated that CT multiplanar reconstruction (MPR) can effectively delineate the trajectory of axis pedicle screws, acquire the narrowest reconstructed section of the PIC, and precisely measure its width, thus representing the foremost CT assessment method for preoperative evaluation of axis pedicle screw placement [7,8]. However, this method is complex to operate and requires specific software support, and it also entails some degree of subjectivity and error. There is a pressing need for more precise and objective evaluation methods.

Recent years, the rapid advancements in artificial intelligence (AI) technology have provided new approaches for tackling this issue [911]. Image processing and deep learning techniques have made remarkable strides in medical imaging, furnishing surgeons with satisfactory image analysis tools [1214]. Deep learning models have been widely applied in the field of spine and have shown significant progress with great potential [1519]. In light of the prevailing challenges and opportunities, we have undertaken the development of an AI-powered model dedicated to analyzing cervical spine CT images for precise C2 pedicle screw placement feasibility. Our study aims to introduce advanced methodologies to spinal surgery, offering surgeons a decision support tool to enhance surgical outcomes and safeguard patient welfare.

Materials and methods

The schematic structure of a deep learning model, named C2-Net, construction and evaluation was shown in Fig 1. The model consisted of two modules: the image segmentation module, which separated the C2 pedicle from the subject's cervical spine CT images, and the probability assessment module, which displayed the feasibility of placing pedicle screws.

thumbnail
Fig 1. Schematic structure of the C2-net model construction and evaluation.

a shows the resampling of CT images to standardize pixel spacing across different subject data. b depicts the image segmentation module of the C2-Net model, which is used to segment the C2 pedicle regions from CT images. c illustrates the probability assessment module of the C2-Net model, which is used to evaluate the feasibility of placing screws in the C2 pedicles. The output of the module is the probability of successful and failed placement. d presents the evaluation results of the model, comparing the predicted results with the 3D-printed ground truth and the judgments of surgeons with different levels of experience.

https://doi.org/10.1371/journal.pone.0342349.g001

Study samples and data acquisition

This study was implemented at a tertiary teaching hospital in southern China. Ethical approval was obtained from the Medical Research Ethics Committee of the hospital, and written informed consent was waived due to the retrospective nature of the study. Data were accessed for research purposes from 04/12/2023–03/12/2024.

Model training data was collected from patients who underwent cervical spine CT or CTA examinations at a tertiary hospital from 01/01/2017–31/12/2022. The inclusion criteria encompassed subjects with well-defined clinical diagnoses unrelated to vertebral artery anomalies, who had undergone routine 2 mm thin-slice CT scans and 0.4–1.0 mm thin-slice head and neck CTA scans. Conversely, subjects with rheumatoid arthritis, ankylosing spondylitis, spinal metastases, congenital cervical fusion (Klippel-Feil syndrome, e.g.,), or a history of prior head or cervical spine surgery were excluded from the study.

DICOM (Digital Imaging and Communications in Medicine) derived from spinal CT scans of the participants was systematically gathered. The imaging protocol was standardized using a high-resolution 256-slice spiral CT scanner (iCT 256, Philips Healthcare, Amsterdam, Netherlands). Each scan was acquired in spiral mode with the following settings: a layer thickness of 2 mm, a tube voltage of 100 kV, a tube current of 340 mA, a window width of 2000, and a window level of 800. Participants were selected based on predefined inclusion and exclusion criteria to ensure homogeneity and relevance of the data for the intended study objectives.

Training and validation set

In the realm of clinical practice, preoperative evaluations for the placement of C2 pedicle screws are methodically conducted on an individual basis for the left and right pedicles. This necessitates the bifurcation of the C2 pedicle into two distinct segments: left and right, each treated as a separate entity. For analytical clarity, these pedicles are categorized into two risk groups—high-risk and low-risk—based on the minimal pediculoisthmic component diameter (MPD), as measured using the RadiAnt software. The volumetric rendering (VR) and multiplanar reconstruction (MPR) functions are employed to reconstruct the C2 pedicle, achieving a linear measurement precision of 0.01 mm. Labeling was based on our previous CTA-based study in which cortical breach on 3D-printed models served as the reference standard. ROC analysis identified 4.78 mm as the optimal cutoff using the Youden index; pedicles with MPD ≥ 4.78 mm were labeled as low-risk and those < 4.78 mm as high-risk (Fig 2) [7]. To prevent data leakage, splitting was performed at the patient level; both left and right pedicles from the same patient were always assigned to the same subset. Patients were randomly divided into training and validation cohorts at an 8:2 ratio, repeated ten times. Additionally, a sensitivity analysis was performed using an alternative cutoff of 4.30 mm, which was selected as a clinically pragmatic reference threshold based on common empirical considerations of the screw-to-pedicle size relationship, while the labels of the external test set remained unchanged.

thumbnail
Fig 2. CT-MPR operation procedure.

a: Utilize the “Multi-Planar Reconstruction (MPR) function” of the RadiAnt DICOM Viewer software; b: Employ the MPR function to correct potential skeletal deformities or improper positioning during patient scans to the standard transverse and standard sagittal planes; c: Reconstruct a tilted transverse plane along the longitudinal axis of the vertebral arch root notch in the standard sagittal position (yellow line); d: Reconstruct an oblique coronal plane perpendicular to the longitudinal axis of the vertebral arch root notch in the tilted transverse plane (red line), ensuring that the oblique coronal section is simultaneously perpendicular to the axis (blue line) and sagittal plane (yellow line) of the root arch, and measure the marrow cavity width (MPD) at the narrowest part of the vertebral arch root notch on the oblique coronal section; e: Classify cases with MPD ≥ 4.78 mm into the low-risk group; f: Classify cases with MPD < 4.78 mm into the high-risk group.

https://doi.org/10.1371/journal.pone.0342349.g002

3D printed test set

The confirmation of C2 pedicle screw placement feasibility was determined through simulated surgery on 3D-printed cervical spine models. DICOM data were imported into Mimics software (v23.0; Materialise, Belgium) for 3D bone model reconstruction of C2 vertebrae and exported as STL files. Solid bone models were then printed at a 1:1 scale using a Stratasys J850 3D printer with BoneMatrix RGD516 material, which provides high geometric fidelity and sufficient rigidity for trajectory and cortical breach assessment, although it does not fully replicate the biomechanical properties of living bone. This 3D printing process was supported by the Medical 3D Printing Center of Tengwei Technology.

Using a 3.5 mm C2 pedicle screw to manually place screws into the 3D printed model [20]. The ground truth for assessing pedicle wall rupture involves direct visualization and CT scanning (Fig 3). If a pedicle wall rupture was detected, the C2 pedicle was placed into the high-risk group of the test set; conversely, if no rupture was observed, it was classified into the low-risk group.

thumbnail
Fig 3. 3D-printed model and CT assessment of C2 pedicle screw insertion.

C2 pedicle screw placement in a 3D-printed bone model, followed by CT multiplanar reconstruction to evaluate screw trajectory and cortical integrity.

https://doi.org/10.1371/journal.pone.0342349.g003

Image processing

To prepare data for analysis, DICOM images in the training, validation, and test sets were converted to JPEG format. Resampling techniques were used to standardize pixel spacing across CT datasets, by standardizing the sampling interval to 1.0 across all axes, the processed image exhibits uniform physical dimensions along all axes, ensuring comparability of C2 pedicle among subjects. Initially, the ratio between original and desired pixel spacing was calculated and applied to interpolate images, maintaining spatial resolution. Image standardization techniques addressed CT image discrepancies, ensuring consistency. Five CT images per C2 pedicle were selected based on pedicle isthmus narrowing, ranging from wide to narrow, as input data. All CT images were tri-channel grey images (R = G = B). During standardization, mean values were subtracted from R, G, and B values and divided by standard deviation, yielding standardized outputs. Mean and standard deviation were calculated from R, G, and B values within the training set.

Deep learning model construction

C2-Net is an automated pipeline that includes an image segmentation module for delineating C2 pedicles in CT images and a screw placement probability assessment module for evaluating the feasibility of screw placement. These modules consist of standard deep learning components, including convolutional layers, pooling layers, nonlinear activation functions, dropout, and fully connected layers, enabling automated feature extraction and end-to-end learning.

The segmentation module employed a U-Net architecture with a five-level encoder–decoder structure and skip connections. The number of feature channels in the encoder progressively increased from 64 to 512, and the decoder symmetrically restored spatial resolution by upsampling and feature concatenation. All convolutional layers used 3 × 3 kernels with ReLU activation. This module enabled accurate localization and isolation of the C2 pedicles from surrounding anatomical structures.

Following segmentation, each C2 pedicle was separated into left and right components, which were treated as independent analytical units. For each pedicle, five consecutive axial CT slices centered on the narrowest pedicle region were extracted as input for feasibility assessment.

The screw placement feasibility assessment module was implemented using a modified C3D network comprising four 3D convolutional blocks with channel sizes of 16, 32, 64, and 128, respectively [21,22]. Each block consisted of 3 × 3 × 3 convolutions with padding, ReLU activation, and max-pooling. The extracted features were subsequently processed through three fully connected layers (4096, 1024, and 2 neurons), with the final layer outputting the probability of screw placement feasibility (low-risk vs high-risk). Dropout (rate = 0.5) was applied to reduce overfitting.

Model training was performed using the Adam optimizer with an initial learning rate of 1 × 10 ⁻ ⁴ and a batch size of 4. A cosine annealing learning rate scheduler was applied, and training was conducted for up to 200 epochs with early stopping if the validation loss did not improve for 10 consecutive epochs.

The overall training workflow is illustrated in Fig 1, and the final model parameters were selected based on the best performance on the validation set.

Screw placement probability assessment

The model categorizes split pedicles into high-risk group and low-risk group and generates a GIF illustrating the probability of C2 pedicle screw placement.

After analyzing the input data, the model outputs raw scores, known as logits, for each category. Subsequently, the Softmax function is applied to calculate the predicted probabilities, transforming the logits into a probability distribution. For the logits vector [z1, z2,..., zn] of each input sample, the Softmax function is computed using the formula:

represents the exponentiation of logit zi, and the denominator is the sum of the exponentiations of all logits. This ensures that the sum of probabilities for all categories equals 1. Through this approach, the model not only classifies split pedicles into those high-risk and low-risk for screw placement but also provides the probability of screw placement feasibility. This aids in assessing the model’s confidence and performance.

The screw placement probability assessment module training and validation

After segmentation, each of the split C2 pedicles, which includes five segmented CT images, was fed into the screw placement probability assessment module to evaluate the overall risk and predict the feasibility of screw placement. The screw placement probability assessment module underwent training for 200 epochs. In each epoch, the model received the split C2 pedicle’s images and labels from the training set, resulting in the generation of model parameters. Model parameters were saved every 10 epochs for validation using the validation set. Parameters demonstrating optimal diagnostic efficacy were selected as the final model parameters and subjected to further validation using the external test set.

Model evaluation

The model's performance was externally validated through a 3D printed test set. Additionally, the 3D printed test set was evaluated by surgeons of varying experience levels to compare the model's diagnostic performance with that of physicians, thereby validating the model's credibility. (Senior spine surgeons: Dr. Zhang and Dr. Wang; Junior spine surgeons: Dr. Lin and Dr. Liang)

Senior spine surgeons in this study had over 10 years of experience and held senior professional titles, while junior spine surgeons had less than 5 years of experience and held intermediate or lower professional titles.

Model attention

The deep learning visualization method was used to generate an attention map to show the areas identified by C2-Net that attracted the most attention of the model. A cut-off value of 0.5 was used to preserve the high response of the attention area.

Statistical analysis

Continuous variables are presented as means ± standard deviations, while discrete variables are expressed as frequencies and percentages. Evaluation of C2-Net's classification performance utilized metrics including accuracy, sensitivity, specificity, area under the receiver operating characteristic curve (AUC). These metrics were reported as percentages with corresponding 95% confidence intervals (CIs). All deep learning methods were performed using the PyTorch toolkit and Python 3.7 (Python Software Foundation, www.python.org).

Results

Data acquisition and dataset splitting

A flowchart of subject data retrieval and the study is shown in Fig 4, and the demographics of the two cohorts of enrolled subjects are shown in Table 1. A total of 490 split C2 pedicles were included, 245 were categorized as high-risk, and 245 were deemed as low-risk. Among these, 440 pedicles were proportionally assigned to the training and validation sets, while the remaining 50 cases were the external test set, with a total of 2450 C2 pedicles CT images. We used MPR to measure the narrowest diameter of the C2 pedicle isthmus. The average narrowest width of the left C2 pedicle isthmus was 5.82 ± 1.61 mm, and the right was 5.55 ± 1.44 mm. The left PIC measurement was significantly greater than the right.

thumbnail
Table 1. Summary of clinical and imaging features of study subjects and C2 minimum pediculoisthmic component diameter.

https://doi.org/10.1371/journal.pone.0342349.t001

thumbnail
Fig 4. Flowchart of subject data retrieval and the study.

405 patients (490 C2 pedicles) were enrolled and stratified by pedicle diameter into low- and high-risk groups for screw placement. Both groups were included in the 10-fold cross-validation for C2-NET training and validation, while an independent test set was used to compare the AI model's performance with that of doctors.

https://doi.org/10.1371/journal.pone.0342349.g004

The image segmentation module performance

The image segmentation module of C2-Net demonstrates high performance, accurately segmenting the C2 pedicle from CT images. In the validation set for image segmentation, it achieved a dice coefficient of 0.959.

Model diagnostic efficacy

The model's average accuracy in determining whether C2 pedicle screw placement was feasible in the validation set was 91.4%, with an average AUC of 0.94 (95% CI, 0.91 to 0.97) and average sensitivity and specificity of 0.93 and 0.90, respectively. In the 3D printed test set, the model's average accuracy was 89.4%, the average AUC was 0.94 (95% CI, 0.91 to 0.96), and the average sensitivity and specificity were 0.90 and 0.89, respectively (Fig 5). No statistically significant differences were observed between the validation and test sets across these performance metrics. Screw placement probability was displayed as a GIF by combing the input five images of the C2 pedicle, which was illustrated in the Fig 1.

thumbnail
Fig 5. ROC curve of the validation, test set.

ROC curves for the C2-NET model, showing performance in the 10-fold cross-validation (left) and independent test set (right). Each curve corresponds to a validation fold, with AUC values indicating strong and consistent diagnostic accuracy.

https://doi.org/10.1371/journal.pone.0342349.g005

Comparison with clinical surgeons

Compared to predictions made by junior surgeons, the C2-Net model showed a slight advantage in assessing the feasibility of C2 pedicle screw placement, achieving an accuracy rate and AUC of 89.4% versus 88.0% (P>0.05), and 0.94 (95% CI, 0.91 to 0.96)versus 0.88 (P>0.05), respectively. However, when contrasted with senior surgeons’ assessments, the human experts outperformed the C2-Net model, boasting an accuracy rate and AUC of 96.0% versus 89.4% (P>0.05), and 0.98 versus 0.94 (95% CI, 0.91 to 0.96, P>0.05), respectively (Fig 6). Moreover, the senior surgeon group demonstrated significantly higher accuracy and sensitivity than the junior surgeon group (P<0.05).

thumbnail
Fig 6. Performance comparison of C2-net model with junior and senior surgeons in C2 pedicle screw placement assessment.

Confusion matrices illustrate the diagnostic outcomes of junior and senior surgeons, while the ROC curves compare their performance to the C2-NET model, demonstrating the model's superior accuracy in identifying high-risk C2 pedicles.

https://doi.org/10.1371/journal.pone.0342349.g006

Visual interpretation of deep-learning internal decision making

Attention area in the CT images detected by C2-Net are shown in Fig 7. Regardless of whether the validation or test set was used, the attention maps showed that C2-Net was able to detect pedicle isthmus and label them as highly responsive areas, indicating that the model can adequately learn the features of C2 pedicle CT images and respond appropriately.

thumbnail
Fig 7. Attention area in the CT images detected by C2-Net.

Original CT images (top) and corresponding model attention maps (bottom) show the regions of interest the C2-NET model focuses on to evaluate C2 pedicle screw placement risk, with color intensity reflecting the degree of attention.

https://doi.org/10.1371/journal.pone.0342349.g007

Discussion

This study introduced C2-Net model, a deep learning model combining image segmentation and probability assessment functionalities, offering an end-to-end solution for pedicle screw placement surgical planning and improving both accuracy and efficiency.. Model performance was evaluated by performing pedicle screw placement surgery on 3D printed C2 pedicle models. Through this deep model, we were able to integrate the feasibility assessment of placing pedicle screws in the C2 PIC into a complete, automated workflow, thus improving the accuracy and efficiency of the evaluation. Similar AI-based models have been developed for lumbar or thoracic pedicle screw planning, demonstrating high accuracy and clinical feasibility, highlighting the growing potential of deep learning in spinal surgical planning [2325].

Various CT- or CTA-based techniques have been employed to assess the morphometric characteristics of the C2 PIC for determining the safety and feasibility of C2 pedicle screw or transarticular screw placement [2628].These methods include transverse C2 pedicle width, defining HRVA, and oblique CT scan reconstructions. Reported screw misplacement rates vary from 5% to 41%, reflecting the influence of measurement techniques, surgical experience, and anatomical variability [29]. In contrast, C2-Net uses a 3D convolutional (C3D) architecture that analyzes volumetric image sequences to capture richer anatomical context, outperforming traditional 2D or slice-based analyses [30,31]. This design ensures higher diagnostic performance, reproducibility, and robustness across different imaging sources. However, because this study focused solely on pedicle morphometry, vascular variants of the vertebral artery and bone quality were not incorporated into the current model. Although the model's performance did not differ significantly from that of either the senior or junior surgeon groups, this may be due to the limited sample size. Nevertheless, the model demonstrated a performance pattern more consistent with senior surgeons across multiple metrics, including accuracy, sensitivity, and specificity, and showed numerically higher but not statistically significant performance compared with junior surgeons. Moreover, a significant difference in both sensitivity and accuracy was found between the senior and junior surgeon groups, suggesting that surgical experience plays an important role in maintaining diagnostic consistency.

The definition of a “narrow C2 pedicle” remains controversial, as different CT measurement techniques yield variable pedicle diameter values, leading to inconsistent criteria for determining screw placement feasibility. Maki et al reported that pedicles with a medullary canal width ≤ 4 mm were unsuitable for safe C2 pedicle screw insertion [32]. Similarly, Marques et al used the MPR function of OsiriX and proposed that pedicle widths of at least 5.5 mm and 6.0 mm are required for 3.5 mm and 4.0 mm screws, respectively [33]. In this study, based on ROC analysis of cortical breach observed in 3D-printed models, the Youden index identified 4.78 mm as the optimal MPD cutoff) [7]. This threshold was validated in our earlier work, where simulated insertion of a 3.5 mm screw frequently resulted in cortical breach when the pedicle width was less than 4.78 mm. Using this standardized cutoff allowed C2-Net to achieve diagnostic performance comparable to that of senior surgeons, supporting its potential as an objective and reproducible tool for preoperative risk assessment. In contrast, When a more conservative alternative cutoff based on the commonly used 80% screw-to-pedicle ratio was applied for sensitivity analysis, model performance decreased, supporting the robustness of the ROC-derived 4.78-mm threshold(Supplementary S1 Table). However, this threshold was derived from single-center data primarily involving an Asian population and may not be directly generalizable to other ethnic groups with different anatomical morphologies. In addition, subtle differences in operator measurement techniques, CT resolution, and segmentation precision may introduce variability near this boundary. Therefore, the 4.78-mm cutoff should be regarded as a statistical reference rather than an absolute safety limit. Future research should explore adaptive or patient-specific thresholding strategies or develop a continuous risk-scoring system to more flexibly represent the probability of cortical breach..

Due to the limited availability and high cost of cadaveric specimens, BoneMatrix 3D-printed bone models were used to simulate the C2 pedicle screw placement process. These models replicate cortical thickness and biomechanical characteristics of the C2 pedicle, allowing a controlled and reproducible testing environment while overcoming sample size limitations. Although the X-ray absorption differences between the model cortex and marrow cannot be completely distinguished, this limitation has minimal impact on the evaluation of screw trajectory or cortical breach detection.

Clinically, C2-Net demonstrates strong potential for application. It can provide less experienced surgeons with rapid, standardized, and reproducible preoperative assessments. By optimizing workflow, it can also save time and labor costs, improve resource utilization, and enhance reproducibility across institutions. Ultimately, such automation can promote safer and more individualized treatment strategies, potentially improving patient outcomes in spine surgery.

While the overall accuracy and AUC of the model demonstrated strong performance comparable to senior surgeons, detailed analysis of false-positive and false-negative predictions provides further insight into its clinical reliability. False positives—cases in which trajectories are incorrectly predicted as high-risk despite being safe—may lead to unnecessary alterations in fixation strategies or avoidance of optimal screw paths, thereby increasing operative time and surgical complexity. Conversely, false negatives—cases predicted as safe when they are actually high-risk—pose substantial clinical danger, potentially resulting in vertebral artery, spinal cord, or nerve root injury. In this study, the senior surgeons demonstrated the lowest false-positive rate (4%), significantly lower than that of the junior group (20%, P = 0.037). The AI model's false-positive rate showed no significant difference compared with either surgeon group, indicating its comparable capability in minimizing false risk assessment. False-negative rates were generally low and did not differ significantly among the three groups. Future optimization of the model will focus on improving the recognition of borderline cases by increasing the weighting of high-risk samples during training and integrating human-in-the-loop validation, allowing surgeons to review model outputs through visualization and attention maps before making final decisions.

This study still has several limitations. The training dataset primarily consisted of Asian patients, requiring external validation across diverse populations and multi-center settings to ensure generalizability. Direct in vivo validation was not performed, and 3D-printed models cannot fully replicate intraoperative environments. In addition, the external test cohort of 50 pedicles limits statistical power and may underestimate performance variability. Therefore, future work will include prospective clinical validation using CTA-based blind prediction compared with postoperative CT outcomes, expansion of external test cohorts, and integration of C2-Net into navigation or surgical planning systems to build a comprehensive AI-assisted surgical planning platform.

Conclusion

The C2-Net model represents a significant advancement in the preoperative assessment of C2 pedicle screw placement feasibility. By combining deep learning techniques for image segmentation and probability assessment, C2-Net offers an automated, efficient, and accurate solution for surgical planning. The model's performance is comparable to that of experienced surgeons, while maintaining time-efficiency and reducing subjective errors. This advancement promises to enrich clinical practice, optimize patient care, and ultimately contribute to improved surgical outcomes and patient safety.

Supporting information

S1 Table. Sensitivity analysis of model performance under an alternative cutoff definition (All results are based on the same 3D-printed test set).

https://doi.org/10.1371/journal.pone.0342349.s001

(DOCX)

References

  1. 1. Harms J, Melcher RP. Posterior C1-C2 fusion with polyaxial screw and rod fixation. Spine (Phila Pa 1976). 2001;26(22):2467–71. pmid:11707712
  2. 2. Yuan F, Yang H-L, Guo K-J, Li J-S, Xu K, Zhang Z-M, et al. A clinical morphologic study of the C2 pedicle and isthmus. Eur Spine J. 2013;22(1):39–45. pmid:22890566
  3. 3. Goel A, Laheri V. Plate and screw fixation for atlanto-axial subluxation. Acta Neurochir (Wien). 1994;129(1–2):47–53. pmid:7998495
  4. 4. Rajinda P, Towiwat S, Chirappapha P. Comparison of outcomes after atlantoaxial fusion with C1 lateral mass-C2 pedicle screws and C1-C2 transarticular screws. Eur Spine J. 2017;26(4):1064–72. pmid:27771789
  5. 5. Sciubba DM, Noggle JC, Vellimana AK, Alosh H, McGirt MJ, Gokaslan ZL, et al. Radiographic and clinical evaluation of free-hand placement of C-2 pedicle screws. Clinical article. J Neurosurg Spine. 2009;11(1):15–22. pmid:19569935
  6. 6. Naderi S, Arman C, Güvençer M, Korman E, Senoğlu M, Tetik S, et al. An anatomical study of the C-2 pedicle. J Neurosurg Spine. 2004;1(3):306–10. pmid:15478369
  7. 7. Wu Y, Liang Z, Bao J, Wen L, Zhang L. C2 pedicle screw placement on 3D-printed models for the performance assessment of CTA-based screw preclusion. J Orthop Surg Res. 2023;18(1):7. pmid:36597148
  8. 8. Wu Y, Liang Z, Bao J, Wen L, Zhang L. Morphology analysis of the C2 pediculoisthmic component and feasibility of safe C2 pedicle screw placement: comparison of multiplanar reconstruction versus traditional radiographic methods. J Orthop Surg Res. 2023;18(1):252. pmid:36973803
  9. 9. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44. pmid:26017442
  10. 10. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69S:S36–40. pmid:28126242
  11. 11. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25(1):24–9. pmid:30617335
  12. 12. Won D, Lee H-J, Lee S-J, Park SH. Spinal stenosis grading in magnetic resonance imaging using deep convolutional neural networks. Spine (Phila Pa 1976). 2020;45(12):804–12. pmid:31923125
  13. 13. Chuang C-H, Lin C-Y, Tsai Y-Y, Lian Z-Y, Xie H-X, Hsu C-C, et al. Efficient triple output network for vertebral segmentation and identification. IEEE Access. 2019;7:117978–85.
  14. 14. Grover P, Siebenwirth J, Caspari C, Drange S, Dreischarf M, Le Huec J-C, et al. Can artificial intelligence support or even replace physicians in measuring sagittal balance? A validation study on preoperative and postoperative full spine images of 170 patients. Eur Spine J. 2022;31(8):1943–51. pmid:35796837
  15. 15. Liawrungrueang W, Kim P, Kotheeranurak V, Jitpakdee K, Sarasombath P. Automatic detection, classification, and grading of lumbar intervertebral disc degeneration using an artificial neural network model. Diagnostics (Basel). 2023;13(4):663. pmid:36832151
  16. 16. Trinh GM, Shao H-C, Hsieh KL-C, Lee C-Y, Liu H-W, Lai C-W, et al. Detection of lumbar spondylolisthesis from X-ray images using deep learning network. J Clin Med. 2022;11(18):5450. pmid:36143096
  17. 17. Gami P, Qiu K, Kannappan S, Alperin Y, Biase GD, Buchanan IA, et al. Semiautomated intraoperative measurement of Cobb angle and coronal C7 plumb line using deep learning and computer vision for scoliosis correction: a feasibility study. J Neurosurg Spine. 2022;37(5):713–21. pmid:36303475
  18. 18. Uemura K, Fujimori T, Otake Y, Shimomoto Y, Kono S, Takashima K, et al. Development of a system to assess the two- and three-dimensional bone mineral density of the lumbar vertebrae from clinical quantitative CT images. Arch Osteoporos. 2023;18(1):22. pmid:36680601
  19. 19. Yamada K, Nagahama K, Abe Y, Hyugaji Y, Ukeba D, Endo T, et al. Evaluation of surgical indications for full endoscopic discectomy at lumbosacral disc levels using three-dimensional magnetic resonance/computed tomography fusion images created with artificial intelligence. Medicina (Kaunas). 2023;59(5):860. pmid:37241092
  20. 20. Li Y, Lin J, Wang Y, Luo H, Wang J, Lu S, et al. Comparative study of 3D printed navigation template-assisted atlantoaxial pedicle screws versus free-hand screws for type II odontoid fractures. Eur Spine J. 2021;30(2):498–506. pmid:33098009
  21. 21. Saeed MU, Dikaios N, Dastgir A, Ali G, Hamid M, Hajjej F. An automated deep learning approach for spine segmentation and vertebrae recognition using computed tomography images. Diagnostics (Basel). 2023;13(16):2658. pmid:37627917
  22. 22. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International conference on computer vision. 2015.
  23. 23. Zhang Q, Zhao F, Zhang Y, Huang M, Gong X, Deng X. Automated measurement of lumbar pedicle screw parameters using deep learning algorithm on preoperative CT scans. J Bone Oncol. 2024;47:100627. pmid:39188420
  24. 24. Yang T, Liu X, Tang J, Xu C, Wu Y, Bao B, et al. Feasibility analysis of a three-dimensional U-net algorithm-assisted automatic pedicle screw planning. World Neurosurg. 2025;201:124302. pmid:40683465
  25. 25. Jia S, Weng Y, Wang K, Qi H, Yang Y, Ma C, et al. Performance evaluation of an AI-based preoperative planning software application for automatic selection of pedicle screws based on computed tomography images. Front Surg. 2023;10:1247527. pmid:37753530
  26. 26. Yin D, Oh G, Neckrysh S. Axial and oblique C2 pedicle diameters and feasibility of C2 pedicle screw placement: Technical note. Surg Neurol Int. 2018;9:40. pmid:29527398
  27. 27. Sieradzki JP, Karaikovic EE, Lautenschlager EP, Lazarus ML. Preoperative imaging of cervical pedicles: comparison of accuracy of oblique radiographs versus axial CT scans. Eur Spine J. 2008;17(9):1230–6. pmid:18661159
  28. 28. Yeom JS, Buchowski JM, Kim H-J, Chang B-S, Lee C-K, Riew KD. Risk of vertebral artery injury: comparison between C1-C2 transarticular and C2 pedicle screws. Spine J. 2013;13(7):775–85. pmid:23684237
  29. 29. Luchmann D, Jecklin S, Cavalcanti NA, Laux CJ, Massalimova A, Esfandiari H, et al. Spinal navigation with AI-driven 3D-reconstruction of fluoroscopy images: an ex-vivo feasibility study. BMC Musculoskelet Disord. 2024;25(1):925. pmid:39558228
  30. 30. Fang Y, Li W, Chen X, Chen K, Kang H, Yu P, et al. Opportunistic osteoporosis screening in multi-detector CT images using deep convolutional neural networks. Eur Radiol. 2021;31(4):1831–42. pmid:33001308
  31. 31. Tharmaseelan H, Vellala AK, Hertel A, Tollens F, Rotkopf LT, Rink J, et al. Tumor classification of gastrointestinal liver metastases using CT-based radiomics and deep learning. Cancer Imag. 2023;23(1):95. pmid:37798797
  32. 32. Maki S, Koda M, Iijima Y, Furuya T, Inada T, Kamiya K, et al. Medially-shifted rather than high-riding vertebral arteries preclude safe pedicle screw insertion. J Clin Neurosci. 2016;29:169–72. pmid:26916906
  33. 33. Marques LMS, d’Almeida GN, Cabral J. “Two-step” technique with OsiriX™ to evaluate feasibility of C2 pedicle for surgical fixation. J Craniovertebr Junct Spine. 2016;7(2):75–81. pmid:27217652