Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The application of artificial intelligence techniques in predicting game outcomes of professional basketball league: A systematic review

Abstract

Background

Predicting basketball game outcomes is a critical area in sports science and data analysis, providing concrete benefits for optimizing coaching strategies, improving team management, and informing betting decisions.

Objective

This methodological review systematically evaluates the effectiveness of specific artificial intelligence technologies in predicting professional basketball game outcomes over the past five years from 2019 to 2024, providing detailed insights into current methodologies and identifying emerging trends and challenges in this domain.

Methods

Following PRISMA-SCR guidelines, a comprehensive keyword search was conducted across four electronic bibliographic databases: PubMed, Web of Science, Scopus, and EBSCO. Studies were included if they utilized artificial intelligence techniques, focused on professional leagues, and aimed to predict game outcomes.

Results

This review incorporated 34 studies that met the predefined eligibility criteria, examining various artificial intelligence techniques used to predict professional basketball game outcomes over the past five years. The findings reveal that artificial intelligence models, particularly the multilayer perceptron neural network, achieved a high prediction accuracy of 98.90%. The random forest model, based on four factors, reached an accuracy of 93.81%, while the voting regression ensemble model achieved 93.3%. The studies underscore the importance of effective data processing and feature selection in enhancing model performance. Additionally, dynamic prediction models that adapt to real-time changes in the game were shown to be particularly useful for tactical decisions and betting strategies.

Conclusions

Artificial intelligence significantly improves the accuracy of predicting outcomes in professional basketball games. Future research should include diverse basketball leagues and employ more advanced validation techniques to enhance model robustness and applicability. Integrating real-time data and exploring transfer learning will likely improve prediction accuracy and decision-making support.

Introduction

Basketball is a widely followed sport globally and the National Basketball Association (NBA) is the premier professional basketball league in the United States [1]. In these competitive environments, accurate predictions of game outcomes can provide valuable insights for coaching staff, helping them to better understand opponents and develop effective strategies [2]. Additionally, these predictions can inform team management decisions regarding player acquisitions, lineup adjustments, and other strategic areas [3]. For bettors and investors, reliable prediction models are important tools for making informed decisions [4]. As a result, the technology for forecasting outcomes in professional basketball games has become a critical area of research within sports analytics.

As the volume and complexity of game data increased, the limitations of traditional methods in predicting match outcomes became increasingly evident. Traditional sports science relies heavily on the expertise of coaches, team leaders, and analysts [5]. Initially, data collection for competitions was primarily manual [6]. However, as data volumes rapidly increased, traditional methods proved insufficient data processing capacity, often resulting in incomplete data and low prediction accuracy [7]. Over the past few decades, advancements in computer technology have driven scholars to explore more efficient methods to predict match outcomes. In 1997, the PC-based Advanced Scout system was developed, marking the entry of NBA data analytics into data mining and machine learning (ML) [8]. As artificial intelligence (AI) technology advances, AI has become a popular research focus in the field of sports outcome prediction, encompassing a variety of sports, including basketball [913].

Among various sports events, the prediction of basketball game outcomes has garnered particular attention due to its complexity and high level of competitiveness. Existing research has explored the application of AI in various areas, including performance prediction [1416], injury risk assessment [1719], game outcome prediction [2024], player performance evaluation [2527], team performance evaluation [2830], tactical lineup optimization [3133], action recognition [3436], and tactical decision-making [3739]. Among these, researchers have made significant progress in predicting basketball game outcomes using various AI models [13,9,20,40,41]. For example, some studies have developed hybrid ensemble learning frameworks that effectively addressed issues such as feature redundancy, sample noise, and dataset imbalance by integrating bagging strategies and random subspace algorithms, achieving an accuracy rate of 84.00% [3]. Similarly, an intelligent framework based on ML and feature selection was proposed to predict NBA game outcomes [9]. Using naive bayes (NB), artificial neural networks (ANN), and decision tree (DT) models, the study identified defensive rebounding as the most significant factor influencing game outcomes [9]. Additionally, seven different ML models were utilized to predict NBA game outcomes, with the k-nearest neighbors (KNN) algorithm yielded the best overall prediction results, with an average accuracy of 60.01% [41].

Despite considerable research on AI applications for predicting basketball match outcomes, the effectiveness of these methods varies significantly, and comprehensive review studies remain scarce. The few existing reviews suffer from several limitations, including a lack of focus on game outcome prediction, the absence of a systematic literature review methodology, the inclusion of a limited number of studies, and the frequent omission of recent literature, particularly literature that reflects the rapid advancements in AI technology in recent two years [42,43]. Additionally, the lack of comprehensive evaluation of different AI methods’ effectiveness across various leagues has hindered the practical determination of the most suitable AI techniques for different scenarios. Notably, these reviews often overlook the uniqueness of professional basketball, which features higher competition levels, more complex tactical systems, and closer teamwork.

This review aims to addressed the gaps identified by providing a comprehensive and systematic evaluation of AI technologies in predicting professional basketball match outcomes, specifically focusing on developments from the past five years. Over this period, AI technology has made significant advancements in algorithms, models, and applications, making it an essential timeframe for capturing the current state and emerging trends in the field. By focusing on this period, we aim to better reflect the latest technological progress and application trends. Specifically, this study will make contributions in the following three areas: first, we will summarize the specific application areas of existing AI technologies in predicting basketball match outcomes, providing a comprehensive and accessible introduction to core AI methods to facilitate their widespread dissemination and application in both academic and practical sports analytics. Second, we will conduct an in-depth examination of the application of AI technologies in data collection and processing, feature selection and extraction, identifying the key factors that influence prediction accuracy. By analyzing how these factors impact AI model performance across different scenarios, we aim to propose optimization strategies for improving prediction accuracy. Third, we will conduct a thorough evaluation of the accuracy and applicability of different AI methods for predicting professional basketball match outcomes. By comparing various models in detail, we will identify the strengths and limitations of each model across different scenarios, providing a scientific basis for optimizing and selecting the most appropriate AI models.

The remainder of this review is structured as follows: Section 2 (Methods) describes the methodology used in this review, including study selection criteria, search strategy, data extraction and synthesis, and study quality assessment. Section 3 (Results) presents the findings from the reviewed studies, including identified AI methodologies, predictive performance, and key influencing factors. Section 4 (Discussion) provides a comparative analysis of AI techniques, the impact of feature selection and data preprocessing, model validation methods, and a discussion of static versus dynamic predictions. This section also compares our findings with previous studies and highlights limitations and future research directions. Finally, Section 5 (Conclusion) summarizes the key findings of this review, emphasizing the effectiveness of AI in predicting basketball game outcomes and the potential for future advancements in this field.

Methods

This scoping review followed the guidelines set by PRISMA-SCR (Preferred Reporting Items for Systematic Review and Meta-Analysis Extensions for Scoping Reviews) guidelines [44].

Study selection criteria

Studies were included in the review if they met all of the following criteria: (1) Study Design: Experimental or observational studies; (2) Analytic Approach: Utilized AI technology (e.g., ML, deep learning (DL)) to predict the outcome of basketball games; (3) Research Samples: Focused on professional competitive basketball leagues worldwide (e.g., NBA, Chinese Basketball Association (CBA), Euroleague, Turkish League); (4) Outcomes: Aimed at predicting the results of basketball games (e.g., game outcome, point spread, win percentage); (5) Article Type: Original, empirical, and peer-reviewed journal publications; (6) Time Window of Search: Covered articles published from January 1, 2019 to March 29, 2024, focusing on the last five years; and (7) Language: articles written in English.

Studies were excluded from the review if they met any of the following criteria: (1) Focused on games other than basketball professional leagues (e.g., intramural or intercollegiate competition in high school or college); (2) Centered on predictions unrelated to outcomes (e.g., predicting or evaluating sports performance or sports injuries); (3) Pertained to sports other than basketball (e.g., football, other team sports); (4) Lacked AI technology in the predictive modeling; (5) Failure to report data sources for basketball games; (6) Involved student-athletes rather than professional competitive athletes (e.g., NCAA participants); (7) Were not written in English; (8) Full-text articles or review articles cannot be found; and (9) Were published before 2019.

Search strategy

A comprehensive keyword search was conducted across four electronic bibliographic databases: PubMed, Web of Science, Scopus, and EBSCO. The search algorithm comprised all possible combinations of keywords from two groups: (1) “artificial intelligence,” “computational intelligence,” “AI,” “machine intelligence,” “computer reasoning,” “machine learning,” “deep learning,” “neural network,” “neural networks,” “reinforcement learning,” and “intelligent systems;” and (2) “basketball.” Additionally, the Medical Subject Headings (MeSH) terms “artificial intelligence” and “basketball” were incorporated into the PubMed search. The detailed search algorithm is provided in S1 File.

Two coauthors independently screened the titles and abstracts of the articles identified through the keyword search, retrieving those that appeared to meet the inclusion criteria, and then thoroughly evaluated the full texts. Additionally, for the included articles, their references and citations were further screened, following a snowballing approach, until no new relevant studies were identified. The interrater agreement between the two coauthors was quantified using Cohen’s kappa (κ = 0.75), indicating substantial agreement. Any disagreements were resolved through discussion.

Data extraction and synthesis

A standardized data extraction form was used to collect each study’s key methodological and outcome variables. These variables included the first author, year of publication, country, AI techniques used, datasets source, number and types of seasons, number of teams, volume and types of data, model validation approaches, study quality assessment, feature selection, feature extraction, number and types of input features, predicted outcomes, performance metrics, and key findings. During the data synthesis process, following a thematic analysis of the included studies, the results were categorized and synthesized based on various types of predicted match outcomes, including binary outcomes such as win/loss, as well as continuous metrics like winning percentages, final scores, and score differentials. Furthermore, recognizing the significance of the AI methodologies employed, this review systematically synthesized the applied methods, organizing them into broader categories such as general AI approaches, traditional statistical techniques, ML algorithms, and DL models.

Study quality assessment

We used the National Institutes of Health’s Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies to evaluate the quality of each included study [45]. This instrument scores each study based on 14 criteria. For each criterion, awarding a point for a “yes” response and zero for any other response, including “no,” “not applicable,” “not reported,” or “unable to determine.” By summing up the scores across all criteria, a study-specific overall score ranging from 0 to 14 is derived. This quality assessment of studies aids in gauging the strength of scientific evidence but does not determine the inclusion of studies. Two co-authors of this review independently conducted the quality assessments, with discrepancies resolved through discussion with the third co-author.

Results

Identification of studies

Fig 1 presents the PRISMA flow diagram. An initial keyword search identified 1,482 articles, and after the removal of duplicates, 824 unique articles remained for screening based on their titles and abstracts. Of these, 768 articles were considered irrelevant and subsequently excluded from the review. The study selection criteria were then applied to the remaining 56 articles, leading to the exclusion of 30 studies for various reasons, such as publication outside the 2019–2024 period (n = 15), focus on non-professional basketball leagues (n = 8), absence of specific empirical data (n = 2), and lack of predicting game outcomes (n = 1). After a comprehensive review of the references and cited literature of the 26 included articles, 8 additional articles were identified as relevant and included in the review. Consequently, a total of 34 articles were ultimately included in this study, guaranteeing a more thorough and in-depth analysis of the subject [17,9,2024,40,41,4664].

Study characteristics

Table 1 comprehensively summarizes 34 studies that leverage various AI techniques to analyze basketball game data across different countries and periods. These studies were conducted in China (n = 12), the USA (n = 5), India (n = 5), Croatia (n = 3), Turkey (n = 3), the UK (n = 2), New Zealand (n = 1), Italy (n = 1), Sweden (n = 1) and Greece (n = 1) (Fig 2). The studies employ a diverse array of AI techniques, including random forests (RF) (n = 14), logistic regression (LR) (n = 12), NB (n = 7), support vector machines (SVM) (n = 6), KNN (n = 6), linear regression (n = 5), eXtreme Gradient Boosting (XGBoost) (n = 5), among others (Fig 3). Several studies utilize other ensemble methods and hybrid models, combining multiple AI techniques to enhance predictive accuracy and robustness. The primary focus is on NBA data (n = 29), with additional datasets from the CBA (n = 1), EuroLeague (n = 2), the Turkish Basketball Super League (n = 1), and combined datasets from both the NBA and the Turkish league (n = 1).

These studies span from 1980 to 2024, covering a range of 1–40 seasons, with the majority encompassing multiple seasons. The most-analyzed durations are 1 season and 5 seasons. Notably, two studies present unique cases: one does not specify the number of seasons analyzed, while another selects 5 NBA seasons and 1 Iranian Super League season. The data utilized in these studies are exclusively numeric, mainly from regular seasons. The dataset sizes varying greatly and some not specifying data amounts. Most studies analyze 30 teams, while others focus on different numbers. Typically, 60–90% of data is used for training, with the remainder for testing. Cross-validation, especially 5-fold and 10-fold, is widely employed to ensure robust evaluation. The complete Table 1 is available in S1 Table.

Outcome measures

Table 2 provides a comprehensive analysis of the feature selection, types of input features, and predicted outcomes utilized in the 34 included studies. The studies employ various feature selection methods to identify the most relevant features from a large dataset. Commonly used methods include statistical-based selection techniques (e.g., Chi-square test, information gain, Gini score), regularization methods (e.g., Least Absolute Shrinkage and Selection Operator (LASSO)), correlation analysis (e.g., Pearson correlation coefficient, mutual information), and more advanced algorithms (e.g., backward elimination, feature importance in RF). Some studies rely on domain knowledge or manual selection to exclude redundant or irrelevant features.

thumbnail
Table 2. Feature information, analysis and predicted outcome of included studies.

https://doi.org/10.1371/journal.pone.0326326.t002

The input features primarily fall into three categories: team statistics, player statistics, and external factors related to the game schedule. Team statistics include points scored, shooting percentages, and defensive efficiency; player statistics encompass individual performance indicators like points, assists, rebounds, and fouls; external factors involve contextual elements like home-court advantage, season stage, and rest days. Some studies integrate advanced statistical models and metrics, such as Elo ratings, PageRank, and player impact ratings. As shown in Fig 4, the scatter-plot presents the relationship between the number of input features and the prediction accuracy. It can be seen from the figure that the relationship between the two is rather complex and not a simple linear one. When the number of input features is at a low level (around 0–20), the prediction accuracy fluctuates significantly, ranging widely from 58% to 99%. As the number of input features increases to the range of 20–60, the prediction accuracy is relatively concentrated between 65% and 80%. When the number of features exceeds 60, the number of sample points decreases, and the prediction accuracy roughly fluctuates between 65% and 80%.

thumbnail
Fig 4. Relationship between input features and prediction accuracy.

https://doi.org/10.1371/journal.pone.0326326.g004

The primary predicted outcomes include game-winner (win or loss), winning percentages of teams, game scores, and difference in scores. Specifically, game-winner predictions were made in 22 studies, winning percentages of teams were predicted in 4 studies, game scores were predicted in 6 studies, and differences in scores were predicted in 2 studies. The complete Table 2 is available in S2 Table.

Main findings

Table 3 summarizes the performance metrics and key findings of the studies included in this review. Four primary themes emerged from the analysis: game-winner predictions, team-winning percentages, game scores, and point differences. These themes reflect the diverse applications of AI models in predicting various aspects of basketball game outcomes. The complete Table 3 is available in S3 Table, Fig 5.

thumbnail
Table 3. Performance metrics and key findings of included studies.

https://doi.org/10.1371/journal.pone.0326326.t003

Game winner (win or loss).

Over the past five years, the application of AI models in predicting the outcomes of sports competitions has garnered widespread attention. The most commonly used models include RF, SVM, and ANN. The multilayer perceptron (MLP) model achieved the highest accuracy at 98.9% [53], while RF models also performed strongly, up to 93.8% [52]. Traditional models like linear regression showed lower performance, with accuracies between 63.3% and 67.2% [51].

Feature selection significantly enhances model accuracy, with methods like LASSO and multiple regression improving performance by identifying key features such as field goal percentage and defensive rebounds [9,51]. However, model adaptability across different datasets remains a challenge. For instance, the AdaBoost model’s accuracy dropped from 75% during cross-validation to 66.8% on an independent test set [49], underscoring the need for further validation and optimization across diverse scenarios. Predictive accuracy was improved to 78% by dynamically adjusting training data based on an extended team efficiency index, enhancing the model’s practical applicability [6]. However, broader application and generalization across diverse datasets still require further validation and optimization.

Winning percentages of teams.

Several studies have shown that AI models, such as neural networks (NN) and ensemble learning models, excel in predicting NBA winning percentages. For example, NN with 64 parameters outperformed traditional linear regression, achieving a root mean square error (RMSE) of 0.034 and an R² of 0.954 [48]. Similarly, an ANN model gained 68.08% accuracy, surpassing linear regression’s 66.24% [55]. Ensemble models further enhanced accuracy, achieving an R² of 0.9332 and an RMSE of 0.0364 [56].

Feature selection and model optimization were crucial for improving model performance. Methods like one-hot encoding and selecting key features such as point differential and offensive efficiency significantly enhanced predictions [55,56]. Additionally, the adaptability of these models was validated in dynamic environments [59], with ensemble methods proving particularly effective in practical applications [56].

Game scores.

XGBoost is widely recognized as one of the best-performing models for predicting game scores. Studies have consistently shown its efficiency in handling large datasets and complex features. For example, a two-stage XGBoost model achieved a mean absolute percentage error (MAPE) of 0.0818, outperforming other models [40]. Similarly, after feature selection, XGBoost’s MAPE improved from 2.44 to 2.13, further validating its accuracy [57]. Another study confirmed XGBoost’s superior performance when weighted by lagged game information [2].

Feature selection is critical for optimizing model performance. Incorporating nonlinear features and domain knowledge significantly enhanced predictive accuracy [4,50]. Optimizing feature sets is essential for improving model efficiency and effectiveness in practical applications. To enhance adaptability and practicality, studies explored integrating real-time data and adaptive features, leading to more reliable predictions in dynamic environments [1,2]. These improvements suggest that XGBoost and similar models can provide accurate predictions even in the complex and rapidly changing context of sports competitions.

Point difference.

AI technologies have shown significant potential in predicting point differentials in basketball games, with each model exhibiting unique strengths and weaknesses. A study compared data snapshot methods, Long Short-Term Memory (LSTM) and Generalized Linear Models (GLM) for predicting point differentials [46]. While the initial prediction error was around 11 points, it decreased to approximately 2 points by the game’s final minute [46]. Despite comparable accuracy to LSTM and GLM, the data snapshot method offered greater computational efficiency, making it ideal for real-time applications like betting and in-game coaching decisions [46]. A linear regression model incorporating team ability and home advantage was used, reducing the prediction error to 11.98 points after weighted adjustments [47]. This model effectively predicted the outcomes of 13 out of 15 games during the Golden State Warriors’ 2016−17 season [47]. However, factors such as player injuries, trades, and coaching changes can affect these models’ accuracy, highlighting the importance of integrating dynamic variables for more accurate predictions [47].

Methodological review

Overview of AI

In many sports, predicting the outcome of professional basketball games has become one of the highlights of AI technology’s powerful capabilities, revolutionizing a task traditionally relying on human intuition and experience. The successful application of AI technologies such as ML and DL has significantly enhanced prediction accuracy in this field with its powerful data processing and analysis capabilities.

Traditional statistical approaches

Traditional statistical methods, including linear regression, LR, and multiple linear regression (MLR), remain crucial in basketball game outcome prediction due to their simplicity, interpretability, and effectiveness in capturing linear relationships. While these methods may struggle with complex, nonlinear data compared to ML approaches, they still offer significant predictive power in specific scenarios.

Linear regression.

Linear regression is commonly used to predict continuous outcomes like point differentials or team win rates. A study found that while linear regression effectively modeled linear relationships, its accuracy declined with nonlinear features [48]. Another study showed that linear regression was practical during regular season predictions but less accurate in high-pressure playoff environments [47].

Logistic regression.

LR is widely used for binary classification tasks, such as predicting game wins or losses. Studies demonstrated LR’s reasonable accuracy, although more complex models like RF often perform better [58,63]. LR’s strength lies in its simplicity and robustness across various datasets.

Multiple linear regression.

The application of MLR in NBA game outcomes prediction involves analyzing offensive and defensive metrics to model team performance. Yao (2019) demonstrated that MLR models effectively identified critical variables such as field goal percentage and defensive parameters, achieving reliable predictions after addressing multicollinearity through parameter elimination and VIF validation [48]. However, MLR’s linear assumptions limited its accuracy compared to NNs when handling nonlinear relationships in larger datasets [48]. In contrast, Sikka (2022) integrated MLR into an ensemble model with other algorithms, where it contributed interpretability by highlighting key predictors like points per game and efficiency differentials, though the ensemble’s superior accuracy stemmed from combining MLR’s transparency with non-linear methods [56].

Machine learning

ML techniques can be categorized into unsupervised learning and supervised learning. For predicting game outcomes, supervised learning methods are primarily used to predict specific results such as win/loss and game scores. Unsupervised learning is typically utilized for data preprocessing and feature extraction to enhance the performance of prediction models. This section summarizes the use of various supervised learning algorithms in the included studies, focusing on RF, KNN, SVM, DT, and Ensemble learning.

Random forests.

RF, an ensemble learning method based on DT, has demonstrated strong performance in predicting basketball outcomes. Studies have shown RF’s best accuracy achieved 93.81% in predicting game results and team win rates [52]. RF is also effective in feature selection and model optimization, where combining Sequential Forward Selection and Recursive Feature Elimination (RFE) improved prediction accuracy to 67.98% [60].

RF is often used in hybrid models to enhance predictive capabilities. Studies successfully integrated RF with other algorithms, achieving high prediction accuracies across various basketball leagues [62]. Another study further demonstrated the benefits of combining RF with Graph Convolutional Networks (GCN), improving prediction accuracy at 71.54% by considering spatial structures [7]. RF’s dynamic prediction capability is another key advantage. RF’s accuracy improved from 61.5% at the start of a game to 69.8% by the fourth quarter, highlighting its potential for real-time analysis [64].

K-Nearest neighbors.

KNN is an instance-based, non-parametric algorithm known for its simplicity and effectiveness, particularly in predicting basketball game outcomes. KNN’s core principle involves measuring the distance to the nearest neighbors and making predictions through voting or averaging, making it well-suited for small datasets with well-defined features.

A study demonstrated that KNN’s accuracy ranged from 57.90% to 60.01%, with improvements in incorporating recent game data [41]. Despite a slightly higher MAPE than models like XGBoost, KNN still shows strong predictive capabilities [40].

However, KNN’s computational cost increases with larger datasets, highlighting the need for optimization or combining it with other algorithms in large-scale applications. KNN’s flexibility allows it to adapt to various datasets and tasks, particularly when enhanced by feature selection, cross-validation, and hybrid models [40,41,61].

Support vector machines.

SVMs are widely used in predicting basketball game outcomes due to their ability to handle high-dimensional and nonlinear data. SVMs achieve precise classification by maximizing the margin between classes. For instance, an SVM with a Radial Basis Function Kernel achieved 84% accuracy, outperforming traditional methods [3]. While SVM’s accuracy is slightly lower than RF [63], it excels in managing nonlinear feature relationships.

Feature selection further enhances SVM performance. For instance, in basketball outcome prediction studies, Li (2020) demonstrated that applying LASSO-based feature selection to SVM significantly improved accuracy by prioritizing high-impact metrics such as defensive rebounds, opponent free throws attempted, and home/away team assists, which effectively reduced model noise and amplified discriminative patterns [51]. Zheng (2022) further corroborated this by showing SVM models integrated with sequential forward selection on hybrid features (e.g., tiredness-adjusted Elo ratings and recent performance differentials) achieved enhanced generalization, as feature pruning mitigated overfitting while emphasizing latent interactions between team fatigue dynamics and skill metrics [60]. These studies collectively affirm that targeted feature engineering and selection optimize SVM’s capacity to delineate critical decision boundaries in complex, high-dimensional sports datasets.

Decision trees.

DTs are widely used in predicting basketball game outcomes due to their intuitive and interpretable nature. DTs build a tree-like model by recursively partitioning data, making them practical for classification and regression tasks. Although they are prone to overfitting and data noise, their simplicity and interpretability have made them valuable in various studies.

It was found that while DTs had slightly lower accuracy (53.37% to 54.66%) compared to more complex models like KNN, their performance improved with up-to-date game data [41]. Similarly, Patrot (2023) highlighted that despite DT achieving a relatively lower accuracy (69%) than other models such as Linear Regression (92%) and SVM (85%), they proved effective in identifying critical game-influencing features—such as defensive rebounds and three-point percentage —through feature selection, which enhanced their utility for strategic analytics despite their modest prediction rates [24]. DTs are flexible in feature selection and data processing, and they can effectively assess the importance of features, making them particularly useful for quickly assessing the importance of multiple features. Combing DT-based methods performance with feature selection enhances predictive capabilities for critical features like defensive metrics [7].

DTs are often integrated into hybrid models to boost predictive performance. It has been demonstrated that while DTs’ standalone performance might be limited, they enhance decision-making processes when combined with other models, improving overall accuracy in ensemble frameworks [56].

Ensemble learning.

Ensemble learning is a powerful technique for predicting basketball game outcomes, combining multiple models to enhance accuracy, robustness, and generalization. Strategies like Bagging, Boosting, and Stacking have proven effective in handling complex, multidimensional data.

For example, an 84% prediction accuracy was achieved using a hybrid ensemble framework [3]. The stability of integrating models like RF, LR, and XGBoost across various leagues was demonstrated [62]. Ensemble learning also plays a crucial role in feature selection and model optimization, as seen in a study that combined multiple algorithms to predict NBA win rates with high accuracy (R² = 0.9332) [56]. Comparative model analyses in NBA studies reveal how heterogeneous algorithms like RF (dependent on historical player metrics) and LR (focused on short-term dynamics) provide complementary strengths, offering a foundation for ensemble frameworks to address playoff complexity. Simultaneously, novel features such as fatigue quantification [60] and multi-model optimized input designs [58] further validate ensemble adaptability in capturing contextual and temporal interactions.

Deep learning

To achieve more accurate predictions of basketball game outcomes, an increasing number of studies have employed DL techniques to process and analyze game-related data. The following DL approaches have been widely utilized: (1) LSTM: A variant of RNN, LSTM is particularly effective in handling dependencies in time series data, making it suitable for capturing dynamic game changes over time. However, its computational efficiency can be a challenge for real-time applications. Kayhan (2019) compared LSTM with General Linear Models for predicting point spreads and found that while both models performed well, LSTM was computationally less efficient, making real-time decision-making more difficult [46]. (2) GCN: Designed for processing graph-structured data, GCNs are particularly effective in capturing relationships between players and team structures. This makes them well-suited for modeling complex interaction patterns in basketball games. Zhao et al. (2023) combined GCN with RF, achieving a prediction accuracy of 71.54%, demonstrating the model’s ability to enhance game outcome predictions by leveraging relational data [7]. (3) MLP: A fully connected NN model, MLP is commonly used for game outcome prediction due to its ability to learn patterns from structured numerical data. However, its performance can vary based on the dataset and model configuration. Balli and Ozdemir (2021) reported that MLP achieved an impressive 98.90% accuracy in predicting EuroLeague outcomes [53], whereas Santos et al. (2022) found that RF slightly outperformed MLP in NBA game predictions, with respective accuracies of 69.88% and 68.85% [58]. (4) Hybrid Neural Networks: These models integrate multiple DL techniques to leverage their combined strengths, improving predictive accuracy in complex datasets. Khanmohammadi et al. (2022) introduced MambaNet, a hybrid neural network architecture that combines different DL models [54]. Their study demonstrated that MambaNet achieved superior performance in basketball game prediction, with AUC scores ranging from 0.72 to 0.82 [54].

Study quality assessment

S4 Table reports criterion-specific and global ratings from the study quality assessment. On average, the included studies scored 10 out of 14 (range: 9–11). All studies clearly stated their research questions or objectives, specified and defined their study populations, maintained a participation rate of at least 50%, uniformly applied inclusion and exclusion criteria to participants, maintained an attrition rate of 20% or less, measured exposures prior to outcomes, ensured clear definition and consistent application of outcome measures, and implemented clearly defined, valid, and reliable exposure measures consistently across participants. Most studies had a sufficient timeframe to observe an association between exposure and outcome (n = 27). Only a few studies (n = 7) statistically adjusted for key potential confounding variables, and even fewer (n = 2) assessed exposures more than once over time. However, none of the studies examined different levels of exposure in relation to the outcomes, nor did they have outcome assessors blinded to the exposure status of participants.

Discussion

This study systematically reviewed the application of AI techniques in predicting the outcomes of professional basketball games. The findings indicate that AI models are highly effective in predicting basketball game outcomes, often surpassing traditional statistical methods in accuracy. Studies focused on achieving higher accuracy by comparing different AI models and employing feature selection and extraction methods before algorithm application. The review of 34 articles revealed a growing trend in adopting advanced ML models to tackle these complex tasks. However, the effectiveness of AI in predicting basketball game outcomes is influenced by several factors, which are discussed in the following sections.

Comparative analysis of AI techniques and their effectiveness

This review reveals that the effectiveness of AI techniques in predicting basketball game outcomes largely depends on the quality of the data, the relevance of the selected features, and the available computational resources. Studies have shown that, although RF models using four factors can achieve up to 93.81% accuracy [52] but much lower in others. This suggests that while the RF model is robust, it may still be prone to overfitting when the data distribution or features differ [65,66]. Moreover, the success of AI models like NN and XGBoost often depends more on the quality and relevance of input features than on the complexity of the models themselves. While advanced methods, such as combining GCNs with RF, have improved accuracy (up to 71.54%), they have also increased model complexity and computational demands [7]. This underscores the urgency of developing models that are both accurate and computationally efficient.

Simple models like KNN are attractive for small datasets, but their computational intensity makes them impractical for large-scale datasets [41]. Similarly, while SVMs are effective for handling high-dimensional data, their stringent requirements for hyperparameter tuning limit their use in real-time decision-making [51]. Ensemble and hybrid models often provide robust predictions by combining multiple AI techniques [67], though they add complexity and may be less suited for real-time applications [3,40].

Given that the trade-off between model complexity and computational efficiency is particularly critical in real-time applications requiring rapid decision-making, future research should focus on developing techniques such as model pruning, quantization, and hardware-specific optimizations (e.g., graphics processing unit acceleration) to maintain high predictive performance while reducing computational load [68]. Additionally, multi-task learning models, which can predict multiple related outcomes simultaneously, offer a way to maximize data utilization and capture the complex relationships between different aspects of a game, thereby enhancing model adaptability and improving their generalization across different leagues [69].

Impact of data processing and feature selection on model performance

Effective data processing and feature selection are crucial for enhancing AI model performance in predicting basketball game outcomes—data processing steps like cleaning, normalization, and feature engineering influence model accuracy and robustness. Handling missing values and outliers is essential for maintaining data quality, while normalization techniques, such as standardization, improve prediction consistency and reliability [5,22]. Moreover, the ability to process and integrate unstructured data is also crucial. In addition to structured data such as player statistics and game scores, there is a vast amount of unstructured data (e.g., game videos, social media posts, and basketball news). For instance, applying convolutional neural networks to video data can identify specific players or evaluate player movements [70], while using natural language processing to analyze coach comments or player interviews can reveal psychological factors [71] that may affect game performance, thereby enhancing predictions by integrating this information with structured data.

Feature selection significantly enhances model performance by identifying the most relevant features [72]. Methods like Information Gain, Chi-Square Test, and RFE have proven effective in improving computational efficiency and reducing overfitting [6,60,61]. However, it was also shown that traditional feature selection methods, such as Information Gain and Chi-Square Test, typically focus on the statistical significance of individual features, potentially overlooking complex interactions among multiple variables [73]. In contrast, model-based feature selection methods, such as DT and RF, inherently account for interactions between features, leading to more accurate predictions of game outcomes [74,75]. ML approaches like Generalized Additive Models can also effectively model nonlinear interactions between variables [76], which is particularly important in basketball, where outcomes are often determined by the combined effects of multiple factors. Future research should focus on further developing and optimizing feature selection methods that can capture feature interactions to improve the predictive accuracy of models.

The review found that feature sets ranging from 2 to 110 features often provided the best balance between accuracy and model complexity, with accuracy rates between 53.4% and 98.9%. While larger feature sets can sometimes marginally improve accuracy, they also increase model complexity, making them less practical for real-time applications [48]. Feature selection also enhances computational efficiency and interpretability, which is critical for real-time applications where resources are limited [64].

In conclusion, data processing and feature selection are foundational for developing effective AI models in basketball game prediction. These processes improve accuracy, efficiency, and interpretability, making models more applicable in real-world scenarios. Future research should focus on balancing model complexity and performance to optimize AI models in sports analytics.

Methods of model validation and performance metrics

Effective model validation and performance evaluation are crucial for assessing the robustness and accuracy of AI models in predicting basketball game outcomes. The reviewed studies widely used cross-validation techniques, particularly k-fold cross-validation, to ensure comprehensive model assessment. This method, which involves rotating dataset subsets as validation sets, provides a more reliable performance estimate than traditional Train-Test splits [41]. For example, Sikka et al. (2022) demonstrated that employing 5-fold and 10-fold cross-validation to evaluate their voting ensemble model, which integrated five base algorithms, achieved a high prediction accuracy, validating the robustness of cross-validation in minimizing overfitting and ensuring reliable performance estimation for basketball win percentage prediction [56].

Incorporating real-time data into model validation further enhances accuracy by accounting for the dynamic nature of basketball games. It was showed that dynamically updating models with current game data improved prediction accuracy [64], while static models relying solely on historical data often failed to adapt to ongoing game nuances [62]. Performance metrics such as accuracy, precision, recall, F1 score, Mean Absolute Error (MAE), RMSE, and Area Under the Curve of Receiver Operating Characteristic (AUC-ROC) were commonly used to evaluate AI models. The choice of metric varied depending on the prediction task. For instance, precision, recall, and the F1 score were emphasized for predicting rare events like upsets [9]. RMSE was frequently used to assess the precision of continuous predictions like total game scores or point differentials [40]. Selecting appropriate performance metrics that align with the prediction task’s goals is essential. While AUC-ROC is helpful for binary classification [7], it may be less effective for continuous outcomes or multi-class predictions.

In conclusion, robust and context-specific validation methods and performance metrics are necessary to develop accurate and applicable AI prediction models for basketball game predictions. Cross-validation combined with real-time data integration provides a reliable evaluation of model robustness. Future research should refine validation techniques and explore more sophisticated metrics to capture the complexities of basketball predictions.

Static vs. dynamic predictions and comparative league analysis

Static prediction models rely on historical data and pre-game statistics, offering considerable accuracy in forecasts but lacking the ability to adapt to in-game changes like injuries or strategy shifts, leading to decreased accuracy as the game progresses [41,47,51].

In contrast, dynamic prediction models adjust predictions in real time, using updated data such as player fatigue and tactical changes to enhance accuracy. For example, it was showed that dynamic models could improve prediction accuracy from 62% at the start to 78% by the final quarter [64]. However, these models are less studied due to challenges in real-time data acquisition and computational efficiency [46]. Additionally, real-time predictive models must be capable of quickly integrating new information, such as sudden injuries, fouls, or changes in team strategy, which can significantly alter the course of a game. Delays in data processing and model inference can be a critical bottleneck, especially in high-stakes scenarios like live betting or in-game coaching decisions.

Another key observation in this review is the heavy concentration of existing research on the NBA, with 29 out of 34 studies focused specifically on this league. This NBA-centric focus raises concerns about the generalizability of AI models to other leagues that feature different playing styles, structures, and levels of competition, such as CBA [3] or the defense-oriented European leagues [62]. For instance, defensive metrics may play a more significant role in European leagues [77], while offensive efficiency metrics might be more critical in the NBA [78]. Additionally, the structure of the league, including season length, playoff format, and level of competition, can also impact model performance. Consequently, AI models trained on NBA data may not perform as effectively when applied to data from other leagues without significant adjustments. Future research should broaden its scope to include a wider range of leagues and game formats, focusing on developing adaptive AI models that can be fine-tuned or retrained to account for these inter-league differences. For example, incorporating transfer learning techniques, where a model trained on data from one league is adapted to another, could enhance the versatility of AI models and their applicability across global leagues [79,80].

Identification of key features and prediction outcome types

Identifying relevant game features is crucial for improving prediction accuracy in basketball outcomes. Key metrics such as shooting efficiency (field goal percentage, three-point percentage), turnovers, and defensive rebounds significantly impact game results [9,40,52]. For instance, defensive rebounds are critical as they limit opponents’ scoring opportunities and enhance a team’s control over the game [7]. Field goal percentage is another crucial predictor, reflecting a team’s offensive efficiency. Advanced metrics like effective field goal percentage and offensive efficiency ratings provide a deeper understanding of scoring capabilities [62]. Additionally, rating systems like Elo ratings, PageRank, and player impact ratings improve predictive accuracy by dynamically capturing team and player strengths [62]. Different prediction tasks cater to various stakeholders. Binary win/loss predictions are relevant for casual viewers and bettors, while predictions involving scores and point differentials offer deeper insights valuable for strategic decisions by coaches and managers. The accuracy of these predictions depends on data quality, feature selection, and analysis techniques.

Integrating critical features into AI models is essential for improving their performance. Combining metrics like defensive rebounds, turnovers, and shooting efficiency enhances a model’s predictive power by offering a comprehensive view of team performance [7]. Ongoing feature selection and engineering refinement will continue to develop more accurate AI models for basketball predictions.

Comparison with previous research studies

Several reviews have examined AI applications in predicting basketball outcomes. For example, a review highlighted high-performing models like LR and Hybrid Fuzzy-SVM, with 93.20% and 88.26% accuracy, respectively [42]. Our review, however, emphasizes the superior effectiveness of advanced models like MLP and RF, achieving accuracies of 98.90% and 93.81%.

Another review focused on AI in sports betting, noting the versatility of models [81]. Our findings align but extend this discussion, highlighting the superior accuracy of hybrid and ensemble techniques, particularly in complex tasks like real-time game outcome forecasting. Similarly, while the review discussed ensemble methods achieving 84% accuracy [43], our study shows that models using real-time data and dynamic updates, such as LSTM networks and GCNs, offer even higher predictive accuracy.

Building on a previous review that examined ML in various sports [82], our study focuses on basketball, underscoring the importance of feature engineering and real-time data integration. Ensemble techniques, like combining RF with GCNs, further advance AI methodologies in basketball analytics [7].

Insight from traffic flow prediction for basketball game forecast

In the realm of basketball game outcome prediction, the dynamic and complex nature of the data poses significant challenges. Drawing on the dynamic analysis methods used in traffic flow forecasting can offer valuable insights for predicting basketball game outcomes.

Firstly, when exploring the “spatiotemporal” correlation of traffic flow scheduling, traffic flow prediction mainly focuses on the relationship of flow changes at different times and on different road sections [83,84]. In basketball games, the schedule can be analogized to “time”, and different opponents and game locations can be analogized to “space”. In the future, by analyzing the performance of teams at different stages of the season (such as the start, middle, and end of the season), against different opponents, and in home and away games, a model similar to the dynamic spatiotemporal connection in traffic flow can be constructed to capture the patterns of team form changes.

Secondly, in terms of dynamic weight allocation, in traffic flow prediction, the Spatio-Temporal Attention Unit controls the neighbor aggregation weights and integrates the spatiotemporal characteristics of different neighbors [83]. In the future, in basketball game prediction, learning can be enhanced by assigning dynamic weights to different game data. For example, at the critical moments of a game, the weights of scoring and rebounding data increase; at the beginning of a game, the weight of the historical head-to-head records of the two teams is greater. By dynamically adjusting the weights, the model can more reasonably integrate various data from different stages of the game, thus improving the prediction accuracy.

Finally, in terms of capturing the impact of unexpected events, traffic flow can be affected by unexpected incidents such as traffic accidents. Similarly, in basketball games, there are comparable situations, such as key players getting injured or sudden technical fouls. In the future, a similar mechanism can be established to promptly capture these “unexpected events”. Just as traffic flow prediction takes into account the impact of abnormal factors on traffic flow, we can assess their impact on the game trend and incorporate them into the prediction model. This will make the prediction results more consistent with the actual dynamic changes in the game.

Limitations and directions for future research

This review highlights the use of AI techniques in predicting professional basketball game outcomes over the past five years. However, several recurring limitations should be noted across the included studies. Most research has focused primarily on the NBA, which limits the generalizability of the findings to other leagues such as the CBA, EuroLeague, and Turkish Basketball Super League. Moreover, there is a noticeable gap in research on women’s professional basketball games and 3x3 Basketball, which have unique dynamics and strategies that differ from traditional men’s 5-on-5 games. The lack of attention to these areas restricts the applicability of AI models across the broader spectrum of basketball competitions.

Methodologically, many studies relied on data from a single season [2,3,40,64], which may compromise the stability and universality of the results. Moreover, some studies did not adequately address feature engineering—failing to implement or report systematic methods for feature extraction and selection, such as correlation-based analysis, statistical selection techniques, or regularization methods [2,10,20,47,53,63]. Effective feature engineering is essential to identify the most influential variables, eliminate redundancies, and improve model interpretability and generalizability. Furthermore, while most studies focused on structured game statistics, few considered unstructured or contextual data such as player injuries, coaching decisions, or game location [9,22,47,50,51,60], which can significantly affect match outcomes.

Another significant limitation is the tendency of AI models, especially those utilizing DL techniques, to overfit the training data. While these models may perform exceptionally well on the data they were trained on, their performance often diminishes when applied to new, unseen data. This overfitting issue highlights the challenge of ensuring that AI models generalize well across different datasets and real-world scenarios. Additionally, many reviewed studies focus on pre-game predictions, with fewer efforts dedicated to real-time, in-game predictions. Real-time predictions are crucial for applications such as live betting and in-game strategy adjustments, yet they pose challenges regarding data acquisition, processing speed, and model adaptation. Developing models that can integrate and process real-time data effectively remains crucial for future research.

This review itself also has certain limitations that should be acknowledged. The included studies were limited to those published in English and retrieved from a select group of academic databases, which may have excluded relevant research published in other languages or sources. Moreover, the review emphasizes performance metrics such as accuracy, while paying less attention to the interpretability, transparency, and real-world applicability of AI systems.

Future research should expand data sources to include a broader range of leagues and international competitions, enhancing AI models’ generalizability. Incorporating data from women’s basketball leagues and 3x3 basketball competitions will provide additional insights and help create more universally applicable models. Furthermore, advanced validation techniques, such as nested cross-validation and time-series cross-validation, should be employed to improve model robustness and reduce the risk of overfitting. Exploring transfer learning approaches is another promising direction, as it allows models trained in one league to be adapted for use in another, enhancing their flexibility and applicability. Integrating AI with emerging technologies like the Internet of Things for real-time data collection could improve data quality and model performance, enabling more accurate and timely predictions. Improving feature engineering by developing techniques that capture the complexities of basketball games and combining domain knowledge with ML can lead to discovering novel and impactful features. Finally, longitudinal studies exploring the long-term impact of AI predictions over multiple seasons are needed to comprehensively understand their influence on team performance, player development, and league dynamics. Such studies could provide valuable insights into AI models’ sustainability and long-term effectiveness in professional sports.

Conclusion

This study systematically evaluated AI technologies in predicting professional basketball outcomes over the past five years, highlighting the superiority of models like MLP, RF, and ensemble learning in enhancing predictive accuracy. The effectiveness of these AI models is largely attributed to advanced data processing, feature selection, and model optimization techniques, which outperform traditional methods. While dynamic models show promise for real-time decision-making, the current research is limited by its focus on NBA data, with insufficient exploration of other leagues and game types. Future research should broaden its scope to include diverse basketball contexts and utilize advanced validation methods and real-time data integration to improve prediction accuracy and model applicability.

Supporting information

S1 Table. Basic information of included studies.

https://doi.org/10.1371/journal.pone.0326326.s004

(DOCX)

S2 Table. Feature information, analysis and predicted outcome of included studies.

https://doi.org/10.1371/journal.pone.0326326.s005

(DOCX)

S3 Table. Performance metrics and key findings of included studies.

https://doi.org/10.1371/journal.pone.0326326.s006

(DOCX)

S5 Table. Studies exclude or include reasons.

https://doi.org/10.1371/journal.pone.0326326.s008

(DOCX)

References

  1. 1. Song K, Gao Y, Shi J. Making real-time predictions for NBA basketball games by combining the historical data and bookmaker’s betting line. Physica A: Statistical Mechanics and its Applications. 2020;547:124411.
  2. 2. Lu C-J, Lee T-S, Wang C-C, Chen W-J. Improving sports outcome prediction process using integrating adaptive weighted features and machine learning techniques. Processes. 2021;9(9):1563.
  3. 3. Cai W, Yu D, Wu Z, Du X, Zhou T. A hybrid ensemble learning framework for basketball outcomes prediction. Physica A: Statistical Mechanics and its Applications. 2019;528:121461.
  4. 4. Alameda-Basora E, Ryan S. Application of bayesian network to total points in NBA games. In: IIE Annual Conference. Proceedings, 2019. 142–7.
  5. 5. Ozkan IA. A Novel Basketball result prediction model using a concurrent neuro-fuzzy system. Applied Artificial Intelligence. 2020;34(13):1038–54.
  6. 6. Horvat T, Job J, Logozar R, Livada Č. A data-driven machine learning algorithm for predicting the outcomes of NBA Games. Symmetry. 2023;15(4):798.
  7. 7. Zhao K, Du C, Tan G. Enhancing basketball game outcome prediction through fused graph convolutional networks and random forest algorithm. Entropy (Basel). 2023;25(5):765. pmid:37238520
  8. 8. Bhandari I, Colet E, Parker J, Pines Z, Pratap R, Ramanujam K. Advanced scout data mining and knowledge discovery in NBA data. Data Min Knowl Discov. 1997;1(1):121–5.
  9. 9. Thabtah F, Zhang L, Abdelhamid N. NBA Game Result Prediction Using Feature Analysis and Machine Learning. Ann Data Sci. 2019;6(1):103–16.
  10. 10. Soto Valero C. Predicting Win-Loss outcomes in MLB regular season games – A comparative study using data mining methods. International J Computer Science in Sport. 2016;15(2):91–112.
  11. 11. Tax N, Joustra Y. Predicting the Dutch football competition using public data: A machine learning approach. Transactions on Knowledge Data Eng. 2015;10(10):1–13.
  12. 12. Delen D, Cogdell D, Kasap N. A comparative analysis of data mining methods in predicting NCAA bowl outcomes. Int J Forecast. 2012;28(2):543–52.
  13. 13. Pathak N, Wadhwa H. Applications of modern classification techniques to predict the outcome of ODI cricket. Procedia Computer Science. 2016;87:55–60.
  14. 14. Skinner B, Guy SJ. A method for using player tracking data in basketball to learn player skills and predict team performance. PLoS One. 2015;10(9):e0136393. pmid:26351846
  15. 15. Felsen P, Agrawal P, Malik J. What will Happen Next? Forecasting Player Moves in Sports Videos. In: 2017 IEEE International Conference on Computer Vision (ICCV). 2017;3362–71.
  16. 16. Taber CB, Sharma S, Raval MS, Senbel S, Keefe A, Shah J, et al. A holistic approach to performance prediction in collegiate athletics: player, team, and conference perspectives. Sci Rep. 2024;14(1):1162. pmid:38216641
  17. 17. Zhang X, Ogasawara I, Konda S, Matsuo T, Uno Y, Miyakawa M, et al. Absorption function loss due to the history of previous ankle sprain explored by unsupervised machine learning. Gait Posture. 2024;109:56–63. pmid:38277765
  18. 18. Sarlis V, Papageorgiou G, Tjortjis C. Sports analytics and text mining nba data to assess recovery from injuries and their economic impact. Computers. 2023;12(12):261.
  19. 19. Huang Y, Huang S, Wang Y, Li Y, Gui Y, Huang C. A novel lower extremity non-contact injury risk prediction model based on multimodal fusion and interpretable machine learning. Front Physiol. 2022;13:937546. pmid:36187785
  20. 20. Horvat T, Job J. Importance of the training dataset length in basketball game outcome prediction by using naive classification machine learning methods. Elektrotehniski Vestnik/Electrotechnical Review. 2019;86(4):197–202.
  21. 21. Chen M, Su F. A basketball game prediction system based on artificial intelligence. J Supercomput. 2022;78(10):12528–52.
  22. 22. Ma B, Wang Y, Li Z. Application of data mining in basketball statistics. Appl Mathematics Nonlinear Sciences. 2022;8(1):2179–88.
  23. 23. Osken C, Onay C. Predicting the winning team in basketball: A novel approach. Heliyon. 2022;8(12):e12189. pmid:36561688
  24. 24. Patrot A, H H, B S, P LG, Sahana. NBA game prediction using machine learning algorithm. In: 2023.
  25. 25. Zuccolotto P, Sandri M, Manisera M. Spatial performance analysis in basketball with CART, random forest and extremely randomized trees. Ann Oper Res. 2023;325(1):495–519. pmid:35677064
  26. 26. Hore S, Bhattacharya T, editors. A machine learning based approach towards building a sustainability model for NBA players. 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT); Coimbatore, India; 2018, pp. 1690-4.
  27. 27. Metulini R, Gnecco G. Measuring players’ importance in basketball using the generalized Shapley value. Ann Oper Res. 2022;325(1):441–65.
  28. 28. Shen Z, Yang Y. Real-Time Regulation Model of Physical Fitness Training Intensity Based on Wavelet Recursive Fuzzy Neural Network. Comput Intell Neurosci. 2022;2022:2078642. pmid:35498205
  29. 29. Ulas E. Examination of national basketball association (NBA) team values based on dynamic linear mixed models. PLoS One. 2021;16(6):e0253179. pmid:34138919
  30. 30. Yanai C, Solomon A, Katz G, Shapira B, Rokach L. Q-Ball: Modeling basketball games using deep reinforcement learning. Proc AAAI Conf Artif Intell. 2022;36(8):8806-13.
  31. 31. Chen C-H, Liu T-L, Wang Y-S, Chu H-K, Tang NC, Liao H-YM. Spatio-Temporal learning of basketball offensive strategies. 23rd Proc ACM Int Conf Multimed, 2015. p. 1123-6.
  32. 32. Chen X, Jiang JY, Jin K, Zhou Y, Liu M, Brantingham PJ. ReLiable: Offline reinforcement learning for tactical strategies in professional basketball games. In: Proceedings of the 31st ACM International Conference on Information and Knowledge Management. 2022;3023–32.
  33. 33. Yu A, Chung SS, Soc IC, editors. Framework for Analysis and Prediction of NBA Basketball Plays: On-Ball Screens. IEEE Conference on SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovation; 2019.
  34. 34. Xin W. Application of intelligent trajectory analysis based on new spectral imaging technology in basketball match motion recognition. Opt Quant Electron. 2023;56(3).
  35. 35. Meng XH, Shi HY, Shang WH. Analysis of basketball technical movements based on human-computer interaction with deep learning. Comput Intell Neurosci. 2022;2022:4247082.
  36. 36. Khobdeh SB, Yamaghani MR, Sareshkeh SK. Basketball action recognition based on the combination of YOLO and a deep fuzzy LSTM network. J Supercomput. 2024;80(3):3528–53.
  37. 37. Medved V, Srpak D, Havaš L, Horvat T. Data-driven basketball web application for support in making decisions. 7th Int Conf on Sport Sciences Research and Technology Support (icSPORTS); 2019. 2019 Sep 20-21.
  38. 38. Tsai W-L, Su L-W, Ko T-Y, Pan T-Y, Hu M-C. Feasibility Study on Using AI and VR for Decision-Making Training of Basketball Players. IEEE Trans Learning Technol. 2021;14(6):754–62.
  39. 39. Rong J, Cui L, editors. Research on the decision-making system of basketball game of virtual human and application prospect. 8th International Symposium on Computer Science in Sport (IACSS 2011); 2011.
  40. 40. Chen WJ, Jhou MJ, Lee TS, Lu CJ. Hybrid basketball game outcome prediction model by integrating data mining methods for the National Basketball Association. Entropy. 2021;23(4):477.
  41. 41. Horvat T, Havaš L, Srpak D. The Impact of Selecting a Validation Method in Machine Learning on Predicting Basketball Game Outcomes. Symmetry. 2020;12(3):431.
  42. 42. Clementswami S, Selvam D, Sankar M, Veda P, Sugumar C. Application of artificial intelligence and machine learning to predict basketball match outcomes: a systematic review. Comp Integrated Manufacturing Syst. 2022;:998–1009.
  43. 43. Li B, Xu X. Application of Artificial Intelligence in Basketball Sport. J Educ Health Sport. 2021;11(7):54–67.
  44. 44. Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann Intern Med. 2018;169(7):467–73. pmid:30178033
  45. 45. Quality assessment tool for observational cohort and cross-sectional studies. https://www.nhlbi.nih.gov/health-topics/study-quality-assessment-tools
  46. 46. Kayhan VO, Watkins A. Predicting the point spread in professional basketball in real time: a data snapshot approach. J Business Analyt. 2019;2(1):63–73.
  47. 47. Lu J, Chen Y, Zhu Y, editors. Prediction of future NBA games’ point difference: a statistical modeling approach. 2019 International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI). 2019.
  48. 48. Yao A, editor Comparing neural and regression models to predict NBA team records. 5th International Conference on Fuzzy Systems and Data Mining (FSDM); 2019 .
  49. 49. Giasemidis G. Descriptive and predictive analysis of Euroleague basketball games and the wisdom of basketball crowds. 2020.
  50. 50. Huang M-L, Lin Y-J. Regression Tree Model for Predicting Game Scores for the Golden State Warriors in the National Basketball Association. Symmetry. 2020;12(5):835.
  51. 51. Li S. Revisiting the correlation of basketball stats and match outcome prediction. Proceedings of the 2020 12th International Conference on Machine Learning and Computing (ICMLC). 2020. p. 63-7.
  52. 52. Migliorati M. Detecting drivers of basketball successful games: an exploratory study with machine learning algorithms. Electronic J Appl Statistical Analysis. 2020;13(2):454–73.
  53. 53. Ballı S, Özdemir E. A novel method for prediction of EuroLeague game results using hybrid feature extraction and machine learning techniques. Chaos, Solitons Fractals. 2021;150:111119.
  54. 54. Khanmohammadi R, Saba-Sadiya S, Esfandiarpour S, Alhanai T, Ghassemi MM. MambaNet: A Hybrid Neural Network for Predicting the NBA Playoffs. SN COMPUT SCI. 2024;5(5).
  55. 55. Krishnan NJ, Suresh R, Vashisht SJ. Prediction of National Basketball Association games using machine learning with integrating advanced statistics. ECS Transactions. 2022;1:107.
  56. 56. Sikka DDR. Basketball win percentage prediction using ensemble-based machine learning. 2022 6th International Conference on Electronics, Communication and Aerospace Technology. 2022. p. 885-90. https://doi.org/10.1109/iceca55336.2022.10009313
  57. 57. Su F, Chen M. Basketball players’ score prediction using artificial intelligence technology via the Internet of Things. J Supercomput. 2022;78(17):19138–66.
  58. 58. Teno, González DS, Wang C, Carlsson N, et al. Predicting season outcomes for the NBA. Springer, Cham; 2022. https://doi.org/10.1007/978-3-031-02044-5_11
  59. 59. Wang Y, Liu W, Liu X. Explainable AI techniques with application to NBA gameplay prediction. Neurocomputing. 2022;483:59–71.
  60. 60. Zheng X. NBA winner prediction: a hybrid framework incorporating internal and external factors. 4th Int Conf Big Data Eng. 2022. p. 71-80.
  61. 61. Daundkar D, Kandhway K, editors. Predicting winner of a professional basketball match. 23rd International Conference on Control, Automation and Systems (ICCAS). 2023. pp. 958-963.
  62. 62. Lampis T, Ioannis N, Vasilios V, Stavrianna D. Predictions of european basketball match results with machine learning algorithms. Journal of Sports Analytics. 2023;9(2):171–90.
  63. 63. Wang J. Predictive analysis of NBA game outcomes through machine learning. The 6th International Conference on Machine Learning and Machine Intelligence (MLMI ‘23) 2023. p. 46-55.
  64. 64. Kandhway K. Dynamic outcome prediction of an NBA match. 2024 18th International Conference on Ubiquitous Information Management and Communication (IMCOM)2024. p. 1-4.
  65. 65. Luan J, Zhang C, Xu B, Xue Y, Ren Y. The predictive performances of random forest models with limited sample size and different species traits. Fish Res. 2020;227:105534.
  66. 66. Lang L, Tiancai L, Shan A, Xiangyan T. An improved random forest algorithm and its application to wind pressure prediction. Int J Intell Syst. 2021;36(8):4016–32.
  67. 67. Khattak BHA, Shafi I, Khan AS, Flores ES, Lara RG, Samad MdA, et al. A Systematic Survey of AI Models in Financial Market Forecasting for Profitability Analysis. IEEE Access. 2023;11:125359–80.
  68. 68. Liang T, Glossner J, Wang L, Shi S, Zhang X. Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing. 2021;461:370–403.
  69. 69. Liang X, Wu Y, Han J, Xu H, Xu C, Liang X. Effective adaptation in multi-task co-training for unified autonomous driving. Adv Neural Inform Processing Syst. 2022;35:19645-58.
  70. 70. Yoon Y, Hwang H, Choi Y, Joo M, Oh H, Park I, et al. IEEE Access. 2019;7:56564–76.
  71. 71. Guetterman TC, Chang T, DeJonckheere M, Basu T, Scruggs E, Vydiswaran VGV. Augmenting Qualitative Text Analysis with Natural Language Processing: Methodological Study. J Med Internet Res. 2018;20(6):e231. pmid:29959110
  72. 72. Hamdard MS, Lodin H. Effect of feature selection on the accuracy of machine learning model. Int J Multidiscipl Res Anal. 2023;6(9):4460–6.
  73. 73. Hall P, Xue JH. On selecting interacting features from high-dimensional data. Comput Stat Data Anal. 2014;71:694–708.
  74. 74. Madan K, Taneja DrK, Taneja DrH. AI based feature selection model for soccer sports management. Int J Eng Sci Humanit. 2024;14(Special Issue 1):38–42.
  75. 75. Tuv E, Borisov A, Runger G, Torkkola K. Feature selection with ensembles, artificial variables, and redundancy elimination. J Mach Learn Res. 2009;10:1341–66.
  76. 76. Larsen K. GAM: the predictive modeling silver bullet. Multithreaded. Stitch Fix. 2015;30:1–27.
  77. 77. Mandić R, Jakovljević S, Erčulj F, Štrumbelj E. Trends in NBA and Euroleague basketball: analysis and comparison of statistical data from 2000 to 2017. PLoS One. 2019;14(10):e0223524.
  78. 78. Mikołajec K, Maszczyk A, Zając T. Game Indicators Determining Sports Performance in the NBA. J Hum Kinet. 2013;37:145–51. pmid:24146715
  79. 79. Hosna A, Merry E, Gyalmo J, Alom Z, Aung Z, Azim MA. Transfer learning: a friendly introduction. J Big Data. 2022;9(1):102.
  80. 80. Constantinou AC. Dolores: a model that predicts football match outcomes from all over the world. Machine learning. 2019;108(1):49–75.
  81. 81. Kollár Aladár.Betting models using AI: a review on ANN, SVM, and Markov chain. MPRA Paper, 2021.
  82. 82. Bunker R, Susnjak T. The application of machine learning techniques for predicting match results in team sport: a review. J Art Intell Res. 2022;73:1285–322.
  83. 83. Ali A, Ullah I, Shabaz M, Sharafian A, Attique Khan M, Bai X, et al. A Resource-Aware Multi-Graph Neural Network for Urban Traffic Flow Prediction in Multi-Access Edge Computing Systems. IEEE Trans Consumer Electron. 2024;70(4):7252–65.
  84. 84. Ali A, Zhu Y, Zakarya M. Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction. Neural Netw. 2022;145:233–47. pmid:34773899