Innovative approaches in QSPR modelling using topological indices for the development of cancer treatments

Xiaolong Shi; Saeed Kosari; Masoud Ghods; Negar Kheirkhahan

doi:10.1371/journal.pone.0317507

Abstract

This paper provides a comprehensive review of quantitative structure-property relationships (QSPR) about to cancer drugs, with a focus on the application of topological indices (TI) and data analysis techniques. Cancer is a serious and life-threatening disease for which no complete cure currently exists. Consequently, extensive research is ongoing to develop new therapeutic agents. The application of topological indices in chemistry and medicine, particularly in the investigation of the molecular, pharmacological, and therapeutic properties of drugs, has become a significant tool. This article investigates the potential of Temperature indices in analyzing the physicochemical properties of drugs used for cancer treatment. The approach employs QSPR modeling to establish correlations between the molecular structure of a compound and its physical and chemical properties. The analysis covers a range of Cancer drugs, including Aminopterin, Convolutamide A, Convolutamydine A, Daunorubicin, Minocycline, Podophyllotoxin, Caulibugulone E, Perfragilin A, Melatonin, Tambjamine K, Amathaspiramide E, and Aspidostomide E. The findings demonstrate that optimal regression models (Fifty-eight models) incorporating TI can effectively predict physicochemical properties, such as Boiling Point (BP), Enthalpy (EN), Flash Point (FP), Molar Refractivity (MR), Polar Surface Area (PSA), Surface Tension (ST), Molecular Volume (MV), and Complexity (COM). This research suggests that temperature-based topological indices (TI) are promising tools for the development and optimization of cancer drugs, as demonstrated by statistically significant results with a p-value less than 0.05. In addition to the linear regression model, which performed the best, two other machine learning models, namely SVR and Random Forest, were also used for further analysis and comparison of their performance in predicting the physicochemical properties of drugs, to assess the advantages and disadvantages of each model.

Citation: Shi X, Kosari S, Ghods M, Kheirkhahan N (2025) Innovative approaches in QSPR modelling using topological indices for the development of cancer treatments. PLoS ONE 20(2): e0317507. https://doi.org/10.1371/journal.pone.0317507

Editor: Niravkumar Joshi, Federal University of ABC, BRAZIL

Received: July 4, 2024; Accepted: December 30, 2024; Published: February 21, 2025

Copyright: © 2025 Shi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data that support the findings of this study are openly available in [ChemSpider] at [http://www.chemspider.com/About Us.aspx]. Data are contained within the article.

Funding: This work was supported by the National Natural Science Foundation of China under grants 62332006 and 62172302, with Xiaolong Shi as the principal recipient.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

In the treatment of this disease within the human body, alkylating agents and metabolites are commonly employed. Although significant attention is devoted to the development and research of initial cancer therapies, the process of drug discovery, from identifying novel chemical compounds to obtaining regulatory approval, remains complex, costly, and time-intensive. Traditional approaches frequently encounter obstacles in compound synthesis and biological screening, leading the scientific community to explore more efficient methods for compound discovery. Chemical graph theory, an interdisciplinary field, is utilized to examine molecular structures and to establish correlations between activities, properties, and various phenomena. In this context, a molecular graph represents the structural formula of a chemical compound, with vertices corresponding to atoms and edges to chemical bonds. Chemical graph theory provides innovative tools for analyzing chemical structures, including topological indices, which serve as descriptors for the structure and specific properties of molecular graphs, typically represented as real numbers [1, 2]. Numerous studies have applied topological indices in the analysis of molecular graphs and drug structures [3–8]. A fundamental approach to exploring the relationship between a substance’s physicochemical properties and its topological indices is through Quantitative Structure-Property Relationship (QSPR) models. These models use regression analysis to examine the correlations between physical and chemical properties and topological indices. Additionally, many studies in Quantitative Structure-Activity Relationship (QSAR) have applied topological indices to drug structures [9, 10]. In this article, various temperature-based indices are evaluated across several Cancer drugs, enabling researchers to identify the associated physical properties and chemical reactions. Furthermore, in addition to linear regression, we employed Support Vector Regression (SVR) and Random Forest models to explore and assess the predictive capabilities of these methods in determining the physicochemical properties of cancer drugs. These models were applied to identify the most effective model for predicting the properties of the drugs. The results of our analysis help in selecting the best predictive model, which is crucial for improving drug design and optimizing the therapeutic effectiveness of cancer treatments [11, 12].

In this study, the drug’s structure is modeled as a graph where each vertex V(G) represents an atom and each edge E(G) signifies a chemical bond between atoms. The graphs considered are simple and connected. The degree of a vertex, defined as the number of edges incident to it, characterizes its connectivity [13].

2 Methodology and analysis

In this study, cancer drugs are modeled as simple graphs. To calculate the topological indices of these drug structures, we utilize techniques such as vertex partitioning, edge partitioning, and various computational methods. Our analysis is restricted to finite, simple, connected graphs. Let G denote a graph with a vertex set V and an edge set E. The degree d_u of a vertex u is defined as the number of vertices adjacent to u. Below is a list of the topological formulas used in this study.

Definition 2.1 Fajtlowicz defined the concept of vertex temperature u for a connected graph G as follows [14]: (1)

Definition 2.2 Product connectivity temperature index [15] is (2)

Definition 2.3 Harmonic temperature index [3] is (3)

Definition 2.4 Symmetric division temperature index [16] is (4)

Definition 2.5 Modified third temperature index [17] is (5)

Definition 2.6 Modified second temperature index [17] is (6)

Definition 2.7 Second hyper temperature indices [2] is (7)

Definition 2.8 Sum connectivity temperature index [16] is (8)

Definition 2.9 F-temperature index [16] is (9)

Definition 2.10 Second temperature index [16] is (10)

Definition 2.11 Reciprocal product connectivity index [16] is (11)

Definition 2.12 First hyper temperature indices [2] is (12)

A list of abbreviations used in the article is given in Table 1.

Download:

Table 1. Abbreviations list.

https://doi.org/10.1371/journal.pone.0317507.t001

In recent years, scientists have increasingly utilized the QSPR/QSAR methodology to predict the physicochemical properties of chemical compounds through topological indices. This approach has been extensively applied in numerous studies to analyze a diverse array of drugs, including highly resistant anticancer agents, anti-COVID-19 drugs targeting the Omicron variant, breast cancer therapies, entropy tests involving benzene derivatives, nanotubes, Lyme disease treatments, and research on temperature indicators [18–22].

3 Mathematical computations of topological indices

This section presents the topological indices (TI) of cancer drugs and the QSPR modeling of their molecular structures.

3.1 Topological Index computation

Let A be a graph representing Aspidostomide E, where the edges are partitioned into distinct subsets based on specific criteria.

The study of the edges in A is shown in Table 2.

Download:

Table 2. Dividing the edges of graph A.

https://doi.org/10.1371/journal.pone.0317507.t002

By applying Definitions 2.1 through 2.12, we obtain the following results:

Fig 1 shows the Chemical structure and Molecular graph of Aspidostomide E.

Download:

Fig 1. Chemical structure and Molecular graph of Aspidostomide E.

a) Chemical structure of Aspidostomide E. b) Molecular graph of Aspidostomide E. https://doi.org/10.6084/m9.figshare.26984881.v1.

https://doi.org/10.1371/journal.pone.0317507.g001

Topological indices for other drugs can be computed using the methods described in Eqs (1) to (12) from Section 2. The indices are detailed in Tables 3, 4, and Fig 2 illustrates the drugs. Additional information about these drugs can be accessed on Chemical book [23], and Table 5 summarizes their physical and chemical properties [15, 24].

Download:

Fig 2. Chemical structure of cancer drugs from ChemSpider.

https://doi.org/10.6084/m9.figshare.26984215.v1.

https://doi.org/10.1371/journal.pone.0317507.g002

Download:

Table 3. The values of the temperature indices of the drugs.

https://doi.org/10.1371/journal.pone.0317507.t003

Download:

Table 4. The values of the temperature indices of the drugs.

https://doi.org/10.1371/journal.pone.0317507.t004

Download:

Table 5. Physicochemical properties of cancer drug.

https://doi.org/10.1371/journal.pone.0317507.t005

3.2 Discussion and comparison of advanced machine learning models and linear models for QSPR analysis

The primary objective of this section is to conduct a QSPR analysis of various topological indices (TI) and examine their correlation with several physicochemical properties and activities of drugs. The drugs under investigation include Aminopterin, Convolutamide A, Convolutamydine A, Daunorubicin, Minocycline, Podophyllotoxin, Caulibugulone E, Perfragilin A, Melatonin, Tambjamine K, Amathaspiramide E, and Aspidostomide E. We assessed the effectiveness of these TI in predicting drug properties. We analyzed eight physicochemical properties: Boiling Point (BP), Enthalpy (EN), Flash Point (FP), Molar Refractivity (MR), Polar Surface Area (PSA), Surface Tension (ST), Molecular Volume (MV), and Complexity (COM), with values obtained from PubChem and Chemspider. Table 6 displays the correlation coefficients (r) between these physicochemical attributes and the degree-based topological indices. Tables 7–13 demonstrate that a linear QSPR model provides the best fit for predicting these properties. The values are normally distributed, and fifty-eight regression models were employed for data analysis. Notably, the PT(G), HT(G), ^mT₃(G), T₂(G), and SDT(G) indices exhibit high correlations with COM, with R-values of 0.913, 0.905, 0.908, 0.915, and 0.905, respectively. Additionally, the ST) G (index shows a strong positive correlation with MR, with r = 0.924. In contrast, the RPT(G) topological index does not show a significant correlation with any physicochemical feature. The HT₁(G) and T₂(G) indices have a significant inverse correlation with MR and MV. The HT₂(G) index is identified as the best predictor for BP, EN, MR, and MV, demonstrating an inverse correlation.

Download:

Table 6. Correlation coefficients of physical properties of drugs.

https://doi.org/10.1371/journal.pone.0317507.t006

Download:

Table 7. Statistical metrics for the linear QSPR model applied to PT (G).

https://doi.org/10.1371/journal.pone.0317507.t007

Download:

Table 8. Statistical metrics for the linear QSPR model applied to HT (G).

https://doi.org/10.1371/journal.pone.0317507.t008

Download:

Table 9. Statistical metrics for the linear QSPR model applied to SDT (G).

https://doi.org/10.1371/journal.pone.0317507.t009

Download:

Table 10. Statistical metrics for the linear QSPR model applied to ^m T₃ (G).

https://doi.org/10.1371/journal.pone.0317507.t010

Download:

Table 11. Statistical metrics for the linear QSPR model applied to ^mT₂ (G).

https://doi.org/10.1371/journal.pone.0317507.t011

Download:

Table 12. Statistical metrics for the linear QSPR model applied to HT₂ (G).

https://doi.org/10.1371/journal.pone.0317507.t012

Download:

Table 13. Statistical metrics for the linear QSPR model applied to FT (G).

https://doi.org/10.1371/journal.pone.0317507.t013

Advanced machine learning models, including SVR, Random Forest, and Linear Regression (a traditional model), were employed for the analysis. The findings revealed the following key observations:

SVR and Linear Regression models exhibited superior performance in predicting physicochemical properties, achieving correlation coefficients (r) above 0.9 for most properties. These results underscore the high predictive power of advanced machine learning techniques in QSPR analysis (Vapnik, 1995; Seber & Lee, 2003).
The Random Forest model also showed acceptable performance. Although its accuracy was slightly lower than that of the tuned SVR and Linear Regression models, it provided valuable insights into the relationships between topological indices and drug properties (Breiman, 2001).
In contrast, the SVR model demonstrated weaker performance, with lower correlation coefficients, highlighting the necessity of parameter optimization for achieving accurate predictions (Vapnik, 1995).

Fig 3 provides a graphical representation of the correlations between TI and physicochemical properties. Fig 4 illustrates the relationship between TI and the physical properties of the drugs studied.

Download:

Fig 3. Two-dimensional (2D) graph illustrating the relationship between drugs and their topological indices.

https://doi.org/10.6084/m9.figshare.26983915.v.

https://doi.org/10.1371/journal.pone.0317507.g003

Download:

Fig 4. Physicochemical properties with topological indices.

https://doi.org/10.6084/m9.figshare.26988649.v1.

https://doi.org/10.1371/journal.pone.0317507.g004

3.3 QSPR analysis

Building upon the temperature indices computed in Section 2, this section aims to develop a linear regression model. This model will be used to elucidate the relationships between the temperature indices and the physical and chemical properties of the drugs.

(13)

Where:

P: Represents the Anxiety drug property (dependent variable)
B: Constant term (y-intercept)
A: Regression coefficient
TI: Topological index (independent variable)

Eq (13) represents the formulated linear regression model. In this equation, "P" denotes a specific property of an anxiety drug that we aim to predict or analyze. "B" is the constant term, and "A" is the regression coefficient, which indicates the change in "P" associated with a unit increase in the topological index. The analysis was performed using SPSS software to develop linear models for eight specific properties of cancer drugs across twelve different drugs. These models are based on the eleven topological indices computed earlier. The following section will present the various linear models tailored to each of the eight drug properties, using Eq (13) as the general framework.

3.4 Linear regression models

In this section, the linear regression models for topological indices (TI) are discussed using Eq (13). Tables 7–13 present the parameters and QSPR models associated with these TI. The following linear models for temperature indices are derived based on Eq (13):

1. Product connectivity temperature index [PT (G)]

BP = 396.850+0.748 [PT (G)], EN = 64.216+0.106 [PT (G)], FP = 184.053+0.453 [PT (G)]

MR = 57.416+0.141 [PT (G)], PSA = 29.617+0.271 [PT (G)], ST = 42.638+0.085 [PT (G)]

MV = 166.264+0.333 [PT (G)], COM = 237.722+1.237 [PT (G)]

2. Harmonic temperature index [HT (G)]

BP = 398.336+0.788 [HT (G)], EN = 64.457+0.112 [HT (G)], FP = 185.297+0.476 [HT (G)]

MR = 57.178+0.150 [HT (G)], PSA = 30.878+0.282 [HT (G)], ST = 43.220+0.088 [HT (G)]

MV = 164.758+0.359 [HT (G)], COM = 243.005+1.295 [HT (G)]

3. Symmetric division temperature index [SDT (G)]

BP = 186.910+6.469 [SDT (G)], EN = 37.003+.912 [SDT (G)], FP = 66.890+4.187 [SDT (G)]

MR = 26.265+1.091 [SDT (G)], PSA = -45232.412+2.279 [SDT (G)], ST = 10.415+0.792 [SDT (G)]

MV = 17.235+2.345 [SDT (G)], COM = 0.494+10.271 [SDT (G)]

4. Modified third temperature index [^mT₃ (G)]

BP = 395.950+1.589 [^mT₃ (G)], EN = 64.102+0.225 [^mT₃ (G)], FP = 183.747+0.961 [^mT₃ (G)]

MR = 56.854+0.301 [^mT₃ (G)], PSA = 29.724+0.572 [^mT₃ (G)], ST = 42.788+0.179 [^mT₃ (G)]

MV = 164.329+0.718 [^mT₃ (G)], COM = 238.144+2.618 [^mT₃ (G)]

5. Modified second temperature index [^mT₂ (G)]

BP = 462.043+0.040 [^mT₂ (G)], EN = 73.347+0.006 [^mT₂ (G)], FP = 224.106+0.024 [^mT₂ (G)]

MR = 69.634+0.008 [^mT₂ (G)], PSA = 50.813+0.015 [^mT₂ (G)], ST = 49.727+0.005 [^mT₂ (G)]

MV = 194.512+0.018 [^mT₂ (G)], COM = 333.617+0.070 [^mT₂ (G)]

6. Second hyper temperature indices [HT₂ (G)]

BP = 696.092–15380.93 [HT₂ (G)], EN = 106.779–2199.264 [HT₂ (G)]

MR = 120.232–3735.217 [HT₂ (G)], MV = 323.901–10116.120 [HT₂ (G)]

7. F-temperature index FT (G)

MR = 120.232–3735.217 [FT (G)], MV = 323.901–10116.120 [FT (G)]

8. First hyper temperature indices HT₁ (G)

MR = 120.232–3735.217 [HT₁ (G)], MV = 323.901–10116.120 [HT₁ (G)]

9. Second temperature index T₂ (G)

MR = 120.232–3735.217 [T₂ (G)], MV = 323.901–10116.120 [T₂ (G)]

10. Sum connectivity temperature index [ST (G)]

BP = 327.804+4.642 [ST (G)], EN = 54.435+.658 [ST (G)], FP = 149.323+2.682 [ST (G)]

MR = 42.390+.911 [ST (G)], PSA = 11.502+1.565 [ST (G)], ST = 37.512+.481 [ST (G)]

MV = 125.946+2.239 [ST (G)], COM = 127.395+7.726 [ST (G)]

4 Machine learning models for predictive analysis

In this study, machine learning models were employed to predict the physicochemical properties of drugs used in the treatment of Cancer. The primary goal was to assess the potential of these models in identifying complex and nonlinear relationships between molecular structures and physicochemical properties. The use of machine learning methods in drug analysis offers the advantage of uncovering hidden patterns within the data that traditional methods may fail to identify (Vapnik, 1995).

4.1. Rationale for using machine learning models

Machine learning models are particularly suitable for capturing intricate, nonlinear relationships in large datasets. This is crucial for predicting drug properties, as these relationships are not always straightforward or linear. In this study, machine learning models were used to model these complex patterns and predict key physicochemical properties of drugs. These properties are vital for drug design, as they influence the drug’s behavior, efficacy, and safety profile. Traditional statistical methods often fail to account for these complexities, making machine learning an ideal choice.

For this analysis, in addition to linear regression, two other machine learning methods were used, which are described below:

Support Vector Regression (SVR): This model is well-known for its effectiveness in handling nonlinear data.
Random Forest: A model based on an ensemble of decision trees, which aggregates the predictions of many trees to improve accuracy and reduce overfitting. Random Forest is particularly effective for regression tasks in complex datasets.

These models were employed to predict the following physicochemical properties of the drugs: BP, EN, FP, MR, PSA, ST, MV, COM.

4.2. Comparison of prediction and analysis of models

Linear regression performed the best, effectively capturing the relationships between the molecular structure of the drugs and their physicochemical properties. The SVR model also captured complex patterns but showed weaker results compared to linear regression. Random Forest performed the least well among the models. Tables 14–17 illustrate the predictions of physicochemical properties using different models, and the evaluation results are presented in Table 17 below and Fig 5.

Download:

Fig 5. Comparison of machine learning models for predicting physicochemical properties of cancer drugs.

https://figshare.com/articles/figure/_/28078031.

https://doi.org/10.1371/journal.pone.0317507.g005

Download:

Table 14. Prediction of physical and chemical properties using linear regression.

https://doi.org/10.1371/journal.pone.0317507.t014

Download:

Table 15. Prediction of physical and chemical properties using SVR.

https://doi.org/10.1371/journal.pone.0317507.t015

Download:

Table 16. Prediction of physical and chemical properties using random forest.

https://doi.org/10.1371/journal.pone.0317507.t016

Download:

Table 17. Evaluation of advanced machine learning models based on the coefficient of determination (R²) for predicting physicochemical properties of drugs.

https://doi.org/10.1371/journal.pone.0317507.t017

The linear regression model performed well in predicting most physical and chemical properties such as BP, EN, MR, and MV, with its predictions closely matching the actual values. Overall, the model is effective in modeling linear relationships.

The SVR model performed relatively well in predicting most physical and chemical properties, with predictions for BP, EN, MR, and MV being close to the actual values. Although the model showed reasonable accuracy for most properties, there were some discrepancies, especially for COM. Overall, the SVR model was effective in capturing complex, non-linear relationships in the data, but linear regression performed better in providing more accurate predictions.

The Random Forest model showed acceptable results in predicting the physical and chemical properties of the drugs, but compared to the Linear Regression and SVR models, its accuracy was lower in some predictions. For instance, for properties like BP and EN, there were notable discrepancies between the predicted values and the actual values, indicating lower precision in these cases. Therefore, it can be concluded that the Linear Regression and SVR models performed better in most cases, with their predictions being closer to the actual values.

As depicted in Fig 5, Linear regression demonstrated the best performance overall. Random Forest excelled in predicting non-linear relationships in some cases but showed lower accuracy in others. The SVR model exhibited weak performance.

Based on the evaluation of machine learning models using the coefficient of determination (R²) for predicting the physicochemical properties of drugs, linear regression demonstrated the best performance, achieving the highest R² values for most properties such as BP (0.95), EN (0.91), and MV (0.93), indicating strong predictive accuracy. Random Forest provided valuable insights into complex, non-linear relationships, though its accuracy was slightly lower than that of linear regression. Finally, SVR performed poorly and provided less accurate results compared to the other two models. Therefore, linear regression can be considered the best model for predicting the physicochemical properties of drugs.

5 Conclusion

Table 6 and Fig 3 illustrate the correlation between the physical and chemical properties of anti-cancer drugs and the defined temperature indices.

The Polar Surface Area is best predicted by the modified second temperature index, with a correlation coefficient (r) of 0.808.
The Sum Connectivity temperature index is the most effective predictor for Boiling Point (r = 0.836) and Molar Volume (r = 0.848). It also exhibits the highest significant correlations with Molar Refractivity (r = 0.924) and Complexity (r = 0.921).
The Symmetric Division temperature index shows a positive correlation with Enthalpy of Vaporization (r = 0.854), Flash Point (r = 0.857), and Surface Tension (r = 0.808).

This analysis reveals a positive correlation between the physical and chemical properties of Cancer drugs and the temperature indices. Tables 7–13 and 18–20 present regression models for various physical and chemical properties. The results demonstrate that the regression coefficients (r) exceed 0.6, and the p-values are below 0.05, indicating that these predictors are reliable for linear regression. The equations are formulated based on criteria such as minimum standard error (SE), maximum R-squared (R²), and maximum F-statistic. Consequently, it can be concluded that all physical and chemical properties are highly significant. This underscores the potential value of these topological indices in QSPR analysis for Cancer drugs, as evidenced by the plotted regression lines. The study’s findings can be applied to the production, development, and enhancement of more effective Cancer drugs. The theoretical insights derived from this study are beneficial for the development of new cancer therapies. Our findings reveal a clear trend in examining drug structures and their physical characteristics. Ultimately, this research contributes to the efficient design of new drugs and the development of preventive measures for the diseases in question. The principles of QSPR and topological indices offer valuable new approaches for estimating properties related to specific diseases and drugs, as demonstrated by the conclusions of this study. Furthermore, when comparing the three methods, despite the simplicity of Linear Regression, it consistently showed the best performance in predicting the physical and chemical properties of cancer drugs, outperforming both the SVR and Random Forest models. This emphasizes the effectiveness of Linear Regression in capturing the relationships within the data.

Download:

Table 18. Statistical metrics for the linear QSPR model applied to HT₁ (G).

https://doi.org/10.1371/journal.pone.0317507.t018

Download:

Table 19. Statistical metrics for the linear QSPR model applied to T₂ (G).

https://doi.org/10.1371/journal.pone.0317507.t019

Download:

Table 20. Statistical metrics for the linear QSPR model applied to ST (G).

https://doi.org/10.1371/journal.pone.0317507.t020

References

1. Ghorbani M, Hosseinzadeh MA. A new version of Zagreb indices. Filomat. 2012;26(1):93–100.
- View Article
- Google Scholar
2. Kulli VR, Pal M, Samanta S, Pal A. Handbook of Research of Advanced Applications of Graph Theory in Modern Society. Hershey, USA: Global; 2020.
3. Ghods M, Ramezani Tousi J. Computing Revan Polynomials and Revan Indices of Copper (I) Oxide and Copper (II) Oxide. Communications in Combinatorics, Cryptography & Computer Science. 2021;1(1):50–8.
- View Article
- Google Scholar
4. Kosari S. On spectral radius and Zagreb Estrada index of graphs. Asian-European Journal of Mathematics. 2023;16(10):4167.
- View Article
- Google Scholar
5. Kosari S, Dehgardi N, Khan A. Lower bound on the KG-Sombor index. Communications in Combinatorics and Optimization. 2023;8(4):751–7.
- View Article
- Google Scholar
6. Ramezani Tousi J, Ghods M. Computing K Banhatti and K Hyper Banhatti Indices of Titania Nanotubes TiO₂ [m, n]. Journal of Information and Optimization Sciences. 2023;44(2):207–16.
- View Article
- Google Scholar
7. Ramezani Tousi J, Ghods M. Investigating Banhatti indices on the molecular graph and the line graph of Glass with M-polynomial approach. Proyecciones Journal of Mathematics. 2024;43(1):199–219.
- View Article
- Google Scholar
8. Shi X, Kosari S, Hameed S, Shah AG, Ullah S. Application of connectivity index of cubic fuzzy graphs for identification of danger zones of tsunami threat. PLoS ONE. 2024;19(1):1–24. pmid:38289906
- View Article
- PubMed/NCBI
- Google Scholar
9. Havare ÖÇ. Quantitative structure analysis of some molecules in drugs used in the treatment of COVID-19 with topological indices. Polycyclic Aromatic Compounds. 2022;42(8):5249–60.
- View Article
- Google Scholar
10. Huang L, Wang Y, Pattabiraman K, Danesh P, Siddiqui MK, Cancan M. Topological indices and QSPR modeling of new antiviral drugs for cancer treatment. Polycyclic Aromatic Compounds. 2023;43(9):8147–70.
- View Article
- Google Scholar
11. Breiman L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
- View Article
- Google Scholar
12. Vapnik V. N. (1995). The nature of statistical learning theory. Springer.
13. Ghani MU, Sultan F, Tag El Din ESM, Khan AR, Liu JB, Cancan M. A Paradigmatic Approach to Find the Valency-Based K-Banhatti and Redefined Zagreb Entropy for Niobium Oxide and a Metal–Organic Framework. Molecules. 2022;27(20):6975. pmid:36296567
- View Article
- PubMed/NCBI
- Google Scholar
14. Fajtolowicz S. On conjectures of Graffitti. Discrete Mathematics. 1988;72(1):113–8.
- View Article
- Google Scholar
15. PubChem. PubChem: National Center for Biotechnology Information [Internet]. [cited 2024 Sep 11]. Available from: https://pubchem.ncbi.nlm.nih.gov/.
16. Kulli VR. Computation of Some Temperature Indices of HC₅C₅ [p, q] Nanotubes. Annals of Pure and Applied Mathematics. 2019;20(2):69–74.
- View Article
- Google Scholar
17. Kulli VR. Inverse sum temperature index and multiplicative inverse sum temperature index of certain nanotubes. International Journal of Recent Scientific Research. 2021;12(01):40635–9.
- View Article
- Google Scholar
18. Husin MN, Khan AR, Awan NUH, Campena FJH, Tchier F, Hussain S. Multicriteria decision making attributes and estimation of physicochemical properties of kidney cancer drugs via topological descriptors. PLoS ONE. 2024;19(5): e0302276. pmid:38713692
- View Article
- PubMed/NCBI
- Google Scholar
19. Jahanbani A, Khoeilar R, Cancan M. On the Temperature Indices of Molecular Structures of Some Networks. Journal of Mathematics. 2022;2022(1):1–7.
- View Article
- Google Scholar
20. Kansal N, Garg P, Singh O. Temperature-based topological indices and QSPR Analysis of COVID-19 Drugs. Polycyclic Aromatic Compounds. 2023;43(5):4148–69.
- View Article
- Google Scholar
21. Ramezani Tousi J, Ghods M. Some polynomials and degree-based topological indices of molecular graph and line graph of Titanium dioxide nanotubes. Journal of Information and Optimization Sciences. 2024;45(1):95–106.18.
- View Article
- Google Scholar
22. Zhang Y, Khalid A, Siddiqui MK, Rehman H, Ishtiaq M, Cancan M. On analysis of temperature based topological indices of some Covid-19 drugs. Polycyclic Aromatic Compounds. 2023;43(4):3810–26.
- View Article
- Google Scholar
23. ChemicalBook. ChemicalBook: Chemical Information [Internet]. [cited 2024 Sep 11]. Available from: https://www.chemicalbook.com/.
24. ChemSpider. (2021). Search asd shace chemistry. Retrieved from http://www.chemspider.com/AboutUs.aspx.
- View Article
- Google Scholar

[ref1] 1. Ghorbani M, Hosseinzadeh MA. A new version of Zagreb indices. Filomat. 2012;26(1):93–100.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Kulli VR, Pal M, Samanta S, Pal A. Handbook of Research of Advanced Applications of Graph Theory in Modern Society. Hershey, USA: Global; 2020.

[ref3] 3. Ghods M, Ramezani Tousi J. Computing Revan Polynomials and Revan Indices of Copper (I) Oxide and Copper (II) Oxide. Communications in Combinatorics, Cryptography & Computer Science. 2021;1(1):50–8.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Kosari S. On spectral radius and Zagreb Estrada index of graphs. Asian-European Journal of Mathematics. 2023;16(10):4167.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. Kosari S, Dehgardi N, Khan A. Lower bound on the KG-Sombor index. Communications in Combinatorics and Optimization. 2023;8(4):751–7.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Ramezani Tousi J, Ghods M. Computing K Banhatti and K Hyper Banhatti Indices of Titania Nanotubes TiO₂ [m, n]. Journal of Information and Optimization Sciences. 2023;44(2):207–16.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref7] 7. Ramezani Tousi J, Ghods M. Investigating Banhatti indices on the molecular graph and the line graph of Glass with M-polynomial approach. Proyecciones Journal of Mathematics. 2024;43(1):199–219.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Shi X, Kosari S, Hameed S, Shah AG, Ullah S. Application of connectivity index of cubic fuzzy graphs for identification of danger zones of tsunami threat. PLoS ONE. 2024;19(1):1–24. pmid:38289906
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref9] 9. Havare ÖÇ. Quantitative structure analysis of some molecules in drugs used in the treatment of COVID-19 with topological indices. Polycyclic Aromatic Compounds. 2022;42(8):5249–60.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref10] 10. Huang L, Wang Y, Pattabiraman K, Danesh P, Siddiqui MK, Cancan M. Topological indices and QSPR modeling of new antiviral drugs for cancer treatment. Polycyclic Aromatic Compounds. 2023;43(9):8147–70.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref11] 11. Breiman L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref12] 12. Vapnik V. N. (1995). The nature of statistical learning theory. Springer.

[ref13] 13. Ghani MU, Sultan F, Tag El Din ESM, Khan AR, Liu JB, Cancan M. A Paradigmatic Approach to Find the Valency-Based K-Banhatti and Redefined Zagreb Entropy for Niobium Oxide and a Metal–Organic Framework. Molecules. 2022;27(20):6975. pmid:36296567
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref14] 14. Fajtolowicz S. On conjectures of Graffitti. Discrete Mathematics. 1988;72(1):113–8.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref15] 15. PubChem. PubChem: National Center for Biotechnology Information [Internet]. [cited 2024 Sep 11]. Available from: https://pubchem.ncbi.nlm.nih.gov/.

[ref16] 16. Kulli VR. Computation of Some Temperature Indices of HC₅C₅ [p, q] Nanotubes. Annals of Pure and Applied Mathematics. 2019;20(2):69–74.
View Article
Google Scholar

[43] View Article

[44] Google Scholar

[ref17] 17. Kulli VR. Inverse sum temperature index and multiplicative inverse sum temperature index of certain nanotubes. International Journal of Recent Scientific Research. 2021;12(01):40635–9.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref18] 18. Husin MN, Khan AR, Awan NUH, Campena FJH, Tchier F, Hussain S. Multicriteria decision making attributes and estimation of physicochemical properties of kidney cancer drugs via topological descriptors. PLoS ONE. 2024;19(5): e0302276. pmid:38713692
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref19] 19. Jahanbani A, Khoeilar R, Cancan M. On the Temperature Indices of Molecular Structures of Some Networks. Journal of Mathematics. 2022;2022(1):1–7.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref20] 20. Kansal N, Garg P, Singh O. Temperature-based topological indices and QSPR Analysis of COVID-19 Drugs. Polycyclic Aromatic Compounds. 2023;43(5):4148–69.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref21] 21. Ramezani Tousi J, Ghods M. Some polynomials and degree-based topological indices of molecular graph and line graph of Titanium dioxide nanotubes. Journal of Information and Optimization Sciences. 2024;45(1):95–106.18.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref22] 22. Zhang Y, Khalid A, Siddiqui MK, Rehman H, Ishtiaq M, Cancan M. On analysis of temperature based topological indices of some Covid-19 drugs. Polycyclic Aromatic Compounds. 2023;43(4):3810–26.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref23] 23. ChemicalBook. ChemicalBook: Chemical Information [Internet]. [cited 2024 Sep 11]. Available from: https://www.chemicalbook.com/.

[ref24] 24. ChemSpider. (2021). Search asd shace chemistry. Retrieved from http://www.chemspider.com/AboutUs.aspx.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

Figures

Abstract

1 Introduction

2 Methodology and analysis

3 Mathematical computations of topological indices

3.1 Topological Index computation

3.2 Discussion and comparison of advanced machine learning models and linear models for QSPR analysis

3.3 QSPR analysis

3.4 Linear regression models

4 Machine learning models for predictive analysis

4.1. Rationale for using machine learning models

4.2. Comparison of prediction and analysis of models

5 Conclusion

References