Associations between artificial sweetener intake from cereals, coffee, and tea and the risk of type 2 diabetes mellitus: A genetic correlation, mediation, and mendelian randomization analysis

Background Previous studies have emphasized the association between the intake of artificial sweeteners (AS) and type 2 diabetes mellitus (T2DM), but the causative relationship remains ambiguous. Methods This study employed univariate Mendelian randomization (MR) analysis to assess the causal link between AS intake from various sources and T2DM. Linkage disequilibrium score (LDSC) regression was used to evaluate the correlation between phenotypes. Multivariate and mediation MR were applied to investigate confounding factors and mediating effects. Data on AS intake from different sources (N = 64,949) were sourced from the UK Biobank, while T2DM data were derived from the DIAbetes Genetics Replication And Meta-analysis.The primary method adopted was inverse variance weighted (IVW), complemented by three validation techniques. Additionally, a series of sensitivity analyses were performed to evaluate pleiotropy and heterogeneity. Results LDSC analysis unveiled a significant genetic correlation between AS intake from different sources and T2DM (rg range: -0.006 to 0.15, all P < 0.05). After correction by the false discovery rate (FDR), the primary IVW method indicated that AS intake in coffee was a risk factor for T2DM (OR = 1.265, 95% CI: 1.035–1.545, P = 0.021, PFDR = 0.042). Further multivariable and mediation MR analyses pinpointed high density lipoprotein-cholesterol (HDL-C) as mediating a portion of this causal relationship. In reverse MR analysis, significant evidence suggested a positive correlation between T2DM and AS intake in coffee (β = 0.013, 95% CI: 0.004–0.022, P = 0.004, PFDR = 0.012), cereal (β = 0.007, 95% CI: 0.002–0.012, P = 0.004, PFDR = 0.012), and tea (β = 0.009, 95% CI: 0.001–0.017, P = 0.036, PFDR = 0.049). No other causal associations were identified (P > 0.05, PFDR > 0.05). Conclusion The MR analysis has established a causal relationship between AS intake in coffee and T2DM. The mediation by HDL-C emphasizes potential metabolic pathways underpinning these relationships


Introduction
Diabetes mellitus (DM) represents a significant and pressing global health concern [1], with type 2 diabetes mellitus (T2DM) constituting approximately 90% of all diabetes cases worldwide [2].The World Health Organization (WHO) estimates that there are currently over 422 million diabetics globally and that there will be 629 million by the year 2045 [3,4].Notably, the prevalence of diabetes has experienced an upward trajectory in developing nations, including China and Pakistan, leading to considerable direct and indirect financial strain on society [5].Consequently, the identification of novel modifiable risk factors for T2DM is imperative for informing clinical management strategies and mitigating the onset and progression of the disease.
As lifestyles change, the demand for sweet treats is gradually increasing.Artificial sweeteners (AS), as low-calorie and sugar-free alternatives, have gained popularity as sugar substitutes to decrease caloric intake [6].The most popular AS include aspartame, saccharin, acesulfame potassium, and sucralose [7], commonly used in foods and beverages such as cereals [8], coffee [9], and tea [10] to satisfy the demand for sweetness.Current research has identified associations between AS and T2DM; however, findings from observational studies in this domain often exhibit inconsistencies.Certain investigations have reported a 3% elevated relative risk of T2DM per additional daily serving of AS [11][12][13][14], while others have demonstrated that the intake of artificially sweetened beverages, when compared to water, is associated with a 21% rise in T2DM incidence [15].Moreover, no correlation between AS and T2DM has been shown in other studies [16,17].Despite the widespread use of AS in the daily diet and their popularity among people with T2DM, there is no consensus on a causal relationship between them and diabetes due to inconsistent research findings.
Previous research encountered challenges in establishing a definitive causal relationship between exposure factors and outcome variables, largely attributable to complexities stemming from confounding variables and reverse causation.Given the constraints of observational studies in ascertaining causality with certainty, alternative approaches such as Mendelian randomization (MR) in the realm of genetic research can prove to be invaluable.Experiments that employ MR utilize genetic variations, ascertained through genome-wide association analyses, as instrumental variables (IVs).These IVs help in gauging the causal relationship between environmental exposure and the desired outcome.Under certain conditions, this technique allows for drawing causal inferences by using genetic variants as surrogates for environmental exposure [18].Conceived as a natural randomized controlled trial, MR is based on Mendelian inheritance laws that allocate parental alleles to their offspring.This approach offers a more robust degree of evidence and a diminished vulnerability to confounding factors.In contrast to observational epidemiological research, MR presents a higher caliber of evidence.This study aims to employ univariate MR (UVMR), multivariate MR (MVMR), mediation MR, and linkage disequilibrium score (LDSC) regression to investigate the relationship between intake of AS from various sources and T2DM, further delving into the mediating roles of five confounding factors.

Study design
The foundational datasets for this study were procured from genome-wide association studies (GWAS).Each GWAS study included obtained the necessary approvals from their respective institutional review boards.As this study involves a secondary analysis of publicly available data, no additional ethical permissions were required.IVs for the exposure were identified based on three critical criteria: (i) the selected genetic variant, serving as the IV, should have a strong association with the exposure; (ii) this variant should not be associated with any known confounders; and (iii) the effect of the variant on the outcome should be solely through the exposure, negating any alternative pathways [19].The MR approach is detailed in Fig 1 , while the summary statistics from the data sources are presented in Table 1.

Selection of genetic instrumental variables
The summary-level GWAS data for AS intake in coffee/tea/cereal were all sourced from the UK Biobank (UKB) [20], encompassing 64,949 European individuals.This information was collected using questionnaires where participants detailed the amount of AS (for example, Canderel) they added to their daily coffee or tea/infusion on a per-drink basis.Additionally, those who reported consuming cereal or porridge the previous day specified the quantity of sweetener added per bowl.To ensure the accuracy of MR analyses, we adhered to stringent criteria for single nucleotide polymorphism (SNPs) selection: (i) SNPs selected as IVs must show an association with the defined exposure at a genome-wide significance level (P < 5×10 −8 ).Given the absence of genome-wide significant SNPs for exposure, we applied a relaxed threshold of 5×10 −6 to capture more SNPs for these phenotypes [21].(ii) Chosen SNPs were further filtered to ensure no associations with potential confounders and to preserve independence among them, thereby mitigating potential biases from linkage disequilibrium (r 2 < 0.001, clumping distance = 10,000 kb).(iii) The efficacy of the selected SNPs as IVs was validated using F-statistics (F = beta 2 /se 2 ; beta for SNP-exposure association; variance (se)), assessing the possibility of weak instrumental variables [22].A high F-statistic indicates robust instrumental strength, and our criteria required all SNPs to have an F-statistic above 10.(iv) To enhance the reliability of our results, we applied MR-Steiger filtering, which systematically excludes variants more correlated with outcomes than exposures [23].(v) In cases where an SNP is absent from the outcome dataset, we utilized the SNiPa online platform (http://snipa.helmholtz-muenchen.de/snipa3/),based on European population genotype data from the 1000 Genomes Project's Phase 3, to locate the missing SNP and identify a proxy SNP with strong linkage disequilibrium (criteria set at r 2 > 0.8) to the original SNP.(vi) The effect of the SNP on the exposure and its effect on the outcome should align with the same allele.

Source of outcome phenotypes
The summary-level GWAS meta-analysis for T2DM integrated 22 cohorts, sourced from the AMP-T2D Knowledge Portal and the DIAbetes Genetics Replication And Meta-analysis (DIA-GRAM) consortium [24].T2DM is defined by ICD-10 codes and includes 180,834 cases and 492,191 controls from European populations.

Primary MR analysis
Within the UVMR framework, individual IVs were evaluated using the Wald ratio test.For scenarios involving multiple IVs, the multiplicative random-effects inverse-variance-weighted (IVW) approach was utilized, with ancillary use of the MR-Egger and weighted median methodologies.In the IVW approach, the weight accorded aligns directly with the Wald ratio estimation and inversely with the variance of each respective SNP [28].IVW offers dependable outcomes when all genetic variants are appropriate; however, the weighted median is optimal when a majority are deemed inappropriate, and MR-Egger is reserved for complete invalidity [29].Moreover, we utilized the constrained maximum likelihood (CML) method for our analysis.This technique allows for the combined estimation across multiple genetic variants while considering possible confounders and genetic heterogeneity.Using CML ensures enhanced precision and reliability in our estimates, especially in scenarios involving an abundance of genetic variants and potential confounding variables [30].Consideration for multiple comparisons was made through the false discovery rate (FDR), with a post-adjustment P-value < 0.05 indicating a discernible causal association.Situations where a raw P-value was < 0.05, yet exceeded 0.05 post-FDR adjustment, were considered indicative rather than conclusive.
Within the scope of the Mediation MR and MVMR analysis, and recognizing potential confounders such as BMI, HbA1c, TG, LDL-C, and HDL-C in the exposure-outcome trajectory, we employed MVMR to discern the inherent causal relationship.The initial MVMR postulation centers on the association of genetic variations with specific exposures, while subsequent postulates align with UVMR standards [31].An assessment was conducted to quantify mediated effects.Commencing with MR, effect estimates correlated to exposures were ascertained via the IVW approach.Subsequently, MVMR was utilized to quantify the influence of the aforementioned mediating factors on outcomes.The product of the two estimates for each outcome yielded the exposure's indirect influence.The ratio of the mediated to the total effect facilitated an understanding of each mediator's contribution to the cumulative effect.

Genetic correlation analysis
Linkage disequilibrium score (LDSC) regression, when tailored to summary-level GWAS datasets, emerges as a sophisticated technique to interrogate genetic correlations in intricate diseases or varied phenotypic manifestations.With precision, this methodology segregates authentic polygenic associations from potential confounders, such as cryptic relatedness or population stratification [32].A statistically and quantitatively significant genetic correlation implies that the aggregate phenotype correlation extends beyond mere environmental confounders [32].
For detailed investigations into genetic correlations between specific exposures and corresponding phenotypic outcomes, the LDSC platform is available at https://github.com/bulik/ldsc.

Sensitivity analysis
Within the UVMR structure, we executed a series of tests to validate the analytical integrity.The Cochran's Q test was utilized to measure heterogeneity among the chosen genetic markers, recognizing a P-value under 0.05 as indicative of significant variances between the SNPs under examination [33].The MR-Egger regression was employed to explore the potential of directional pleiotropy in the MR framework [34].An intercept P-value below 0.05 in the MR-Egger regression indicates a considerable directional pleiotropy, albeit the method may pose limitations in accuracy [35].The MR Pleiotropy Residual Sum and Outlier (MR-PRESSO) was invoked to pinpoint outliers and assess horizontal pleiotropy, deeming a global P-value under 0.05 as significant [36].This technique's exclusion of outliers fine-tunes the correction process.Additionally, the leave-one-out method was incorporated to evaluate the influence of individual SNPs on the overall results [37].The R 2 value was computed using the formula 2×MAF×(1-MAF)×beta 2 , with MAF representing the minor allele frequency of each instrumental SNP.Summation of these values yielded the coefficient pivotal for power computation [38].We derived the statistical power utilizing tools available on the mRnd website [39] (https://shiny.cnsgenomics.com/mRnd/).

Genetic instrument selection and genetic correlation between phenotypes
The study reports F statistics exceeding 20 for all IVs, signifying a robust reduction of bias from weak instruments.The SNPs selected as IVs ranged from 14 to 176, accounting for an explained variance of 0.09% to 28.47%, and the power statistics obtained ranged from 6% to 100% (S1 Table ).
In UVMR, the study identified AS intake in coffee with evidence for a causal relationship with T2DM that passed their statistical significance threshold (P < 0.05 & P FDR < 0.05), and neither of these associations were significant in MVMR analyses that accounted for potential confounders (LDL-C, HDL-C, TG, HbA1c, BMI and all models) (S3 Table ).Further mediation MR analysis revealed that HDL-C partially mediates the causal relationship with AS intake in  2).The causal relationships among the three align with the principles of mediation MR (P > 0.05).
A series of sensitivity analyses confirmed the robustness of the forward and reverse UVMR results (Table 3).Cochran's Q statistic suggested no heterogeneity (P > 0.05).MR-PRESSO detected no outliers and no evidence of horizontal pleiotropy (P > 0.

Discussion
This study conducted a comprehensive MR analysis to delve deeper into the genetic susceptibility linking AS intake from various sources with T2DM.The MR findings corroborated prior epidemiological studies [11][12][13][14], establishing a causal relationship between an elevated risk of AS intake in coffee and T2DM.Moreover, we identified a positive correlation between T2DM and AS intake in coffee, cereal, and tea.Further LDSC analysis revealed significant genetic correlation between the exposure and outcome phenotypes.MVMR analyses unveiled  the influence of several confounding factors, while mediation MR indicated that HDL-C partially mediates the causal relationship.Previous epidemiological research has observed a link between AS and T2DM.The findings demonstrated an association between artificial sweetener usage and the emergence of insulin resistance and T2DM among diabetic patients, leading to a heightened occurrence of obesity.However, animal studies have hinted at a negative relationship between AS and T2DM [40][41][42].In contrast, thorough safety assessments have confirmed their safety [43], and reputable organizations have vouched for their safety [7].Nevertheless, innate limitations in observational studies make it difficult to fully negate the impact of unobserved confounding variables and reverse causality.Observational studies tend to prioritize correlation over causation.By employing MR analysis, this study minimized the effects of bias and confounding factors, establishing a causal relationship between AS intake in coffee and T2DM.
This study elucidates the results through gastrointestinal reactions, insulin resistance and secretion, alterations in the microbiome, and changes in feeding behavior.Chlorogenic acid and caffeic acid, as bioactive components in coffee, are known to influence intestinal motility and gastric acid secretion, subsequently affecting food and nutrient absorption and digestion [44].When combined with AS, these compounds could potentially influence the secretion of gut hormones.Research conducted by Jing Ma's team suggests that AS can stimulate the secretion of GLP-1 and GIP from the intestinal endocrine cell line GLUTag, as well as GLP-1 secretion from the human L-cell line NCI-H1 [45], subsequently influencing insulin secretion and glucose homeostasis.Furthermore, intake of AS might alter the structure of the gut microbiome, resulting in gut bacterial imbalance and glucose metabolic disturbances.Consistent evidence provided by research from Jotham Suez et al. [46] indicates that AS can modify the gut microbiome to induce glucose intolerance in mice and various human subgroups, resulting in sustained hyperglycemic states.Coffee itself, with its bioactive compounds like caffeine and polyphenols, can also alter the gut flora.A review by Astrid Nehlig [47] indicates that coffee consumption mainly affects the population levels of bifidobacteria.
AS may directly or indirectly influence insulin secretion and function.Research by Cristina Bosetti and colleagues [48] suggests that AS such as saccharin can induce pancreatic cells to release insulin, resulting in short-term hyperinsulinemia.Long-term overstimulation might lead to the functional exhaustion of pancreatic cells.Caffeine can accelerate gastric emptying, temporarily increasing blood glucose and insulin resistance.A meta-analysis involving seven cohorts by Xiuqin Shi's team supports this notion [49], suggesting that caffeine intake can reduce insulin sensitivity in healthy subjects, possibly due to interference with intracellular glucose uptake.Considering both effects, artificial sweeteners in coffee might exacerbate this burden, leading to overexertion of pancreatic cells or further glucose metabolic disruption.Coffee, an integral part of daily life, is frequently consumed with high-sugar, high-fat foods like pastries, potentially impacting glucose absorption and metabolic rates [50].Given the "zero-calorie" characteristic of AS, if consumers mistakenly believe that coffee with AS can offset the intake of other unhealthy foods, this could lead to an overall increase in caloric intake, thus elevating the risk of T2DM.This notion is further corroborated in the reverse MR analysis, where an increased intake of AS from various sources is associated with the T2DM.Other beverages, such as tea, might be consumed independently at other times of the day.Research by Bangde Li and colleagues suggests sweetness and coffee flavor directly influence two key sensory attributes for consumers [51].The robust flavor of coffee, compared to tea or cereals, might necessitate more artificial sweeteners to achieve the desired sweetness.Consequently, the amount of sweetener consumed might vary, further explaining the study's findings that AS intake from other sources doesn't show a causal link with T2DM.
The mediation MR analysis unveils a pivotal role of HDL-C as an intermediary in the relationship between AS intake in coffee and the risk of T2DM.HDL-C, commonly termed as the 'beneficial cholesterol,' facilitates reverse cholesterol transport by actively sequestering surplus cholesterol from peripheral tissues and conveying it to the liver for subsequent excretion [52].AS intake in coffee may influence HDL-C levels by altering metabolism and regulating lipid metabolism through changes in the gut microbiome [53].Ample evidence suggests that artificial sweeteners are associated with liver damage [54,55], and liver function, in turn, impacts the synthesis and metabolism of HDL-C.Concurrently, HDL-C may affect insulin sensitivity by modulating the function of β-cells and the peripheral tissue's glucose uptake [56].Therefore, the mediating effect of HDL-C implies that interventions targeting the regulation of HDL-C, combined with controlling AS intake, might offer a synergistic approach for preventing or mitigating the risk of T2DM.
The study demonstrates several notable merits.Primarily, this MR analysis pioneers in establishing a causal linkage between sources of AS intake and T2DM.Furthermore, given that all SNPs utilized as IVs were identified within the European population, the probability of population stratification bias is diminished, thereby bolstering the credibility of the MR assumption.In the course of this inquiry, the application of rigorous instruments (e.g., F statistic significantly surpassing 10) serves to mitigate potential biases stemming from sample overlap [57].However, our study is not without certain limitations.While every SNP was scrutinized, not all potential risk factors were considered.Furthermore, the selection of a relatively small number of SNPs as IVs could account for a minimal percentage of exposure variation, thereby affecting the statistical power of causal estimations.In addition, the lack of extensive disease severity and demographic information in the GWAS database, making it impossible to undertake further subgroup analyses.

Conclusion
In summary, the MR analysis has established a causal relationship between AS intake in coffee and an elevated risk of T2DM, with HDL-C mediating a portion of this causal effect.The reverse analysis indicates a positive correlation between T2DM and artificial sweetener intake from all sources.Future MR analyses, employing larger-scale GWAS summary data and an increased number of genetic instruments, are necessary to corroborate the conclusions drawn from this study.

Fig 1 .
Fig 1. Overview of research design and analysis strategy.Overview of the research design.Exposures come from UKB, with outcomes including Type 2 diabetes mellitus.The MR framework is based on three fundamental MR assumptions, with MVMR analyses adjusting for five mediating factors for positive results.MVMR, Multivariate MR; UVMR, Univariate MR; BMI: Body Mass Index; SNP, Single Nucleotide Polymorphism; MR-PRESSO, MR Pleiotropy Residual Sum and Outlier; HbA1c, Glycated Hemoglobin A1c; LDL-C, Low Density Lipoprotein Cholesterol; HDL-C, High Density Lipoprotein Cholesterol; TG, Triglyceride; DIAGRAM, DIAbetes Genetics Replication And Meta-analysis; UKB, UK Biobank; AS, Artificial sweetener.https://doi.org/10.1371/journal.pone.0287496.g001

Fig 2 .Fig 3 .
Fig 2. Genetic associations with AS intake from different sources (horizontal axis, standard deviation units) and with T2DM (vertical axis, log odds ratios) at a genome-wide level of significance.(A) AS intake in coffee on T2DM (B) AS intake in tea on T2DM (C) AS intake in cereal on T2DM (D) T2DM on AS intake in coffee (E) T2DM on AS intake in tea (F) T2DM on AS intake in cereal.Horizontal and vertical lines represent 95% confidence intervals for the genetic associations.Horizontal and vertical lines represent 95% confidence intervals for the genetic associations.AS, Artificial sweetener; T2DM, type 2 diabetes mellitus.https://doi.org/10.1371/journal.pone.0287496.g002 05).MR-Egger detected no horizontal pleiotropy (P > 0.05).The leave-one-out analysis further validated that the causal relationship wasn't influenced by any single SNP (S1 Fig), and the funnel plot showed symmetry (S2 Fig).The Steiger test indicated that all SNPs passed the test, and the direction of causality remained unchanged, further solidifying the results.The forest plots can be found in S3 Fig.