Intestinal Collinsella may mitigate infection and exacerbation of COVID-19 by producing ursodeoxycholate

The mortality rates of COVID-19 vary widely across countries, but the underlying mechanisms remain unelucidated. We aimed at the elucidation of relationship between gut microbiota and the mortality rates of COVID-19 across countries. Raw sequencing data of 16S rRNA V3-V5 regions of gut microbiota in 953 healthy subjects in ten countries were obtained from the public database. We made a generalized linear model (GLM) to predict the COVID-19 mortality rates using gut microbiota. GLM revealed that low genus Collinsella predicted high COVID-19 mortality rates with a markedly low p-value. Unsupervised clustering of gut microbiota in 953 subjects yielded five enterotypes. The mortality rates were increased from enterotypes 1 to 5, whereas the abundances of Collinsella were decreased from enterotypes 1 to 5 except for enterotype 2. Collinsella produces ursodeoxycholate. Ursodeoxycholate was previously reported to inhibit binding of SARS-CoV-2 to angiotensin-converting enzyme 2; suppress pro-inflammatory cytokines like TNF-α, IL-1β, IL-2, IL-4, and IL-6; have antioxidant and anti-apoptotic effects; and increase alveolar fluid clearance in acute respiratory distress syndrome. Ursodeoxycholate produced by Collinsella may prevent COVID-19 infection and ameliorate acute respiratory distress syndrome in COVID-19 by suppressing cytokine storm syndrome.


Introduction
COVID-19 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The infection has rapidly spread worldwide and has a great impact on medical care and the economy. SARS-CoV-2 causes widely variable phenotypes from lack of any symptoms, mild phenotype, rapidly progressive phenotype, to respiratory failure. The mortality rate increases exponentially with age with about one in ten patients over 80 years of age [1]. Risk  associated with high mortality rates include obesity, diabetes, tobacco smoking, a past history of respiratory infection, and aging [2]. These risk factors should be similar among countries. However, there are large differences in mortality rates between countries. Mortality rates are higher in the United States, Europe, and South America than in Asia. In Europe, Spain and Italy have high mortality rates, whereas Germany and Northern Europe have low mortality rates (https://ourworldindata.org/). Similarly, in Asia, Taiwan and China have lower mortality rates than Japan and Korea. The differences could be accounted for by the differences in genome, previous exposure to less virulent corona virus, policies against COVID-19 pandemic, and/or gut microbiota. A link between gut microbiota and COVID-19 has been postulated based on four observations [3]. First, chronic obstructive pulmonary disease (COPD) and inflammatory bowel diseases (IBDs) share similarities in epidemiology, clinical features, and inflammatory pathologies, which can be accounted for by gut dysbiosis, although other explanations are still possible [4]. Second, gut microbiota regulates the innate and adaptive immune system [5]. Third, germ-free mice, lacking of their gut bacteria, are not able to clear pathogen in the lungs [6]. Fourth, the removal of neomycin-sensitive gut bacteria in mice increases susceptibility to influenza virus infection [7]. Temporal profiles of gut microbiota in COVID-19 infection have been reported but without consistent bacterial features [8][9][10][11].
In the course of our analysis of gut microbiota in Parkinson's disease [12] and idiopathic rapid-eye-movement sleep behavior disorder [13] in the world, we noticed that the mortality rates of COVID-19 may be associated with gut microbiota. We analyzed the relationship between the composition of intestinal bacteria in 953 healthy subjects in ten countries and the mortality rates of COVID-19 in these countries, and found that genus Collinsella was negatively correlated with the mortality rates of COVID-19.

Generalized linear model (GLM) analysis
To examine the effects of gut microbiota on the COVID-19 mortality rates across countries, we obtained 16S rRNA V3-V5 sequencing data of 953 healthy subjects in ten countries in the Organization for Economic Co-operation and Development (OECD) ( Table 1), where high medical and hygienic standards were similarly expected with less biased geopolitical factors. We first analyzed relative abundance of each intestinal bacterium at the genus level in each subject. We then predicted the COVID-19 mortality rates in ten countries with the 30 most abundant genera using GLM. In GLM analysis, we compared gaussian, gamma, and inverse gaussian distributions, and found that gamma distribution gave rise to the lowest Akaike's Information Criterion (AIC) (gaussian, 14599; gamma, 14573; and inverse gaussian, 15318). The results of GLM analysis using gamma distribution are plotted in Fig 1 and are indicated in S1 Table. Genus Collinsella had a marked negative predictive value for the COVID-19 mortality rates with the lowest p-value. Genera Dorea and Fusicatenibacter, which are short chain fatty acid (SCFA)-producing bacteria, also had high predictive values for the COVID-19 mortality rates with the second and third lowest p-values.

Linked Inference of Genomic Experimental Relationships (LIGER) analysis
Non-negative matrix factorization of gut microbiota in 953 healthy subjects in ten countries by a single cell RNA-seq analysis tool, LIGER [23], yielded five enterotypes (Fig 2A). The mean relative abundances of 30 most prevalent genera for each enterotype are collated in S2 Table. Ten countries were sorted in order of increasing COVID-19 mortality rates, and fractions of the five enterotypes were plotted in Fig 2B. The rates of enterotype 1 were high in countries where the mortality rates were low, whereas the rates of enterotypes 4 and 5 were high in countries where the mortality rates were high. Indeed, color-coding of the COVID-19 mortality rates on the LIGER plot showed that the mortality rates were increased from the right side to the left side ( Fig 2C). The average mortality rates were increased form enterotypes  1 to 5 (Fig 2D). Color-coding of the relative abundance of genus Collinsella on the LIGER plot showed that Collinsella was decreased from the right side to the left side (Fig 2E). The average relative abundances of Collinsella were decreased form enterotypes 1 to 5 except for enterotype 2 ( Fig 2F). Thus, in five enterotypes in ten countries, high Collinsella was predictive of low mortality rates of COVID-19. Plots of the average relative abundances of genera Dorea and Fusicatenibacter that had the second and third lowest p-values in GLM analysis (S1 Table) for each enterotype showed that genus Dorea was decreased from enterotypes 1 to 5 except for enterotype 2 (S1a Fig) and genus Fusicatenibacter was the highest in enterotype 1 (S1b Fig). Thus, genera Dorea and Fusicatenibacter were also predictive of mortality rates of COVID-19, although the p-values in GLM analysis were 4.20 x 10 7 and 5.54 x 10 8 times higher than that of genus Collinsella.

Discussion
We made a machine-learning GLM to predict the COVID-19 mortality rates with gut microbiota in 953 healthy subjects in ten countries. Some of the 953 subjects might have been infected by SARS-CoV-2 and might have died, but anonymity of these subjects prevented us from tracing COVID-19 in these subjects. Even if we could trace COVID-19, the number of subjects was too low to analyze gut microbiota in fatal cases. However, as specific bacteria or enterotypes are enriched in specific countries [24,25], we hypothesized that gut microbiota in healthy subjects in ten countries might account for the difference in widely variable COVID-19 mortality rates across countries. We found that genus Collinsella was negatively correlated with the mortality rate with a markedly low p-value of 1.58 x 10 −15 (S1 Table and Fig 1). We next performed unsupervised clustering of gut microbiota in 953 healthy subjects using LIGER, and observed the presence of five enterotypes (Fig 2A). The mortality rates were increased (Fig 2B and 2D) and the relative abundances of genus Collinsella were decreased ( Fig 2F) from enterotypes 1 to 5. SCFA-producing genera Dorea and Fusicatenibacter had the second and third lowest p-values in GLM analysis and highly correlated with the COVID-19 mortality rates (S1 Table and Fig 1).
In accordance with our observations, analyses of gut microbiota in COVID-19 patients in Hong Kong [9] and Portugal [26] showed that genus Collinsella and SCFA-producing bacteria were decreased in severe COVID-19 patients compared to mild COVID-19 patients. About 5% primary bile acids escape absorption in the small intestine and are biotransformed to secondary bile acids including ursodeoxycholic acid (UDCA) by intestinal bacteria [27]. Genus Collinsella carries a gene encoding an NADPH-dependent 7β-hydroxysteroid dehydrogenase (7β-HSDH) (Fig 3) [28], and is an essential intestinal bacterium to produce UDCA and other secondary bile acids. Three lines of evidence suggest that UDCA prevents SARS-CoV-2 infection and/or ameliorates COVID-19. First, docking simulation indicates that UDCA blocks binding of the spike region of SARS-Cov-2 and ACE2, and UDCA indeed prevents their interaction in a dose-dependent manner [29,30]. Second, UDCA suppresses proinflammatory cytokines like TNF-α, IL-1β, IL-2, IL-4, and IL-6 at the mRNA and protein levels [31,32]. UDCA also has an antioxidant effect as a remarkable scavenger [33]. UDCA additionally has an anti-apoptotic effect [34]. UDCA is thus expected to suppress the cytokine

Fig 2. Five enterotypes in ten countries and their relevance to the COVID-19 mortality rates and genus Collinsella. (a)
Unsupervised clustering of gut microbiota in 953 healthy subjects in ten countries by LIGER generated five enterotypes. Each subject is plotted with t-SNE and is color-coded by its enterotype. (b) Fractions of enterotypes 1 to 5 in ten countries. Ten countries are sorted in ascending order of the COVID-19 mortality rates per million, which are indicated in parentheses. (c) The t-SNE plot is color-coded by the COVID-19 mortality rates in ten countries. (d) Mean and standard error of the COVID-19 mortality rates in enterotypes 1 to 5. P = 2.2E-16 by Jonckheere-Terpstra trend test. (e) The t-SNE plot is color-coded by the relative abundance of genus Collinsella. (f) Mean and standard error of the relative abundance of genus Collinsella in enterotypes 1 to 5. P = 3.7E-12 by Jonckheere-Terpstra trend test. Color code in (a), (b), (d), and (f) are matched. https://doi.org/10.1371/journal.pone.0260451.g002

PLOS ONE
Intestinal Collinsella is protective for COVID-19 storm syndrome causing respiratory failure in COVID-19 [35,36]. Third, UDCA increases alveolar fluid clearance in a rat model of acute respiratory distress syndrome (ARDS) via ALX/ cAMP/PI3K pathway [37]. UDCA has been approved by the US Food and Drug Administration (FDA) and other countries for primary biliary cholangitis and other cholestatic disorders, and has no major adverse effects [38]. It is thus expected that UDCA prevents binding of SARS-CoV-2 to ACE2, and ameliorates COVID-19 by suppressing pro-inflammatory cytokines and by mitigating ARDS. Further prospective and/or retrospective study of COVID-19 patients is required to confirm whether genus Collinsella is protective against COVID-19 infection and mitigates ARDS in COVID-19.

Datasets
The prevalences of COVID-19 are highly variable from country to country. The accurate numbers of COVID-19 patients are difficult to be estimated, because young subjects infected with SARS-CoV-2 tend to be asymptomatic and the chance of detection of SARS-CoV-2 infection is dependent on the accessibility to PCR test [39][40][41][42][43][44]. Compared to the number of COVID-19 patients, the number of deaths due to COVID-19 should be less biased by geopolitical factors. To further make geopolitical factors unbiased, we selected ten OECD countries, where the raw 16S rRNA V3-V5 sequencing data in normal subjects were available on public [Japan [12], Korea [14], Finland [15], Canada [16], Germany [17], Mexico [18], USA [19], Italy [20], UK [21], and Belgium [22]] (Table 1). OECD countries are expected to have high medical and hygienic standards, and dependable health statistics. As our analysis was inevitably biased by the number of samples in each country, we randomly selected 137 samples out of 1,561 samples in Canada and 2,700 samples in the United Kingdom. The total number of samples was 953 in ten countries. The datasets were not filtered by age or sex, because we hoped to analyze as many subjects as possible and because average gut microbiota is similar from ages 20 to 70 in each country [45]. The accumulated numbers of deaths per million people were obtained from Our World in Data (https://ourworldindata.org/) on February 9, 2021 (Table 1), when vaccines were not widely used in these countries.

Taxonomic analysis of gut microbiota
Taxonomic analysis was performed using QIIME2 as previously described [12,13]. Briefly, the FASTQ files were quality-filtered, and Amplicon Sequence Variants (ASVs) were yielded by DADA2. No sample was discarded during this step. For taxonomic determination, a trained reference was made from the SILVA taxonomy database release 138 by q2-feature-classifier.

GLM analysis
We first used GLM to predict the mortality rates using gut microbiota of healthy subjects in ten countries. To make GLM, we used 30 most abundant intestinal genera. The variance inflation factor (VIF) was calculated for each pair of genera using the R package HH version 3.1-40. We confirmed that the VIFs were all less than 2, indicating that there was no multicollinearity among these genera. We used the "glm" function in R. We applied the gaussian, gamma, and inverse gaussian distributions to GLM, and adopted the distribution with the lowest AIC.

LIGER analysis
We developed three topic-based tools to analyze gut microbiota and enterotypes [46][47][48], but we used LIGER, which was developed for single-cell RNA-seq analysis [23], to identify enterotypes in ten countries. We previously applied LIGER for intestinal enterotype analysis in Parkinson's disease and rapid eye movement sleep behavior disorder [13]. We then analyzed correlation between the enterotypes and the mortality rates.