Associations between microbial communities and key chemical constituents in U.S. domestic moist snuff

Background Smokeless tobacco (ST) products are widely used throughout the world and contribute to morbidity and mortality in users through an increased risk of cancers and oral diseases. Bacterial populations in ST contribute to taste, but their presence can also create carcinogenic, Tobacco-Specific N-nitrosamines (TSNAs). Previous studies of microbial communities in tobacco products lacked chemistry data (e.g. nicotine, TSNAs) to characterize the products and identify associations between carcinogen levels and taxonomic groups. This study uses statistical analysis to identify potential associations between microbial and chemical constituents in moist snuff products. Methods We quantitatively analyzed 38 smokeless tobacco products for TSNAs using liquid chromatography with tandem mass spectrometry (LC-MS/MS), and nicotine using gas chromatography with mass spectrometry (GC-MS). Moisture content determinations (by weight loss on drying), and pH measurements were also performed. We used 16S rRNA gene sequencing to characterize the microbial composition, and additionally measured total 16S bacterial counts using a quantitative PCR assay. Results Our findings link chemical constituents to their associated bacterial populations. We found core taxonomic groups often varied between manufacturers. When manufacturer and flavor were controlled for as confounding variables, the genus Lactobacillus was found to be positively associated with TSNAs. while the genera Enteractinococcus and Brevibacterium were negatively associated. Three genera (Corynebacterium, Brachybacterium, and Xanthomonas) were found to be negatively associated with nicotine concentrations. Associations were also investigated separately for products from each manufacturer. Products from one manufacturer had a positive association between TSNAs and bacteria in the genus Marinilactibacillus. Additionally, we found that TSNA levels in many products were lower compared with previously published chemical surveys. Finally, we observed consistent results when either relative or absolute abundance data were analyzed, while results from analyses of log-ratio-transformed abundances were divergent.


Introduction
Smokeless tobacco (ST) use contributes to oral diseases, increases cancer risks, and results in an unnecessary burden on the healthcare system [1,2]. Moist snuff is the largest category of smokeless tobacco products sold in the United States, having an estimated 5.9 million users [3]. The negative effects of ST are attributed to the wide range of toxicants contained within each product. The microbial components of ST impact its chemistry through agricultural practices, curing, and manufacturing steps. These processes range from the steps of curing, through aging and fermentation, all of which contribute to the product's palatability. These processes create a metabolically active environment [4][5][6] that incidentally results in more harmful products [7,8].
During the tobacco curing and aging, nitrate-reducing microorganisms convert nitrate (NO 3 -) to nitrite (NO 2 -) [9]. Nitrite is a reactive species known to be actively transported out of the cells in some bacterial species [10,11]. Once nitrite is in the extracellular environment, it reacts abiotically with abundant tobacco alkaloids, such as nicotine and nornicotine, that have been released by ruptured cells, forming Tobacco-Specific N-Nitrosamines (TSNAs). These chemical reactions occur more favorably at the low pH conditions during curing and aging of tobacco [4,8,12,13]. TSNAs are some of the most potent and abundant carcinogens in smokeless tobacco. Two TSNA compounds in particular, N'-Nitrosonornicotine (NNN) and 4-(Methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK), have been identified by the International Agency for the Research on Cancer (IARC) as Group I carcinogens (known human carcinogens) [14]. Various means have been suggested to reduce TSNAs in ST products. These include sanitizing fermentation vats, adding non-nitrate reducing bacteria [5], and using agents such as green tea extract or ascorbic acid to neutralize nitrite [15]. Additionally, seeding of a microbe identified as a nitrite-reducing strain of Bacillus amyloliquefaciens to scavenge nitrite has been suggested [16]. Some microbial nitrate reduction has been achieved by newer farming and manufacturing techniques [17]. As microbial activity remains a key process in domestic smokeless tobacco manufacturing, TSNA formation may not be significantly reduced without a fundamental change such as the pasteurization of selected Swedish snus products [18]. The microbial taxa responsible for nitrate-to-nitrite conversion in smokeless tobacco are not known. Culture-independent studies [19][20][21][22][23][24] have confirmed the presence of diverse bacterial communities in ST products, but fail to yield definitive answers as to what microbes may be associated with TSNAs. Most microbial-focused studies to date investigated multiple types of products, but only included a limited number of samples that were not accompanied by relevant chemical measurements.
Several extensive chemical profiles of smokeless tobacco products including TSNA measurements have been published, but without microbial community data [25][26][27][28][29][30][31][32]. Associations between chemical attributes and microbial taxa have been studied, but characterizations are limited to fermenting tobacco intended for cigars, and in another study, lab-produced (noncommercial) smokeless products [22,33]. Additional studies that included chemistry measurements were performed with cigarette and small cigar tobacco, where microbial community changes due to storage conditions were also explored [34][35][36]. However, smokeless tobacco product microbiotas are substantially different from cigar and cigarette tobacco [33,35,37,38].
Although products such as snus and new "tobacco-free" nicotine pouches are rapidly gaining popularity, moist snuff products remain very popular among ST users. For instance, the moist snuff products Copenhagen, Grizzly, and Skoal were the top 3 selling ST brands on the market in 2019 (data from https://www.statista.com/). Due to their popularity, we focused on traditional moist snuff products that are fermented rather than pasteurized. Moist snuff products that utilize fermentation are clearly distinct from unfermented tobacco products, such as Swedish snus, which is subjected to heat treatment to remove microbes, thereby omitting the fermentation process [39].
This study provides an updated survey of chemistry in popular moist snuff products on the domestic market and explores microbes associated with TSNAs in these products. To examine this association we analyzed 38 smokeless tobacco products using analytical chemistry measurements (TSNAs, nicotine, pH, and moisture) and 16S microbial community surveys. Combining data from chemistry and the microbiota allowed us to relate several bacterial taxa with TSNAs. Since both the relative and absolute abundance of taxa were measured, this study also provides an opportunity to compare and contrast the results of analyses of these different data types, as well as analyses based on log-ratios of abundance data.

Samples, preparation and storage
Commercial smokeless tobacco products (N = 38) were purchased in 2016 by Lab Depot (Dawsonville, GA, USA) and shipped to CDC. Moisture samples were taken from individual tins, then three packages, or tins, of each tobacco product were pooled to ensure complete homogenization. After moisture measurement aliquots were taken, the remaining contents were placed into large polypropylene tubes and rotated for 30 minutes. Samples were stored at -80˚C until thawed for DNA extraction. Aliquots were taken for chemistry after thawing and prior to taking samples for the microbiological experiments.
We focused on moist snuff, but also included one product, Hawken Wintergreen (Hawken), that is marketed towards moist snuff users, but is substantially different. Hawken represents a product that is compositionally more similar to a chewing tobacco. Hawken has generally been viewed as a "introductory" product with lower pH and nicotine, which would deliver less free nicotine during use, presumably alleviating nausea caused by high nicotine levels [40].
Two versions of Copenhagen Long Cut Straight with different labeling were obtained, which we termed 'A' and 'B'. Aside from the labeling on the package, contents of the two versions appeared identical.
All quantitative analytical chemistry measurements were performed in accordance with laboratory ISO 17025 quality guidelines. After aliquoting for chemical measurements, samples were stored at -20˚C prior to measurement and allowed to equilibrate to room temperature before analysis.

Quantifications of moisture and pH
Moisture was quantified by the mass loss on drying method described in Lawler et al., 2013 [32]. Moisture was measured with two replicates for each product and the means are presented in the results. The weight difference of freshly opened product (prior to pooling) versus dried tobacco was used to determine moisture. Tobacco was dried at 99˚C for 3 hours and then placed in a desiccator for approximately 30 minutes [41].
The product pH was measured as previously described in Lawler, et al., 2013 [32, 41]. Briefly, 10 mL of deionized distilled water was added to 1.0 g of sample and measured on Sirius Vinotrate pH meters (Sirius Analytical, East Sussex, UK). The meter was calibrated daily with pH buffers of 4.01, 7.00, and 10.01. Duplicate pH readings were averaged.

Quantification of nicotine by gas chromatography with mass spectrometry and free nicotine calculations
Nicotine was extracted from ST products and subsequently analyzed, in triplicate, using an Agilent 6890 Gas Chromatograph/5973N Mass Spectrometer fitted with an Agilent Ultra2 GC column (25 m x 0.32 mm x 0.52 μM) (Agilent Technologies; Santa Clara, CA) with parameters described elsewhere [42]. Methyl tert-butyl ether (MTBE), sodium hydroxide (NaOH) and chemical standard quinoline were purchased from Sigma-Aldrich (St. Louis, MO). Nicotine standards were obtained from Accustandard (New Haven, CT, USA). Briefly, the method involves weighing a 0.4 g product sample into a sample bottle then adding 1 mL of 2N NaOH and 10 mL MTBE with quinoline added as an internal standard. The extraction solution plus sample were agitated on a Rugged Rotator for 60 minutes at 70 rpm. Approx. 1.5 ml extract was transferred to a 2 mL autosampler vial and a 1-μL aliquot of each sample extract was injected into the GC/MS operated in selected ion monitoring mode. GC parameters included: column flow rate of 1.7 mL/minute and an inlet temperature of 230˚C; and the auxiliary line temp was held at 280˚C. The GC oven ramp parameters were as follows: hold 175˚C for 1 min; ramp at 5˚C/minute to 180˚C; and finally, ramp at 35˚C/minute to 240˚C. The total run time is 3.7 minutes. Relative response factors (nicotine quantitation ion area/quinoline quantitation ion area) against nicotine concentrations resulted in a calibration curve that was used to quantify total nicotine. Unprotonated (free or freebase) nicotine is the charge neutral form of nicotine that is most easily released from tobacco and absorbed across oral membranes. Free nicotine percentage was calculated using the measured pH of the product and the pK a value of the pyrrolic nitrogen of nicotine (8.02) substituted into the Henderson-Hasselbach equation [41]. The percentage of free nicotine was multiplied by total nicotine to get the amount of free nicotine (mg/g).

Nucleic acid extractions
Nucleic acids were extracted from tobacco products using the PowerSoil DNA Elution Accessory Kit together with the Total RNA Isolation Kit(MoBio, Carlsbad, CA, USA), with a few modifications. These kits were used for co-extraction of nucleic acids in this study because we originally intended to sequence cDNA made from extracted RNA as well as the DNA itself. We found, however, that RNA amounts that were extracted were highly variable, potentially making analysis difficult, and thus, we limited this study to an examination of the DNA. For reference, a table with RNA extraction values for the first eight products extractions are provided in S1 Table in S4 File. Extraction protocol modifications included the use of 0.5 grams tobacco, weighed into polypropylene tubes and the addition of 0.5 mL of molecular biology grade, nuclease-free water prior to extraction. Additionally, MPBio's Lysing matrix E (MP Biomedicals, Santa Ana, CA, USA) was used in lieu of the bead-beating tubes provided with that kit. All bead-beating was performed using a SPEX GenoGrinder with 4 cycles of 2 minutes at 1750 RPM followed by cooling on ice for 2 minutes between grinding steps (Spex Sample Prep, Metuchen, NJ, USA). Due to varying amounts of potentially inhibitory contamination that gave eluants a varying shade of color (likely from excess humic acids), we used an additional cleanup step for further purification. This clean-up step was performed after the extraction using Qiagen DNEasy columns using the Qiagen QIAmp DNA mini kit (Qiagen, Germantown, MD, USA). All products were homogenized, then treated with RNAProtect prior to extraction (Qiagen, Germantown, MD, USA). Duplicate samples for each product were extracted and sequenced. S2 Table in S4 File lists extracted amounts of DNA for each sample.

Library preparation and sequencing
Libraries were prepared from amplicons using primers derived from Illumina MiSeq 16S protocol with some changes, as described below. Primer specifics for the V4-V5 region of the 16S rRNA gene used [43] are provided in S3 Table in S4 File. Pooled and frameshifted primers were used to increase sequencing diversity [44,45]. Multiplexing indexes were included in the primers, as were annealing sequences for Illumina sequencing. Three reverse primer sequences were used to provide greater coverage for the V4-V5 hypervariable regions. Sequencing was performed using the Illumina MiSeq Reagent Kit V2 (500 cycle) sequencing kit. Sequencing plate setup and index numbers used are given in S1 File. PCR amplification of 16S regions used KAPA HiFi HotStart 2X ready mix (Kapa Biosystems, Wilmington, MA, USA). Thermal cycler conditions were as follows: One cycle at 95˚C for 3 minutes, 25 cycles of: 98˚C for 30 seconds, 58˚C for 30 seconds, 72˚C for 30 seconds, then a 72˚C hold for 5 minutes followed by cooling and a final hold at 4˚C. For each reaction, 12.5 ng (1.5 ng / μl) of template DNA were used, with primer concentration at 5 μM. Ampure XP was then used for PCR cleanup, and then for library preparation, eight additional cycles were used with Nextera XT Indexes (5 μl each), followed by a further cleanup using AMPure XP.
Library quality was assessed using an Agilent Bioanalyzer 2100 with a High Sensitivity DNA chip (Agilent Technologies; Santa Clara, CA, USA), and quantified using a Qubit 2.0 with the Qubit dsDNA HS Assay Kit (Thermo Fisher; Waltham, MA, USA). The DNA libraries were then combined in equimolar amounts before going onto the sequencer. Sequencing was performed on an Illumina MiSeq using the MiSeq Reagent Kit V2 (500-cycle) (Illumina Inc., San Diego, CA, USA).

Measurement of total bacterial load by qPCR
Measurements were completed as described in Al-Hebshi, 2017 [23]. Briefly, a small, well-conserved portion of 16S (1406F-1525R primer set) was used in conjunction with a control for inhibition. The inhibition control, run in parallel to the 16S samples, used samples spiked with genomic DNA extracted from DH10B E. coli. A standard curve using serial dilutions of the rpsL gene was constructed. Total bacterial 16S counts were computed based on the slope of the calculated calibration curve. Three dilutions of each sample were run, in triplicate. Averages of triplicates were used in calculation of the 16S counts per 1 gm of tobacco. One sample, Stoker's Long Cut Natural, was omitted due to loss of the DNA sample prior to the qPCR bacterial load quantitation.

Bioinformatics analysis-Data QC processing and 16S pipeline
Sequences were uploaded to National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) with the BioProject ID PRJNA684146. FaQCs v2.08 was used to generate statistics on average read length, GC %, and average quality score per pair of reads (S4 Table in S4 File).
The specific QIIME2 commands are listed in S2 File. An operational taxonomic unit (OTU) table was constructed after glomming taxonomy to genus level (S5 Table in S4 File), as the 16S V-region used would not accommodate useful species-level identification in all instances, yet higher level associations (e.g. family level) may not be specific enough to provide useful information on which microbes are involved. The OTU table was imported into the R statistical software suite using R package 'qiime2R' (https://github.com/jbisanz/qiime2R/). The R package 'phyloseq' was used for data visualization including alpha diversity and PCA analyses [50]. All QIIME2 commands are given in S2 File, R scripts in S3 File.

Statistical analysis
For statistical analyses, Hawken Wintergreen was excluded from the study as it represents a different type of product (chewing tobacco vs. moist snuff); in addition, it had substantially fewer reads, many or all of which may have been artifactual, as Illumina multiplexed sequencing sometimes results in small amounts of crossover between barcodes [43]. Stoker's LC Natural was also omitted from the statistical analysis due to a loss of the sample that prevented us from using it in the total bacterial load qPCR.
Statistical models and software packages tailored to microbiome data were used. First, we analyzed relative abundance data obtained by dividing the counts observed for each sample by the library size for that sample. We then analyzed quantitative count (absolute abundance) data obtained by multiplying the relative abundances for each sample by the measured number of 16S sequences per gram of sample. Finally, we generated centered log ratio (CLR) data by replacing abundance by its log transform, then subtracting the mean log abundance for each sample. A pseudocount of 1 was added to each zero count to allow the log to be taken. Data on all three scales were each analyzed using the R package LDM (linear decomposition model, version 4.0), first as a whole, followed by separate analyses of products from manufacturer [51]. The LDM uses statistical methods that are appropriate for the nature of microbiome data, while controlling the False Discovery Rate (FDR) [51]; we note that many other methods developed for analysis of microbiome data do not control the FDR [52].
The LDM gives both an overall (global) test of microbiome association as well as association tests for each OTU, while also allowing for control of confounding covariates. LDM was used to test for association between TSNAs, nicotine and other analytes and the microbiota, while controlling for potentially confounding factors including manufacturer, moisture, and pH. Because 'Classic' and 'Crisp' flavors were rare in our sample (see S6 Table in S4 File), we used ordination, with manufacturer as a confounder, to determine that 'Classic' flavor products were closest to the flavor category 'none' (corresponding to products where there no flavor was noted on the package); the distance between 'classic' and 'none' was smaller than the distance between any two other flavors. Thus, we assigned the products with 'classic' flavor the flavor value 'none.' Unfortunately no other flavor was close to the single 'crisp' product; thus, for the statistical association analyses where flavor was used as a confounder, the single "crisp" flavored product was left out of the analysis. Association between chemical analytes and individual OTUs were obtained from LDM, using a nominal false-discovery rate (FDR) of 10%. Significance was defined for LDM as q-value of less than the nominal FDR (q < 0.1) (S3 File). The direction of the identified associations were obtained using the sign of the 'v.freq' effects for individual taxa in the LDM (S3 File). To reduce Monte-Carlo error, we fixed the number of permutations to 1,000,000 for analyses of the entire dataset, and to 100,000 for the analyses of samples from individual manufacturers. For relative abundance analyses, p-values from the LDM omni test (that optimizes over untransformed and arc-sin root transformed tests) are reported. For the absolute abundance and CLR analyses, the arc-sin root transformation is not appropriate, so only results from the untransformed data are reported (denoted as FREQ in the LDM). The ordination plot was created using R package 'phyloseq' using the Bray-Curtis dissimilarity based on relative abundances. All R commands used in the analyses presented here are given in S3 File, where we also use comments to indicate how we used the LDM to analyze the quantitative count and CLR data.

Products
All varieties of moist snuff products available and listed by the selected vendor were purchased for this study. Products that were available in different cuts were only purchased in long cut. To obtain a representative and homogenous product, with enough product for all measurements, three tins for each product were pooled together, sampled for moisture, then pooled, homogenized, and frozen for later sampling. From the pooled material, samples were taken for chemistry and microbiological measurements (S6 Table in

Chemical measurements and observations
Chemical measurements exhibited some variation between products with a marked differences for the product Hawken wintergreen (Hawken). For instance, the total nicotine levels of ST products ranged from 11.3 and 16.7 mg/g, while Hawken was found to be 7.10 mg/g (Fig 1, S6  Table in S4 File). Similarly, product alkalinity ranged from pH 6.89 to 8.20, yet Hawken had a pH of 5.25 (Fig 2, S6 Table in S4 File). The free nicotine ranged from 0.90 to 7.65 mg/g, with the exception of Hawken at 0.01 mg/g (Fig 3, S6 Table in S4 File). The percentage of nicotine as free nicotine in these products ranged from 6.9 to 60.2%; whereas Hawken was 0.2%. These values are consistent with past measurements of top-selling moist snuff brands [25,53,54].
Previously, Richter et al., [25], identified a correlation between the top market products and highest free nicotine. We found that the current top three market share brands, Copenhagen, Grizzly, and Skoal, had a wide range of free nicotine, from the highest of the group in Grizzly Long Cut Wintergreen (at 7.65 mg/g free nicotine) to a product with comparatively low free nicotine (Copenhagen Long Cut Straight 'B', at 1.84 mg/g free nicotine). The two versions of Copenhagen Long Cut Straight with different labeling were particularly interesting in that they had very different levels of free nicotine (5.45 mg/g vs 1.84 mg/g, for 'A' and 'B', respectively).

PLOS ONE
We found that products with Wintergreen had significantly more free nicotine (as percentage of total nicotine, S1 Fig).

PLOS ONE
Associations between microbial communities and chemical constituents in U.S. moist snuff The moisture content for most products were all just over 50% moisture by weight, with values ranging from 50.3% to 56.8% (Fig 5, S6 Table in S4 File). An exception was Hawken which had 27.4% moisture, consistent with a previous study [25]. Moisture was overall very consistent between moist products and even between our results and studies reported >10 years ago [25,53].
A multivariate analysis of the chemistry measurements is given in S3 Fig. Aside from the expected correlation of pH with free nicotine percentage, other correlations were not observed.

DNA extractions, bacterial load, and observations of bacterial communities
We obtained microbial community profiles for all thirty-eight products using 16S rRNA sequencing. For the main study, extraction yields were highly variable, and ranged from 0.306 to 39.2 ng/mL with an average of 9.73 ng/mL, and a standard deviation of 7.68 ng/mL (S2 Table in Table in S4 File), but PCR amplification did not produce consistently measurable amplicons. The average read size was also shorter in Hawken (236 bases average read size vs 249 bases for the other samples). Because sequencing characteristics were not comparable to other samples, results from this product were deemed insufficient for analysis in PCA and other statistical metrics and consequently omitted from further analysis.

PLOS ONE
Total bacterial load had a median of 2.4x10 9 16S copies per gram of tobacco (S4 Fig). With the exception of Hawken, which had a much lower amount of bacterial load of 3.0x10 7 copies per gram of tobacco, the range varied from 2.3x10 8 16S copies per gram of product (MS34) to 2x10 10 16S copies per gram of product (MS11), corresponding to absolute abundances that varied up to 100-fold across samples.
Qualitatively, observed bacterial communities in the moist snuff samples were, overall, fairly consistent with results obtained in previous publications based on 16S analysis [20][21][22][23]. Most samples were dominated by just a few bacterial species, mainly Firmicutes, including just a few members of the Orders Bacillales (genera including Bacillus, Geobacillus, Oceanobacillus, Staphylococcus), and Lactobacillales (genera including Lactobacillus, Tetragenococcus), and Actinobacteria (Corynebacterium genus). Specific taxa were brand-dependent, and relative abundances of those brand-specific taxa varied between products within the same manufacturer. Relative abundances of taxa for each product are presented in bar graph form (Fig 6). We found that overall, the greatest driver of community composition was the product manufacturer, as most of the products by a single manufacturer had similar presence or absence of taxa and clustered together in the PCA analysis (Fig 7).

Associations between chemical analytes and the microbiome
The results of our analyses of the quantitative count data can be found in Table 1A-1C. We also analyzed these data using relative abundances and centered-log-ratio (CLR) transformed relative abundances. Data from all three scales were analyzed using the LDM. A comparison of the three results from relative, absolute and CLR-transformed abundance analyses is presented in S8 and S9 Tables in S4 File. Results for analyses using relative and absolute abundance were similar, while CLR-transformed analyses were divergent. Because absolute abundance data is

PLOS ONE
Associations between microbial communities and chemical constituents in U.S. moist snuff thought to be the most informative, we present only these results here. Results for the other two scales can be found in S8 and S9 Tables in S4 File.
We first used the LDM to determine how much of the variability found in the absolute abundance data each important variable explained. We found that Manufacturer explained 69% of the variability, while flavor, total TSNA, nicotine, moisture, and pH explained 15, 0.69, 3.0, 9.7, 2.5%, respectively. The effect of manufacturer on overall (global) taxonomic abundance was highly significant (p<0.0002). TSNAs were also significantly associated with taxa on a global level (p = 0.0064). Flavor appeared to have an effect, but when manufacturer was used as a confounder in the LDM, flavor did not reach significance as a driver of taxa globally (p = 0.61). The global association between TSNA and microbial composition was also not significant when flavor was incorporated as a confounder; however, some individual genera were found to be associated with TSNAs (Table 1A). Nicotine was not found to have a significant association with overall microbial composition on a global scale, but was also found to be associated with taxa, shown in Table 1B. Log transformations for TSNA were also investigated, but associations remained largely unchanged.

Taxonomy highlights of samples by manufacturer
We also investigated associations between TSNAs and microbial taxa using the LDM with data from samples corresponding to each manufacturer individually. A phylogenetic tree with abundances colored by manufacturer is presented in Fig 8. Four phyla were represented in the samples: Actinobacteria, Bacteroidetes, Firmicutes, and Proteobacteria. In this tree, manufacturer's microbiota and abundance patterns are demonstrated by the presence or absence of patterns found at the trees tips and sizes of the dots. For example, products manufactured by Stoker's were unique in that their microbiota were dominated by Bacillus spp. with only a few other taxa present. In Fig 8, the three clades that appear at the top of the tree are comprised primarily of taxa that appear exclusively or predominantly in Stoker's products.
Within-sample product (alpha) diversities, as measured by the Shannon diversity index are shown in Fig 9, which shows that Shannon diversity ranged from less than 0.1 to greater than 3.5. Products from Pinkerton had the greatest microbial diversity within the products we analyzed. U.S. Smokeless Tobacco Company (USSTC), Swisher, and ASC had Shannon diversity values that were similar and less than Pinkerton's (Fig 9). In contrast, Stoker's product sampled were differentiated from all other manufacturers by having the lowest Shannon diversity measures, and were heavily dominated by Bacillus spp.
When analyzed by manufacturer, the only significant findings were for products manufactured by the American Snuff Company (ASC). We found that Marinilactibacillus was negatively correlated with TSNAs. This means that when more of these taxa were present in a product, there was a likelihood of lower values of TSNAs. The Marinilactibacillus were the single most abundant organism found in all the samples. Marinilactibacillus genus was completely absent in all Stoker's products and was only found in low amounts in several ASC products. Tetragenococcus spp. were found in large amounts in most Pinkerton and some USSTC products.
Microbial community composition also differed substantially between the two variations of Copenhagen Long Cut Straight with different labeling, with sample 'A' having a much lower relative abundance of Tetragenococcus, a much higher relative abundance of Atopostipes, and a somewhat higher abundance of Marinilactibacillus.

Survey of moist snuff
Continued monitoring of smokeless tobacco products is important because its chemical and biological constituents can vary over time, even with products within the same brand. These variations can be due to multiple factors, including sources of tobacco, weather changes (e.g. rainfall, humidity), and changes in farming and manufacturing processes implemented over time [5]. When we compared products previously measured in our lab, we found a reduction in TSNAs for new products identically named. In this study, we noted three Skoal products (Skoal Long Cut Classic, Skoal Long Cut Mint, and Skoal Long Cut Wintergreen), having Table 1. Associations between product chemistry (TSNAs and nicotine) to taxa. Phylogenetic tree and abundance by manufacturer. R package Phyloseq was used to generate a taxonomic tree using data glommed to Genus level. Each tip represents a Genus, with bootstrap values given at intersections. Each dot after the tree tip and label represents a product that was found to have that taxon. The size of the dots represents abundance in that particular product. Sequencing for each sample was conducted in triplicate, with mean of three replicates presented in the figure.

A. Taxon-specific associations to the value of the sum of TSNAs NNN, NNK, NNAL (red or green background indicates direction of association in the LDM
https://doi.org/10.1371/journal.pone.0267104.g008

PLOS ONE
much lower TSNA values compared to the exact same products in the Richter et al., 2008 study [25]. Detailed manufacturing steps of moist snuff products remain as trade secrets. However, communications with regulatory authorities (e.g. FDA contacts), as well as recent comments from industry on proposed TSNAs regulations, suggest that at least some, or perhaps most, manufacturers tailor the microbes used in fermentation in order to minimize TSNAs [17]. This may also explain the patterns observed in the within-sample diversity metric (Shannon diversity), where we found similar values for all products from the same manufacturer.
We found product's microbiotas segregated readily by manufacturer, but none were notably similar to those previously reported in cigarette tobacco, cured, or aged tobacco leaves [55,56]. In most of those products, Proteobacteria, not Firmicutes, made up the majority of the taxa identified. At the highest level of taxonomy, most products tested here had a microbiota made up mainly of three genera within Firmicutes. Marinilactibacillus spp. were the taxa most likely to be present and were found in 74% of products tested (28/38, Fig 6). The dominance of PLOS ONE most of these products by Marinilactibacillus contrasts recently characterized little cigar and cigarillo [35,36] and cigarette microbial communities where this species was not prominent in the products [37].
To date, the majority of previous smokeless tobacco product surveys have not included microbiota data coupled to chemistry data, or have only limited data (e.g. Han, et al.,) [22]. Law, et al., 2016, established correlations between taxa and chemistry but samples in that study were not commercial products [24]. Thus, we attempted to establish associations between chemistry (TSNAs, nicotine, moisture, pH) and specific taxa (OTU abundances) in commercial products.
Microbiome data analysis. We conducted separate analyses of 16S relative abundance and absolute abundance data, as well as on the log-ratio scale. We found similar results for both relative and absolute abundance analyses, while results obtained using CLR-transformed data were divergent. This is important as some authors have argued that the only valid analyses of relative abundance data are conducted on the log-ratio scale. While it is true that analyses based on log-ratios are invariant to the (typically unknown) absolute quantity of DNA in a sample, it is telling that the analyses based on relative abundance were in fact consistent with the analyses based on absolute abundances while those based on log-ratios generally reached different conclusions. This finding is consistent with the observation that log-ratio-based analyses test a different hypotheses than those we tested here that were based on differences in absolute abundance [51]. In particular, the log-ratio-based analyses allow changes in the abundances (either relative or absolute) of pairs of taxa to be consistent with the null hypothesis as long as their ratio is unchanged. Further, it is known that the choice of a pseudocount can change the conclusions of an analysis [57,58].
In our analyses, we chose to consider manufacturer as a confounder because we saw that both TSNA levels and microbial composition varied by manufacturer. In future analyses, it may be interesting to determine the extent to which microbial levels mediate the effect of manufacturer on TSNA levels. Flavor was included as a confounder because it appears to have a large affect on product compostion, potentially affecting both the microbes present and in turn the TSNA levels. We also considered nicotine as a potential confounder due to its potential influence as a precursor for NNN. However, nicotine did not have global significance even when manufacturer and flavor were included as confounders. One argument against nicotine confounding associations between microbial taxa and TSNA is the disproportional abundance of nicotine compared to TSNAs in tobacco. Consequently, nicotine should have little direct effect on the amount of TSNAs.
The pH values of these products when TSNAs are being formed are likely to have an effect on TSNA levels, but we did not observe pH to be correlated to TSNAs in these samples, possibly because manufacturers tailor pH for their products after all TSNAs have been generated. Moisture was also tested for significance, but was not found to have significant effects on the microbiota, so it was not considered confounding for the statistical analysis.
Marinilactibacillus spp., the most common bacteria found in our samples, has not been well documented to date, with only a few species described and sequenced. None of the species so far identified in the Marinilactibacillus genus have an annotated nitrate reductase gene, or have been found to reduce nitrate in culture [59,60]. It is a possibility that a nitrate reductase will be identified in this genus in the future, but it is more likely that most microbes in the genus do not reduce nitrate.
Most studies of smokeless tobacco products have identified many bacteria known for being halotolerant, including Marinilactibacillus. We hypothesize that most bacteria observed in the finished products result from an ongoing selection brought about by the addition of salts and other manufacturing treatments to prevent further TSNA generation. For example, the addition of unknown amounts of sodium chlorate [17], a human toxin if ingested, conventionally used as a weedkiller, was reported by one manufacturer. Sodium chlorate is reported to be added as a competitive inhibitor for nitrate when exposed to nitrate-reducing bacteria, preventing these bacteria from proliferating and generating nitrite in products after packaging. Due to the toxic nature of sodium chlorate, further investigation into the concentrations of sodium chlorate in these products is warranted.

Comments on individual products
Compared with previous observations of moist snuff tobacco products [25,53], the chemical measurements of the products in this study were similar to those in the past, but a few trends were noticed including, on average, lower TSNAs than previously observed [18]. Total nicotine concentrations were overall increased, while free (unprotonated) nicotine concentrations were lower than previously observed for products with the same name brands. Differences in chemical concentrations may result from lot to lot variability, influenced by the tobacco source, or from temporal changes during the manufacturing process itself.
Unlike the other products described here, Hawken Wintergreen was previously found to be virtually sterile, by bacterial culturing methods [21]. However, we were able to extract quantifiable amounts of DNA. Although, even when an abundance (by total DNA measurement) of template DNA was used as template material, we were unable to effectively amplify 16S sequences from the extraction, as demonstrated by undetectable amounts of DNA after PCR amplification. While we were able to obtain a useable signal in the bacterial load qPCR, this method uses only a small portion (~125 base pairs) of the ubiquitous 16S rDNA sequence, the measured bacterial load was about 10-fold less than the second lowest product, and almost 100-fold lower than the median measurement of the products. This suggests that either 1) the DNA purified out of Hawken may have been degraded to a great extent, or 2) it originates from Eukaryotic sources. The latter we consider unlikely, based on shotgun metagenomic data of other ST products that showed Eukaryotic microbes in very low abundance in most moist snuff products [6,61]. Despite Hawken being seemingly bacteria-free as an end product, it had comparable amounts of TSNAs to other moist snuff products, although it was lower in nicotine (by wet weight) than all the other moist snuff products, and much lower in free nicotine. This may further support the idea that the end product microbiota may not represent the microbes that were present when TSNAs were formed [61].
The clear segregation of microbiotas by manufacturer in the products analyzed shows that processing or source of tobacco is more important for a product's observable communities than added flavor. Further investigation into flavor's affect on ST microbiota is warranted due to the potential to use flavor to tailor the microbiota away from taxa involved in generation of TSNAs.

Limitations of the study
We focused on bacterial constituents because Fungi and Eukaryotes have been found to be less prominent in these products. Further, based on previous literature and metagenome studies, bacteria are expected to have larger effects on TSNAs in the manufacturing process [6,24,33,61], than fungi. The microbiota observed may simply reflect fluctuations due to changes in tobacco source due to growing conditions. This change in source material is likely to be linked to the manufacturer-to-manufacturer variability seen in this study. Therefore, one limitation of this study is that it presents samples obtained at a single time point, and from a single vendor. A wider range of sampling may better characterized the product variability and would help identify the stability, or lack thereof, of the microbiota in these products. We also did not consider lot-to-lot variation in these products (except the difference in the two versions of Copenhagen) or the impact of product aging, as the products are not labeled with the manufactured date.
We also acknowledge that this data represents statistical associations only. Experimentation is necessary to investigate whether these taxa may be causal in being responsible for reducing nitrate, and therefore, directly involved in the generation of harmful TSNAs in the production of these products. However, it is very unlikely we could obtain longitudinal samples from a single batch or lot as it goes through the stages of manufacture; even sequential cross-sectional samples are not likely to be made available. As a result, the final-product associations presented here are the only evidence available on any possible relationship between bacteria and chemical composition of ST products. Different product types appear to have different core microbiotas, and it is clear that bacterial constituency in the products we observed do not reflect microbiotas found in raw tobacco [34][35][36]38]. Associations identified here may be relevant only for moist snuff and may not be a suitable approach for the specific identification of organisms involved in nitrate reduction, and thus, TSNA generation. Communities observed in offthe-shelf tobacco products, especially moist snuff, which has different consistency from raw tobacco, may not reflect what taxa were present during the active periods of TSNA formation. In the products tested here, many of the bacteria identified are known halotolerant bacteria that may simply reflect such an environment is present in these tobacco products. Many genera, including some of the most abundant in these products such as Lactobacillus and Marinilactibacillus, do not even include known nitrate reduction genes in their genomes. These observations support a hypothesis that the community may have shifted between the time of TSNA generation and the time we are observing the community in the off-the-shelf product. We suggest further investigation is needed to identify the nitrate-reducing microbes that may be active at earlier time points in the manufacturing process and may be ultimately responsible for TSNA formation. Lastly, the number of products tested here is relatively small for making robust statistical conclusions, where analyzing a larger number could potentially reveal associations inadvertently overlooked in this study.

Conclusions
This study offers a publicly available large sample set of amplicon sequencing of U.S. domestic moist snuff products. The chemical and microbiota measurements provide a starting and reference point for ongoing explorations of the potential associations between product chemistry and its microbiota. We found a number of taxa were associated with TSNAs, though interpreting these associations in light of their occurrence at the end of a time-dependent process may be problematic. Future studies directed towards samples obtained at earlier time points in the manufacturing process may help greatly reduce the potential confounding variables in this complex system, but such samples would only be available from the manufacturers and so are likely difficult to obtain.
Advancing our knowledge of smokeless tobacco products microbiotas will greatly help in the ability to suggest regulations that may lead to lower toxicant levels. Although lowering exposure for a select, but potent class of carcinogenic chemicals, might not lower overall harm. However, it seems prudent to consider reducing exposures to harmful chemicals in these products, when feasible. Tailoring the bacterial composition by adding species that do not reduce nitrate or increase nitrite assimilation are techniques that have had some demonstrated success in reducing TSNA concentrations, and further research should be encouraged in these areas.