Analysis of the interests of Google users on toothache information

Background The knowledge on health interests of a given population of Internet users might contribute to the increase of evidence on community’s dental needs, and consequently, to the improvement of public health planning. The frequency of searches for specific issues on Google can be analyzed by the application of Google Trends. Aim In this study, we aimed to characterize the interests on toothache information of Google users from the United States, United Kingdom, Australia and Brazil. Methods The monthly variation of relative search volume (RSV) and the lists of main toothache-related queries were determined from January 2004 through April 2016 using Google Trends. Autoregressive Integrated Moving Average (ARIMA) forecasting models were developed to determine predictive RSV values for toothache for additional 12 months. Autocorrelation plots and general additive model (GAM) were applied to determine trends and seasonality in RSV curves. Through linear regression models, we assessed the association between the variation of annual means of RSV values and national statistics for toothache in the U.S. and U.K. Subsequently, the distribution of main queries according to the identification of endodontic pain, type of searching information, and the interest in self-management of toothache was evaluated for the four countries. Results The autocorrelation plots showed patterns of non-stationary time series. The monthly variation influenced the data of the U.S. and U.K., with the higher RSV values found respectively in January/July and December. Also, the interest on toothache in the U.K. increases in the second semester and in the fourth quarter, especially in December. Additionally, an annual variation affected significantly all time series, with the increment of RSV means over the years, varying from 265% in the U.S. to 745% in Brazil. In parallel, the increments in RSV values were also observed in all predictive curves. The annual variation of observed and fitted RSV values was directly associated with the increase of toothache visits in the U.S. and urgent dental treatments in the U.K. Moreover, the queries typed on Google were markedly linked to searches on endodontic pain information, especially in Brazil, where the residents usually searched for relief and/or self-management of pain. Conclusions Therefore, these findings indicate an increasing interest of Google users on toothache-related topics, regardless of country and season. The Internet activity can be used by policy makers as a complementary source of data for the development and implementation of strategies to control and prevent toothache complications.


Aim
In this study, we aimed to characterize the interests on toothache information of Google users from the United States, United Kingdom, Australia and Brazil.

Methods
The monthly variation of relative search volume (RSV) and the lists of main toothacherelated queries were determined from January 2004 through April 2016 using Google Trends. Autoregressive Integrated Moving Average (ARIMA) forecasting models were developed to determine predictive RSV values for toothache for additional 12 months. Autocorrelation plots and general additive model (GAM) were applied to determine trends and seasonality in RSV curves. Through linear regression models, we assessed the association between the variation of annual means of RSV values and national statistics for toothache in the U.S. and U.K. Subsequently, the distribution of main queries according to the identification of endodontic pain, type of searching information, and the interest in self-management of toothache was evaluated for the four countries.

Results
The autocorrelation plots showed patterns of non-stationary time series. The monthly variation influenced the data of the U.S. and U.K., with the higher RSV values found respectively in January/July and December. Also, the interest on toothache in the U.K. increases in the second semester and in the fourth quarter, especially in December. Additionally, an annual PLOS

Introduction
The World Health Organization (WHO) and the World Dental Federation (FDI) recommended the reduction of toothache as one of the priority issues in the Global Oral Health Promotion Agenda [1]. Toothache is defined as an orofacial pain originated from a dental element and/or adjacent structures in consequence of several diseases or conditions, such as dental caries, periodontal disease, trauma, occlusal dysfunction, and abscess [2]. It is more prevalent among socioeconomic deprived groups [3,4], affecting the sleep, feeding, school/ work performance, and productivity [5,6]; consequently, toothache negatively impacts the individuals' quality of life [7]. For these reasons, people suffering with toothache may access the Internet searching for useful advices for the self-management of pain, such as medication, home remedies and emergency dental care [8]. Also, internet users navigate on the web to confirm professional instructions or to make self-diagnosis of alterations and/or diseases [9]. The health seekers frequently begin their web searches using the Google Search engine, which account for more than 70% of market share of this industry [10]. The Big Data produced from structured queries typed on Google can be systematically analyzed by Google Trends [11], an online tool developed to evaluate market and opinion trends on the Internet in near real-time. In this scenario, the awareness of the volume and profile of oral health searches performed on Google from specific regions might contribute to the recognition of community's dental needs in order to guide policy makers in the development of further action plans, providing information to an area that traditional methods are largely incomplete: behavioral health [12], a multidisciplinary field that combines knowledge on health and human sciences with focus on the better understanding of health and disease [13]. It would be even more important when regarding the difficulty of obtaining accurate toothache data, since that diagnosis depends on how patients deal with physiological, observational and self-reporting components of pain [14]. In addition, access barriers prevent a sizeable proportion of individuals in reaching appropriate dental treatments, leading to under-registration of toothache [2,15]. Recently, there has been an increasing use of the analysis of Internet activity to measure the people's interest on medical conditions [16][17][18][19][20].
Taking into consideration the challenges posed by toothache, the aim of this study was to characterize the toothache-related interests of Google users from the United States, United Kingdom, Australia and Brazil. We hypothesize that the surveillance of health seeking behaviors of Internet users could provide complementary information about the impact of toothache in different population groups, aiding in the minimization of damage caused by the self-management of that clinical condition.

Study design
This cross-sectional study was performed by the analysis of computational data of four distinct countries. After the development of specific query strategies, the relative search volume (RSV) and the popularity of toothache-related queries were determined using Google Trends, in two distinct periods. The RSV values indicate the ratio between the search volume of specific Google's queries and the search volume of overall Google's queries performed in specific regions and time intervals. These values are normalized in function of the maximum value of the time series (RSV = 100), varying from 0 to 100. Also, the main toothache-related terms typed on Google are ranked in descending order of their respective RSV values.
The data collected were assessed according to the following aspects: (a) validation of analysis, (b) search volume trends and seasonality, (c) production of forecasting models, (d) association with available toothache statistics, (e) relation with media interests, and (f) distribution of categories of main toothache-related topics.

Countries selection
The countries were selected according to three criteria: (i) availability of more than 20 million Internet users, (ii) more than 50% of Internet penetration, and (iii) detection of relevant Internet-based toothache data in Google Trends. To evaluate season-based effects, two countries from each Hemisphere and distinct continents were chosen simultaneously. The United States, the United Kingdom, Australia, and Brazil presented all requirements for being included in this essay.

Query strategies
The development of query strategies is shown in Fig 1. Initially, free queries written in English and Brazilian Portuguese were performed on Google Search to choose relevant toothacherelated terms for the four selected countries. In this first step, only three terms in English ["toothache", "tooth pain", and "aching tooth"] and four terms in Portuguese ["dor de dente" (toothache), "dor dente" (toothache), "dente dolorido" (sore tooth), and "dente doendo" (aching tooth)] were considered relevant. Following, other eight toothache-related terms described by Ahlwardt et al [8] were added to the initial English term list.
Then, each term was individually inserted in Keyword Planner [21], a tool used to find new keywords based on pre-determined terms. After that, 26 terms in English and 16 terms in Portuguese were additionally included in the keyword lists; however, terms presenting redundant, scientific or not human-related characteristics were excluded (English -21, n = 16; Portuguese -3, n = 17). Finally, after the individual analysis of each term on Google Trends, keywords with irrelevant RSV values were also excluded (English -6, n = 10; Portuguese -14, n = 3).

Search volume trends
The RSV curves were obtained using Google Trends, as aforementioned. The data were collected specifically for each one of the four selected countries, in two distinct dates: November  To confirm the equivalence of two query strategies in demonstrating the search volume of toothache-related terms in different countries, the trends related to the topics "Toothache-Disease", a set of automatic algorithms supplied by Google Trends, were also achieved for the U.S., U.K., and Brazil in the second session of data collection (April 30, 2016). Google Trends did not provide automatic algorithms for Australia.

Main queries
The main toothache-related queries used by the Internet users from different countries were saved in.csv files, also provided by the platform of Google Trends. An investigator dichotomized those queries according to three categories: identification of endodontic pain (no/yes), type of searching information (cause-symptoms/relief-treatment), and interest in self-management strategies of toothache (no/yes). "No" and "cause-symptoms" were coded as 0 (zero), while "yes" and "relief-treatment" were coded as 1 (one). Each query was weighted by its respective RSV value. The differences in the distribution of toothache characteristics in distinct countries were determined for each category by the comparison of the percentage of weighted queries.

Available statistics for toothache
The U.S. toothache statistics and population estimates between the years 2005 and 2010 were respectively extracted from The National Hospital Ambulatory Medical Care Survey (NHAMCS) [22] and the U.S. Census Bureau [23] databases. The variables age, sex, patient's reason for visit (rfv1-3), and patient visit weight (patwt) were considered to calculate the absolute and relative number of toothache visits registered by the Emergency Department. The weights of patients (patwt) with diagnosis of toothache were summed to obtain the amount of toothache cases per 1,000 population, per age group (15-69 y, x1,000 pop.), and per female sex (x1,000 pop.) on an annual basis.
The estimates of the number of urgent dental treatments in England per 1,000 population and adult age group were based on the official reports of the NHS Dental Statistics for England [24]. The population data were obtained from the Office for National Statistics [25]. Both statistics were published between 2006/2007 and 2014/2015.
The toothache statistics of Australia were obtained from the Australian Institute of Health and Welfare [26], which provided a curve trend of toothache cases among persons aged 15 and over, from 1994 through 2010. The prevalences of toothache in Brazil were collected from the reports of two distinct surveys-SB Brasil 2003 [27] and SB Brasil 2010 [28].

Data analysis
Data were analyzed with the Statistical Package for Social Science (version 21.0; SPSS, Chicago, USA), considering the following aspects: 1. Validation of analysis: The Intraclass Correlation Coefficient (ICC) was used to assess the stability of RSV toothache data collected in two dates (Nov 2014 and Apr 2016), and to analyze the absolute concordance between the curves originated from the present query strategies and those curves originated from automatic algorithms by Google Trends. The predictive performance of preliminary forecasting models (November 2014) was evaluated based on real data collected in April 2016 (see details on predictive analysis below).

Search volume trends and seasonality:
The autocorrelation and partial autocorrelation plots of the RSV toothache values were analyzed to identify the patterns of fluctuation of data over time. The effect of seasonality on the time series was evaluated by generalized additive model (GAM). It was comprised by a previous detrending of each long-term curve by its lag-1 difference, with subsequent application of distinct generalized linear models to evaluate the effect of monthly, quarterly, semi-annual and annual seasonality on time series. 5. Relation with media interests: The months associated with the abrupt RSV spikes, identified as outliers in ARIMA models, were qualitatively analyzed in relation to toothache-related media publications. Seven types of outliers were considered for this analysis: additive, level shift, innovational, transient, seasonal additive, local trend, and additive patch.
6. Distribution of categories of main toothache-related topics: The differences in the distribution of the main toothache-related topics between distinct countries were evaluated by Chisquare Pearson test.
For all statistical analyses, P values < 0.05 were considered significant.

Validation of analysis
The RSV data of the four countries demonstrated an excellent stability over time. The minimum and maximum ICC values for consistency were respectively found in Brazil (0.88, 95% CI:0.82-0.91) and U.S. (0.99, 95% CI:0.99-1.00). Also, the absolute concordance between the current RSV toothache and the Google's automatic time series varied from 0.95 (U.K., 95% CI:0.49-0.99) to 0.98 (Brazil, 95% CI:0.98-0.98). The preliminary 12-month forecasts displayed significant increments of RSV toothache values in the U.S. (6.04%), U.K. (4.56%), Australia (3.26%) and Brazil (7.10%). These results were fully confirmed by real data collected in April 2016; nevertheless, the forecasts were more conservative in the U.K., Australia and Brazil, with respective gains of 8.89%, 5.36%, and 26.23%. On the other hand, the U.S. showed a lower variation (3.72%) than that predicted.

Search volume trends and seasonality
The autocorrelation plots clearly demonstrate the pattern of non-stationary time series (Fig 2). The monthly variation influenced the data of the U.S. and U.K., with the higher RSV values found respectively in January/July and December (Fig 3). Also, the interest on toothache in the U.K. increases in the second semester and in the fourth quarter, especially in December (Fig 3). Additionally, an annual seasonality affected significantly all time series, with considerable increments of RSV means over the years, varying from 265% in the U.S. to 745% in Brazil (Figs 2 and 3).  (Fig 4). Association with available toothache statistics Table 3 depicts a positive significant association between the variation of the annual means of observed and fitted RSV toothache values and the increment of toothache visits in the U.S. (+29.7%) or urgent dental treatments in the U.K. (+21.9%). The toothache statistics presented for the total population of the U.S. and adult population of the U.S. and U.K. were stronger correlated with RSV toothache values than other population groups.

Relation with media interests
Twenty-seven outliers were detected in different time series, which were distributed among the U.S. (n = 10), U.K. (n = 5), and Brazil (n = 12). Only a scientific article published in January 2005 was connected to an additive patch outlier in the U.S. time series. Therefore, the media interests did not disturb the variation of RSV toothache values in any of the four countries.

Main topics of toothache information
Although the main toothache queries were markedly associated with searches on endodontic pain information, the proportion found in Brazil was still significantly higher than other countries. For instance, queries as "earache", "wisdom tooth", "sinus toothache", "sinus infection", "sensitive tooth", and "jaw pain" were categorized as non-endodontic pain. In addition, Brazilian residents seemed to be more prone to seek instructions about relief/treatment of toothache compared to other nationalities. The adoption of self-management strategies, such as the consumption of medications and home remedies, were usually identified among individuals interested in relief/treatment pain, mainly in the U.S. and Brazil ( Table 4). The percentages of searches related to adequate treatment of toothache were 1.2% in Brazil, 1.8% in the U.S., 3.9% in Australia, and 7.9% in the U. K. (Table 5).

Discussion
These findings indicate a continuous increase of the interests of the Internet users on toothache information over the years. These trends are consistent among different countries, without influence of media. In most cases, the Internet is used as a source of instruction on how to self manage dental pain, through the consumption of medication and/or home remedies. To our knowledge, this is the first study that demonstrates the utility of Google-based data to improve the understanding of dental needs of distinct countries. The search volume trends on toothache are in agreement with the national statistics of the U.S., U.K., and Australia, which registered the increase of the incidence of toothache along time [22,24,26]. Honkala et al [29] have displayed a growth of toothache cases among Finnish adolescents from 1977 to 1997, even with the decline of dental caries. More recently, Lewis et al [30] found a gradual increase in toothache cases among U.S. young adults who sought   [31] demonstrated that the burden of untreated dental caries in permanent teeth raised 3.2% from 1990 through 2010. The impact of tooth loss among children and adolescents progressively decreased in response to the reduction of dental caries rates. This fact in conjunction with the population aging may explain the increase of untreated caries lesions among adults, and by consequence, the higher incidence of toothache [32]. The deterioration of the employability and the health systems due to the financial crises hampers the access of individuals to preventive dental visits, also contributing to the increment of toothache [29,33]. In this sense, Althouse et al [20] showed the increment of toothache concerns of web health seekers during the U.S. Great Recession (2008-2011). Other seasonal stressor might explain the significant increase of toothache-related queries in December and January: Christmas festivities. The impact of Christmas ranges from increased stress, family conflicts and alcohol misuse to heightened loneliness, increasing mental health difficulties and domestic violence [34]. Curiously, the higher means of detrended RSV values for toothache were also observed during December in Brazil and Australia. Although our results demonstrate a trend of increase of search volume of toothache-related queries in Brazil, the national statistics produced in 2003 and 2010 revealed a substantial decline of toothache cases, from 34.8% to 27.5% [27,28]. This divergent result might be explained by the advancement of digital penetration, which facilitates the access of deprived groups to the Internet, typically more affected by toothache [35,36]. This greater Internet access also intensifies the health seeking behavior, leading persons to begin their toothache- related searches when are experiencing milder pain [29]. Although this overreaction to pain might be a confounding factor for toothache analysis, a deeper overview of the present queries elucidated the endodontic pain as the main cause of Internet toothache searches in Brazil. On the other hand, two methodological differences found between the Brazilian surveys might influence the interpretation of these results. The sampling criteria adopted in the surveys differed significantly from each other, with the preponderant inclusion of metropolitan areas in the second survey. The greater availability of healthcare services in urbanized areas may facilitate the access to dental treatment, reducing the number of toothache cases among those population groups. Besides, in both surveys, the prevalence of toothache was calculated through the recall of participants about their last 6-month dental experiences, which predisposes to inaccurate reports and observer-expectancy effect [37,38]. Differently, the prevalence of toothache in the U.S., U.K. and Australia was based on the information collected from the emergency patient records [22,24,26]. It is not possible to affirm that all queries typed on Google were performed by people experiencing toothache; however, the no interference of media on the behavior of time series and the direct link of main queries with the self-management of endodontic pain indicate that searches were mostly conducted by individuals interested in the resolution of dental pain. Also, the overestimation of the interest on toothache cannot be discarded due to possible duplicate searches, since Google accounts all specific queries originated from different Internet Protocols (IPs) in a time range [11]. In parallel, Santillana et al. [39] exhibited the tendency of Google Flu Trends in overestimating cases of influenza over time. Indeed, a same person suffering with toothache can search on Google from diverse computers, localized at home, work and public places. While this repetitive action limits the potential of Internet-based methods for replacing traditional epidemiological approaches, the registration of multiple queries from a single person may indirectly represent the severity of pain, the importance of health barriers or even the time elapsed until the dental visit; therefore, this analysis can contribute greatly with the elucidation of behavioral aspects related to toothache. It is noteworthy that the number of people that used Google for seeking adequate treatment of toothache was alarmingly low, varying from 1.2% in Brazil to 7.9% in the U.K. These results are even more dramatic when the profile of Google health seekers is considered, composed predominantly by white women with high income and education [9].
The individuals interested in relief and/or treatment of toothache usually searched for selfmanagement of pain, through the utilization of home remedies and/or over-the-counter painkillers. These results corroborate with previous social media-based findings [8,40]. Also, a major impact of the self-management of endodontic pain was found in Brazil, a developing country with profound social inequalities and, therefore, with less access to dental treatment. These results bring up the concern about the negative effect of incorrect and/or inaccurate contents of websites on the deterioration of the patient's health conditions, hampering the person-professional relationship and the shared decision-making process.
The abundance of Google Trends' data enables the development of good fitted forecasting models for the prediction of health interests of specific populations in a near future, overcoming the expected delay for the production and publication of dental statistics. Furthermore, this approach presents other advantages: i) the anonymous and objective collection of data decreases the reporting bias of surveys; ii) the analysis can be periodically updated and filtered for particular areas; iii) the assessment of the consumption of health information is simpler, faster, and more economical than interview-based methodologies [19]; iv) it provides near real-time data [37,38]; v) the methodological standardization enables the comparison of results from distinct studies; vi) these data can be useful to improve the quality of surveys, by the inclusion of relevant questions and identification of new hypotheses using online proxies [12]. It should also be considered that the greater access to mobile technologies as smart phones and tablets, and the diffusion of high-speed Internet lead to the increment of the number of health-related queries in the next years. Additionally, the Internet penetration is advancing consistently even among the poorest countries [41], which substantiate the representativeness of samples in further studies.
The strength of these methods can be observed through the excellent stability of time series data, with the maintenance or improvement of the association level between the national statistics for toothache and the fitted RSV curves, which indicates a low risk of spurious correlation. In this analysis, we had to compare the RSV toothache values obtained for U.K. with dental statistics of England due to the lack of respective data. In our opinion, this comparison is feasible because the RSV data for England is presented as a constant value of the U.K. time series. Moreover, the confirmation of all preliminary forecasts demonstrated the power of ARIMA models in predicting the direction of persons' interests on toothache information. Likewise, the strong positive correlation between the curves resulting from the English or Portuguese query strategies with those curves given by automatic Google Trends' algorithms revealed the potential for comparison of results produced through distinct languages.
In conclusion, Google users from the U.S., U.K., Australia and Brazil demonstrated an increasing interest for toothache information, despite of media influence. Based on these results, the Internet activity can be used as a complementary source of data to support policy health makers in the development and implementation of person-centered strategies, such as the awareness campaigns on the possible consequences of untreated toothaches and/or the training of professional teams to prescribe good quality dental information for their patients. These measures would be relevant for minimizing the damage caused by the non-effective selfmanagement of toothache.