Investigating public behavior with artificial intelligence-assisted detection of face mask wearing during the COVID-19 pandemic

Objectives Face masks are low-cost, but effective in preventing transmission of COVID-19. To visualize public’s practice of protection during the outbreak, we reported the rate of face mask wearing using artificial intelligence-assisted face mask detector, AiMASK. Methods After validation, AiMASK collected data from 32 districts in Bangkok. We analyzed the association between factors affecting the unprotected group (incorrect or non-mask wearing) using univariate logistic regression analysis. Results AiMASK was validated before data collection with accuracy of 97.83% and 91% during internal and external validation, respectively. AiMASK detected a total of 1,124,524 people. The unprotected group consisted of 2.06% of incorrect mask-wearing group and 1.96% of non-mask wearing group. Moderate negative correlation was found between the number of COVID-19 patients and the proportion of unprotected people (r = -0.507, p<0.001). People were 1.15 times more likely to be unprotected during the holidays and in the evening, than on working days and in the morning (OR = 1.15, 95% CI 1.13–1.17, p<0.001). Conclusions AiMASK was as effective as human graders in detecting face mask wearing. The prevailing number of COVID-19 infections affected people’s mask-wearing behavior. Higher tendencies towards no protection were found in the evenings, during holidays, and in city centers.


AI-assisted face mask wearing (AiMASK)
AiMASK system was developed using OpenPose for pose detection, Norfair for human tracking, and MobileNetV3 for mask detection. The OpenPose is a real-time multi-person system which detects human body, hand, face, and foot key-points (in total 135 key-points) on single images. The Norfair is a Python library for real-time 2D object tracking built by Tryolabs. It predicts each point's future location based on previous positions and align these approximate locations with the detector's newly observed points to perform tracking. The MobileNetV3 was used with TensorFlow Lite to detect masks (Fig 1).
Images were categorized into protected and unprotected group. The protected group was the correct mask-wearing group, and the unprotected group consisted of incorrect mask-wearing and non-mask-wearing people. The correct mask-wearing group consisted of people whose face masks covered their mouth, nose, and chin simultaneously, while the incorrect mask-wearing group was composed of people who wore a face mask that did not cover their

PLOS ONE
Investigating public behavior with AiMask of face mask wearing during the COVID-19 pandemic PLOS ONE | https://doi.org/10.1371/journal.pone.0281841 April 11, 2023 2 / 13 AiMask to be used by the government. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. Thanaruk Theeramunkong had full access to all the data in the study and had final responsibility for the decision to submit for publication.
mouth, nose, and chin at the same time. The non-mask-wearing group was made up of those whose face mask was not detected (S1 Fig).
All images for training and validation were attained through closed-circuit televisions (CCTVs) files owned by the Bangkok Metropolitan, which we received permission from the Bangkok Metropolitan to access all the CCTV files.
For the last process we utilized 8,551 images obtained from CCTVs around the city during December 2020, which were independent images from the ones used for the training of AiMask. Initially, 6,795 images were used for training. AiMASK marked 3,180 images as protected group and 3,615 images as unprotected. (Table 1) After successful training, internal validation was performed using the testing set of images which comprised of around 20% (1,756 images) of the initial images. During internal validation, average accuracy of AiMASK was 97.83% (95%CI; 97.04-98.46%) ( Table 1).
After training and internal validation was completed, external validation was done on an independent data set of randomly selected images from 64 (one-hour) CCTV files from the morning and in the evening from each of the 32 districts from different dates and time during 20-22 January 2021 and compared the results of the human graders with those of AiMASK (Fig 2).

AiMASK data collection
Having been verified internally and externally, AiMASK was used to gather information for this study. Data classified into the unprotected group were manually allocated into the nonmask-wearing and incorrect mask-wearing groups. AiMASK analyzed recorded videos from the same CCTVs the training images were gathered from. Files from CCTVs in various areas in every district around Bangkok were sent to AiMASK for evaluation. Each file collected had a duration of one hour, and all data were updated on the AiMASK website daily (https:// aimask.aiat.or.th/). AiMASK detected images from public areas between January 23, 2021, and April 22, 2021 (90 days). Videos were taken from two separate timeframes, one during the morning (7am-8am), and one during the evening (5pm-6pm). We subcategorized days into working days and holidays, with working days defined as Monday to Friday, and holidays defined as weekends and public holidays. We also subcategorized data by type of place from which images were gathered into 7 categories: market entrances; inside the markets; public transportation (bus stops and sky trains); malls and convenience store entrances; building entrances; footbridges;  and along the sidewalk. Districts were subdivided into 2 groups, the city center and suburban districts. Numbers of COVID-19 patients were taken from the daily official reports from Bangkok Metropolitan Data Center which reveals new cases over a 24-hour period. New clusters of COVID-19 in this research comprised of 2 big events announced by the Department of Disease Control under the Ministry of Public Health. The first cluster was reported on March 14, 2021, from Bang Kae, and the second cluster was in Thonglor on April 5, 2021. Patients who tested positive were reported according to their current residential district.

Outcome measures
Data were categorized into two groups for analysis: the protected group and the unprotected group. The primary outcome was the proportion of the protected and unprotected groups together with correlations with the reported number of COVID-19 infections. Secondary outcome was identification of factors showing correlations with the varying proportions of the unprotected group.

Statistical analysis
Descriptive statistics were used to report the total number of people analyzed by AiMASK and the proportion of mask wearing. Continuous data were reported using mean, median, and standard deviation (SD). External validation of AiMASK was analyzed by confusion matrix. The accuracy, precision, recall, and F1 scores were calculated. Correlations were calculated by Pearson's correlation coefficient. A "very high" correlation was defined as a correlation coefficient of 0.90-1.00, a "high" correlation was a value of 0.70-0.89, a "moderate" correlation was defined as a correlation coefficient of 0.50-0.69, and a "low" correlation was a value of 0.30-0.49. Little or no correlation was considered to be a correlation coefficient � 0.29.
Categorical variables were compared using the Chi-Square test. Data were analyzed using univariate analysis with 95% confidence interval (CI), and p<0.05 was considered statistically significant. All analyses were performed with SPSS 16.0 for Windows (SPSS Inc., Chicago, IL, USA).

External validation of AiMASK
AiMASK classified 3,000 people into the same group as human graders, giving AiMASK an accuracy of 91%. F1 score was 0.91 in both groups ( Table 2).

Overall data
During the 90 days of the study, 1,124,524 people were counted. The protected group accounted for the largest proportion (95.98%), followed by the unprotected group (4.02%). Incorrect mask wearing and the non-mask-wearing groups constituted 2.06% and 1.96% respectively. The protected group was over 90% at every time point. The average number of places analyzed per day was 24.87 ± 5.34, and the average number of people detected per day was 12,494 ± 3044.63.
During the same 90-day timeframe, the total number of new COVID-19 patients was 6,312. The median number of new daily cases was 15.5 (range 0-446). Two weeks before the Bang Kae cluster, the size of the unprotected group increased to 5.31%. During the same two weeks before the first cluster, an average of 4.2 cases were reported per day. The highest percentage of people in the unprotected group was 8.38% on March 14, 2021, the same day the Bang Kae cluster was announced. A day later, the size of the unprotected group decreased to 4.74%.
Twenty-three days after the Bang Kae cluster, another cluster was announced in Thonglor area (April 5, 2021). On that day, the unprotected group accounted for 3.51% of the observed individuals. One day after the Thonglor cluster, the unprotected group decreased to just 2.90%. Ever since this second cluster, the proportion of people in the unprotected group varied between 2.18% and 3.84%. After the Thonglor cluster, the mean number of cases reported per day was 272.9 (Fig 3). Moderate positive correlation was found between the number of new COVID-19 patients and the number of protected group (r = 0.432, p<0.001), while negative correlation was found between the number of new COVID-19 patients and the size of the unprotected group (r = -0.507, p<0.001). Overall, there was a moderate negative correlation between the amount of new COVID-19 patients and the proportion of people in the unprotected group.

Face mask wearing divided by place
The percentage of unprotected individuals in all 7 types of places showed statistically significant differences (p<0.001). The lowest percentage of unprotected individuals was found inside markets at 2.64%. The Odds Ratio (OR) of building entrances was 2.30 (95% CI 2.20-2.41, p<0.001), while the OR of the sidewalk was 1.88 (95% CI 1.80-1.96, p<0.001) compared with inside markets (Table 3).

Face mask wearing divided by date and time
The proportion of people in the unprotected group was significantly lower in the morning than in the evening (3.27% and 3.74% respectively, p<0.001). Among the unprotected group, incorrect mask-wearing was more prevalent than non-mask-wearing both in the evening and in the morning (Table 3).
Sunday evening had the highest rate of unprotected people at 5.05%. On holidays and in the evening, people were 1.15 times more likely to be unprotected than on working days and in the morning (OR = 1.15, 95% CI 1.13-1.17, p<0.001) ( Table 3).
The percentage of unprotected people during the holidays (4.06%) was higher than on working days (3.40%). Sundays and Saturdays had the highest rates of unprotected individuals at 4.98% and 3.99% respectively, while Mondays had the lowest rate at 3.23% of the 5 working days (Table 4).

Face mask in different districts
The 5 districts with the highest proportions of unprotected people are all situated in the center of Bangkok and are adjacent to each other (Fig 4). Districts in the city center were 1.31 times more likely to have higher rates of unprotected people than suburban districts (OR = 1.31, 95% CI 1.28-1.34, p<0.001) ( Table 3). No correlation was found between reported COVID-19 cases and unprotected people divided into each districts.

AI-assisted face mask wearing (AiMASK)
Constant face mask detection is required to gather information about the public's compliance with recommendations regarding wearing face masks. While previous studies have used manual methods to acquire this information, AiMASK-assisted face mask detection methods have allowed us to monitor a large number of people in a short period of time with high accuracy. AiMASK's accuracy has been assessed using actual images captured from CCTVs through external validation processes. This study is the first to use AI-assisted face mask detection in real world settings. Previous studies from Egypt and China also developed a machine-learning device to detect face masks, but they only performed internal validation, with reported accuracy ranging from between 98-100% [25,26].

Overall data
The overall rate of mask-wearing in Bangkok was 95.98%. At the beginning of the year 2021 (January 22 to February 28), the percentage of unprotected people was 2.95%, during which the number of new COVID-19 patients was at around 8.7 cases per day. Two weeks before the first cluster was announced, the proportion of unprotected individuals increased to 5.31%, reaching its maximum at 8.38%. The increase in the unprotected group was due to the lower number of new infections per day, averaging at 4.2 cases. The low number of new cases resulted in people letting their guard down. Immediately after the first cluster was announced, the size of the unprotected group started to decline gradually. Not long after the first, the announcement of the second cluster brought about a further decrease in the size of the unprotected group, which dropped to 2.61%. When the number of patients increase rapidly, people tend to exercise more care to protect themselves.
The government has emphasized the importance of social distancing and self-protection ever since the pandemic began in Thailand in 2020. Even when the situation was improving, the public health department still encouraged everyone to keep their distance and to not drop their guard. Data provided by AiMASK showed that measures taken have not been effective enough to maintain adequate prevention. Awareness has been raised by the announcement of new outbreaks, and the longer the duration of sustained increases in new COVID-19 patients, the more the proportion of unprotected people decreases, with high correlations. This illustrates that when the public see that the situation is not showing signs of improving, they are more aware of the high risk of contracting the virus.
Interestingly, the unprotected group consisted more of incorrect mask-wearing people than of non-mask-wearing ones. The reason for improper usage of face masks might be carelessness or lack of knowledge; either way, measures should be taken to ensure not only mask usage but also correct mask usage. During the first COVID-19 outbreak in Thailand, availability of face masks was a problem, resulting in people not wearing masks; however, this was no longer a problem at the time this study was conducted.

Global comparison
A study of the rate of mask-wearing in public in Poland, which observed 2,353 people over 3 days, found that 65-75% of people wore masks [22]. Other studies used series of photographs to estimate rates of mask wearing from 3-5 April 2020, and they found rates in Cambodia, Peru, India, Mexico, and USA of 97%, 86%, 41%, 25%, and 21% respectively [27]. Face mask wearing in France, Iran, and Hong Kong was reported at 56.4%, 45.6%, and 87% respectively [23,28,29]. A self-reported questionnaire from Brazil found 95.5% of people who wore face masks [30].
Current published studies only reported face mask use, but not it's correctness. If worn incorrectly, the effectiveness of face masks decreases significantly, therefore it is important to use face mask accordingly.

Face mask wearing divided by place
From our observation, the locations with the lowest unprotected rate was inside markets (2.64%). Since the start of the COVID-19 spread in Thailand, before the Bang Kae market cluster, another market cluster was reported in Samut Sakhon on December 2020. Markets became regarded as high-risk places for the SARS-COV-2 virus, leading people to believe that they had a greater chance of contracting the virus if they went to the market, causing them to take on self-protective measures such as wearing masks. In reality, although markets have a high density of people occupying a limited amount of space, conditions which lead to rapid spread, other places such as malls and closed spaces also present a high risk of virus infection.

Face mask wearing divided by date and time
Higher rates of unprotected behavior was found during the holidays, with Sunday evening showing the highest percentage, while the lowest rates were observed on Mondays. This could be due to the desire for relaxation after a long week of working. Before the pandemic, wearing masks was not habitual, but now it is mandatory on public transportations and in the workplaces. Gradual adoption of this new habit might be the reason why on days when people are not strictly required to wear masks, they prefer not to.
A higher percentage of the unprotected group was seen in the evenings than in the mornings, showing that people have a tendency to relax protective measures more often later on in the day. Iran observed similar findings that the rate of mask wearing in the morning was significantly higher than in the evening [23].
Knowing that during holidays and evenings people are prone to be under-protected, measures should be taken to emphasize the need to maintain mask wearing throughout the day and week. Offices and schools can help encourage people to check their protection before leaving and entering their premises. Strategies to boost the economy by promoting holidays might not be the best idea during this ongoing pandemic.

Face mask in different districts
The 5 districts containing the highest number of people in the unprotected group are all adjacent to each other and situated in central business areas. A study from France also reported the presence of independent associations between correct mask position with rural areas [28]. In contrast, the 5 districts with the highest number of COVID-19 patients were not those with the largest unprotected group. No correlation was found between reported cases and unprotected group according to districts, and this may be because the cases found in each district were reported according to where the people resided rather than where they contracted the virus.

Strengths and limitations
This study is the first to use AI machines to detect mask wearing in public populations and had the largest number of participants in the world. We provided data over a period of 90 days which included both time frames of low contraction rates and high spread. The longest duration of data collection in previous study was over a period of 30 days [16]. We obtained data from various places across different districts and also identified those who wore masks incorrectly.
In hopes of providing a foundation for health care policies, we provided quantitative evidence of percentages of patients correctly and incorrectly wearing masks, and correlations with increased numbers of covid infections. Highlighting areas of higher rates of non-compliance could lead to new strategies aimed at decreasing the risk of incorrectly worn masks. This data is very important for policy makers, not only for COVID-19, but also for other future cases of droplet-borne respiratory tract infections.
One limitation of this study was that information was collected from public areas of a single city, Bangkok, which might not represent mask wearing in Thailand. Reported cases of COVID infection could come from home transmissions and clinical settings, which we did not gather information concerning face mask wearing inside homes and hospitals [29]. Thailand has had a large number of reported cases to come from close contact of families, and the government has released a policy encouraging people to wear masks while inside as well. Our study only addressed the public aspect of face mask wearing which may only account for some of the COVID-19 cases.
The aim of our study was not to find correlations between face mask wearing and the number of COVID-19 patients due to many reasons. One, is that we do not know when the daily reported cases were infected with the virus. Therefore, it is hard to evaluate whether patients tested positive for corona virus today were due to inadequate mask wearing last week, the week before, or even from not wearing masks in public. Secondly, wearing face mask is only one of the many alternative measures to lower rates of COVID-19 infection, it does not entirely eliminate the risk of contracting the virus.
Lastly, our current AI machine could not differentiate the different types of masks such as N95, cloth, and medical masks. Due to this new developed system, AiMASK still has its limits in grading poor quality images or images that it is not confident with. These images will then be classified into ungradable, where data would be manually allocated into the correct group or discarded later.

Conclusion
AI-assisted face mask detection illustrates the current rates of mask wearing which we believe reflects the public's awareness. Higher tendencies toward no protection were found in the evenings, during holidays, and in city centers. Though the overall rate of mask wearing in Bangkok is relatively high (95.98%), data has shown that lower rates of COVID-19 has led to increased numbers in the unprotected group, and whether this is a large or small percentage, it is still unprotected and presents a higher possibility of transmission, eventually leading to recurrent outbreaks. This study shows current gaps in the public's behavior which can be adjusted to heighten self-protective measures. Policies focusing on current shortfalls will help maintain a high rate of protection. AiMASK has been developed to use images attained from CCTVs already available throughout the city, meaning that it can be used on a nationwide scale or even worldwide with high percentages of accuracy.