Fig 1.
The framework for obtaining social media messages about “outdoor air pollution”.
The framework containing two phases: data collection and data pre-processing.
Table 1.
Different types of microblogs collected in Beijing relevant to air pollution.
Fig 2.
Examples of high correlations between AQI and individual messages.
(A) The trends of AQI (red line) and individual messages (blue line) for January 2012 in Beijing. (B) The trends of AQI (red line) and individual messages (blue line) in April 2012 in Beijing.
Table 2.
Correlation coefficient between AQI and individual messages.
Fig 3.
An example of the low correlation between AQI and individual messages.
The trends of AQI (red line) and the individual messages (blue line) for July 2012 in Beijing. Point A is the lowest AQI value and point B is the highest daily frequency of individual messages in July.
Fig 4.
An example of the high correlation between AQI and negative individual messages.
The trends of AQI (red line) and negtive individual messages (blue line) for July 2012 in Beijing.
Table 3.
Correlation coefficient between AQI and negative individual messages.
Fig 5.
An example of high correlation between AQI and pollution app messages.
The trends of AQI (red line) and pollution app messages (blue line) for July 2012 in Beijing.
Table 4.
Correlation coefficient between AQI and app messages.
Fig 6.
An example of the low correlation between AQI and retweets.
The trends of AQI (red line) and retweet (blue line) for June 2012 in Beijing. Point C and point D are the highest two daily frequencies of retweets in June.
Table 5.
Correlation coefficient between AQI and retweets.
Fig 7.
The inferred AQI and the observed AQI in each month in 2013.
The trends of observed AQI (red line) and inferred AQI (blue line) in each month in 2013.
Table 6.
RMSE for each month in 2013.
Fig 8.
The page of Sina Weibo users’ profile.
Basic information of user’s profile. The user-defined geolocation of Sina Weibo users is selected from the drop-down list and different from the Twitter with open entry fields.