Fig 1.
Flowchart of the framework of text classification algorithm for subordinate classes of tourist attractions.
Fig 2.
CBOW and skip-gram models [36].
Fig 3.
Structure of skip-gram model [37].
Table 1.
Binary contingency table.
Table 2.
Evaluation indicators -1 for classification results.
Table 3.
Evaluation indicators -2 for classification results.
Fig 4.
Frequency of occurrence of different text lengths.
Table 4.
Distribution of experimental dataset categories.
Table 5.
Hyperparameter settings for Word2Vec and Doc2vec.
Table 6.
Hyperparameter settings of classification model.
Table 7.
Hyperparameter settings of BERT model.
Table 8.
Classification performance of the entire test set in the MLP classifier during the improved processes.
Table 9.
Classification performance of each category of the test set in MLP during the improved processes.
Table 10.
Classification performance of different combinations of text representation method & classifier.
Fig 5.
Difference line graph of "micro-F1 minus weighted-F1".
Fig 6.
Difference line graph of "weighted-F1 minus macro-F1".
Fig 7.
Weighted-F1 values for different combinations of text representations & classifiers.
Fig 8.
Values of each evaluation index under the optimal classification combination model.
Fig 9.
Classification results of each category in the optimal combination of different text representations & classifiers.
Fig 10.
F1-measure of each category in the optimal combination of different text representations & classifiers.
Fig 11.
Weighted-F1 of the optimal combination model with different scale text sets.
Table 11.
The optimal combination of text representations and classifiers for different-size text sets.
Fig 12.
Confusion matrix heat map of optimum classification combination models under text set size of 3498.
Fig 13.
F1-measure of the composite category of optimum classification combination models under different scale text sets.
Fig 14.
Quantity ratio of the true and predicted values for each category of tourist attractions.
Fig 15.
Confusion matrix heat map of the test set in Shanghai and Hunan Province.
Table 12.
Comparing true and predicted values of the top 2–3 categories in different level attractions.
Table 13.
Quantity ratio of the true and predicted values of various attractions in the two provinces.
Table 14.
Comparing true-predicted values of the top 2–3 categories in different-level attractions in the two provinces.