Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The influence of neighborhood quality on tourism in China: Using Baidu Street View pictures and deep learning techniques


Previous studies have investigated the determinants of urban tourism development from the various attributes of neighborhood quality. However, traditional methods to assess neighborhood quality are often subjective, costly, and only on a small scale. To fill this research gap, this study applies the recent development in big data of street view images, deep learning algorithms, and image processing technology to assess quantitatively four attributes of neighborhood quality, namely street facilities, architectural landscape, green or ecological environment, and scene visibility. The paper collects more than 7.8 million Baidu SVPs of 232 prefecture-level cities in China and applies deep learning techniques to recognize these images. This paper then tries to examine the influence of neighborhood quality on regional tourism development. Empirical results show that both levels of street facilities and greenery environment promote tourism. However, the construction intensity of the landscape has an inhibitory influence on the development of tourism. The threshold test shows that the intensity of the influence varies with the city’s overall economic level. These conclusions are of great significance for the development of China’s urban construction and tourism economy, and also provide a useful reference for policymakers. The methodological procedure is reduplicative and can be applied to other challenging cases.


The improvement of neighborhood quality is closely related to the development of urban tourism [1]. Good neighborhood quality includes not only high-level public service, but also the city’s unique architectural style, beautiful ecological environment, convenient street facilities, etc. All of these can bring the visitors a comfortable sightseeing experience, improve the attractiveness for tourists, and significantly strengthen the city’s competitiveness in tourism [2] and other industries. However, scholars have argued that tourism as an integral part of city marketing and development hasn’t gained sufficient attention in urban research [3]. Therefore, this paper attempts to objectively evaluate the neighborhood quality and explore its influence on urban tourism development.

Previous studies have investigated the determinants of urban tourism development from the perspective of neighborhood context or environment. However, most of them are qualitative analyses focusing on the concept’s interpretation or belong to general experience analysis or case study. Whether these research results can provide generalized implications for other cities is unknown. Due to the insufficient quantitative analysis, studies have not yet led to a consensus about the specific relationship between neighborhood quality and the tourism industry. Traditional methods to assess neighborhood quality are often subjective, costly, and only on a small scale [4]. As Long [5] asserts, it is difficult for researchers to have large-scale micro-data at the street level to measure neighborhood quality. In addition, previous evaluations of neighborhood quality, such as questionnaires, on-site surveys, and field observation, are time-consuming and easily cause bias [6]. Therefore, the existing literature is limited on the objective and effective evaluation of the neighborhood quality, which has become the prerequisite for further exploring its impact on regional tourism development.

To overcome the limitations of the previous evaluation method, this study applies the recent development in big data of street view images, machine learning, and image processing technology to effectively measure neighborhood quality [7]. The purpose of this paper is to examine the influence of neighborhood quality on regional tourism development and to provide a feasible method of combining SVPs and deep learning algorithms for quantifying the neighborhood quality for large number of cities. The paper firstly collects more than 7.8 million Baidu SVPs obtained from 232 prefecture-level cities in China. There are altogether 285 prefecture-level administrative cities in China. Due to the data availability of SVPs in some distant cities, this paper discusses 232 cities of them as research sample. Then, we apply deep learning techniques to recognize these images and quantitatively measure four attributes of the neighborhood quality, namely, street facilities, architectural landscape, green or ecological environment, and scene visibility. The paper then conducts an empirical regression analysis between neighborhood quality and tourism development. Moreover, this paper further explores the threshold effect considering the heterogeneity in different cities. The empirical results show strong support for the development of China’s urban construction and tourism economy, and also provide a useful reference for policy makers.

The main contributions of this paper are: (1) It introduces the concept of neighborhood quality in the research field of economics and explores its relationship with urban tourism. This study broadens the theoretical and empirical research perspective about the tourism industry in economics and urban study. (2) Applying the combined method of deep learning and large-scale street view at a pedestrian level, we quantitatively analyze and measure four attributes of neighborhood quality. It advances previous qualitative research in urban planning. In this regard, our study enriches the research method in related fields and introduces the possibility of applying SVPs and deep learning methods for urban assessment.

The rest of this paper is arranged as follows: Section 2 introduces the related literature review. Section 3 describes the materials and methods for our analysis. Section 4 reports empirical results and further analysis. Conclusion and discussion are presented in the last section.

Literature review

This study is based on two main branches of literature. One is about the relationship between neighborhood quality and urban tourism. And the other is the methodological application of SVPs and deep learning techniques for measuring street landscapes and neighborhood quality.

Relationship between neighborhood quality & tourism

In the early stage of industrial society, tourism and cities deviated from each other. The urban diseases brought by high-speed industrialization force people to escape from the city and return to nature. Cities have become an important source of tourists. After that, with the improvement of the urban environment and the continuous improvement of equipment and facilities in the city, the city has the functions of management, reception, leisure, and entertainment. Tourism begins urbanization [8], and the urban tourism industry enters a competitive growth stage [9]. Resulting of the rapid development of transportation and Internet technology, the competition among cities has focused on the local quality while the neighborhood is the starting point for improving local quality. Previous studies have shown that good neighborhood quality can significantly enhance a city’s tourism competitiveness [2]. From the perspective of environmental psychology and behavior science, neighborhood quality usually includes four attributes: street facilities, the city’s architectural landscape, greenery or ecological environment, and scene visibility.

In the research on neighborhood quality and tourism development, scholars mostly focus on the influence of street style attributes of the neighborhood quality on tourism. Street layout with high density and strong connectivity broadens the available options of urban transportation and increases the competitiveness of urban tourism [10]. Fluent road traffic and perfect road facilities can provide tourists with a convenient transportation experience [11]. Meanwhile, road traffic with high mobility has greatly improved accessibility [12]. However, it is accompanied by high accidents and severe pollution. Differently, narrow pedestrian roads provide visitors with more opportunities for walking, resting, and sightseeing [13].

Referring to the architectural style, the appropriate building layout creates friendly and integrated visiting surroundings, thus bringing tourists a comfortable sightseeing experience. In contrast, a dense building layout produces pressure [11]. For example, Tyrväinen and others [14] study the impact of building density on the subjective psychological feelings of tourists and find that the highly-dense building layout in resorts significantly reduces the visual scope of green space and natural landscape and thus causes psychological pressure on tourists. In a similar study, Ren [15] argues in a study of Xi ’an city that the high-density multi-storey buildings can block the sights of the tourists and negatively affect their mood.

Therefore, excellent urban landscape designers usually employ various plants in the busy downtown area to enhance the greenery degree of the city and to relax the tourists from the dense and high buildings [16]. Urban designers tend to apply greenery, which can reduce the sense of pressure from high building density [17]. The visibility of vegetation can improve the safety perception of urban areas and bring tourists an easy and pleasant sightseeing experience [16]. In the study about Chinese cities, Fok [18] discovers that improving the urban greenery coverage rate in Hong Kong can promote the construction of its own urban brand and reputation, thus enhancing the competitiveness of urban tourism. The studies about foreign cities have concluded that urban greenery creates more robust ecosystems, and forest landscape plays a vital role in tourism. Therefore, appropriate management of urban greenery helps enhance urban tourism [1921].

In addition, the street enclosure formed by the building as the main body also affects tourists’ sense of security, comfort, and satisfaction. It then influences tourism development in the city [22, 23]. Street’s spatial enclosure decides the scale size perceived by visitors [24], and the sky ratio affects their sight [25]. Both relate directly to the scene visibility and ultimately influence visitors’ emotional pleasure and pressure, thus their willingness to stay.

Altogether, the existing studies mostly emphasize the influence of a particular attribute of the neighborhood quality on the tourism industry while neglecting comprehensive analysis taking all main elements into one framework simultaneously. Moreover, the available research is limited to the study of one specific city or case study which lacks the reliability of quantitative analysis results. As a result, the conclusions may not have reference value for other cities considering cities’ heterogeneity.

The application of SVPs & deep learning

Recently, there has been a surge in the measurement method of street landscapes in both domestic and global academia. The formation of the new data environment worldwide, the technical breakthrough of map and image processing, and the introduction of measurement technology in related fields further allow the refinement of quantitative research in street form. Among them, the emerging SVPs can provide massive data based on a pedestrian perspective as a benchmark, with the advantages of rich information, high collection efficiency, and low collection cost [26]. SVPs can thus accurately reflect various street landscapes such as sky views, building, and architectural style, greenery environment, visibility, and transportation convenience as basic elements of neighborhood quality [24, 25, 27]. Currently, Scholars have applied this technology in the evaluation of urban streets. For example, Li and others [28] construct a framework for evaluating the greenery recovery degree of urban Streets based on Google SVPs, and applies it in the evaluation of the East Village in the Manhattan District of New York. After that, Hao and Long [29] revise the study from Li and others [28] and apply the street greenery index to calculate more than 30,000 streets in Chengdu City in China. Similarly, Liu and others [30] use Baidu SVPs to evaluate the buildings on both sides of Beijing’s Fifth Ring Road streets. Steinmetz-Wood and others [31] utilize Google Street View to describe the microscale environments in a total of 129 streets from both the Montreal and Toronto regions.

In recent years, scholars have begun to broaden the use of street scenes through color band methods or machine learning techniques [32]. For example, Nagata and others [33] assess the street view walkability of communities around Tokyo city through semantic segmentation and statistical modeling for Google Street View images. Based on the generative adversarial networks (GAN) models through machine learning, Rachele and others [34] observe the influence on healthy activity from architectural features of more than 200 community buildings in Brisbane, Australia.

Although various methods exist to evaluate the micro-morphology of the city through SVPs and machine learning, existing applications revolve only around a single city or evaluate a single element [3133, 35]. To fill the research gap, this paper utilizes micro-data at the street level in each city to scientifically evaluate the neighborhood quality from its four attributes and then carries out an empirical test.

Materials & methods

To examine the influence of neighborhood quality on the tourism industry, we need to quantitatively measure neighborhood quality based on the processed SVPs. Then, combined with data from China Urban Statistical Yearbook and China Environmental Statistical Yearbook, we conduct the empirical analysis with 232 major prefecture-level cities in China as the research samples. Fig 1 shows the research flow chart of this paper.

Acquisition of SVPs

The SVPs in this study came from the Baidu Map open platform ( Baidu Street View and Tencent Street View have covered most cities in China and are now the primary street view data sources in mainland China. Baidu Map open platform provides free online access and an API (Application Programming Interface) for downloading Baidu SVPS (Baidu SVPs API: = *&width = 512 &height = 256 &location = 116.313393,40.04778&fov = 180).

In recent years, data from SVPs has been widely used in many studies, providing new ideas for large-scale research at the micro-level [3638]. Compared with the traditional urban data, the urban SVPS has the following advantages: (1) They can reflect the objective reality. Unlike questionnaires or interviews, SVPs systematically map the city’s street-level scene in detail. SVPs contain rich urban infrastructure information and provide artificial and natural landscape at the street level, which can intuitively and accurately reflect the facade information of the city. (2) Street view images come from pictures taken by street view cars, with comprehensive coverage and different perspectives. Baidu SVPs cover hundreds of cities in China, which provides a reliable data source for the research in this paper. (3) The data collection work has high efficiency and comparatively low cost. The processing logic is simple and easy for supervision and quality control. In addition, it is not limited by the weather, time, and place.

By combining GIS and the Baidu API, we apply the following steps to download the SVPs and create a dataset:

Firstly, to obtain a street network map of China from the Open Street Map (OSM) website (OSM:; China road network: OSM is a collaborative project which produces and offers free geographic data. We obtain data about five types of roads, including the main roads (Primary), secondary roads (Secondary), tertiary roads (Tertiary), all residential roads (Residential), and service roads (Service). We then use ArcGIS software to optimize the road network data, removing streets with possible visual field duplication. Finally, we get a road network map in China.

Secondly, to generate GPS (the Global Positioning System) points. Previous studies have applied various interval criteria to obtain SVPs, such as 20 meters [39], 30 meters [40], 50 meters [32], and 100 meters [4]. Considering the time required to obtain SVP and the efficacy and validity of image content for our research purpose, we set the density of the sampling point to be 100 meters for our analysis. We set the shortest allowable distance between any two random points to ensure at least one Baidu Street View panorama every 100 meters on one road. The road network obtained in the previous step is thus sampled by the CreatePointsLines Plugin of the ArcGIS software. The GIS enables direct links through spatial connections between latitude and longitude coordinate pairs and the related datasets. As a result, we get GPS points with latitude and longitude coordinates every 100 meters along the street. The coordinates of all sample points are saved.

Thirdly, to collect SVPs. The Baidu API allows users to download SVPs by specifying required acquisition parameters, such as Size (the output size of the image in pixels), Location (coordinates of location), Fov (horizontal field of view of the image, accepted values are 0–360), Heading (compass heading of the camera, accepted values are 0–360), Pitch (up or down angle of the camera, accepted values are 0–90), etc. In order to make the obtained street view closer to the eye-level view of the pedestrians, we set "Fov" to 90° and "Heading" to 0, 90, 180, and 270, respectively. By entering the coordinates of the sampling points into a Python script, we access the Baidu Street View API and thus collect four images in each location, which can cover 360-degree horizontal surroundings. The 360-degree images can contain entire streetscapes from a specific position and are more advanced than static images in some studies. The requested SVPs were mainly captured in 2016. A tiny portion was collected in 2015 or 2017 and is basically of small roads with low pedestrian traffic and elevated highways without change. In addition, some images need to be removed in the following cases: (1) for some specified street sites, the Baidu Street View map does not record images; (2) in some visual frames, the central view is covered by objects such as trucks and buses, instead of street view. After deleting the invalid pictures, we collect ultimately more than 7.8 million Baidu SVPs (860*573 pixels) from 5,868 streets in China’s 232 prefecture-level cities from the Baidu API.

Recognition of SVPs

To measure the neighborhood quality based on SVPs, we need to assess the various physical features of objects in each image. Specifically, there are two procedures. (1) we need to have a precise semantic segmentation for the different physical features in each image and assign a category label to each pixel in the image. (2) We use the Python tool to optimize the segmentation and count the total number of pixels in each category.

Image semantic segmentation uses machine learning to segment the contents of an image semantically. Scene parsing is the strict segmentation approach that attempts to partition the image into semantically meaningful parts and classify each image pixel into one of the pre-determined categories. Deep learning is one of the most advanced scene parsing [40] and can extract the underlying features directly from the raw image data, avoiding manual feature extraction. There are various technologies for deep learning applications, such as convolutional neural networks (CNN), fully convolutional neural networks (FCNN, [41]), pyramid scene parsing networks (PSPNet, [42]), and Deep Labelling for Semantic Network (DeepLabV3; [43]), etc. This paper uses DeepLabV3+ based on the Dilated FCN (Fully Convolution Network) framework for semantic segmentation. The proposed model, DeepLabv3+, is the fourth version of the Deep Lab series proposed by Google ([44]). It extends DeepLabv3 by employing an encoder-decoder structure to refine the segmentation results, especially along object boundaries. The overall architecture of the DeepLabV3+ model is shown in Fig 2. The collected SVP is inputted into the model and passes through the backbone network (which is the part marked as DCNN (deep convolutional neural networks) Atrous Conv(olution) in Fig 2). The encoder module encodes multi-scale contextual information by applying atrous convolution at multiple scales, while the simple yet effective decoder module refines the segmentation results along object boundaries. Finally, a pixel-wise classified street view image with semantic categories is produced. Meanwhile, it has the same size as the input image.

Fig 2. The overall architecture of the DeepLabv3+ model [44].

Generally, a deep learning model needs additional training to create a model suitable for scene segmentation and semantic understanding. Several datasets are available for training deep neural networks, such as ADE20k(, PASCAL VOC (, etc. This paper uses the “cityscapes dataset” ( to train the model. The Cityscapes dataset is a large-scale database that focuses on semantic understanding of urban street scenes. The dataset consists of images labeled as objects for complex scenes in 50 different cities. The photos have the features such as a large number of dynamic objects, varying scene layouts, and varying backgrounds. Compared to other datasets, the cityscapes dataset contains the complexity of real-world landscapes [45], similar to the Baidu SVPs we use. The Cityscapes dataset defines semantic, instance-wise, and dense pixel annotations for 30 visual classes grouped into eight categories (flat, human, vehicle, construction, object, nature, sky, and void). It excludes 11 classes due to rare segments in streetscapes and leaves 19 classes for evaluation. In this paper, we use the default value (the number of finely annotated pixels) of 19 classes for semantic segmentation. The 19 classes and their associated categories are: the flat category including road and sidewalk, the construction category including building, wall, and fence, the object category including pole, traffic light, and traffic sign, the nature category including vegetation and terrain, the human category including person and rider, the vehicle category including car, truck, bus, train, motorcycle, and bicycle, and the final sky category.

Based on the DeepLabv3+ algorithm and Cityscapes dataset, we complete the semantic segmentation for more than 7.8 million Baidu SVPs. That is, the segmentation (pixel-wise) of the images into multiple pixels sets of the sky, buildings, walls, and others (altogether 19 classes). Take the school gate of Tongji University in Shanghai as an example, and the semantic segmentation results are in Fig 3. The upper four SVPs are images with four different angles, and the lower four images are the pictures after semantic segmentation, respectively. The GPS point is the school gate of Siping Road, Tongji University, with the coordinates of (121.513046, 31.288205). Fig 3 shows that different objects in the pictures are clearly divided and given different colors. Among them, the color gray represents the road, yellow represents the sidewalk, blue represents vegetation, green represents buildings, red represents cars, purple represents the sky, purple represents pedestrians, blue-green represents poles, and so on.

Although literature review has shown that the DeepLabV3+ model is currently one of the most advanced techniques in recognizing SVPs in related studies, we carry out additional comparative assessments regarding the efficiency and accuracy of our approach. Fig 4 compares our approach and the manual approach using Adobe Photoshop. It is apparent that our approach detects more than the manual approach. In addition, referring to the method by Xia and others [40], we use linear regression to compare our results with those obtained manually. We randomly take SVPs and calculate the GVI (Green View Index) values for each image according to the measurement equation explained in the next section. Fig 5 shows the relationship between the GVI results of the two approaches. The correlation coefficient of the regression line is 0.938, with R2 = 0.9851.

Measurement of neighborhood quality

As stated above, the neighborhood quality can be measured through four attributes: the level of traffic convenience, the urban architectural landscape, the greenery or ecological environment, and the scene visibility along the street. Referring to previous research, we use four indicators to evaluate neighborhood quality.

Measurement of the level of street facilities (SC).

Considering the available data, we use the street capacity (SC) to indicate the level of street facilities and the spatial accessibility of a city. Street capacity is usually measured by the physical scale of the road interface, such as the distance between buildings or structures (walls, fences) on both sides of the street [46], or the sum of secondary spatial scales such as motorized lanes, non-motorized lanes, and sidewalks [47]. Based on the image recognition information of various street-related objects, including motorized lanes, non-motorized lanes, sidewalks, pedestrians, vehicles, and so on, this paper first defines four street-related indicators. Then, we use the entropy method [48] to evaluate the street capacity based on the four indicators. The expression formula for the four street-related indicators is as follows:

Where PSWi is the proportion of sidewalks of city i, PRWi is the proportion of roadways of city i, CSWi is the capacity of sideways (non-motorized line) in city i, CRWi is the capacity of roadways (motorized line) in city i. NUMl is the total number of SVPs collected in city i, roadik the pixel proportion of the motorized lane in the kth SVP of city i, sidewalkik is the pixel proportion of non-motorized lane and sidewalk in the kth SVP of city i. personik, riderik, bicycleik, carik, truckik, busik, motorbicycleik, is the pixel proportion of person, rider, bicycle, car, truck, bus and motorcycle in the in the kth SVP of city i respectively.

Here we apply Entropy Weight Method (EWM) to weigh the different impacts of these four indicators on the street capacity. It measures the degree of dispersion between variables, avoiding subjective evaluation and the information overlaps of variables. Generally, there are six steps when carrying out the EWM, including 1) data acquisition and collation, 2) data standardization to solve the homogenous problem existing in indicators’ values, 3) proportion calculation for indicators, 4) measurement of the information entropy [48] to solve the problem of quantitative measurement of information, 5) the calculation of the difference coefficient, which is negatively correlated with the entropy value, 6) after applying Python to realize entropy method, we get the weighted value for the four indicators as Table 1 shows. In this paper, we apply the weight of each indicator to calculate the street capacity of each city.

Table 1. The weight measurement of 4 indicators for evaluating the street capacity.

Measurement of urban architectural landscape (CD).

We use the construction density to reflect urban development intensity, which relates to the city’s landscape directly. The construction density is the open space rate of the unit space and the density of the building coverage. Higher building density reflects the greater construction intensity of a city and thus the higher utilization degree of its urban space. The building density of an area is calculated by the percentage of building area to the whole area. Based on the information from the identification of SVPs, this paper calculates the building density with the following formula:

CDi is the construction density of city i, and Buildingik is the pixel proportion of buildings in the kth SVP of the city i.

Measurement of greenery environment (GVI).

Street greenery plays a vital role in enhancing the overall quality of a city [16]. The concept of “Green View Index (GVI)” was firstly proposed by Japanese scholars in 1985 [49] and was analyzed based on the ratio of green within the frame of vision. The green view index indicates the proportion of greenery in the street and reflects the degree of people’s visual perception of the surrounding greenery environment. It focuses on the three-dimensional composition and visual effect of urban greenery and thus can better reflect the quality of the public greenery environment from residents’ viewpoint [50]. Therefore, the Visible Green Index has a specific evaluation effect on environmental quality, urban planning, and residential value [51]. However, most early research mainly focuses on green coverage derived from satellite images or aerial photographs [16]. As Leslie and others [52] point out, the green indicators based on remotely sensed images are objectively measured and unable to represent neighborhood greenness perceived by pedestrians.

In 2009, Yang and others [53] use four pictures taken in four directions through an object-based image analysis approach and measure the VGI through the ratio of green area to the total area of the four pictures. It quantitively measures how much greenery a pedestrian can see from street level. After that, researchers have modified the GVI through various calculation methods (e.g., [40]). This paper refers to the method by Ki and Lee [32]. When considering the SVPs, VGI is the proportion of the number of pixels of greenery elements in SVP of the total amount of pixels of the picture. The calculation formula is as follows:

GVIi is the visible green index of city i, Vegetationik is the pixel proportion of vegetation in the kth SVP of city i. The value of GVIi is high, when a lot of greenery is visible from a specific position.

Measurement of scene visibility (DE).

A street’s continuous scene visibility can satisfy people’s material, spiritual, psychological, behavioral, and other needs. It reflects a street’s original texture, making users feel the humanity care from the designers, thus promoting neighborhood quality [22, 23]. Public space of a city is mainly surrounded by various buildings. Scene visibility can be indicated by the spatial enclosure. The degree of enclosure is mainly composed of the proportion of street wall and of the sky [47]. In this paper, we include the sky proportion in reflecting the degree of spatial enclosure felt by pedestrians. We use the following formula to get the degree of enclosure in a city:

Where DEi is the degree of enclosure of city i, Skyik represents the pixel proportion of sky elements in the kth SVP of city i. ALLik indicates all pixels in the kth SVP of city i (value is 1).

Empirical analysis and results

Basic regression

Because the Baidu SVPs are from the year 2016, we use income in the tourism industry (TIi) of the same year as a proxy measure to indicate the development level of the tourism industry in the city i. The corresponding regression empirical model is as follows: (1)

TIi is the tourism income of city i; QSi is the core explanatory variable of this paper and represents the neighborhood quality of city i, expressed by four indicators. Referring to the existing literature about the tourism industry (e.g., [54]), we adopt a series of control variables in city i, Xi, including city’s level of economic development with indicator GDP (Gross Domestic Products), the level of service development with indicator the number of employees in the city’s service industry (POP3), the level of living standards with indicator the per capita income level (GZ), the degree of pollution with indicator the use of urban industrial dust emissions (GYFC).

Table 2 shows the regression results. Both estimated coefficients of two variables (SCi, GVIi) are significantly positive for the development of tourism development at level of 1% and 5% respectively. The results are consistent with those from Moniruzzaman and Páez [11] and Fok [18]. They insist that a high level of street convenience and the greenery or ecological environment tends to improve the city’s tourism industry. In addition, the estimated coefficient of the variable (CDi) has a significant inhibitory effect on tourism’s development. It is reasonable that people in modern society are tired of the high-density buildings and over-development of the city and prefer the natural environment. It also confirms the conclusion of Tyrväinen and others [14]. However, different from our expectation, there is no significant relationship between the variable (DEi) and the tourism income. That means, except for the variable scene visibility, all other three attributes of neighborhood quality significantly influence a city’s tourism industry. We can find in the following analysis part that the significance of four variable change when considering the city’s heterogeneity.

Test of robustness

To test whether the above empirical results vary with the change of parameter settings, it is necessary to conduct a robustness test. We use two methods for robustness testing.

Firstly, we replace the explained variable tourism income (TIi) with the number of tourists (TI_2i) in the regression model. Columns (1) and (2) of Table 3 report the empirical results. The signs and significance of the explanatory and control variables are consistent with the baseline regression.

Table 3. Robustness test (replacement of the explained variables).

Secondly, we delete some special city samples for robust regression. Cities in China present considerable differences due to their administrative levels.

Most of the cities in our sample are ordinary prefecture-level cities. However, there are about 39 special cities with various administrative levels such as the provincial capital city, municipalities directly under the central government, sub-provincial cities, special economic zone cities, and cities listed in the national social and economic development plan. Because the administrative level indicates heterogeneity in resources capacity and policy-making rights which might influence the development of urban tourism, the difference at the city level may bias the regression results. Column (1) in Table 4 shows the regression results after deleting the special city samples. Column (2) in Table 4 shows the results after changing the explained variable with the number of tourists. Both results show that the signs and significance of the explanatory and control variables are consistent with the baseline regression.

The above two tests show that the signs and significance of the coefficients of the main variables are consistent with that of the basic regression, indicating that our empirical results are robust.

Threshold test

Existing literature has demonstrated that China is a vast country with distinct heterogeneity in various aspects among cities. That is to say, heterogeneities in different cities might lead to differences in the intensity of the influence from neighborhood quality. To make a reasonable explanation for the heterogeneity, this paper further explores the threshold effect with consideration of the economic level measured by the GDP. Table 5 shows that the economic level in the 39 special cities is significantly higher than that of the ordinary cities.

Table 5. Descriptive statistics of economic levels in different types of cities.

Table 6 shows the test for the threshold effect with the economic level as the threshold variable and the neighborhood quality as the core independent variable. The P-value of 0.043 at 5% significance level indicates a significant single threshold effect with a threshold value of 0.248. In addition, the result that the P-value of 0.279 is not significant reflects that the model has no double threshold effect. It means that only after the city’s economic level reaches the threshold of 0.248, the neighborhood quality begins to exert its impact on the tourism industry.

Furthermore, Table 7 shows the result of the threshold regression. If the economic level is less than the threshold (0.248) in Column (1), all four attributes of neighborhood quality have no significant effect on the tourism industry. Conversely, if the economic level crosses the threshold value (0.248) in Column (2), the results are basically same as that of the basic regression in Table 2. The results here testify the existence of cities’ heterogeneity.


Previous studies have explored that the improvement of neighborhood quality has an influence on urban tourism. However, most assess the neighborhood quality subjectively and on a small scale. The lack of quantifying neighborhood quality leads to doubts about the applicability and reliability of the research conclusions. Traditional methods such as case studies, questionnaires, and field surveys are time-consuming and not applicable to large-scale studies. Although previous literature has applied satellite remote sensing images to quantify environment characteristics (e.g. [55]), these images are from a bird’s eye or overhead view, which is different from our study that requires a visitor’s or pedestrian perspective. Therefore, the application and methodology of Baidu SVPs in this paper meet the accessibility of large-scale detailed data and the possibility of measuring the neighborhood environment in a quantitative way, ultimately making our discussion of the impact of the neighborhood quality on tourism more accurate and reliable. To the best of our knowledge, this paper is the first to combine SVPs, deep learning techniques, and regression models to examine the influence of neighborhood quality on tourism development in almost all Chinese prefecture-level cities.

This paper contributes to the literature on urban planning. Most previous researches employ questionnaire or field research to explore the influence of neighborhood quality. Our findings in this paper provide new insight into how four attributes of neighborhood quality affect the tourism industry, thereby enriching the existing discussion about urban design and planning. In addition, our study provides a methodological reference for employing SVPs and deep learning techniques in economic study, especially the use of large amounts of data at the micro-level.

Our results are of great significance for policymakers in improving China’s urban construction and urban tourism economy. In recent years, the regional integration process in China has accelerated significantly. As the basic unit of the city, the city block carries the city’s main public space. The neighborhood quality of a city is closely related to the city’s attractiveness and thus its economic development issue. Meanwhile, “ecological civilization” has become one of the national strategies and also the primary guidance for spatial development in China. Under these circumstances, improving neighborhood quality in a city has become an essential strategy for local governments to participate in the new urban development process. More detailly, this article presents some policy implications for effectively promoting local quality in the aspects such as street capacity and greenery environment. In addition, it reminds people to avoid dense construction in a city. However, the existence of such impact requires a threshold point, namely, it depends on a city’s economic level. If the overall economic level of a city reaches a certain degree, its urban spatial design should gradually shift from “scale expansion” to “stock optimization”. For cities with relatively low economic levels, the practice of promoting the urban economy through infrastructure investment and construction is reasonable. Meanwhile, they need to have enough space for future urban space design, such as greenery and a reasonable layout of urban buildings and streets.

Although the SVP is a valuable data source for assessing the urban environment, some scholars have raised its limitation in recent years. Firstly, almost all existing studies on Chinese street maps do not involve multi-temporal analysis and are carried out for a specific time, e.g., the year 2016 in our study. However, a time series of data can better reflect the dynamic change of neighborhood quality and thus the influence across time. The main reason lies in our acquired Baidu SVPs, captured in a single day or a short period and lacking continuous temporal coverage ([56]). Although Baidu has added the “time machine” function in its online query system that provides multi-temporal SVPs, users are not able to acquire these historical SVPs through the API access. The study by Yu and others ([57]) is one of the few studies focusing on temporal changes. The research uses Baidu SVP to assess the dynamic changes in urban greening in 2014 and 2019. However, their study is limited to one Chinese prefecture-level city. In addition, the SVP is criticized for its bias in spatial coverage and street accessibility. Chen and others ([58]) argue that the vehicle used to take the SVPs cannot reach some streets or alleys in the city. Especially for tourists, the places that attract them may not be reflected on the street view map. Sometimes, the weather conditions or weather seasonality at the time of photo taking, such as lighting and solar radiation ([56]), also limit the quality of the photo and thus the precision of the recognition. Under this circumstance, some other data sources need to be added to ensure the accuracy of the analysis. Furthermore, the subjective perception measures based on the real environment and based on the street view may not be consistent. The study from Feng and others ([59]) points out that the Baidu SVPs and the real site are relatively consistent in the subjective perception assessment of street quality while significant differences exist in the ambiance perception. Yue and others ([60]) also acknowledge the importance of human visual perception of the urban environment, in addition to the physical environment. Future research would be made to combine other techniques or information requisition sources such as questionnaires for accurate assessment.

In addition, existing literature has applied various indicators to explain the concept of neighborhood quality. This paper describes the concept from four attributes. However, there exist other variables relating to the quality of the neighborhood, such as demographic makeup [61], psychological perception [4] including safety [16], wealth and boring, etc, urban commerce distribution [39], social inequalities [62] and perceived scene complexity [63]. Considering the tourist-related characteristics and the data availability from VSPs, this paper limits to the four indicators. Future related studies can focus on the above-mentioned aspects and yield more significant implications. For example, our future study tries to use POI (point of interests) data obtained through Baidu, hoping to improve the assessment of neighborhood quality. Also, the definition and measurement of some indicators for neighborhood quality in our paper seems simple to some degree. Although our evaluation of the first indicator, the level of street facilities, is relatively comprehensive, the measurements of the other three indicators only rely on the proportion of pixels. One of the reasons is that our methodological analysis is relatively new, and not much literature can be referenced, except for the greening rate. The other reason is the limitation of the VSP identification. Therefore, the limited measurement might weaken the effectiveness and robustness of such an analysis. Future research will benefit from the advances in VSP-identification technologies.


This paper fills the research gap by using Baidu SVPs and deep learning to evaluate quantitatively four attributes of neighborhood quality, namely street facilities, architectural landscape, green or ecological environment, and scene visibility. Based on the more than 7.8 million Baidu SVPs of 232 prefecture-level cities in China, the paper attempts to investigate the influence of neighborhood quality on the development of the tourism industry in a city. Empirical study finds that: (1) The improvement of the neighborhood quality, in general, promotes the development of the tourism industry in a city. More specifically, the level of street facilities with an indicator of street capacity and greenery environment with indicator GVI significantly positively influence tourism. Conversely, higher construction density (CD) has significant inhibitory effects on the development of the tourism industry. (2) These effects have the threshold effect when we take cities’ heterogeneity into account, such as the urban economic level of a city. Only when the economic level crosses the threshold value of 0.248, neighborhood quality has significant impact on the urban tourism industry.

Compared to previous related studies assessing neighborhood quality subjectively and on a small scale, our study shows that the Baidu SVPs and deep learning techniques can be regarded as a reliable tool for evaluating neighborhood quality for a large-scale study area. The methods proposed in this study could be applied to related research areas in other countries and regions, such as land use and built environment, health and wellbeing, urban modeling and demographic surveillance, etc. ([64]).


  1. 1. Boivin M., Tanguay G. A. Analysis of the determinants of urban tourism attractiveness: The case of Québec City and Bordeaux. Journal of destination marketing & management. 2019; 11, 67–79.
  2. 2. Enright M. J., Newton J. Tourism destination competitiveness: a quantitative approach. Tourism management. 2004; 25(6), 777–788.
  3. 3. Yuan J., Deng J., Pierskalla C., King B. Urban tourism attributes and overall satisfaction: An asymmetric impact-performance analysis. Urban Forestry & Urban Greening. 2018; 30, 169–181.
  4. 4. Wang R., Liu Y., Lu Y., Yuan Y., Zhang J., Liu P., et al. The linkage between the perception of neighbourhood and physical activity in Guangzhou, China: using street view imagery with deep learning techniques. International Journal of Health Geographics. 2019; 18(1), 1–11.
  5. 5. Long Y. Street Urbanism: A New Perspective for Urban Studies and City Planning in the New Data Environment. Time+ Architecture. 2016; 02: 128–132. (in Chinese)
  6. 6. Wang R., Liu Y., Lu Y., Zhang J., Liu P., Yao Y., et al. Perceptions of built environment and health outcomes for older Chinese in Beijing: A big data approach with street view images and deep learning technique. Computers, Environment and Urban Systems. 2019; 78, 101386.
  7. 7. He N., Li G. Urban neighbourhood environment assessment based on street view image processing: A review of research trends. Environmental Challenges. 2021; 4, 100090.
  8. 8. Mullins P. Tourism urbanization. International journal of urban and regional research. 1991; 15(3), 326–342.
  9. 9. Page S. J. Urban tourism. Cahiers Espaces. 1995; (1), 77–79.
  10. 10. Saelens B. E., Sallis J. F., Frank L. D. Environmental correlates of walking and cycling: findings from the transportation, urban design, and planning literatures. Annals of behavioral medicine. 2003; 25(2), 80–91. pmid:12704009
  11. 11. Moniruzzaman M., Páez A. A model-based approach to select case sites for walkability audits. Health & Place. 2012; 18(6), 1323–1334. pmid:23103425
  12. 12. Wendt J. A., Grama V., Ilieş G., Mikhaylov A. S., Borza S. G., Herman G. V., et al. Transport infrastructure and political factors as determinants of tourism development in the cross-border region of Bihor and Maramureş. A comparative analysis. Sustainability. 2021; 13(10), 5385.
  13. 13. Arnold H. F. (1980). Trees in urban design. Trees in urban design.
  14. 14. Tyrväinen L., Uusitalo M., Silvennoinen H., Hasu E. Towards sustainable growth in nature-based tourism destinations: Clients’ views of land use options in Finnish Lapland. Landscape and Urban Planning. 2014; 122, 1–15.
  15. 15. Ren Y. Xi’an tourism urbanization process, pattern and mechanism. Journal of Xi’an University of Finance and Economics. 2017; 30 (3), 68–75. (in Chinese)
  16. 16. Li X., Zhang C., Li W. Does the visibility of greenery increase perceived safety in urban areas? Evidence from the place pulse 1.0 dataset. ISPRS International Journal of Geo-Information. 2015; 4(3), 1166–1183.
  17. 17. Mikulić J., Krešić D., Miličević K., Šerić M., & Ćurković B. Destination attractiveness drivers among urban hostel tourists: An analysis of frustrators and delighters. International Journal of Tourism Research. 2016; 18(1), 74–81.
  18. 18. Fok K. City re-imagined: multi-stakeholder study on branding hong kong as a city of greenery. Journal of Environmental Management. 2018; 206, 1039–1051. pmid:30029338
  19. 19. Santos M. C., Ferreira A. M., & Costa C. Influential factors in the competitiveness of mature tourism destinations. Universidade Do Algarve. 2014; 10(1), 73–81.
  20. 20. Dobbs C., Kendal D., Nitschke C. R. Multiple ecosystem services and disservices of the urban forest establishing their connections with landscape structure and sociodemographics. Ecological Indicators. 2014; 43, 44–55.
  21. 21. Sgroi F. Forest resources and sustainable tourism, a combination for the resilience of the landscape and development of mountain areas. Science of The Total Environment. 2020; 736, 139539. pmid:32474275
  22. 22. Gong F. Y., Zeng Z. C., Zhang F., Li X., Ng E., Norford L. K. Mapping sky, tree, and building view factors of street canyons in a high-density urban environment. Building and Environment. 2018; 134, 155–167.
  23. 23. Liang J., Gong J., Sun J., Zhou J., Li W., Yi L. Automatic sky view factor estimation from street view photographs—a big data approach. Remote Sensing. 2017; 9(5), 411.
  24. 24. Harvey C., Aultman-Hall L., Hurley S. E., Troy A. Effects of skeletal streetscape design on perceived safety. Landscape and Urban Plannin. 2015; 142, 18–28.
  25. 25. Alexander C. A pattern language: towns, buildings, construction. 1977. Oxford university press.
  26. 26. Zhang L., Pei T., Chen Y., Song C., Liu X. A Review of Urban Environment Evaluation Based on Street View Image. Journal of the Earth Information Sciences. 2019; 21 (1), 13. (in Chinese)
  27. 27. Rapoport A. Human aspects of urban form: towards a man—environment approach to urban form and design. 2016. Elsevier.
  28. 28. Li X., Zhang C., Li W., Ricard R., Meng Q., Zhang W. Assessing street-level urban greenery using Google Street View and a modified green view index. Urban Forestry & Urban Greening. 2015; 14(3), 675–685.
  29. 29. Hao X., Long Y. Street greening: a new walkability evaluation index. Shanghai Urban Planning. 2017; (1), 6. (in Chinese)
  30. 30. Liu M. Y., Lin S., Ramalingam S., Tuzel O. Layered interpretation of street view images. Computer Science. 2015; 10, 393–396.
  31. 31. Steinmetz-Wood M., El-Geneidy A., Ross N. A. Moving to policy-amenable options for built environment research: The role of micro-scale neighborhood environment in promoting walking. Health & Place. 2020; 66, 102462. pmid:33120068
  32. 32. Ki D., Lee S. Analyzing the effects of Green View Index of neighborhood streets on walking time using Google Street View and deep learning. Landscape and Urban Planning. 2021; 205, 103920.
  33. 33. Nagata S., Nakaya T., Hanibuchi T., Amagasa S., Kikuchi H., Inoue S. Objective scoring of streetscape walkability related to leisure walking: Statistical modeling approach with semantic segmentation of Google Street View images. Health & Place. 2020; 66, 102428. pmid:32977303
  34. 34. Rachele J. N., Wang J., Wijnands J. S., Zhao H., Stevenson M. Using machine learning to examine associations between the built environment and physical function: a feasibility study. Health & Place. 2021; 70 (1), 102601. pmid:34157507
  35. 35. Appleyard D., Lintell M. The environmental quality of city streets: the residents’ viewpoint. Journal of the American Institute of Planners. 1972; 38(2), 84–101.
  36. 36. Li X., Ratti C. Mapping the spatial distribution of shade provision of street trees in Boston using Google Street View panoramas. Urban Forestry & Urban Greening. 2018; 31, 109–119.
  37. 37. Lu Y., Sarkar C., Xiao Y. The effect of street-level greenery on walking be- havior: Evidence from Hong Kong. Social Science & Medicine. 2018; 208, 41–49.
  38. 38. Wang W., Xiao L., Zhang J., Yang Y., Tian P., Wang H., et al. (2018). Potential of Internet street-view images for measuring tree sizes in roadside forests. Urban Forestry & Urban Greening, 35, 211–220.
  39. 39. Ye N., Wang B., Kita M., Xie M., Cai W. Urban commerce distribution analysis based on street view and deep learning. IEEE Access. 2019; 7, 162841–162849.
  40. 40. Xia Y., Yabuki N., Fukuda T. Development of a system for assessing the quality of urban street-level greenery using street view images and deep learning. Urban Forestry & Urban Greening. 2021; 59, 126995.
  41. 41. Long J., Shelhamer E., Darrell T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition 2015; 3431–3440.
  42. 42. Zhao H., Shi J., Qi X., Wang X., Jia J. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2017; 2881–2890.
  43. 43. Chen L. C., Papandreou G., Schroff F., Adam H. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587. 2017.
  44. 44. Chen L. C., Zhu Y., Papandreou G., Schroff F., Adam H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 2018; 801–818.
  45. 45. Cordts M., Omran M., Ramos S., Rehfeld T., Enzweiler M., Benenson R., et al. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016; 3213–3223.
  46. 46. Jia, R. Research on expressway settlement evaluation and expansion longitudinal surface design based on driving comfort. (Doctoral dissertation, Zhejiang University). 2016. (in Chinese)
  47. 47. Li K., Cheng S., & Li X. Quality index and measurement method of urban streets. Statistics and Decision-Making. 2019; 11, 4. (in Chinese)
  48. 48. Shannon C. E. A mathematical theory of communication. The Bell system technical journal. 1948; 27(3), 379–423.
  49. 49. Aoki Y., Yasuoka Y., Naito M. Assessing the impression of street‐side greenery. Landscape Research, 1985; 10(1), 9–13.
  50. 50. Long Y., Liu L. How green are the streets? An analysis for central areas of Chinese cities using Tencent Street View. PloS one. 2017; 12(2), e0171110. pmid:28196071
  51. 51. Zhu H., Yang F., Bao Z., Nan X. A study on the impact of Visible Green Index and vegetation structures on brain wave change in residential landscape. Urban Forestry & Urban Greening. 2021; 64, 127299.
  52. 52. Leslie E., Sugiyama T., Ierodiaconou D., Kremer P. Perceived and objectively measured greenness of neighbourhoods: Are they measuring the same thing?. Landscape and urban planning. 2010; 95(1–2), 28–33.
  53. 53. Yang J., Zhao L., Mcbride J., Gong P. Can you see green? Assessing the visibility of urban forests in cities. Landscape and Urban Planning. 2009; 91(2), 97–104.
  54. 54. Zhang Q., Chen J., Guan W., Wang Y., Ahmed M. Asymmetric impacts of technology innovation and environmental quality on tourism development in emerging economies. Economic Research-Ekonomska Istraživanja. 2021; 1–17.
  55. 55. Feng T., Ma H., Cheng X., & Zhang H. Calculation of the optimal segmentation scale in object-based multiresolution segmentation based on the scene complexity of high-resolution remote sensing images. Journal of Applied Remote Sensing, 2018; 12(2), 025006.
  56. 56. Deng M., Yang W., Chen C., Wu Z., Liu Y., & Xiang C. Street-level solar radiation mapping and patterns profiling using Baidu Street View images. Sustainable Cities and Society, 2021; 75, 103289.
  57. 57. Yu X., Her Y., Huo W., Chen G., & Qi W. Spatio-temporal monitoring of urban street-side vegetation greenery using Baidu Street View images. Urban Forestry & Urban Greening, 2022; 127617.
  58. 58. Chen X., Meng Q., Hu D., Zhang L., & Yang J. Evaluating greenery around streets using Baidu panoramic street view images and the panoramic green view index. Forests, 2019; 10(12), 1109.
  59. 59. Feng G., Zou G., Piga B. E., & Hu H. The Validity of Street View Service Applied to Ambiance Perception of Street: A Comparison of Assessment in Real Site and Baidu Street View. In International Conference on Applied Human Factors and Ergonomics (pp. 740–748). 2021; Springer, Cham.
  60. 60. Yue H., Xie H., Liu L., & Chen J. Detecting People on the Street and the Streetscape Physical Environment from Baidu Street View Images and Their Effects on Community-Level Street Crime in a Chinese City. ISPRS International Journal of Geo-Information, 2022; 11(3), 151.
  61. 61. Gebru T., Krause J., Wang Y., Chen D., Deng J., Aiden E. L., et al. Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Proceedings of the National Academy of Sciences. 2017; 114(50), 13108–13113. pmid:29183967
  62. 62. Zhou H., He S., Cai Y., Wang M., Su S. Social inequalities in neighborhood visual walkability: Using street view imagery and deep learning technologies to facilitate healthy city planning. Sustainable cities and society. 2019; 50, 101605.
  63. 63. Guan F., Fang Z., Wang L., Zhang X., Zhong H., Huang H. Modelling people’s perceived scene complexity of real-world environments using street-view panoramas and open geodata. ISPRS Journal of Photogrammetry and Remote Sensing. 2022; 186, 315–331.
  64. 64. Cinnamon J., & Jahiu L. (2021). Panoramic Street-Level Imagery in Data-Driven Urban Research: A Comprehensive Global Review of Applications, Techniques, and Practical Considerations. ISPRS International Journal of Geo-Information, 10(7), 471.