A New Methodology of Spatial Cross-Correlation Analysis

Spatial correlation modeling comprises both spatial autocorrelation and spatial cross-correlation processes. The spatial autocorrelation theory has been well-developed. It is necessary to advance the method of spatial cross-correlation analysis to supplement the autocorrelation analysis. This paper presents a set of models and analytical procedures for spatial cross-correlation analysis. By analogy with Moran’s index newly expressed in a spatial quadratic form, a theoretical framework is derived for geographical cross-correlation modeling. First, two sets of spatial cross-correlation coefficients are defined, including a global spatial cross-correlation coefficient and local spatial cross-correlation coefficients. Second, a pair of scatterplots of spatial cross-correlation is proposed, and the plots can be used to visually reveal the causality behind spatial systems. Based on the global cross-correlation coefficient, Pearson’s correlation coefficient can be decomposed into two parts: direct correlation (partial correlation) and indirect correlation (spatial cross-correlation). As an example, the methodology is applied to the relationships between China’s urbanization and economic development to illustrate how to model spatial cross-correlation phenomena. This study is an introduction to developing the theory of spatial cross-correlation, and future geographical spatial analysis might benefit from these models and indexes.


Introduction
In geographical research, spatial correlation processes falls into two types: autocorrelation and cross-correlation. The former reflects intra-sample correlation, that is, a relationship between one measure and itself, while the latter reflects inter-sample correlation, namely, a relationship between one measure and another measure. Spatial autocorrelation is defined by one size measurement (e.g. city population) and one spatial contiguity measurement (e.g., Euclidean distance), while spatial cross-correlation can be defined by two size measurements (e.g., city population and urban area) and one spatial contiguity measurement. Based on the statistical measurements of Moran's index and Geary's coefficient [1,2], a relatively mature theory has been developed for spatial autocorrelation analysis . Spatial autocorrelation modeling has been widely applied to various correlational analyses of natural and human phenomena in many fields [26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41], and in particular it has been integrated into the spatial analytical

Global and local measurements of spatial cross-correlation
The theoretical framework of spatial cross-correlation analysis consists of a set of models and measurements. The basic mathematical reasoning is helpful for understanding these models and indices. Suppose there are n elements (e.g., cities) in a system (e.g., a network of cities and its hinterland) which can be measured by two variables (e.g., city population and urban area), X and Y. A pair of vectors can be defined as below: where x i and y i are two size measurements of the ith element (i = 1, 2, . . ., n), and the symbol "T" denotes transpose. The centralized variables can be calculated by where μ x and μ y represent the average values of the variables x i and y i , which are expressed as The population variances of the two variables are as follows where σ x and σ y denote the population standard deviations of x i and y i , respectively. The results of a scaling transform of the centralized variables form a pair of standardized vectors such as which are termed standard scores in statistics. It can be shown that the norm of x and y, i.e., the lengths of the two vectors, ║x║ and ║y║, exactly equals the dimensions of the vectors, i.e., the number of elements in the system, n. Thus we have The models of spatial correlation, including autocorrelation and cross-correlation, are based on spatial distance or spatial contiguity. Define an n-by-n unitary spatial weights matrix such as which is actually a unitized spatial weights matrix (USWM). The matrix can be produced by a spatial contiguity matrix (SCM), and it has three properties as below: (1) Symmetry, i.e., w ij = w ji ; (2) Zero diagonal elements, namely, |w ii | = 0, which implies that the entries in the diagonal are all 0; (3) Unitization condition, that is Then, by analogy with the improved formula of Moran's index for spatial autocorrelation (Chen, 2013a), a new measurement for spatial cross-correlation analysis can be defined as where R c denotes the coefficient of spatial cross-correlation, which can be termed spatial crosscorrelation index (SCI). It is easy to prove that the SCI is a correlation coefficient, and its value falls between -1 and 1. Because of symmetry of the spatial weights matrix, transposing R c yields another expression which is numerically equivalent to Eq (9). However, as indicated in the following section, from Eqs (9) and (10), we can derive different models for different uses of spatial analysis. A set of matrix equations can be constructed on the basis of the SCI formulae. Eqs (9) and (10) multiplied left by x or y on both sides of the equal signs yields It is easy to demonstrate that xy T Wx = xx T Wy, yx T Wy = yy T Wx. In these equations, there are two ideal spatial correlation matrixes (ISCM) for spatial autocorrelation as below: There are two ISCMs for spatial cross-correlation such as Eqs (11) and (12) show that SCI is just the eigenvalues of the ISCMs of spatial cross-correlation. This differs from Moran's index, which is the characteristic value of the ISCM of spatial autocorrelation [7]. An important measurement of spatial autocorrelation is called local indicators of spatial association (LISA). LISA is also termed local Moran's index [3]. Similarly, two sets of local spatial cross-correlation coefficients can be defined by where R i and R j refer to the local spatial cross-correlation index (LSCI) of the ith element and the jth element. Accordingly, R c denotes the global spatial cross-correlation index (GSCI), which can be termed SCI for short. The GSCI is used to reflect the summation of cross-correlation between any two elements, while the LSCI is utilized to measure the cross-correlation between a given element and all other elements in a geographical system. As w ij = w ji , for arbitrary n, LSCI can be expressed with matrix equations as follows . .
x n 2 6 6 6 6 6 Practical equations for spatial cross-correlation In practice, the spatial cross-correlation coefficient can be defined in another form. The precondition for Eq (9) is as follows which represents a practical relation for SCI. According to Eq (6), Eq (21) multiplied left by x T yields nx T Wy = x T R c x = nR c , which results in Eq (9). Similarly, the precondition Eq (10) is as below which multiplied left by y T yields ny T Wx = y T R c y = nR c , and thus yields Eq (10). A real spatial correlation matrix (RSCM) for cross-correlation can be defined as It can be proved that R c is just the eigenvalue of M, and the corresponding eigenvector is (x+y). Actually, Eq (21) plus Eq (22) yields This suggests that M corresponds to M (xy) and M (yx) . The relationship between Eq (13) and Eq (21) gives an error equation in which U represents an error vector. The relationship between Eq (14) and Eq (22) gives another error equation in which V represents another error vector. There are errors between My = x T xWy and M (xx) y = xx T Wy, also there are errors between Mx = y T yWx and M (yy) x = yy T Wx. This suggests an approach to testing the "goodness of fit" of a spatial cross-correlation model relative to observational data. If the spatial cross-correlation is strong, Mx will be a close to M (yy) x, and My will be a close to M (xx) y.

Spatial cross-correlation scatterplots
Spatial cross-correlation can be visually displayed with two scatterplots, which are similar to Moran's scatterplot of spatial autocorrelation. However, the cross-correlation scatterplots come in pairs. In order to create the scatterplots, six variables based on the spatial correlation matrix are defined as below: Using these equations, we can generate a set of scatterplots comprising four graphs with observational data and the corresponding calculations. The newly defined variables can be matched to make cross-correlation scatterplots as follows. The relationship between x and f (xy) give the first scatterplot, the relationship between x and f (xx) give the second scatterplot, the relationship between y and f (yx) give the third scatterplot, and the relationship between y and f (yy) give the fourth scatterplot (Table 1). In fact, the first plot is the same as the second one, while the third plot is identical in form to the fourth one. In this instance, we actually need two scatterplots to illustrate spatial cross-correlation in empirical studies.
The approach to making cross-correlation scatterplots is as follows. Taking x or y as an abscissa (x-axis) and f (y) or f (x) as an ordinate (y-axis), we can create a scatterplot. Then using the relationships between x or y and f (xx) or f (xy) or f (yx) or f (yy) , we can produce a trendline. In short, each scatterplot includes two parts: n scattered points and a straight line. The relationship between x or y and f (y) or f (x) take on scattered points, but the relationship between x or y and f (xx) or f (xy) or f (yx) or f (yy) exhibit a trendline, which is in fact a regression line. In other words, the plot of f (y) or f (x) vs. x or y presents a set of randomly scattered data points, while the plot of f (xx) or f (xy) or f (yx) or f (yy) vs. x or y shows a set of ordered data points, which make a straight line. Superimposing the trendline onto the scattered data points yields a scatter diagram for spatial cross-correlation analysis.

Geographical meanings of spatial cross-correlation measurements
The geographical meaning of the spatial cross-correlation can be illuminated by clarifying the mathematical relationship between Pearson's correlation coefficient and the SCI. Leaving spatial distance out of account, we can re-express Eqs (9) and (10) as follows where R 0 is the simple correlation coefficient (SCC), which can be treated as a special case of SCI, and represents a unitary identity matrix, which takes the place of the USWM, and E denotes an identity matrix. It can be proved that R 0 is just a Pearson's correlation coefficient: which indicates a simple cross-correlation between x and y. Based on Eq (35), a partial correlation coefficient can be defined as where R p refers to the partial spatial cross-correlation coefficient (PSCC). Now, the meanings of the spatial correlation coefficients can be explained as follows. The SCI, R c , denotes the indirect correlation between x and y through the spatial distances and other elements in a geographical system; the PSCC, R p , represents the direct cross-correlation between x and y, which is free of the spatial distance and other elements; Pearson's correlation coefficient, R 0 , is a simple cross-correlation coefficient reflecting the summation of spatial correlation, including both the direct cross-correlation and the indirect cross-correlation. The SCI has two functions. First, it presents the indirect correlation between x and y, which is based on spatial distance. Second, using the indirect spatial cross-correlation coefficient, we can estimate the direct cross-correlation coefficient. Thus, the simple spatial correlation, Pearson's correlation, can be separated into two parts: a direct correlation without distance effect and an indirect correlation based on the distance decay effect.

Comparison between spatial autocorrelation and cross-correlation
For spatial analysis, autocorrelation and cross-correlation represent two different sides of the same coin. In fact, the concept of autocorrelation comes from the simplest cross-correlation, i.e. the one independent of a time lag or a spatial displacement. The autocorrelation coefficient defined in the 2-dimensional space proceeds from the autocorrelation function defined in the 1-dimensional time or space (Fig 1). The 2-dimensional cross-correlation coefficient is constructed by analogy with the 2-dimensional autocorrelation coefficient, i.e., Moran's index, which was re-expressed in a new mathematical form [7]. A comparison can be drawn between spatial autocorrelation and spatial cross-correlation as shown in Table 2. In short, the spatial autocorrelation is the intra-sample spatial correlation, while the spatial cross-correlation is the inter-sample spatial correlation. The former is based on one size measurement, while the latter is based on two size measurements.
The 2-dimensional spatial correlation analyses, including spatial autocorrelation and spatial cross-correlation, are based on spatial weight matrices. A spatial weight matrix comes from a SCM, which can be generated by at least four ways [6]. For a geographical system with n spatial elements, a SCM can be expressed as V = [v ij ], where V denotes the SCM, and v ij is a measure used to compare and judge the degree of contiguity between place i and place j (i, j = 1,2,. . .,n). The elements on the diagonal are zeros, otherwise they must be converted into zero (i.e., for i = j, v ii 0). A USWM can be defined as w ij = v ij /T, where T denotes the sum of SCM entries. Thus, based on the population standard deviation (PSD), the SCI formulae, Eqs (9) and (10), can be developed in a sophisticated form as follows which bear an analogy with the traditional expression of Moran's index. If our spatial analysis is based on a sample rather than a population (universe), the PSD should be replaced by the sample standard deviation (SSD). In this case, the sample size n in Eqs (37) and (38) should be substituted by the total degree of freedom (n-1). For the comparability between the spatial cross-correlation index and Moran's index, PSD rather than SSD will be employed to make empirical analyses in the next section.

Study area, measurements, and analytical process
The new framework of spatial cross-correlation can be employed to study the relationship between urbanization and economic development of a country. It has been confirmed that there is correlation between population urbanization and regional economic development [74]. However, the relationship between the cause and effect is not yet clear. The spatial cross-correlation analysis can be used to reveal the causality between urbanization and economic Table 1. The functional relationships of two pairs of scatterplots defined for spatial cross-correlation analysis.

Scattered points Trend line
The first plot x f (y) = nWy f (xy) = xy T Wx x acts on y The second plot The third plot The fourth plot development. As an example, the spatial cross-correlation models and methods will be applied to Mainland China's regions and cities. The spatial objects are the 31 provinces, autonomous regions, and municipalities directly under the Central Government of China and the capital cities of these regions. The level of urbanization is measured by the proportion of urban population to total population in a region, while the level of economic development is measured by the per capita gross regional product (GRP). The distances by train between any two capital cities can be used to quantify the spatial contiguity and to make a spatial weight matrix. The statistical data of urbanization levels and per capita GRP (2000-2013) are available from the website of National Bureau of Statistics (NBS) of the People's Republic of China (http://www. stats.gov.cn/tjsj/ndsj/), and the railroad distance matrix can be found in many Chinese road atlases (datasets in S1 File). Because the cities of Haikou and Lhasa were not connected to the network of Chinese cities by railway from 2000 to 2006(Lhasa)/2013(Haikou), only 29 regions and their capital cities are taken into account, and thus the size of each spatial sample is n = 29 (Table 3).  Table 2. The similarities and differences between spatial autocorrelation and spatial crosscorrelation.

Item Spatial autocorrelation Spatial cross-correlation
Correlation property Intra-sample correlation Inter-sample correlation According to the theoretical model (Results), the analytical process of spatial cross-correlation comprises three principal steps.
Step1: global analysis of spatial cross-correlation. The basic measurement is the GSCI, which can be given by Eqs (9) and (10). This step is to examine the sum of spatial cross-correlation between any two regions.
Step2: local analysis of spatial cross-correlation. The basic measurements are the LSCIs, which can be calculated one by one using Eqs (17) and (18), or processed as batches using Eqs (19) and (20). The two vectors of LSCIs can be visually displayed with a scatterplot. This step is to investigate the spatial cross-correlation between each region and all other regions.
Step3: explanation of spatial cross-correlation scatterplots. Two pairs of scatterplots can be drawn using Eqs (27) to (32). Among them we need only one pair of scatterplots. Table 1 has shown the corresponding relationships between different equations. This step is to illustrate the spatial cross-correlation patterns. The local cross-correlation can be reflected by the scattered points, while the global cross-correlation can be mirrored by the trend lines.

Calculations and analyses
The new calculation method for Moran's index presented by Chen [7] can be adapted to the spatial cross-correlation coefficients. Based on the standardized vector x, y and the unitized weights matrix W, the SCI can be computed easily (an example in S2 File). The method comprises three steps as follows (an instruction in S3 File).
Step 1: standardize the size variables. In other words, convert the initial vectors X, Y in Eq (1) into the standardized vectors in Eq (5). As indicated above, the PSD instead of the SSD will be employed to standardize the data so that the results are comparable with Moran's index and Pearson's correlation coefficient. The results of 2012 are shown in Table 3.
Step 2: unitize the spatial weight matrix. Using the matrix of railway distances, we can compute the SCM with the distance decay function v(x) = 1/x, where x denotes the railway distance between any two capital cities. Note that the diagonal elements of the matrix should be turned into zeros. Then unitize the contiguity matrix by using the sum of the whole entries to divide each entry. The final weights matrix can be characterized with Eqs (7) and (8).
Step 3: compute SCI. According to Eq (9), the USWM is first left multiplied by the transpose of x, and then the product of x T and W is right multiplied by y; According to Eq (10), the unitized weights matrix is first left multiplied by the transpose of y, and then the product of y T and W is right multiplied by x. The final product of the continued multiplication yields the value of the SCI, and the two results are numerically equivalent to one another. For example, in 2012, the index of spatial cross-correlation between the level of urbanization and per capita GRP is R c = x T Wy%0.1566, R c = y T Wx%0.1566. The SCI can be separated into LSCIs, which reflect the spatial correlation between a region or city and all other regions or cities. Using Eq (19), we can calculate the first vector of the local spatial correlation coefficient, which reflects the action of x (economic development) on y (urbanization); using Eq (20), we can compute the second vector of LSCI, which reflects the reaction of y (urbanization) on x (economic development). All the results are displayed in Table 3, which shows that the sum of the LSCI equals the GSCI. The process of calculations can be fulfilled by MatLab-based computer programs (two programs in S4 File).
A pair of scatterplots of spatial cross-correlation can be drawn by two approaches. The first approach is to make use of the variables x, y, nWx, nWy, xx T Wy, and yy T Wx. One scatterplot is based on the relationship between x (x-axis) and nWy as well as xx T Wy (y-axis), which reflect the action of x (per capita GRP) on y (level of urbanization). The relationship between x and nWy gives the scatterpoints indicative of the first set of LSCIs, while the relationship between x and xx T Wy yields the trendline indicative of the GSCI (Fig 2A). The other scatterplot is based on the relationship between y (horizontal axis) and nWx as well as yy T Wx (vertical axis), which reflect the reaction of y (level of urbanization) on x (per capita GRP). The relationship between y and nWx yields the scatterpoints indicating the second set of LSCIs, while the relationship between y and yy T Wx gives the trendline indicating the same GSCI (Fig 2B). The second approach is to utilize the variables x, y, nWx, nWy, xy T Wx, and yx T Wy. Compared with the first approach, xx T Wy is replaced by xy T Wx, and yy T Wx is substituted with yx T Wy. The results and effects are same as those from the first approach, and the plots are the same as those displayed in Fig 2 (for 2012).
The dual scatterplots can be used to categorize Chinese cities or regions in terms of spatial cross-correlation. Each spatial cross-correlation plot includes four quadrants that indicate four basic types of geographical elements. Accordingly, Chinese regions can be grouped under four heads. The first quadrant represents the high-high (H-H) type: an element is at a higher level, and its neighbors are also at the higher level; the second quadrant represents the low-high (L-H) type: an element is at a lower level, but its neighbors are at a higher level; the third quadrant represents the low-low (L-L) type: an element is at a lower level, and its neighbors are also at the lower level; the fourth quadrant represents the high-low (H-L) type: an element is at a high level, but its neighbors are at a lower level. Where 2012 year is concerned, the classification result based on the first scatterplot is generally consistent with that based on the second scatterplot (Table 4). Only three regions are uncertain, that is, Chongqing (municipality), Heilongjiang (province), and Shandong (province). This suggests that the three regions are at the edges of different types of economic and urban zones. The relationship between the two sets of LSCIs can also be shown by a scatterplot. The plot is a visual aid for categorizing Chinese regions (Fig 3). As far as 2012 year is concerned, the 29 Chinese regions can be distributed into 4 groups according to the quadrants of a Cartesian coordinate system. This classification process rearranges the results given by the dual cross-correlation scatterplots. The H-H type and L-L type such as Beijing, Tianjin, and Shanghai are in the first quadrant, the L-H type such as Anhui, Hebei and the H-L type such as Guangdong are in the third quadrant, and the uncertain type including Chongqing, Heilongjiang, and Shandong are in the second and fourth quadrants (Table 5). Apparently, the LSCI scatterplot lends further support to the clustering result from the spatial cross-correlation scatterplots.
A difficult problem about the relationship between urbanization and economic development is to reveal the causality. This problem can be solved by spatial cross-correlation analysis. In the cross-correlation scatterplots, the slopes of the trend lines equal the SCI value. This suggests that we can employ the regression analysis based on the least squares method to estimate the SCI using Eqs (21) and (22). A discovery is that Eq (21) and Eq (22) give the same SCI value (R c ), but the values of goodness of fit (R 2 ) are different. If the independent variable is x, the dependent variable will be nWy. For 2012, the SCI value is about R c = 0.1566, and the coefficient of determination is approximately R 2 = 0.3710. The standard error of is about δ = 0.0385. This suggests that the per capita GRP can explain about 37.10% of the spatial change of the level of urbanization. If the independent variable is y, the dependent variable will be nWx. For 2012, the SCI value is still about R c = 0.1566, but the determination coefficient is approximately R 2 = 0.3139. The standard error of is around δ = 0.0438. This suggests that the level of urbanization can explain about 31.39% of the spatial change of the per capita GRP. Note that the intercept should be set to 0 as there is no constant term in the regression equations abovementioned.
Different coefficients of determination result in different values of F statistic, t statistics, and standard errors. The F statistics can be used to judge the cause and effect, the t statistics can be utilized to judge the level of confidence of a model's parameter, and the standard errors can be employed to estimate the margin of error of a coefficient. For the regression analysis with a single explanatory variable, the F statistic, t statistics, and parameter standard errors (δ) are all equivalent to the R square, and can be computed by the following formulae\ where v refers to the residual degree of freedom. For our example, because the intercept (constant item) is zero, the degree of freedom is actually v = n-1 = 28. Given R 2 = 0.3710, it follows that F = 16.5136, t = 4.0637, δ = 0.0385; If R 2 = 0.3139 as given, then F = 12.8075, t = 3.5788, δ = 0.0438. Accordingly, the significance for δ = 0.0385 is about p = 0.0004, and that for δ = 0.0438 is about p = 0.0013 (Table 6) (see the example in S2 File).The rest may be deduced by analogy with these. In a linear regression analysis, the F statistic indicates the extent to which an independent variable can explain the corresponding dependent variable. For 2012 year, the action of x on y (R 2 = 0.3710, F = 16.5136) is stronger than the reaction of y on x (R 2 = 0.3139, F = 12.8075). This seems to suggest that the influence of economic development on urbanization is greater than the impact of urbanization on economic development. The coefficient of simple correlation between the level of urbanization and that of economic development of Mainland China can be decomposed by using the SCI value, and thus we obtain direct correlation coefficients. For example, for 2012, the Pearson correlation coefficient can be calculated with Eq (35), and the result is about R 0 = 0.9457. Then, according to Eq (36), the PSCC is about R p = 0.9457-0.1566 = 0.7891. A conclusion can be drawn from these values of correlation coefficients that the direct correlation index of the 29 regions is 0.7891 or so, and the indirect correlation index is round about 0.1566. The former has little relation to the distances between different provincial capital cities and can be regarded as intra-group correlation, but the latter is related to spatial interaction of different regions based on distances and can be treated as intergroup correlation.  Similarly, the analytical process of spatial cross-correlation can be further applied one by one to the other datasets of the years from 2000 to 2013. The parameters include SCC (R 0 ), SCI (R c ), and PSCC (R p ), and the goodness of fit for the regression analyses of spatial cross-correlation have been computed ( Table 6). The related statistics can be evaluated with Eqs (39) and (40). From these calculations, we can get useful spatio-temporal information for China's urbanization and economic development.
First, there is weak positive spatial cross-correlation between Chinese per capita GRP and the level of urbanization. The SCI values come between 0.0995 and 0.1575. The Pearson correlation coefficients (SCC) range from 0.9142 to 0.9577. Thus the partial correlation coefficients (PSCC) vary from 0.7872 to 0.8147. This suggests that the correlation between urbanization and economic development includes an influence factor from spatial interaction.
Second, the spatial cross-correlation between urbanization and economic development of China became stronger and stronger. The simple correlation is relatively stable, and the SCC values fluctuate around 0.95. However, the SCI values went up and up, while the PSCC values went down gradually (Fig 4). This suggests that the spatial interaction between different regions and cities became more and more significant in the process of spatio-temporal evolution of China's regional systems.
Third, the action of economic development on urbanization is relatively stronger than the reaction of urbanization on economic development. The goodness of fit for the regression of nWy depending on x, R 2 (y-x) , is all greater than that for the regression of nWx depending on y,  R 2 (x-y) . This suggests that economic development is a cause of urbanization, and urbanization is an effect of economic development. On the whole, both the values of R 2 (x-y) and R 2 (y-x) go up and up from 2000 to 2013. This lends support to the inference that the spatial interaction of the 29 regions became more and more significant over time. The absolute and relative growth rates of R 2 (y-x) are less than those of R 2 (x-y) . The relationship between the relative growth rate of R 2 (y-x) and that of R 2 (x-y) can be shown by the allometric scaling pattern (Fig 5). The allometric exponent of R 2 (y-x) depending on R 2 (x-y) is about 0.9133, which is less than 1. The R 2 values can be converted into F statistics using the hyperbolic function, Eq (39), and we have F (x-y) >F (y-x) . The F statistics imply that the explanation of economic development for urbanization is more than that of urbanization for economic development. This lends further support to the proposition that the level of urbanization in a geographical region is determined by the level of economic development and in turn reacts to it. x-y) denotes the goodness of fit for the regression of nWy depending on x, and R 2 (y-x) refers to the goodness of fit for the regression of nWx depending on y. Based on the R 2 values, the F statistics, t statistics, and parameter standard errors δ can be calculated with the formulae such as F = t 2 = (n-1)R 2 /(1-R 2 ) and δ = R c [(1/ R 2 -1)/(n-1)] 1/2 . Further, the significance, P, can be reckoned using the F distribution function of MS Excel.

Conclusions
This paper is devoted to laying the foundation for development of spatial cross-correlation theory. The basic measurements and analytical methods are put forward and applied to an urban study of China. In terms of technology, two computer programs based on MatLab have been written and provided for readers. On the basis of the theoretical results and empirical analyses, three basic conclusions can be drawn as follows.
First, spatial autocorrelation and spatial cross-correlation can complement one another Both autocorrelation and cross-correlation analyses can be employed to study the correlation effect of different spatial elements in a regional system or different subsystems within a geographical system. The two methods are different, but they can combine to make an integrated framework. The spatial autocorrelation analysis shows the simultaneous change in value of one random variable, while the spatial cross-correlation analysis displays the simultaneous change in values of two random variables. If we use one variable to measure a number of spatial entities, we can make a spatial autocorrelation analysis; on the other hand, if we use two or more variables to measure a number of spatial entities, we can make both spatial autocorrelation analysis and spatial cross-correlation analysis.
Second, the spatial cross-correlation coefficient represents the indirect relationships between spatial variables Using SCI, we can analyze the well-known simple correlation coefficient in spatial statistics. Pearson's correlation between two spatial variables includes two components: direct correlation and indirect correlation. The spatial correlation coefficient reflects the indirect correlation based on the spatial contiguity between any two geographical entities. Pearson's correlation coefficient minus the spatial cross-correlation coefficient leaves the direct correlation coefficient. The direct correlation is actually a kind of partial correlation, which is independent of spatial patterns. In this sense, spatial cross-correlation analysis can reveal the importance of the part played by geographical distances or spatial relationships. Spatial Cross-Correlation Analysis Third, the dual scatterplots of spatial cross-correlation can be used to reveal the causality between two variables visually Pearson's correlation coefficient and spatial cross-correlation coefficient can reflect the correlation between two variables, but they cannot distinguish between cause and effect. The scatterplots of spatial cross-correlation can be used to differentiate between the cause and the effect. The spatial cross-correlation plots appear by twos, and the two plots are of asymmetry. Therefore, they can show us which variable is in the leading position and which is in the subordinate position. In scientific research, determining causality may be more important than describing correlation in a system. Moreover, the scatterplots can serve for an assistant approach to making a spatial classification of geographical elements.
Supporting Information S1 File. Datasets of per capita GRP, level of urbanization, and railway distances. This file contains the original or preliminarily processed data used in this paper. (XLSX) S2 File. The calculation process of SCI for 2012 (example). It provides two complete processes of computing the spatial cross-correlation coefficients based on power-law decay and exponential decay, respectively. (XLSX) S3 File. An instruction for calculating SCI using MS Excel. It illustrates how to calculate a spatial cross-correlation coefficient step by step using MS Excel. (PDF) S4 File. Two matlab programs for spatial cross-correlation analysis. It provides two MatLab programs for calculating spatial cross-correlation coefficients: one is based on the power-law decay function, and the other is based on the exponential-decay function. Readers can employ the programs to carry out spatial cross-correlation analyses by substituting the data with their own data. (M)