Embedding graphs in Lorentzian spacetime

Geometric approaches to network analysis combine simply defined models with great descriptive power. In this work we provide a method for embedding directed acyclic graphs (DAG) into Minkowski spacetime using Multidimensional scaling (MDS). First we generalise the classical MDS algorithm, defined only for metrics with a Riemannian signature, to manifolds of any metric signature. We then use this general method to develop an algorithm which exploits the causal structure of a DAG to assign space and time coordinates in a Minkowski spacetime to each vertex. As in the causal set approach to quantum gravity, causal connections in the discrete graph correspond to timelike separation in the continuous spacetime. The method is demonstrated by calculating embeddings for simple models of causal sets and random DAGs, as well as real citation networks. We find that the citation networks we test yield significantly more accurate embeddings that random DAGs of the same size. Finally we suggest a number of applications in citation analysis such as paper recommendation, identifying missing citations and fitting citation models to data using this geometric approach.


Introduction
One success of network science has been to identify that some complex systems can be simplified by considering just the topology of the pairwise interactions between their parts.Abstracting a complex system as a graph can bring physical insights and predictive power.Yet these graphs can still be very complicated.Network geometry is an approach which further abstracts the system by modelling the nodes of the network as points in a geometric space.
Most existing approaches use Riemannian spaces, the simplest example of which is Euclidean space.Random Geometric Graphs (RGG) are graphs embedded in Euclidean space [1,2,3].
Recently there has been much interest in geometric approaches to the study of networks in non-Euclidean spaces [4,5,6,7].Embedding in hyperbolic spaces can yield scale free, clustered networks with community structure illustrating the remarkable power that geometric approaches have to recover complex network properties.
A well established geometric approach to data analysis is Multidimensional Scaling (MDS), a technique to give data expressed as distances or similarities a spatial representation [8].In most MDS analysis, the space used for that spatial representation has been Euclidean, and the technique, as usually described, requires a Riemannian manifold, where the triangle inequality is maintained.
MDS has been used in network science, to fit models of RGGs to networks from real data, for example, from protein interactions [9].Normally, the MDS algorithm takes as an argument pairwise distances between objects, so when applying it to simple networks, where only binary pairwise relations exist, these distances have to be inferred from the network structure.In the simplest Euclidean case (as in [9]), the shortest path on the network is used as an estimate for the distances between vertices, from which MDS is used to calculate coordinates.Once these coordinates have been calculated, a new RGG can be built from them, and if it is similar to the original graph, the initial geometric assumption was a good one.
In this paper we will consider networks where each node is associated with a particular time and directed edges between nodes represent causal relations.Such a network forms a directed acyclic graph (DAG) [10].Instead of embedding a network in geometric space alone, the causal ordering of the nodes in a DAG suggests that an embedding in space and time is needed.The causal structure of such a network has the same constraints as the causal structure of spacetime as used in special and general relativity [11,12].This suggests that the geometries used in relativity, which are pseudo-Riemannian are the appropriate ones to use because of the special properties of a time dimension.In particular, we will consider Lorentzian spacetimes, a special case of pseudo-Riemannian manifolds in which there is one time dimension with some number of spatial dimensions.
Euclidean space, being flat and isotropic, is the simplest Riemannian manifold.Analogously, flat isotropic spacetime is Minkowski spacetime, the simplest Lorentzian manifold.In this paper, we first generalise classical MDS to allow converting distances into coordinates for pseudo-Riemannian manifolds.We then show how this allows embedding of DAGs in Minkowski spacetime 1 , and that the coordinates of geometric graphs in Minkowski spacetime can be successfully recovered.We then illustrate this technique by finding coordinates for some citation networks, which naturally form DAGs and suggest applications such as paper recommendation.

Review of Standard MDS
We will begin by briefly summarising the details of standard MDS in Euclidean space.Suppose we have N objects, and are given the squared Euclidean distance, S ij between each pair.We wish to find the co-ordinates of the objects, which will be D dimensional vectors, x i for each object i, such that they fit the constraint that The classical MDS algorithm solves this problem by using this N × N matrix of square distances, S, and then constructing the double centred matrix B = − 1 2 JSJ where 1 Somewhat confusingly, in some of the literature on MDS (particularly in psychology -see for instance Shepard [13]), a Minkowski metric refers to a distance measure of the form d xy = i (x i − y i ) λ 1/λ .This is not what we mean by a Minkowski metric in this paper, rather we mean the Minkowski spacetime of special relativity [14].
It can then be shown (details are available in [8]) that where X is an N × D matrix of co-ordinate vectors x which satisfy the constraint of recovering the original distances, and with the centre of mass of the coordinates at the origin.B is guaranteed to be semi-positive definite (i.e. it has no negative eigenvalues).So we can then find (up to a distance-preserving symmetry transformation) the coordinates in X by decomposing where Σ is a diagonal matrix of the eigenvalues of B, and U a matrix of its eigenvectors.A solution is given by This process yields coordinates in N dimensions, but only D of the eigenvalues will be non-zero.It is possible retrieve coordinates in fewer dimensions, by using only the largest D eigenvalues and their corresponding eigenvectors.The larger eigenvalues correspond to principle components, meaning that using them as the coordinates minimises the square difference between the original distances we started with, and the ones calculated from these inferred coordinates.These coordinates are in this sense the most accurate D dimensional representation of the original data.

Lorentzian MDS
Minkowski spacetime is a combination of a d-dimensional Euclidean space, and one time dimension forming a (d+1)-dimensional manifold.A point i in this space, has coordinates x i consisting of a time coordinate, x 0 i , and spatial coordinates x k i , with k = 1, 2, ..., d.The Minkowski separation between two such spacetime points i and j is given by Pairs of points can then be classified into three types: for a positive separation the pair is spacelike separated, for a negative separation the pair is timelike separated, while exactly zero separation means the pair is lightlike separated.We can now ask the same question that classical Euclidean MDS poses: given pairwise separations M ij , for points in this space, can we recover coordinates which respect these separations?Proceeding with the classical Euclidean algorithm we can construct the double centred matrix B as before using B = − 1 2 JMJ.However we now encounter a problem when decomposing B. Previously the eigenvalues were guaranteed to be non-negative, but now we find one negative eigenvalue corresponding to the time dimension's negative sign in equation 4. Since we need to take the square root of these eigenvalues, and we want real coordinates this is a problem.
It turns out that the changes required to the classical MDS algorithm are remarkably simple (details are given in appendix A).Instead of looking for a matrix of coordinates X such that B = XX T , we now search solutions to where G is matrix representing the metric of the embedding space.For traditional MDS with its Euclidean space G is just the identity matrix so this factor drops out from the analysis.However for DAGs where nodes are also associated with a time coordinate, we choose G to represent the Minkowski metric.In our conventions, this is a diagonal matrix with −1 in the first column and +1 in the others.We again decompose B and now need solutions to The difference to traditional MDS is that the −1 present in G changes the sign of the one negative eigenvalue in Σ.This allows us to take the square root of Σ as we do in classical Euclidean MDS.

Causal Set DAGs and Minkowski spacetime
To use Lorentzian MDS on networks, we require a method of estimating the separations between every pair of nodes from the network structure.We will do this using ideas from causal set theory.A causal set is a locally finite partially ordered set.
In the causal set approach to quantum gravity, the underlying structure of spacetime is postulated to be a causal set.Spacetime is seen as discrete at the Planck scale, and the continuous spacetime we perceive emerges at larger length scales.[15,16,17].The only structure present are elements of the set and the relation, ≺ between pairs of elements.In the correspondence between the discrete causal set, and the continuous spacetime that emerges, the relation ≺ corresponds to timelike separation where x ≺ y corresponds to x being in the past of y, and spacelike separated pairs are not related.Causal sets give us a natural way of discretising spacetime with related elements, and we can use them in our approach because they have the same structure as a DAG.In physics a timelike separated pair are allowed to be causally related, as information could pass from one to the other, from past to future.However a spacelike separated pair can never be causally related as light could not reach one event from the other.This is why we will associate timelike separation in the continuous spacetime with a link representing a causal relation in the network.
We can now construct the equivalent of an RGG in Minkowski space.We begin by assigning coordinates in M D to N points, by sampling uniformly at random2 from [0, 1] D .We will denote the coordinates of point i by x µ i , µ = 0, 1, . . ., d, where µ = 0 is the time coordinate.We then construct the causal set graph, G D by placing an edge from i to j if M ij < 0 and directing that edge from the past to the future.The fact that edges in the graph now represent causal relations illustrates why the graph is necessarily a DAG, as closed causal loops are forbidden.In all cases, one large negative eigenvalue is seen, corresponding to the one timelike dimension.For the causal sets, this timelike dimension is the one time dimension in the Minkowski spacetime they are embedded in.For the random DAG, this corresponds to the time ordering which could be created as a consequence of the acyclic property of this graph.We observe d large positive eigenvalues corresponding to the spatial dimensions in each causal set, illustrating that the coarse graining of the causal set does not prevent the MDS algorithm from successfully identifying the principle components of the space.For a given number of points, higher dimensionality will mean fewer relations between the causal set's elements and since these relations are the information used by the MDS algorithm its ability to cleanly pick out d dimensions diminishes as d increases.However for the random DAG there is no clear separation of large eigenvalues suggesting there is no natural dimension to an embedding Minkowski spacetime.

Embedding Graphs using Lorentzian MDS
To use MDS on a network we must estimate the separation matrix S using the network's structure.For Euclidean MDS, the separation is always a non-negative number and the shortest path between nodes in the graph is a natural and effective estimator for the distance.However in Minkowski spacetime, the separation of points is not always positive, so the the number of steps along some path is not going to be measure of all Lorenztian separations.The solution is to estimate spacelike and timelike separations separately when studying a DAG.
Suppose we have two connected nodes in the graph, meaning they are timelike separated.It was conjectured in [18] and later shown in [19] that for timelike separated points i and j in G D the length of longest path L ij between two connected nodes, say i and j, is proportional their timelike separation (in the limit of long separations, and where the longest path must respect the edge's direction).So in this case we set Finding the distance between spacelike pairs is more challenging and to our knowledge there is no solution as easily calculated as the longest path is for timelike pairs.Good approximations are known however, and we will use a very simple one, described in [20,21] as 'naive spatial distance'.Suppose we have two disconnected points i and j in the G D meaning they are spacelike separated.We then look for a pair of nodes, k and l, where k is in the future of both nodes i and j while l is in the past of both i and j4 .We then chose nodes k and l so as to minimise the length of the longest path between k and l (which are necessarily connected, via paths through x and y).The timelike separation between k and l is then used as an estimate for the spacelike separation between i and j. Figure 2 shows an example of how we estimate timelike and spacelike separations.
This estimate is simple and at first appealing, but fails in more than two dimensions for large graphs (hence the 'naive' in the name).Nonetheless we find it is sufficient for our purposes 5 .This is partly because it is inaccurate only for large causal sets but also because in the MDS algorithm each point's coordinates is fixed by many distances, both timelike and spacelike which limits the effect of noise some poor estimations of spacelike separations.We can then set S ij = L 2 kl for the chosen k and l.These timelike and spacelike distances define our separation matrix S (where timelike separation has the − sign in our conventions).Finally we use the algorithm described in the previous section to assign coordinates in some D-dimensional Minkowski spacetime M D to each vertex in the DAG.

Testing Lorentzian MDS on Causal Sets
Given a causal set graph, G D it is possible, in principle and for large enough N = |G D |, to recover all properties of the spacetime (up to a factor of the density of the sprinkling) [11,22].The timelike separation between nodes A and F is approximated as 5 units -as this is the number of edges in the longest direction-respecting path between them.Nodes B and G are spacelike separated.To estimate this separation we find a pair of points in their mutual past and future.In this case, the only such pair is (A, F).The naive spatial separation between (B, G) is then given by the timelike separation between (A, F) so is also in this case 5 units.Note, only the edges not implied by transitivity have been drawn.
In Minkowski spacetime there is only one parameter to recover, D, and this can be estimated for the process described above and for DAGs in general [18,23,6].Our task here is to recover not properties of the manifold in which the nodes are embedded, but to find the full details of that embedding.
If the graph was originally made sprinkling points into M D then we know that an exact embedding is possible (since the original sprinkled coordinates must be a solution), and so the embedding algorithm should approximately recover the original coordinates (up to distance preserving factors).If the graph was not, then we may only be able to find an approximate embedding.
As is the case for classical MDS, our Lorentzian MDS is guaranteed to recover the coordinates of points when the exact distances are used as the algorithms input.However, when distances are estimated using graph topology, the pairwise separations will be noisy as they are coarse grained by the discrete graph (see figure 3).To assess the reliability of this algorithm we will test it first on the causal set model described above.We will take the coordinates produced by the Lorentzian MDS algorithm use them to rebuild the graph by again placing edges only between timelike separated pairs.If the overlap between the edges between nodes in the recreated graph, and in the original graph is high, the embedding is an accurate one, and similarly if the overlap is low the embedding is poor.As in [9] we will measure this using the sensitivity (the fraction of the correct edges which were predicted) and specificity (the fraction of correct non-edges which were predicted).

Citation Networks in Minkowski spacetime
Our MDS algorithm is able to find accurate embeddings for randomly sprinkled causal sets.We can now attempt to embed networks formed from real social systems, and here we will use citation networks from the arXiv (2003 KDD cup datasets) and the US Supreme Court [24], as well as random DAGs for comparison.Recall that the dimensionality of the embedding is something we can choose, by selecting the D largest eigenvalues in the MDS algorithm.To measure the effectiveness of the embedding we will compare the original network, with a new network generated from the coordinates determined by the MDS algorithm.We want to measure the effectiveness of a classifier which predicts edges in the network from the MDS coordinates.To do this we will use the established method of the area under the receiver-operator curve, AUC.Varying a continuous parameter, the sensitivity and specificity of the embedding is measured, and plotted, as in figure 5 and the area under this curve describes the quality of the classifier.The continuous parameter we will vary is the speed of light (or the speed information can be transferred) in the Node size is proportional to the number of citations, and lines correspond to citations amongst these papers.A large group of papers is visible in the middle of the plot, forming a long chain of citations, as well as some more isolated papers on either side.A small number of spacelike citations are visible (those edges more than 45 deg from vertical) because this two-dimensional embedding is not perfect, but only the optimal set of coordinates found by the MDS algorithm.Where the curve reaches the top-left of the plot we are in the c ≈ 1 regime (c = 1 denoted by the large black point on each curve) where the trade-off between false positives and false negatives is balanced.The shape of these ROC curves measures the effectiveness of the embedding and it is clear that the the causal set performs best, the random DAG worst and the citation networks in between.
Minkowski space, c.Previously, we have set c = 1, but varying this speed will change which nodes are connected in new network generated from the MDS coordinates.Now, nodes i and j are connected if their coordinates satisfy For small values of c, very few nodes are connected and so the specificity is high (few false positives) but the sensitivity is low (many false negatives).For large values of c, many nodes are connected and so the reverse is true.We will measure the ease of embedding a network by taking the mean of the area under this curve for networks of size 100-500.In the case of the citation networks this was done by randomly sampling intervals of this size from the citation network.In the cases of the causal set graph, and the random DAGs, many instances of the required sizes are stochastically generated.

Discussion and Applications
Finding a good geometric embedding of a network provides a powerful tool for the analysis of that network as it allows standard geometric techniques and intuition to be used.Calculations of network properties can be made more efficient, for example, when finding optimal routes from one node to another, the node coordinates provide local information which can improve routing algorithms [25].Coordinates resulting from geometric embedding also provide a natural visualisation for a network.Such visualisations are used in bibliometrics to help identify distinct fields or assist literature reviews [26].
In the cases of citation analysis, where we conjecture that the spatial dimensions that result from a geometric approach correspond to similarity in the topic of a paper, our approach yields spatial similarities between papers while accounting for the time difference in their publication.Once embedding coordinates are known, the idea that nodes may be 'similar' can be expressed as nodes being close by using an appropriate metric, which need not be the Lorentzian one we used in the construction of the coordinates.Two nodes might be spacelike separated, so have no direct links, yet be close in a Euclidean sense because MDS calculates coordinates globally using information from all vertices and edges.Papers could be recommended if spatially close to others a reader has interest in even if there are no local connections between them, potentially bringing work or authors to the attention of readers who are not aware of them.Spatial similarity can also be used to define clusters.The idea of 'centrality' or importance of a node has a natural representation in terms of the density of points in the geometric neighbourhood of the point linked with the chosen node.Another use of this approach is where edges in a network are placed primarily according to some geometric rule but their connections are also governed by some smaller second order effect.It may only be possible to measure the smaller effect once we have accounted for the primary geometric one by assigning coordinates using Lorentzian MDS or a similar method.We can see this effect clearly when the geometric embedding is one in real geographic space, such as in [27] where accounting for geographic distance in phone-call data allows more accurate prediction of the second order effect of shared language.
Other approaches to geometric embedding exist in the literature.MDS is characterised by maintaining global separations between pairs.Others, such as Isomap [28,29] maintain shortest paths between pairs linked by local interactions.That approach may not be as appropriate in pseudo-Riemannian geometries.Firstly, the shortest path locally may not correspond to the geodesic distance like it does in Euclidean space; as discussed above in graphs embedded in Minkowski space it is the longest path which corresponds to the geodesic.Secondly, the idea of local neighbours is less clearly defined if there are different types of separation, or if, as is the case for Minkowski space, the number of nearest neighbours diverges.Another class of embedding approaches are probabalistic,  In this plot, the causal set graphs are created from sprinkling uniformly into a (d + 1)-dimensional spacetime, then embedded back into that same dimensional spacetime.Their AUC values therefore represent the ideal case, where we know an embedding is possible.The random DAGs show the worst embeddings.Their AUC values represent the success of an embedding for a graph which has no structure.They are noticeably more embeddable in higher dimensions which is because there are more degrees of freedom in which to assign coordinates while maintaining the randomly placed links.The bars show means and standard deviations of the AUC for random DAGs of size N = 100 to N = 500.Citation networks from the arXiv, and the US Supreme Court fall between these extremes, illustrating the presence of some structure which makes them easier to embed in spacetime.We sampled 250 intervals with between 100 and 500 nodes randomly from each citation network, and the bars show means and standard deviations for the AUC in each case.such as Stochastic Neighbour Embedding [30].Although it is beyond the scope of this work, we do not see why such approaches could not be adapted to pseudo-Riemannian manifolds, and the ability to use a mixture of separated images for the same object may prove very useful.

Causal
Finally we note that inserting a metric signature into the equations for classical MDS allows it to be used on any metric signature, even though we have focused only on the Lorentzian signature here.To our knowledge this pseudo-Riemannian output is a new development, although some kind of manifold learning techniques exist which can take pseudo-Riemannian manifolds as their input [31,32].We note that when performing the Lorentzian network MDS algorithm we often find multiple negative eigenvalues, suggesting that embeddings in spaces with more than one timelike dimension is also possible, as are potential embeddings into Lorentzian manifolds other than Minkowski space, incorporating curvature or preferred directions.metric, B is symmetric and so can can be decomposed into orthogonal eigenvectors, U and eigenvalues Σ.

B = UΣU T
We aim to find a solution to the equation B = XGX T .Trying X = UD where D is some real diagonal matrix gives Assuming that the metric G is diagonal we then have that Σ i = D 2 i G i and since D is real, the signs of the elements of Σ must equal those of G.

Figure 1 Figure 1 :
Figure1shows the distribution of the eigenvalues Σ for causal sets and random DAGs 3 .

Figure 2 :
Figure2: The timelike separation between nodes A and F is approximated as 5 units -as this is the number of edges in the longest direction-respecting path between them.Nodes B and G are spacelike separated.To estimate this separation we find a pair of points in their mutual past and future.In this case, the only such pair is (A, F).The naive spatial separation between (B, G) is then given by the timelike separation between (A, F) so is also in this case 5 units.Note, only the edges not implied by transitivity have been drawn.

Figure 3 :
Figure 3: 20 points of a 200 point causal set, (left) and its embedding after the MDS algorithm.The similarity between the two plots shows the success of the embedding algorithm in finding the point's coordinates using only the edges of the graph.

Figure 4 :
Figure 4: An D = 1+1 embedding of the top 200 most cited papers in the hep-th citation network.Node size is proportional to the number of citations, and lines correspond to citations amongst these papers.A large group of papers is visible in the middle of the plot, forming a long chain of citations, as well as some more isolated papers on either side.A small number of spacelike citations are visible (those edges more than 45 deg from vertical) because this two-dimensional embedding is not perfect, but only the optimal set of coordinates found by the MDS algorithm.

Figure 5 :
Figure 5: Sensitivity and specificity are measured for various c values for D = (2 + 1) dimensional embeddings of 5 networks.Where the curve reaches the top-left of the plot we are in the c ≈ 1 regime (c = 1 denoted by the large black point on each curve) where the trade-off between false positives and false negatives is balanced.The shape of these ROC curves measures the effectiveness of the embedding and it is clear that the the causal set performs best, the random DAG worst and the citation networks in between.

Figure 6 :
Figure6: The AUC for 5 types of DAG, embedding in Minkowski spacetime of various dimensions.In this plot, the causal set graphs are created from sprinkling uniformly into a (d + 1)-dimensional spacetime, then embedded back into that same dimensional spacetime.Their AUC values therefore represent the ideal case, where we know an embedding is possible.The random DAGs show the worst embeddings.Their AUC values represent the success of an embedding for a graph which has no structure.They are noticeably more embeddable in higher dimensions which is because there are more degrees of freedom in which to assign coordinates while maintaining the randomly placed links.The bars show means and standard deviations of the AUC for random DAGs of size N = 100 to N = 500.Citation networks from the arXiv, and the US Supreme Court fall between these extremes, illustrating the presence of some structure which makes them easier to embed in spacetime.We sampled 250 intervals with between 100 and 500 nodes randomly from each citation network, and the bars show means and standard deviations for the AUC in each case.