Hypergraph partitioning using tensor eigenvalue decomposition

Hypergraphs have gained increasing attention in the machine learning community lately due to their superiority over graphs in capturing super-dyadic interactions among entities. In this work, we propose a novel approach for the partitioning of k-uniform hypergraphs. Most of the existing methods work by reducing the hypergraph to a graph followed by applying standard graph partitioning algorithms. The reduction step restricts the algorithms to capturing only some weighted pairwise interactions and hence loses essential information about the original hypergraph. We overcome this issue by utilizing tensor-based representation of hypergraphs, which enables us to capture actual super-dyadic interactions. We extend the notion of minimum ratio-cut and normalized-cut from graphs to hypergraphs and show that the relaxed optimization problem can be solved using eigenvalue decomposition of the Laplacian tensor. This novel formulation also enables us to remove a hyperedge completely by using the “hyperedge score” metric proposed by us, unlike the existing reduction approaches. We propose a hypergraph partitioning algorithm inspired from spectral graph theory and also derive a tighter upper bound on the minimum positive eigenvalue of even-order hypergraph Laplacian tensor in terms of its conductance, which is utilized in the partitioning algorithm to approximate the normalized cut. The efficacy of the proposed method is demonstrated numerically on synthetic hypergraphs generated by stochastic block model. We also show improvement for the min-cut solution on 2-uniform hypergraphs (graphs) over the standard spectral partitioning algorithm.


Introduction
In machine learning, interacting systems are often modeled as graphs.In graph modeling, an interacting object is represented as a node, and an edge captures the interaction between a pair of objects.A conventional approach is to quantify the extent of interaction by associating a positive weight to the corresponding edge.This graph formulation is further utilized for various standard machine learning applications like clustering [1] and semi-supervised learning [2].While a graph representation is limited to capturing only pairwise interaction, many real-world systems may involve interactions that may be more complex than the simple pairwise formulation [3].For instance, a collaboration network may involve agents interacting at a group level (also called super-dyadic interactions), which can not be captured by modeling the system as a graph.
Recently, hypergraphs have been used to represent and analyze such complex super-dyadic relationships.Hypergraphs are generalizations of graphs where an edge could potentially connect multiple nodes.These edges are commonly referred to as hyperedges.A k-uniform November 17, 2020 1/22 arXiv:2011.07683v1[cs.LG] 16 Nov 2020 hypergraph refers to the case when all hyperedges are constrained to contain exactly k nodes.Hypergraph partitioning has been used in a variety of applications in several domains, such as VLSI placement [4], object segmentation in videos [5], and citation networks [6].
Existing hypergraph modeling frameworks can be classified into two paradigms, based on whether they reduce the hypergraph to a graph explicitly [7] or implicitly.These reduction based approaches are quite popular in the machine learning community due to the scalability to large datasets [8,9], and provable performance guarantees of graph-based algorithms [10].Thus most of the existing approaches make use of hypergraph reduction to utilize standard graph-based algorithms, which defeats the motivation behind using hypergraphs.As graphs are limited to capture only dyadic interactions, the reduction-based approaches fail to model the desired super-dyadic relationships.
An exciting aspect of hypergraph partitioning is that a hyperedge can be cut in multiple ways, unlike the case of an edge in graphs.The nodes in a hyperedge can be split in different ways.Most of the existing reduction based partitioning methods do not differentiate these multiple configurations and penalize them equally, which, ideally, should not be the case.In fact, Ihler et al. [11] show that the reduction-based approaches can not model a hypergraph cut, i.e., the complete removal of a hyperedge from a given hypergraph.
On the other hand, tensors have gained increasing attention for modeling hypergraphs, primarily in the mathematics community.For instance, Hu et al. [12] extended the fundamental and well-known theorem in spectral graph theory relating cardinality of zero eigenvalue of the Laplacian of a graph to the number of connected components to the uniform hypergraphs.Specifically, they proved that the algebraic multiplicity of zero eigenvalue of a symmetric Laplacian tensor is equal to the sum of the number of even-bipartite connected components and the number of connected components excluding the number of singletons in the given hypergraph.Such insights can not be revealed from the clique reduction methods and its variants [7].In the machine learning community, tensor representation of hypergraphs has not gained much attention, except for a few works [13,14].In this work, we utilize the tensor representation of hypergraphs for detecting densely connected components.

Our Contributions & Outline
The preliminaries of hypergraph reduction to graphs and tensor representation are covered in Section 2. We make the following contributions in this work: • In Section 2.3, we utilize the tensor representation of hypergraphs and prove that hypergraph reduction is a special case of tensor contraction.We propose the ratio-cut and normalized cut objective functions in Section 3.2, which are capable of distinguishing the multiple ways of cutting a hyperedge.• In Section 3.3, we prove that the solution to the minimization of relaxed ratio-cut problem can be obtained from the eigenvector corresponding to the minimum positive eigenvalue of the Laplacian tensor.We also derive an upper bound on the minimum positive eigenvalue in terms of the conductance of the hypergraph, which is a significant improvement over the existing bounds.• In Section 3.4, we exploit the structure of the Laplacian tensor to reduce the objective function computation time from O(n k ) to O(m), where n, m, k are the number of nodes, hyperedges, and cardinality of hyperedges, respectively.In Section 4, we demonstrate the efficacy of the proposed algorithm on synthetic hypergraphs generated by stochastic block model.We also report n/8 times improvement of ratio-cut over the conventional spectral partitioning for cockroach graph.

Preliminaries
In this section, we briefly discuss the prevalent approach of representing hypergraphs and their partitioning.A hypergraph G is defined as a pair of G = (V, E), where V = {v 1 , v 2 , . . ., v n } is the set of entities called vertices or nodes and E = {e 1 , e 2 , . . ., e m } is a set of non-empty subsets of V referred to as hyperedges.The strength of interaction among nodes in the same hyperedge is quantified by the positive weight represented by w e = {w e1 , w e2 , . . ., w em }.
The vertex-edge incidence matrix is denoted by H and has the dimension |V | × |E|.The entry h(i, j) is defined to be 1 if v i ∈ e j and 0 otherwise.The degree of node v i is defined by d(v i ) = ej ∈E w ej h(i, j).We can also define two diagonal matrices, W, D, with the dimension of m × m, n × n, containing the hyperedge weights and node degrees respectively.Note that there is no loss of information in this form of representation of hypergraphs until this point.This implies that a unique hypergraph can be constructed for a given incidence matrix.

Reducing a Hypergraph to a Graph
Now, we discuss the widely-accepted approach for hypergraph reduction in the machine learning community.The fundamental idea is to reduce a hypergraph to graph and subsequently apply standard graph-based algorithms.In this subsection, we briefly discuss the merits and demerits of these approaches and articulate the reasons for choosing the tensor based representation of hypergraphs.

Definition 1. The clique expansion for hypergraph
by replacing each hyperedge with the corresponding clique, The same could be stated in matrix form as where A represents the adjacency matrix for reduced hypergraph.Another traditional hypergraph reduction approach is star expansion [15].Most of the other reduction approaches are build on these.Please see [7] and the references therein, for more details.This reduction step is very convenient as we can now employ any graph learning algorithms that scale well and come with theoretical guarantees.A natural question arises on the need for these different reduction based approaches.We believe that each of these reduction approaches preserves a few but not all hypergraph properties in the reduction step.The preserved hypergraph property may be useful for the end task of learning on hypergraphs.For example, [16] demonstrates the improvement in clustering by proposing a reduction approach which preserves the node degrees.
More often, the reduction step loses vital information about hypergraphs as two different hypergraphs can reduce to the same graph.This can be seen directly from Eq. (1) as two distinct hypergraphs having different H and W can reduce to the same adjacency matrix A. An illustrative example of the same is presented in Appendix B.1.
We discuss the demerits of reduction-based approaches in the context of hypergraph partitioning in the later sections.In Section 2.3, we use the tensor representation of hypergraphs, and show that reduction is just a special case of tensor contraction.

Tensor representation of Hypergraphs
In this subsection, we briefly review the tensor-based representation of hypergraphs [17,18].A natural representation of hypergraphs is a k-order n-dimensional tensor A, which consists of n k November 17, 2020 3/22 entries and is defined by: It should be noted that A is a "super-symmetric" tensor, i.e, a i1i2...i k = a σ(i1i2...i k ) , where σ (i 1 , i 2 , . . .i k ) denotes any permutation of the elements in the set {i 1 , i 2 , . . ., i k }.The order or mode of the tensor refers to the hyperedge cardinality, which is k for A. The degree of all the vertices can be represented by k-order n-dimensional diagonal tensor D. The Laplacian tensor L is defined as follows: An example demonstrating the tensor representation of a 4-uniform hypergraph is presented in Appendix B.2.The normalized Laplacian tensor, denoted by L can also be defined in a similar manner: For the sake of completeness, we define the tensor eigenvalue decomposition as: where In the next section, we discuss the merits of choosing tensor representation of hypergraphs over using reductions to graphs.

Hypergraph Reduction using Tensor Contraction
Agarwal et al. [7] proposes to unify various hypergraph reduction methods like star expansion [15] , Bolla's Laplacian [19], Rodriguez's Laplacian [20], Zhou's Normalized Laplacian [3] using clique expansion idea.In this section, we show clique expansion, and hence the other existing hypergraph reduction methods, are just a special case of tensor contraction.We define a contraction as an operation on the tensor, which reduces its mode or order.The following lemma is crucial for the proof.
Theorem 3. Let L T be the graph Laplacian corresponding to the clique expansion of a k-uniform hypergraph such that the node degrees are preserved [16].L T can be obtained from the hypergraph tensor Laplacian using L T = L1 k−2 n , where 1 n is a column vector of dimension n with all entries as unity and L1 k−2 n ∈ R n×n with its elements defined by Corollary 4. The Laplacian corresponding to clique expansion of a k-uniform hypergraph, denoted by L c can be derived from the hypergraph tensor Laplacian using This can also be interpreted as (k − 2) th order contraction of original k-order tensor.In this section, the generalized nature of tensor Laplacian was demonstrated.This was done by deriving the mapping for (k − 2) order contraction of original hypergraph tensor Laplacian of order k.This operation reduces a k th order tensor to a matrix, which can be viewed as a 2 nd order tensor.
Having established the generality of thee tensor representation, we revisit the original problem of hypergraph partitioning in the next section.

Partitioning of Hypergraphs
We start this section with brief review of spectral graph theory for partitioning of graphs [21] and further propose these ideas for hypergraphs.

Partitioning of Graphs
Let the p partitions of vertex set V be denoted by sets C 1 , C 2 , . . ., C p such that The two most commonly used objectives of graph partitioning are Ratio cut [22] and Normalized cut [23]: where w rs denotes the weight of edge between nodes r and s, and d r denotes the degree of r th node.It is well known that the solution to the relaxed version of minimizing the ratio cut and normalized cut can be obtained from the Fiedler vector of unnormalized and normalized Laplacians, respectively.The approximation made in the relaxation step is theoretically analyzed by [24].Several extension of this work can be seen in [25,26].

Ratio-Cut and Normalized-Cut for Hypergraphs
We start the discussion with a formal description of the problem.Let C 1 , C 2 , . . ., C p be the p partitions as defined in Eq. ( 8).For a given hypergraph G(V, E, W e ), we intend to remove a subset of hyperedges ∂E ⊆ E, such that G \ ∂E produces at least p disjoint partitions [27,28].
The hyperedge boundary ∂E can be defined as: The next step is to define the objective function to be minimized for obtaining optimal partitions.The measures described in Eq. ( 9) and Eq. ( 10) for graphs are not well-suited for hypergraphs.
We propose the generalization of ratio-cut and normalized-cut for hypergraphs.
Definition 5.The cut cost for the partition C i denoted by w h (C i ) and the total cut cost denoted by w h,t (V ) for all the partitions is defined as: November 17, 2020 5/22 The cut cost for a partition and total cut cost defined in Eq. ( 12) reduces to numerator term in Eq. ( 9) and Eq. ( 10) for k = 2 because the term |C i ∩ e j | reduces to unity ∀e j ∈ ∂E in graphs.We further demonstrate the merits of this cut cost by the following example.
The partition cost is given by It should be noted that w e1 is not reflected in the above cut costs because hyperedge e 1 is not cut.It could be easily verified that this cut cost is not equivalent to clique reduction approach.The cut costs derived for the reduced hypergraph are as follows: Note that the cut costs derived from both approaches are different.On further inspection, we infer w g (C i ) = 2w h (C i ) for i = {2, 3}, which means the cut cost for partitions C 2 and C 3 in the reduced hypergraph are just a scaled version of costs involved in original hypergraph.The same relation does not hold for partition C 1 due to the presence of the term |C i ∩ e j | in Eq. (12).Please refer Appendix B.3 for the computation of these cut costs.
From this illustrative example, it can be inferred that the proposed cut cost for hypergraphs defined in Eq. ( 12) comprises of more information about the cut as compared to reduced hypergraphs.The term |C i ∩ e j | in Eq. ( 12) will lead to a greater penalty for cutting hyperedges with more elements from C i .A hyperedge with higher |C i ∩ e j | is likely to have more association with partition C i , so the corresponding cut should be penalized more.
Minimizing the total cut cost defined in Eq. 12 directly may lead to "unbalanced" partitions with minimum cost.To bypass such trivial and undesirable partitions, we propose the normalization.Definition 6.The Ratio-cut and Normalized-cut for p partitions are defined as: where w h (C i ) is defined in Eq. (12).The above term for ratio-cut and normalized-cut simplifies to Eq.( 9) and Eq.( 10) respectively for k = 2.We prefer normalization of exponential factor in the denominator to bypass the partitions with singletons or less number of nodes.[10,29] define normalized associativity of a hypergraph in a similar manner but utilize the approach of hypergraph reduction for modelling purposes.

Hypergraph Partitioning Algorithm
The partitions C 1 , C 2 , . . ., C p can be derived by minimizing the ratio-cut or normalized cut.For further discussion, we focus on the minimization of ratio-cut, and the same approach can be extended for normalized-cut, as shown later.The optimal partitions can be obtained by solving: Unfortunately, the above problem is NP-hard.Inspired from spectral graph theory, we propose to solve a relaxed version of the optimization problem mentioned above.
Theorem 7. The minimization of ratio-cut in Eq. ( 13) can be equivalently expressed as where we define p indicator vectors f i and its j th element, denoted by f i,j indicates if the vertex v i belongs to cluster C j .The solution to the above problem after relaxing f i ∈ R n rather than an indicator vector can be derived from the eigenvector corresponding to the minimum positive eigenvalue stated in Eq. ( 5).
Hence, we focus on the computation of objective function Lx k for any vector x ∈ R n .
Theorem 8.The objective for the tensor Laplacian of a hypergraph can be simplified using where n s = |{i j : x ij < 0}|, A.M and G.M stand for arithmetic and geometric means respectively.
The above objective function can be viewed as generalization of the graph, as for any edge {a, b}, the objective function We continue the discussion on partitioning with the following example.
From the above example, it is clear that the traditional approach of partitioning does not yield desired partitions for hypergraphs.This is primarily because the eigenvectors of the Laplacian tensor of a hypergraph can not be interpreted in the same way as the eigenvectors of the Laplacian matrix of a graph.November 17, 2020 7/22 To understand the implication of minimum ratio-cut associated with minimum positive λ , we analyze the computation of Laplacian objective function using the Fiedler vector: where l ej (f ) denotes the "score" for hyperedge e j computed for the eigenvector f .With a slight abuse of terminology, we argue that a higher value of this score indicates the corresponding hyperedges are "close" to separator boundary ∂E.The measure of closeness between two nodes is quantified by the minimum number of hyperedges to be traversed for reaching one node to another.This can be validated easily by careful inspection of hyperedge score l ej (f ), when the vector f is treated as the cluster indicator variable shown in Eq. (15).The hyperedge score will be non-zero only for the hyperedges on the separating boundary for such ideal choice of f .The same can be also interpreted as the score being zero ∀e j ∈ {E \ ∂E}.We carry forward the same intuition and prefer to cut the hyperedges with a "higher" score.
The score may not be exactly zero for any hyperedge if the Fiedler vector is used for the score computation as it is obtained for the relaxed minimization of the ratio-cut (Theorem 7).Applying this approach on Example 2, we report a maximum score of 0.017 for the hyperedge {1, 2, 3} and hence cut it to obtain the optimal partitions.The proposed algorithm is summarized in Table 1.
3. Remove hyperedges with maximum cost until p-disjoint partitions are obtained.
The intuition behind using the hyperedge score for deriving ∂E is motivated from spectral graph theory.It is interesting to note that this novel use of hyperedge scores helps to compute a better ratio-cut for the cockroach graph presented in Section 4.
A similar analysis can be performed for the minimization of the normalized cut of hypergraphs.
Corollary 9.The solution to the relaxed optimization problem of minimizing normalized cut mentioned in Proposition 6 can be derived using the eigenvector corresponding to the minimum positive eigenvalue of the normalized Laplacian tensor defined in Eq. (4).
We perform the theoretical analysis of the proposed algorithm and derive an interesting bound on the approximation made in normalized cuts.
Theorem 10.The upper bound on the minimum positive eigenvalue of an even order k-uniform hypergraph is where λ 1 is the first positive eigenvalue satisfying Eq. (5) for normalized tensor Laplacian L and φ(G) refers to the conductance of hypergraph.
The above inequality helps to analyze the order of approximation involved in relaxing the N-min cut problem by deriving the solution through tensor EVD.The tightness of the bound indicates the goodness of the approximation.Several other attempts have been made to derive November 17, 2020 8/22 such approximation bounds for hypergraphs.For example, [30] utilizes a different Laplacian tensor and the following hyperedge score to derive similar bound on λ 1 of a different tensor: This is a weaker bound of exponential nature where as we have proposed tighter bound of linear nature in Theorem 10.

Computation of Tensor Eigenvectors
The computation of eigenvectors of real super-symmetric tensors is quite challenging and not straightforward as in the case of real symmetric matrices.This is primarily due to the non-orthogonality of tensor eigenvectors.There are several other challenging aspects, for example, real symmetric tensors can have complex eigenpairs, unlike the case of matrices.Also, a real symmetric matrix of size n × n can have a maximum of n eigenvalues, whereas a tensor can have much larger number of eigenpairs [31].Please refer to [17] and references therein for more details.
Most of the existing works on computation of eigenpairs have been for tensors with special structure [32] or the extreme eigenvalues such as maximum or minimum eigenvalue [33,34].As discussed in the section 3.3, only the Fiedler vector is required for partitioning a given hypergraph.As the Fiedler vector is not one of the extreme eigenvectors, the above methods are not helpful for our case.
Recently, [35] proposed an algorithm to compute all the eigenvalues of a tensor using homotopy methods.They pose the problem as finding the roots of a vector of high order polynomials generated from P (y) = Lx k−1 − λx = 0, where y = [x λ] ∈ R n+1 .As it is tough to compute the zeros of P (y) directly, the core idea of linear homotopy methods is to construct a vector function H(y, t) = (1 − t)Q(y) + tP (y), where t ∈ [0, 1] and Q(y) is a suitable vector polynomial whose roots can be computed easily.The next step is to slowly iterate from the solution of H(y, t = 0) = Q(y) = 0 to H(y, t = 1) = P (y) = 0. Despite the novel formulation, this approach is forced to compute all the complex eigenpairs even if we are interested in real eigenpairs only.
We prefer to use the approach by [36], which computes all the real eigenvalues sequentially from maximum to minimum by using Jacobian semidefinite relaxations in polynomial optimization.They formulate the following problem to compute λ i+1 assuming λ i is known: where 0 < δ < λ i − λ i+1 and h r (x) is defined as: where g(x) = x 2 1 + x 2 2 + . . .+ x 2 n − 1 is a normalization constraint.They further utilize Lasserre's hierarchy of semidefinite relaxations [37] to solve the above problem.
The computation of the objective function f (x) and the constraints h r (x) is expensive and takes O(n k ) for general tensors.Using Theorem 8, the objective function can be computed in linear time O(m) for Laplacian tensors.The constraint can also be simplified using: November 17, 2020 9/22 where E i = {e q |i ∩ e q = ∅, e q ∈ E}.This approach is very helpful as all the eigenvalues need not be computed for the Fiedler eigenvalue.Hence, the Fiedler vector can be computed easily by using Theorem 8 and the above equation.

Related works
As stated earlier, most of the existing methods utilize hypergraph reductions either implicitly [6,7] or explicitly.For example, [38] utilize the tensor-based representation of hypergraphs but construct a matrix by concatenating the slices of the tensor.Further, they apply the standard spectral partitioning algorithm on the covariance of that matrix.These variants of hypergraph reduction differ in the method of expanding a hyperedge and produce graphs with different edge weights.The Laplacian objective function (Eq.( 16)) of any graph is second-order polynomial, which captures weighted interaction among two nodes.A second-order polynomial is insufficient for capturing super-dyadic interaction among multiple nodes (≥ 3) of a hyperedge.Also, note that multiple hypergraphs may reduce to the same graph.
[39] discuss the incapability of reduction methods in preserving the hyperedge cuts for general hypergraphs.We utilize the Laplacian tensor (Eq.( 16)) to penalize these multiple cuts differently.Few other recent works try to capture these multiple ways of splitting nodes.For example, [40] proposes non-uniform clique expansion and provides quadratic approximation under submodularity constraints of the inhomogeneous cost function.[41] extends the notion of p-Laplacian from graphs to hypergraphs by introducing the following hyperedge score: Ideally, any definition of hyperedge score should capture the non-uniformity among the nodes in a hyperedge, but the above equation fails to capture the variation perfectly.For example, consider two hyperedges with cardinality 4 and node labels assigned as {0, 1, 1, 2} and {0, 1, 2, 2}.Equation ( 23) computes the maximum difference and hence will not differentiate among these two hyperedges but the AM-GM difference (Eq.( 17)) will capture the variance among all the nodes of the hyperedge.[42,43,44] consider a similar formulation of the hyperedge score function.

Experiments
The proposed algorithm is examined on well-known and synthetic hypergraphs.The numerical details of Fiedler vector and hyperedge scores are presented in Appendix B.
Example 3. Consider the cockroach graph shown in Figure 3 and taken from [21].
The traditional spectral partitioning makes the red cut shown in the graph and the partition is We utilize the edge scores as suggested in the proposed algorithm and report that the edges {v t , v t+1 } and {v 3t , v 3t+1 } have maximum scores.On cutting these edges, the obtained partition is November 17, 2020 10/22 Therefore, the solution obtained by proposed algorithm is t/2 times better than the traditional approach.We made this observation for t = {3, 4, . . ., 20}.
Example 4. In this example, we consider different types of synthetic graphs and compare the ratio-cut values computed by the existing and proposed methods.
We begin with the study on random graphs generated from the Erdős-Rényi model denoted by G(n, p) where n is the number of nodes and p is the probability of an edge between any two nodes.We compare the ratio-cut values for 100 different graphs for n = 100 and for each value of p = {0.2,0.4, 0.6}.We observe that the ratio-cut value by our proposed algorithm is always less than the ratio-cut obtained by sign-based Fiedler vector partitioning.Hence we define the following metric, termed as percentage improvement (PI) to showcase the proposed algorithm's performance: where R f , R p denotes the ratio-cut value by sign based Fiedler partitioning and proposed algorithm, respectively.A positive value of PI indicates the proposed algorithm has produced a better ratio-cut value and the magnitude of the value represents the extent of the improvement.
Figure 4 shows the result as a histogram for different values of p = {0.2,0.4, 0.6}.It can be seen that the proposed algorithm performs better than the sign based Fiedler partitioning in all the cases.We perform the similar analysis for another graph generation model, referred to as the stochastic block model (SBM).This model provides us the freedom to control the number of partitions, the number of nodes in each partition (denoted by n 1 , n 2 ), the probability of an edge within a partition (denoted by p), and across the partition (q).Note that p = q yields ER model with n = n 1 + n 2 as discussed previously.
We consider the graphs for multiple combinations of probabilities p, q and 2 partitions with n 1 = n 2 = 50.It should be noted that we consider the SBM with assortative community structure, which implies p > q.We generate 100 random graphs for each of these settings and compare the ratio-cut values.A histogram plot summarizing the results is presented in Histogram plot for percentage improvement by the proposed method for graphs generated by the SBM for different values of intra-cluster probability (p) and inter-cluster probability (q).It confirms that the proposed algorithm performs better for most of the generated graphs as there are very few cases of negative PI.
It is evident from Figure 5 that the proposed algorithm produces a lower ratio-cut value for most of the graphs generated by SBM.We perform a similar analysis on synthetic hypergraphs generated by SBM [38].We generate 100 random 4-uniform hypergraphs with 2 partitions, 60 nodes, and relatively small values of intra-cluster probability (p) and inter-cluster probability (q) as compared to the case of graphs.This is primarily because the number of possible hyperedges for a 4−uniform hypergraph is n 4 , which is much larger as compared to the case of graphs The proposed algorithm is compared against the conventional sign-based partitioning using the Fiedler vector computed from the Laplacian tensor of the hypergraph.A histogram plot summarizing the results is shown in Figure 6.It can be observed that the proposed algorithm has improved the ratio-cut value (defined in Eq. ( 13)) significantly as compared to the traditional sign-based partitioning.This is primarily because cutting a few hyperedges does not necessarily produce only two components, unlike the case of graphs.For example, if we cut a hyperedge having 3 nodes in a hypergraph with one November 17, 2020 12/22 hyperedge only, we get 3 disconnected components, and there is no possibility of obtaining two connected components.Hence, we may get 3 connected components, even if we desired only 2 connected components.This is a unique property of hypergraphs.Any partitioning algorithm producing many small connected components (like singletons) will have a higher ratio-cut value.We observe that the conventional sign-based partitioning approach using the Fiedler vector of the Laplacian tensor is more prone to producing many small connected components as compared to the results by proposed algorithms.Hence, the ratio-cut value by sign-based partitioning is significantly higher.
In this section, we examined the performance of the algorithm on various synthetic hypergraphs and observed that the proposed approach derives the partitions with lower ratio-cut values.

Conclusions & Future Work
In this work, we propose a hypergraph partitioning algorithm using tensor eigenvalue framework and establish its superiority over existing hypergraph reduction methods.We extend the notion of ratio-cut and normalized cut from graphs to hypergraph and show the equivalence of relaxed optimization problem to a tensor eigenvalue problem.Further, we derive a tighter upper bound for the approximation of Normalized-cut problem.We also reduce the computation time drastically for Fiedler vector of Laplacian tensor of hypergraph.The future directions of this work is along the lines of similar analysis for non-uniform and directed hypergraphs.

A.2 Proof of Theorem 3
Proof.Using (3) and Lemma 1, following may be stated : As D is a diagonal tensor of order k, the first term D1 k−2 n is a contracted diagonal tensor of order = k − (k − 2) = 2.The diagonal elements in the resulting 2 nd order tensor (a matrix) will be same as the diagonal of original k mode tensor.
As A is a super symmetrical tensor of order k, the second term A1 k−2 n will be a 2− dimensional symmetric tensor.This is due to the fact that all (k − 2) modes of tensor are scaled and contracted with same vector of ones.As all the diagonal terms for A1 k−2 n are zeros, the vertex degree can be seen to be preserved from L n .The Laplacian of the corresponding reduced graph is described by November 17, 2020 13/22 A.3 Proof of Corollary 4 The Laplacian corresponding to clique expansion [7] of a k-uniform hypergraph, denoted by L c can be derived from the hypergraph tensor Laplacian as follows: Proof.This can be seen as a direct consequence of Lemma 1 which proposes an approach to preserve node degree.In basic clique expansion algorithm, the vertex degree is not preserved.Each node in a hyperedge of k nodes will be paired up with remaining (k − 1) nodes.So, the same scaling factor appears in Eq. ( 25).The proof is trivial because the input is a k-uniform hypergraph.
For any partition C i , we compute We use Theorem 8 to compute the above term There can be three case for each hyperedge.
1.All the nodes in a hyperedge e j are assigned as 1 |Cj | 1/2 .Both the terms will be k 1 and the overall term reduces to 0.
2. All the nodes in hyperedge e j are assigned 0: Both the terms will be zero and overall term will be zero.
3. Some of the nodes are assigned Similarly adding other partitons, we arrive at The RHS term in the above equation is same as the defined ratio cut for hypergraphs.It should be noted that f i T f i = 1.As the objective function and constraint are same under relaxation, both the problems are equivalent and the solution can be derived from tensor Eigenvalue decompostion.

A.5 Proof of Theorem 8
Proof.
As there are (k − 1)! and k! permutations of the first and second term respectively : A.6 Proof of Corollary 9 The proof of this corollary is very similar to the proof of Theorem 7. In this case, we choose the indicator variable as The next step is to compute the L x k , where the normalized Laplacian tensor L is defined in Eq. 4. Rest of the proof is very similar to the proof of Theorem 7.

A.7 Proof of Theorem 10
Proof.Let x be n × 1 vector with , where ω is defined as November 17, 2020 15/22 It can be easily verified that x T x = 1.Substituting x in the expression for normalized hypergraph Laplacian defined in Eq. ( 4).Please note that we do not use the signs of elements in the Fiedler vector to compute partitions, as discussed in the main manuscript.
where n s = |{i k : x i k < 0}|.For even order hypergraphs, the above can be reduced to

B Examples & Numerical Details of Experiments B.1 Hypergraph reduction to same graph
Various hypergraph reduction methods have been summarized in [7].One of the prevalent approaches is It should be noticed that these reduction methods are a non-unique mapping from hypergraph to adjacency matrix.This implies that there could be multiple different hypergraphs which reduce to same graph.For example, the clique reduction approach reduces the four-uniform hypergraph and the three-uniform hypergraph hypergraph shown in following figure to the same graph.
Represent G as Matrix This non-uniqueness property of hypergraph reduction method plays a very crucial role in the task of hypergraph partitioning.The reduced hypergraph has lost the information about the original hypergraph structure.So there is no assurance of any analysis on reduced hypergraph to deliver correct results for the original hypergraph.To avoid the loss of information in the reduction step, we utilize the tensor-based representation of hypergraphs as shown in the next example.

B.2 Representation of Hypergraphs
This example shows the procedure to construct the adjacency and Laplacian tensor for any k-uniform hypergraph.

B.3 Partition Cost
The cost of other partitions can be derived in similar fashion and are observed to be w h (C 2 ) = w 2 + w 3 and w h (C 3 ) = w 3 .
Further, we compute the cut cost for reduced hypergraph.The edges between node i and j is named as e ij and the corresponding weight is denoted by w eij .The edges denoted by e 24 , e 34 , e 35 , e 45 have to be removed for arriving at the desired partition, So ∂E g = {e 24 , e 34 , e 35 , e 45 } .The cut cost for such partitioning is given by: The cut cost for other partitions can be calculated as w g (C 2 ) = 2(w e2 + w e3 ) and w g (C 3 ) = 2w e3 .
It can be easily noticed that the cut cost for both the cases are not equal.On further inspection, we infer w g (C i ) = 2w h (C i ) for i = {2, 3}, which means the cut cost for partitions C 2 and C 3 in reduced hypergraph are just a scaled version of costs involved in original hypergraph.The same relation doesn't hold for partition C 1 due to the presence of the term

Fig 4 .
Fig 4. Histogram plot for percentage improvement by the proposed method for graphs generated by the ER model for different values of p.It shows that the proposed algorithm performs better for all the generated graphs.
Fig 5.Histogram plot for percentage improvement by the proposed method for graphs generated by the SBM for different values of intra-cluster probability (p) and inter-cluster probability (q).It confirms that the proposed algorithm performs better for most of the generated graphs as there are very few cases of negative PI.

Fig 6 .
Fig 6.Histogram plot for percentage improvement by the proposed method for hypergraphs generated by the SBM for different values of q.It shows that the proposed algorithm performs significantly better as compared to sign-based partitioning for all generated hypergraphs.

Fig 7 .
Fig 7. Two Hypergraphs reducing to same Graph