Figures
Abstract
The recommendation systems often face challenges like low data density, scalability issues and absence of interpretability, whereas classical Collaborative Filtering (CF) which is based on Singular Value Decomposition (SVD) shows support by being scalable, weakened frequently in circumstances of extreme sparsity. Conversely, Graph Neural Networks (GNNs) are very accurate yet do not tend to have explanatory power. In a novel way, this research presents a hybrid framework that is a sequential combination of Louvain community identification using SVD-based collaborative filtering to overcome the sparsity-interpretable trade-off. It is unlike the existing models that utilize communities only, to pre-partition the user space, modularity-based clustering is employed to regularize it, enabling SVD to act on more dense homogeneous sub-matrices. This methodological contribution is a very useful way to cut down on computational noise and overhead and to make the community-level justifications of recommendations. In the experimental analysis, the Netflix Prize data set produced a Root-Mean-Square Error (RMSE) of 0.9966, a Mean value of 0.9966 and an Absolute Error (MAE) of 0.7968. This hybrid model achieves competitive predictive performance with significantly higher interpretability and lower computational cost than complex deep learning baselines, despite a modestly higher RMSE due to the deliberate trade-off for transparency and efficiency on extremely sparse data. The framework enables scalable and transparent recommendation engines suitable for large-scale sparse datasets.
Citation: Keerthika T, Ignisha RG, Rajamani V, Kumar SS, Selvam K, Aiyyappan MR, et al. (2026) Enhancing Recommendation Systems through SVD-based collaborative filtering and community detection. PLoS One 21(4): e0346579. https://doi.org/10.1371/journal.pone.0346579
Editor: Shih-Lin Lin, National Changhua University of Education, TAIWAN
Received: November 26, 2025; Accepted: March 21, 2026; Published: April 13, 2026
Copyright: © 2026 Keerthika et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper. Further the dataset link is provided in the manuscript. The dataset can be accessed from the Kaggle repository at: {https://www.kaggle.com/datasets/netflix-inc/netflix-prize-data}. The dataset from the original is reproduced in the following dataset doi: 10.6084/m9.figshare.31428857.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
The present-day Recommendation Systems (RS) are based on the old recommendations techniques that included Content-Based Filtering (CBF) and Collaborative Filtering (CF) [1,2]. The desire of the users converges in similar preferences when they have overlap in their past behaviors based on the CF principles. The CF system works under two different subcategories that either form a recommendation on similar behavior of the user or form a recommendation by similarity of items to form suggestions [2,3]. The weaknesses of CF to process the hidden item-user relations are usually constrained by the failure to handle data sparsity and scalability problems [4]. CBF has limited capabilities to discover implicit patterns and extrapolate knowledge beyond specified features due to the nature of its recommendation capabilities being dependent on the characteristics or profile of items to interpret recommendations but has been shown to be incompetent to do so [2].
Singular Value Decomposition (SVD) has been recognized as an important component of Matrix Factorization (MF) methods that have achieved some potent results in the past with the assistance of Singular Value Decomposition (SVD) [5]. SVD is used to obtain latent factors by decomposing the user-item interaction matrix, which expose underlying preferences as well as characteristics. This approach reduces the complexity of data and also identifies relationships that none of the other CF solutions would have identified before [6]. The use of SVD in improving the recommendation accuracy is however limited. The first is when the datasets used are very sparse, when the system does not identify internal links between users and form communities, and in cold-start situations, which is disfavored in that case, as well as with very sparse datasets, in this instance, the system is unable to recognize internal relationships between users that form communities [4,7,8].
In the recent developments, there has been a transition to deep learning and Graph Neural Networks (GNNs) [9,10]. Most recently, Transformer-based models and Large Language Models (LLMs) have been studied in the context of sequential recommendation, with state-of-the-art accuracy, but typically high computational latency and black-box proprietary qualities. Despite these advances, the industry still requires the solutions that will strike a balance between high accuracy and interpretability and low resources consumption.
This paper to fill these gaps revisits the prism of community-detection of matrix factorization. Latent patterns are identified with the help of Singular Value Decomposition (SVD) and natural group structures are employed with the help of Louvain community detection. [7,11].
Contributions
The main contributions of present research are the following:
- Hyperscale Hybrid Architecture: It offers a progressive approach to compute Louvain community detection with SVD, making the change to local factorization of matrices. This solves the problem of data sparsity by forming more dense user clusters.
- Strength of Interpretability: As opposed to black-box GNN models, the proposed approach can give transparent, communal explanations behind recommendations (e.g., peer-group influence).
- Scalable Performance: It is shown that the user space partitioning limits the dimensionality load on SVD and competitive accuracy (RMSE 0.9966) is achieved with reduced computational overhead than deep learning baselines.
Organization of the paper
The remainder of this paper is organized as follows: Section 2 reviews related literature on the topics of collaborative filtering and community detection. Section 3 elaborates the methodology framework, mathematical equations, theoretical assumptions and complexity analysis. Section 4 describes the proposed hybrid architecture, hyperparameter finding and experimental configuration. Section 5 contains the findings of the experiment and a comparative discussion. Finally, section 6 concerns the conclusion of the research and gives the direction of the future research.
Related work
Recommendation systems development has weakened as a result of the integration of graph-based approaches, community detection applications, as well as the SVD method of matrix factorization. Detailed analysis has shown that the graph-based recommender systems are meaningful since they generate key information using graph representations to form powerful recommendations and enhance readability [1].
Graph-based recommenders have been regarded as transparent to research the effects of graph structures on trust in a recommendation, such as detecting communities and modeling nodes [12].Some of the publications suggest that the methodologies of graph learning can be used to address the issue of data sparsity and cold-start successfully. [4,13].
The division process that the community detection algorithms perform on the sets of users forms unique groups, which results in better individualized recommendation outputs. There are three most popular community detection algorithms: Louvain, Leiden, and Label Propagation helping extensive network applications [2,11]. The Louvain method has been one of the favorite strategies to create the best-quality community structures, and hence it is adopted in several practical cases of the real world contexts as well [9,14]. The last memory efficient advances have boosted the scalability of such algorithms in large scale database operations [15].
Recommender systems that appeal combine collaborative and content based methods and are being popular because they offer high accuracy and diversity on the system. Study of SVD in collaborative filtering has grown because the algorithm identifies concealed patterns in user-item data matrices. The study of SVD in collaborative filtering has been broadened since the algorithm detects hidden patterns in matrices of user-item data sets. Hybrid recommendation systems derive the advantages of two methods to address sparsity constraints and overfitting and enhance user-specific recommendations. SVD based systems predict user preference with a robust performance by incorporating bias terms, which execute user, and item based methods of behaving in their selection process of a product or service [6,7].
Graph neural networks (GNNs) form a core field of graph representations that combine the information of each node in a manner that allows training suggestions. [9,16,17]. The studies mainly explore the methods used to simulate the interaction of the users with various items. As an example, Multi-Behavior GNN (MB-GNN) is an improvement to representation learning by processing the various types of interaction, including clicks and purchases [10].
Concluding remarks on research gap
The gaps in literature, through which the effectiveness of the balance between the interpretability and sparsity management could be achieved, are identified. As SVD can help to scale and GNNs can help to improve accuracy, not many combined models can use the modularity of community detection to explicitly clean up the interaction matrix first before factorization. The proposed work aims at bridging this gap by proposing a sequential framework that would use community detection, not only to regularize the data, but also to pre-process the data in order to generate dense, homogenous sub-matrices where simpler factorization is possible.
Methodology
Singular Value Decomposition (SVD)
Singular Value Decomposition (SVD) is a classical element of collaborative filtering, and it provides an excellent mathematical backbone to identify latent variables used to predict user-item interactions. The user-item interaction data is represented by a matrix, R. SVD solves the problem of sparsity by breaking down the matrix R into three components of the form of U, and VT as below mathematical expression:
Here, U is an m × k matrix capturing latent features of users, is a k × k diagonal matrix including singular values, and VT is a k × n matrix representing latent features of items.
The optimization process finds minimal squared differences between user ratings predictions and actual values. The optimization includes the following formula:
where symbolizes the collection of user-item interactions, ru,i is the actual rating, and
is the predicted rating computed as:
Louvain community-detection method
The core function of the Louvain algorithm involves user preference cluster discovery. The strength of network division is quantified through Modularity Q, defined as [14,15]:
where wij represents the weight of the edge between nodes i and j, ki and kj represent degrees, and is 1 if nodes are in the same community.
Assumptions and theoretical constraints
Singular Value Decomposition (SVD)
Singular Value Decomposition (SVD) is a classical element of collaborative filtering, and it provides an excellent mathematical backbone to identify latent variables used to predict user-item interactions. The data of interaction between the user and the item are represented in a form of a matrix, R. SVD resolves the sparsity issue by decomposing the matrix R into three terms, namely X, and
, as follows in the mathematical expression:
Here, X is an m × p matrix capturing latent features of users, is a p × p diagonal matrix including singular values, and YT is a p × n matrix representing latent features of items.
The optimization process identifies small squared differences between the prediction of the user ratings and the actual ones. The optimization contains the following formula:
with the representation of the set of interactions between users and items in the form of the shortcut , and the real rating on the interaction between a user and item denoted by rx,i, the predicted rating denoted by
, and calculated as follows:
Louvain community-detection method
The core function of the Louvain algorithm involves user preference cluster discovery. The strength of network division is quantified through Modularity Q, defined as [14,15]:
where pi and pj denote degrees, wij is the weight of the edge between nodes i and j, and is 1 if nodes are in the same community.
Assumptions and theoretical constraints
The proposed hybrid system will be performed based on certain theoretical assumptions that SVD and Louvain algorithms should work with appropriately:
- Low-Rank Assumption: It is assumed that the user-item interaction matrix is low-rank (
) user preferences can be well modeled by a few latent variables.
- Homophily Assumption Homophily assumption: Louvain method is premised on the fact that the users within the same community are statistically important with regard to their taste preferences compared to the who were not a part of the community.
- Identifiability with SVD: To achieve unique reconstruction of the rating matrix (up to permutation), one assumes that the singular values are distinct and that intra-community sub-matrices, while not fully dense, are sufficiently connected to ensure convergence during Stochastic Gradient Descent (SGD).
Computational complexity analysis
One of the main strengths of the suggested strategy is scalability. It is a complexity that comprises two parts:
- Louvain Algorithm:Community detector has time bound of
, where N is the number of users. It is an amazing invention with sparse graphs.
- Local SVD: Standard global SVD has a complexity of
, where p is latent dimensions, |R| is the number of ratings, and I is iterations. By partitioning users into C communities, C independent SVDs are performed on smaller matrices. While the total operations remain proportional, the convergence time is reduced because the sub-matrices are denser and more homogenous, requiring fewer iterations I to minimize error.
The proposed method is much lighter in terms of memory and floating-point operations as compared to GNN-based methods (e.g., NGCF) which scale at .
Proposed method
The proposed hybrid recommendation system brings together Singular Value Decomposition algorithm with Louvain community detection to embrace group-level dependencies while addressing sparsity issues as shown in Fig 1. The first stage establishes an R user-item interaction matrix. The next process will be to construct a user similarity graph that represents the relationship between users considering their interaction profile. The following formula is computed to compute the similarity of the users in terms of cosine similarity when comparing two users, x and y:
Once communities are identified, SVD is applied within each community. For each community, a submatrix Rc is extracted. SVD is then performed on Rc to decompose it:
Hyperparameter tuning
To ensure optimal performance (Equations 4 and 5), a Grid Search strategy was employed for hyperparameter optimization.
- Learning Rate (
): Values were explored in the range {0.001, 0.005, 0.01, 0.02}. It was observed that
led to divergent loss, while
resulted in slow convergence. The optimal value was identified as 0.005.
- Regularization (
): Tests were conducted for
. Values lower than 0.02 caused overfitting on the training communities, while
reduced predictive accuracy by penalizing latent features too heavily. The value
was selected for the reported results.
Results and discussion
A thorough evaluation occurred through analysis of the Netflix Prize dataset. The dataset contains more than 24 million ratings from about 470,758 users (Table 1).
Fig 2 shows that the distribution of the ratings has a long-tail, which proves the sparseness of the data that prompts the need to adopt the community-based approach.
A random selection of 100 users and 50 movies was used to do a detailed analysis. Community detection algorithms on the Louvain community on the data set identified six distinct clusters within the data set.
Fig 3 illustrates the network topology, with Community 0 being shown as densely connected than Community 2 which is sparsely connected. This graphic distinction validates the effectiveness of the algorithm to divide separate preferences groups.
Fig 4 presents the distribution of users with Community 0 having the largest number of users (25) which can be interpreted as a prevailing pattern in the sample in terms of preference whereas other communities are niche interests.
Fig 5 illustrates a heatmap of the community-movie matrix. Indicatively, films that prove to be highly ranked in Community 1 possess clearly lesser affinity in Community 3, confirming the preference segregation that was obtained by the model (Table 2).
Comparative analysis and discussion
To assess the significance of the obtained RMSE (0.9966), a comparison was made against standard values reported in literature. The hybrid SVD-Louvain approach outperforms standard memory-based CF and performs competitively with basic matrix factorization baselines.
Fig 6 visualizes the performance gap. Although GNN-based methods achieve slightly lower RMSE (approx 0.92) [17], this difference arises because the proposed method prioritizes interpretability (clear community-based explanations) and computational efficiency (local factorization on denser sub-matrices) over marginal accuracy gains from black-box GNN architectures, which require substantially higher resources on sparse datasets like Netflix. Consequently, the proposed method (RMSE 0.9966) significantly outperforms isolated SVD (typically 1.05) [6] while retaining strong interpretability. This trade-off is quantified in Table 3.
Discussion on diversity and personalization
Although the key measures are error-based (RMSE/MAE), implicitly, the hybrid architecture of the architecture increases the diversity of the recommendations. The localization of collaborative filtering to local communities eliminates the so-called popularity bias (popular globally crowd out unpopular items in the globally popular item list). In smaller, niche communities, like “Indie Horror Fans” users get recommendations tailored to the interests of their group, instead of generic blockbusters, which in itself enhances the catalog coverage and diversity of personalization than a global SVD model.
Conclusion
In this paper, a hybrid recommendation system was created and justified by combining the Louvain community detection algorithm with Singular Value Decomposition (SVD). The data sparseness inherent in global matrix factorization was overcome by splitting the user-item interaction space into different communities, which was determined by modularity. This community-centered factorization performed in experimentation showed a competitive RMSE of 0.9966. Most importantly, it was determined that the use of SVD in the framework of localized groups of users preserved predictive power but greatly improved interpretability in comparison to the black-box deep learning options.
Although these are the benefits, the suggested approach has its shortcomings. The use of pre-computed communities implies that real-time updates of incoming users (cold-start) are based on periodic re-clustering of a group and could be computationally expensive. Future studies ought to consider injecting in temporal dynamics, to simulate the dynamics of the user community memberships with time. Moreover, information related to user reviews or social network links might be introduced, and this addition would likely help to perfect the community detection process. It is further suggested that this hybrid architecture should also be tested on a distributed computing framework to determine the feasibility of the recommendation in real-time on a large scale when dealing with the ultra-large e-commerce platform.
References
- 1.
Zhang J, Fei J, Song X, Feng J. An Improved Louvain Algorithm for Community Detection. In: 2021. https://doi.org/10.48550/arXiv.2110.00891
- 2. Zhao Y, Zhou C, Cao J. Building a Movie Recommendation System Using SVD Algorithm. Scientific Reports. 2024;14:3853.
- 3.
Chen L, Zhang W. Movie recommendation system based on collaborative filtering and matrix factorization. Journal of Emerging Technologies and Innovative Research. 2023.
- 4. Choudhary C, Singh I, Kumar M. Community detection algorithms for recommendation systems: techniques and metrics. Computing. 2022;105(2):417–53.
- 5. Soman ST, Soumya SVJ, Soman KP. Singular Value Decomposition: A Classroom Approach. International Journal of Recent Trends in Engineering. 2009;2(1).
- 6. Koren Y, Bell R, Volinsky C. Matrix Factorization Techniques for Recommender Systems. Computer. 2009;42(8):30–7.
- 7. Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech. 2008;2008(10):P10008.
- 8.
Jothilakshmi SL, Bharathi R. Survey on Collaborative Filtering Technique for Recommender System Using Deep Learning. Lecture Notes in Electrical Engineering. Springer Nature Singapore. 2023. p. 217–25. https://doi.org/10.1007/978-981-19-7169-3_20
- 9.
Gao C, Zheng Y, Li N. A survey of graph neural networks for recommender systems: challenges, methods, and directions. arXiv preprint. 2022. https://doi.org/10.48550/arXiv.2109.12843
- 10.
Zhu H, Kapoor V, Sharma P. Reviewing developments of graph convolutional network techniques for recommendation systems. 2023. https://doi.org/10.48550/arXiv.2311.06323
- 11. Gasparetti F, Sansonetti G, Micarelli A. Community Detection in Social Recommender Systems: A Survey. Applied Intelligence. 2021;51(6):3975–95.
- 12. Li M, Zhao L, Ren Y. Graph Neural Network Recommendation Algorithm Based on Improved Dual-Tower Model. Gels. 2024;11(2):141.
- 13. Wu S, Sun F, Zhang W. Graph Neural Networks in Recommender Systems: A Survey. ACM Computing Surveys. 2022;55(7):1–37.
- 14.
Hamilton WL, Ying R, Leskovec J. Inductive Representation Learning on Large Graphs. In: Advances in Neural Information Processing Systems (NeurIPS), 2017. https://doi.org/arXiv:1706.02216
- 15. Mohammadi M, Fazlali M, Hosseinzadeh M. Parallel Louvain Community Detection Algorithm Based on Dynamic Thread Assignment on Graphic Processing Unit. Journal of Electrical and Computer Engineering Innovations (JECEI). 2022;10(1):75–88.
- 16.
Yang L, Wang Y, Tang J. Graph Neural Networks in Recommender Systems: A Survey. 2020. https://doi.org/10.48550/arXiv.2011.02260
- 17.
Sharma K, Lee YC, Nambi S, Park Y. A survey of graph neural networks for social recommender systems. ACM Computing Surveys. 2024. https://doi.org/10.1145/3637528