Computer and Information Sciences - Theses
Permanent URI for this collection
Browse
Browsing Computer and Information Sciences - Theses by Author "Ayangleima Laishram"
Results Per Page
Sort Options
-
ItemMetaheuristics for Collaborative Filtering in Recommender Systems(University of Hyderabad, 2021-06-29) Ayangleima Laishram ; Vineet PadmanabhanIt has become imperative in the current internet based era to advance technology in such a way that the preferences of individuals/users’ could be learned from the existing data and recommendations be made on unseen data wherein the user is satisfied with the recommended data/items to a large extend. Recommender systems technology have been put forward by keeping this idea in mind and several multinationals make use of this paradigm to expand their business initiatives. In this thesis we are mainly focused on devising methods that can improve the recommendation as well as prediction accuracy in collaborative filtering (CF) based recommender systems. To achieve this end we propose a variety of algorithms in which metaheuristic techniques are combined with matrix factorisation methods and the combined framework is tested on two main approaches used for collaborative filtering in recommender systems, namely, model based and neighbourhood based collaborative filtering. In the case of model based collaborative filtering we demonstrate how metaheuristic techniques like Particle Swarm Optimisation (PSO) and Genetic Algorithm (GA) can be combined with matrix factorisation techniques like Maximum Margin Matrix Factorisation (MMMF). The metaheuristic algorithms such as PSO and GA are exploratory in nature which enhance the traditional model-based collaborative filtering techniques like MMMF with exploitatory nature of gradient descent. The gradient descent approach may get trapped in local optima which is why we plan to employ metaheuristic techniques. Our algorithm starts from multiple initial points and uses gradient information and swarm-search as the search progresses. We show that by this process we get an efficient search scheme to get near optimal point for maximum margin matrix factorization. Our experimental results on benchmark datasets demonstrate that when the exploration capability of popula- tion based search algorithms is combined with gradient search direction of MMMF, the proposed models are able to achieve better accuracy as can be evidenced from the derived RMSE and MAE values . We extended the neighbourhood based collaborative filtering technique by adopting the concept of discovering highly correlated user-item subgroups. Our proposal of constructing the user-item-subgroup based collaborative filtering can be done in two ways, namely, through a two-step approach and a fuzzy c-means clustering approach. In the two-step process, we proposed different algorithms to identify the highly correlated user-item subgroups. Then, we used least squares method to predict the missing ratings by using the rating information of the highly correlated user-item subgroups. In fuzzy c-means clustering based user-item subgroup algorithm, the highly correlated user-item subgroups are discovered in one step. We optimized the initialization of centroids in fuzzy cmeans by using particle swarm optimization to accurately discover highly correlated user-item subgroups in CF. We observed that our proposed algorithms in the two step approach outperforms all the CF models under comparison for all benchmark datasets. Our findings in terms of MAP suggest that the correlation of the subgroups discovered by GA that evaluates fitness by calculating mean squared residue and row variance is significant though the effectiveness is less for smaller dataset. In the case of fuzzy c-means approach, the metaheuristic optimization algorithm acts as a booster to improve the fuzzy c-means clustering in discovering highly correlated user-item subgroups by initializing the initial centroid of the clusters to the nearest optimal solutions. Our experimental results have shown a promising way of making use of user-item subgroups in helping to capture highly similar user preferences on a subset of items.