Metaheuristics for Collaborative Filtering in Recommender Systems
Metaheuristics for Collaborative Filtering in Recommender Systems
Files
Date
2021-06-29
Authors
Ayangleima Laishram
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hyderabad
Abstract
It has become imperative in the current internet based era to advance technology
in such a way that the preferences of individuals/users’ could be
learned from the existing data and recommendations be made on unseen
data wherein the user is satisfied with the recommended data/items to a
large extend. Recommender systems technology have been put forward
by keeping this idea in mind and several multinationals make use of this
paradigm to expand their business initiatives. In this thesis we are mainly
focused on devising methods that can improve the recommendation as well
as prediction accuracy in collaborative filtering (CF) based recommender
systems. To achieve this end we propose a variety of algorithms in which
metaheuristic techniques are combined with matrix factorisation methods
and the combined framework is tested on two main approaches used for
collaborative filtering in recommender systems, namely, model based and
neighbourhood based collaborative filtering.
In the case of model based collaborative filtering we demonstrate how
metaheuristic techniques like Particle Swarm Optimisation (PSO) and Genetic
Algorithm (GA) can be combined with matrix factorisation techniques
like Maximum Margin Matrix Factorisation (MMMF). The metaheuristic
algorithms such as PSO and GA are exploratory in nature which
enhance the traditional model-based collaborative filtering techniques like
MMMF with exploitatory nature of gradient descent. The gradient descent
approach may get trapped in local optima which is why we plan to employ
metaheuristic techniques. Our algorithm starts from multiple initial points
and uses gradient information and swarm-search as the search progresses.
We show that by this process we get an efficient search scheme to get near optimal point for maximum margin matrix factorization. Our experimental
results on benchmark datasets demonstrate that when the exploration
capability of popula- tion based search algorithms is combined with gradient
search direction of MMMF, the proposed models are able to achieve
better accuracy as can be evidenced from the derived RMSE and MAE
values .
We extended the neighbourhood based collaborative filtering technique
by adopting the concept of discovering highly correlated user-item subgroups.
Our proposal of constructing the user-item-subgroup based collaborative
filtering can be done in two ways, namely, through a two-step
approach and a fuzzy c-means clustering approach. In the two-step process,
we proposed different algorithms to identify the highly correlated
user-item subgroups. Then, we used least squares method to predict the
missing ratings by using the rating information of the highly correlated
user-item subgroups. In fuzzy c-means clustering based user-item subgroup
algorithm, the highly correlated user-item subgroups are discovered
in one step. We optimized the initialization of centroids in fuzzy cmeans
by using particle swarm optimization to accurately discover highly
correlated user-item subgroups in CF. We observed that our proposed algorithms
in the two step approach outperforms all the CF models under
comparison for all benchmark datasets. Our findings in terms of MAP
suggest that the correlation of the subgroups discovered by GA that evaluates
fitness by calculating mean squared residue and row variance is significant
though the effectiveness is less for smaller dataset. In the case of
fuzzy c-means approach, the metaheuristic optimization algorithm acts as
a booster to improve the fuzzy c-means clustering in discovering highly
correlated user-item subgroups by initializing the initial centroid of the
clusters to the nearest optimal solutions. Our experimental results have
shown a promising way of making use of user-item subgroups in helping
to capture highly similar user preferences on a subset of items.