| Title: | Collaborative Filtering Models for Recommendation Systems |
|---|---|
| Description: | Implements collaborative filtering methods for recommendation systems based on user-item interaction data. Supports both explicit feedback (ratings) and implicit feedback (consumption). The package uses efficient sparse matrix representations and provides incremental updates for users, items, and similarity structures through an R6 class-based architecture. See Aggarwal (2016) <doi:10.1007/978-3-319-29659-3> for an overview. |
| Authors: | Jessica Kubrusly [aut, cre] (ORCID: <https://orcid.org/0000-0003-0465-4629>), Thiago Lima [ctb], Lucas Oliveira [ctb], Caio Salviano [ctb] |
| Maintainer: | Jessica Kubrusly <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.1 |
| Built: | 2026-05-30 22:18:01 UTC |
| Source: | https://github.com/cran/CFilt |
CF is a class of objects that stores information about a recommendation system. This information includes the consumption or rating of each (user, item) pair in the utility matrix MU, the similarities between each pair of users in the similarity matrix SU, the similarities between each pair of items in the similarity matrix SI, the number of items consumed and/or rated by each user in the vector n_aval_u, the number of users who consumed and/or rated each item in the vector n_aval_i, the average rating value of each user in the vector averages_u, the average rating value received by each item in the vector averages_i, the number of items consumed in common by each pair of users in the matrix Int_U, and the number of users in common for each pair of items in the matrix Int_I. The class contains methods such as addNewUser, addNewEmptyUser, deleteUser, addNewItem, addNewEmptyItem, deleteItem, newRating and deleteRating, which modify the object's structure by altering users, items, or consumption data. The class also includes functions such as kClosestItems, topKUsers, and topKItems, which return items to recommend to a user or users to whom an item should be recommended. An object of the CF class is created using the CFBuilder function.
This class implements a collaborative filtering system supporting both explicit (ratings) and implicit (consumption) feedback. The internal state is updated incrementally after each operation.
MUThe Utility Matrix, a matrix that contains all the users' ratings. The rows comprise users and the columns, items.
SUThe user similarity matrix.
SIThe item similarity matrix
IntUA symmetric matrix that records the number of items in common between pairs of users.
IntIA symmetric matrix that records the number of users in common between pairs of items.umber of items in common that
averages_uA vector that contains the averages of users' ratings.
averages_iA vector that contains the averages of items' ratings.
n_aval_uA vector that stores the number of items rated by each user.
n_aval_iA vector that stores the number of users who consumed each item.
datatypeA character that indicates the type of data, which can be either "consumption" or "rating".
similarityA character string indicating the similarity measure used. It can be "pearson" or "cosine" for rating data, and "jaccard" for consumption data.
data_0A data.frame containing the original dataset used to build the object. This corresponds to the input data provided to CFbuilder and is stored for reference.
CF$addnewemptyuser()Add a new empty user to the system. This method creates a new user with no interactions.
CF$addnewemptyuser(Id_u)
Id_uA character string (or a list of strings) representing user ID(s).
Invisibly returns the updated object.
CF$addnewemptyitem()Add a new empty item to the system.
CF$addnewemptyitem(Id_i)
Id_iA character string (or list of strings) representing item ID(s).
CF$newrating()Add a new rating or consumption.
CF$newrating(Id_u, Id_i, r = NULL)
Id_uA character string (or a list of strings) representing user ID(s).
Id_iA character string (or a list of strings) representing item ID(s).
rA numeric (or a list of numeric) for rating value(s) (only for rating data)
CF$deleteuser()Delete a user from the system.
CF$deleteuser(Id_u)
Id_uA character string (or a list of strings) representing user ID(s).
CF$deleteitem()Delete an item from the system.
CF$deleteitem(Id_i)
Id_iA character string (or a list of strings) representing item ID(s).
CF$deleterating()Delete a rating or consumption.
CF$deleterating(Id_u, Id_i)
Id_uA character string (or a list of strings) representing user ID(s).
Id_iA character string (or a list of strings) representing item ID(s).
CF$clone()The objects of this class are cloneable with this method.
CF$clone(deep = FALSE)
deepWhether to make a deep clone.
Jessica Kubrusly
LINDEN, G.; SMITH, B.; YORK, J. Amazon. com recommendations: Item-to-item collaborative filtering. Internet Computing, IEEE, v. 7, n. 1, p. 76-80,2003
Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.
Leskovec, J., Rajaraman, A., & Ullman, J. D. (2020). Mining of massive data sets. Cambridge university press.
data(movies, package = "CFilt") # --- Rating data --- objectCF_r <- CFbuilder(Data = movies[1:500,],Datatype = "rating", similarity = "pearson") dim(objectCF_r$MU) colnames(objectCF_r$MU) #movies Id rownames(objectCF_r$MU) #users Id dim(objectCF_r$SU) dim(objectCF_r$SI) objectCF_r$averages_u objectCF_r$averages_i objectCF_r$n_aval_u objectCF_r$n_aval_i objectCF_r$addnewemptyuser(Id_u = "newuser1") objectCF_r$newrating(Id_u = "newuser1",Id_i = "Frozen",r = 5) objectCF_r$MU["newuser1","Frozen"] objectCF_r$newrating(Id_u = list("newuser1","newuser1","newuser1"), Id_i = list("Thor: The Dark World","The Lego Movie","Despicable Me 2"), r = list(2,3,4)) objectCF_r$n_aval_u["newuser1"] objectCF_r$averages_u["newuser1"] objectCF_r$addnewemptyuser(Id_u = "newuser2") objectCF_r$newrating(Id_u = list("newuser2","newuser1"), Id_i = list("Frozen","Her"),r = c(2,1)) objectCF_r$addnewemptyuser(Id_u = list("newuser3","newuser4")) objectCF_r$newrating(Id_u = list("newuser3","newuser3","newuser4","newuser4"), Id_i = list("The Lego Movie","Wreck-It Ralph","Fast & Furious 6", "12 Years a Slave"),r = list(4,5,4,2)) objectCF_r$addnewemptyitem(Id_i = list("movie1","movie2","movie3")) objectCF_r$newrating( Id_u = list("newuser1","newuser1","newuser1", "newuser2","newuser2","newuser2", "newuser3","newuser3","newuser3", "newuser4","newuser4","newuser4"), Id_i = list("movie1","movie2","movie3", "movie1","movie2","movie3", "movie1","movie2","movie3", "movie1","movie2","movie3"), r = list(4,5,4,2, 1,2,1,1, 4,3,1,2)) objectCF_r$MU[,"movie1"] objectCF_r$SI["movie1","movie2"] objectCF_r$SU["newuser2","newuser4"] # --- Consumption data --- objectCF_c <- CFbuilder(Data = movies[1:300,-3],Datatype = "consumption", similarity = "jaccard") objectCF_c$addnewemptyuser(Id_u = list("newuser1","newuser2","newuser3")) objectCF_c$newrating(Id_u = list("newuser1","newuser2","newuser3"), Id_i = list("Frozen","Frozen","Frozen")) objectCF_c$newrating(Id_u = list("newuser1","newuser1","newuser1"), Id_i = list("Gravity","The Wolverine","Iron Man 3")) objectCF_c$addnewemptyitem(Id_i = list("movie1","movie2","movie3")) objectCF_c$newrating(Id_u = list("newuser1","newuser1","newuser2","newuser2", "newuser3"),Id_i = list("movie1","movie2","movie1","movie3","movie3")) objectCF_c$MU[,"movie1"] objectCF_c$SI["movie1","movie2"] objectCF_c$SI["movie1","movie3"] objectCF_c$SI["movie2","movie3"] objectCF_c$SU["newuser1","newuser2"] objectCF_c$SU["newuser2","newuser3"]data(movies, package = "CFilt") # --- Rating data --- objectCF_r <- CFbuilder(Data = movies[1:500,],Datatype = "rating", similarity = "pearson") dim(objectCF_r$MU) colnames(objectCF_r$MU) #movies Id rownames(objectCF_r$MU) #users Id dim(objectCF_r$SU) dim(objectCF_r$SI) objectCF_r$averages_u objectCF_r$averages_i objectCF_r$n_aval_u objectCF_r$n_aval_i objectCF_r$addnewemptyuser(Id_u = "newuser1") objectCF_r$newrating(Id_u = "newuser1",Id_i = "Frozen",r = 5) objectCF_r$MU["newuser1","Frozen"] objectCF_r$newrating(Id_u = list("newuser1","newuser1","newuser1"), Id_i = list("Thor: The Dark World","The Lego Movie","Despicable Me 2"), r = list(2,3,4)) objectCF_r$n_aval_u["newuser1"] objectCF_r$averages_u["newuser1"] objectCF_r$addnewemptyuser(Id_u = "newuser2") objectCF_r$newrating(Id_u = list("newuser2","newuser1"), Id_i = list("Frozen","Her"),r = c(2,1)) objectCF_r$addnewemptyuser(Id_u = list("newuser3","newuser4")) objectCF_r$newrating(Id_u = list("newuser3","newuser3","newuser4","newuser4"), Id_i = list("The Lego Movie","Wreck-It Ralph","Fast & Furious 6", "12 Years a Slave"),r = list(4,5,4,2)) objectCF_r$addnewemptyitem(Id_i = list("movie1","movie2","movie3")) objectCF_r$newrating( Id_u = list("newuser1","newuser1","newuser1", "newuser2","newuser2","newuser2", "newuser3","newuser3","newuser3", "newuser4","newuser4","newuser4"), Id_i = list("movie1","movie2","movie3", "movie1","movie2","movie3", "movie1","movie2","movie3", "movie1","movie2","movie3"), r = list(4,5,4,2, 1,2,1,1, 4,3,1,2)) objectCF_r$MU[,"movie1"] objectCF_r$SI["movie1","movie2"] objectCF_r$SU["newuser2","newuser4"] # --- Consumption data --- objectCF_c <- CFbuilder(Data = movies[1:300,-3],Datatype = "consumption", similarity = "jaccard") objectCF_c$addnewemptyuser(Id_u = list("newuser1","newuser2","newuser3")) objectCF_c$newrating(Id_u = list("newuser1","newuser2","newuser3"), Id_i = list("Frozen","Frozen","Frozen")) objectCF_c$newrating(Id_u = list("newuser1","newuser1","newuser1"), Id_i = list("Gravity","The Wolverine","Iron Man 3")) objectCF_c$addnewemptyitem(Id_i = list("movie1","movie2","movie3")) objectCF_c$newrating(Id_u = list("newuser1","newuser1","newuser2","newuser2", "newuser3"),Id_i = list("movie1","movie2","movie1","movie3","movie3")) objectCF_c$MU[,"movie1"] objectCF_c$SI["movie1","movie2"] objectCF_c$SI["movie1","movie3"] objectCF_c$SI["movie2","movie3"] objectCF_c$SU["newuser1","newuser2"] objectCF_c$SU["newuser2","newuser3"]
Creates an object of class CF from a dataset of user-item interactions.
The dataset can represent either explicit ratings or implicit consumption.
CFbuilder( Data, Datatype = ifelse(ncol(Data) == 2, "consumption", "rating"), similarity = ifelse(Datatype == "consumption", "jaccard", "pearson") )CFbuilder( Data, Datatype = ifelse(ncol(Data) == 2, "consumption", "rating"), similarity = ifelse(Datatype == "consumption", "jaccard", "pearson") )
Data |
A data.frame containing:
|
Datatype |
A character string indicating the type of data:
|
similarity |
A character string indicating the similarity measure:
Default is chosen based on |
An object of class CF.
data(movies, package = "CFilt") # --- Rating data --- CF1 <- CFbuilder(Data = movies[1:300,],Datatype = "rating", similarity = "pearson") CF1_ <- CFbuilder(Data = movies[1:300,]) CF2 <- CFbuilder(Data = movies[1:300,],Datatype = "rating", similarity = "cosine") CF2_ <- CFbuilder(Data = movies[1:300,],similarity = "cosine") # --- Consumption data --- CF3 <- CFbuilder(Data = movies[1:300,-3],Datatype = "consumption", similarity = "jaccard") CF3_ <- CFbuilder(Data = movies[1:300,-3])data(movies, package = "CFilt") # --- Rating data --- CF1 <- CFbuilder(Data = movies[1:300,],Datatype = "rating", similarity = "pearson") CF1_ <- CFbuilder(Data = movies[1:300,]) CF2 <- CFbuilder(Data = movies[1:300,],Datatype = "rating", similarity = "cosine") CF2_ <- CFbuilder(Data = movies[1:300,],similarity = "cosine") # --- Consumption data --- CF3 <- CFbuilder(Data = movies[1:300,-3],Datatype = "consumption", similarity = "jaccard") CF3_ <- CFbuilder(Data = movies[1:300,-3])
Returns the k most similar items to a given item based on the item-item similarity matrix of a collaborative filtering model.
kclosestitems(CF, Id_i, k = 10)kclosestitems(CF, Id_i, k = 10)
CF |
An object of class |
Id_i |
A character string representing the item ID. |
k |
A positive integer indicating the number of similar items to return. Default is 10. |
The similarity between items is obtained from the item similarity matrix
stored in CF$SI. The item itself is excluded from the result.
A character vector containing the IDs of the k most similar items.
CFbuilder, topkitems, topkusers
data(movies, package = "CFilt") CF1 <- CFbuilder(movies[1:200, ], Datatype = "rating") # Find the 5 items most similar to a given item kclosestitems(CF1, Id_i = "Frozen", k = 5)data(movies, package = "CFilt") CF1 <- CFbuilder(movies[1:200, ], Datatype = "rating") # Find the 5 items most similar to a given item kclosestitems(CF1, Id_i = "Frozen", k = 5)
A dataset containing 7276 ratings for 50 movies by 526 users.
A data frame with 7276 rows and 3 variables:
Users identifier. Numbers 1 to 526.
Movies identifier. Movies names.
Movie ratings by users. Ratings from 1 to 5 (Likert scale).
Giglio (2014)
Returns the top-k items to recommend for a given user based on a collaborative filtering model.
topkitems(CF, Id_u, k = 10, type = "user")topkitems(CF, Id_u, k = 10, type = "user")
CF |
An object of class |
Id_u |
A character string representing the user ID. |
k |
A positive integer indicating the number of items to recommend. Default is 10. |
type |
A character string indicating the recommendation strategy:
|
For type = "user", recommendations are computed based on similarities
between users. For type = "item", recommendations are computed based
on similarities between items.
Only items not yet consumed/rated by the user are considered.
A character vector containing the IDs of the top-k recommended items.
data(movies, package = "CFilt") CF1 <- CFbuilder(movies[1:200, ], Datatype = "rating") # Recommend users for an item using user-based CF topkitems(CF1, Id_u = "1", k = 5, type = "user") # Recommend users for an item using item-based CF topkitems(CF1, Id_u = "1", k = 3, type = "item") CF2 <- CFbuilder(movies[1:200,-3]) # Recommend users for an item using user-based CF topkitems(CF2, Id_u = "1", k = 5, type = "user") # Recommend users for an item using item-based CF topkitems(CF2, Id_u = "1", k = 3, type = "item")data(movies, package = "CFilt") CF1 <- CFbuilder(movies[1:200, ], Datatype = "rating") # Recommend users for an item using user-based CF topkitems(CF1, Id_u = "1", k = 5, type = "user") # Recommend users for an item using item-based CF topkitems(CF1, Id_u = "1", k = 3, type = "item") CF2 <- CFbuilder(movies[1:200,-3]) # Recommend users for an item using user-based CF topkitems(CF2, Id_u = "1", k = 5, type = "user") # Recommend users for an item using item-based CF topkitems(CF2, Id_u = "1", k = 3, type = "item")
Returns the top-k users to whom a given item should be recommended, based on a collaborative filtering model.
topkusers(CF, Id_i, k = 10, type = "user")topkusers(CF, Id_i, k = 10, type = "user")
CF |
An object of class |
Id_i |
A character string representing the item ID. |
k |
A positive integer indicating the number of users to return. Default is 10. |
type |
A character string indicating the recommendation strategy:
|
For type = "user", recommendations are based on similarities
between users. For type = "item", recommendations are based on
similarities between items.
Only users who have not yet consumed/rated the item are considered.
A character vector containing the IDs of the top-k users for whom the item is recommended.
data(movies, package = "CFilt") CF1 <- CFbuilder(movies[1:200, ], Datatype = "rating") # Recommend users for an item using user-based CF topkusers(CF1, Id_i = "Frozen", k = 5, type = "user") # Recommend users for an item using item-based CF topkusers(CF1, Id_i = "Frozen", k = 5, type = "item") CF2 <- CFbuilder(movies[1:200,-3]) # Recommend users for an item using user-based CF topkusers(CF2, Id_i = "Frozen", k = 5, type = "user") # Recommend users for an item using item-based CF topkusers(CF2, Id_i = "Frozen", k = 3, type = "item")data(movies, package = "CFilt") CF1 <- CFbuilder(movies[1:200, ], Datatype = "rating") # Recommend users for an item using user-based CF topkusers(CF1, Id_i = "Frozen", k = 5, type = "user") # Recommend users for an item using item-based CF topkusers(CF1, Id_i = "Frozen", k = 5, type = "item") CF2 <- CFbuilder(movies[1:200,-3]) # Recommend users for an item using user-based CF topkusers(CF2, Id_i = "Frozen", k = 5, type = "user") # Recommend users for an item using item-based CF topkusers(CF2, Id_i = "Frozen", k = 3, type = "item")