openrec.legacy.utils.implicit_dataset module¶
-
class
openrec.legacy.utils.implicit_dataset.ImplicitDataset(raw_data, max_user, max_item, name='dataset')¶ Bases:
openrec.legacy.utils.dataset.DatasetThe ImplicitDataset class stores and parses a sequence of user implicit feedback for training or evaluation. It extends the functionality of the Dataset class.
Parameters: - raw_data (numpy structured array) – Input raw data. Other legacy formats (e.g., sparse matrix) are supported but not recommended.
- max_user (int) – Maximum number of users in the recommendation system.
- max_item (int) – Maximum number of items in the recommendation system.
- name (str) – Name of the dataset.
Notes
The ImplicitDataset class parses the input
raw_datainto structured dictionaries (consumed by samplers or model trainer). This class expectsraw_dataas a numpy structured array, where each row represents a data point and contains at least two keys:user_id: the user involved in the interaction.item_id: the item involved in the interaction.
raw_datamight contain other keys, such astimestamp, andlocation, etc. based on the use cases of different recommendation systems. An user should be uniquely and numerically indexed from 0 tototal_number_of_users - 1. The items should be indexed likewise.-
contain_item(item_id)¶ Check whether or not an item is involved in any interaction.
Parameters: item_id (int) – target item id. Returns: A boolean indicator Return type: bool
-
contain_user(user_id)¶ Check whether or not an user is involved in any interaction.
Parameters: user_id (int) – target user id. Returns: A boolean indicator Return type: bool
-
get_interactions_by_item_gb_user(item_id)¶ Retrieve the interactions (grouped by user ids) involve a specific item.
Parameters: item_id (int) – target item id. Returns: Users that have interacted with given item. Return type: list
-
get_interactions_by_user_gb_item(user_id)¶ Retrieve the interactions (grouped by item ids) involve a specific user.
Parameters: user_id (int) – target user id. Returns: Items that have interacted with given user. Return type: list
-
get_unique_item_list()¶ Retrieve a list of unique item ids.
Returns: A list of unique item ids. Return type: numpy array
-
get_unique_user_list()¶ Retrieve a list of unique user ids.
Returns: A list of unique user ids. Return type: numpy array
-
unique_item_count()¶ Number of unique items.
Returns: Number of unique items. Return type: int
-
unique_user_count()¶ Number of unique users.
Returns: Number of unique users. Return type: int