openrec.legacy.utils.implicit_dataset module¶
-
class
openrec.legacy.utils.implicit_dataset.
ImplicitDataset
(raw_data, max_user, max_item, name='dataset')¶ Bases:
openrec.legacy.utils.dataset.Dataset
The ImplicitDataset class stores and parses a sequence of user implicit feedback for training or evaluation. It extends the functionality of the Dataset class.
Parameters: - raw_data (numpy structured array) – Input raw data. Other legacy formats (e.g., sparse matrix) are supported but not recommended.
- max_user (int) – Maximum number of users in the recommendation system.
- max_item (int) – Maximum number of items in the recommendation system.
- name (str) – Name of the dataset.
Notes
The ImplicitDataset class parses the input
raw_data
into structured dictionaries (consumed by samplers or model trainer). This class expectsraw_data
as a numpy structured array, where each row represents a data point and contains at least two keys:user_id
: the user involved in the interaction.item_id
: the item involved in the interaction.
raw_data
might contain other keys, such astimestamp
, andlocation
, etc. based on the use cases of different recommendation systems. An user should be uniquely and numerically indexed from 0 tototal_number_of_users - 1
. The items should be indexed likewise.-
contain_item
(item_id)¶ Check whether or not an item is involved in any interaction.
Parameters: item_id (int) – target item id. Returns: A boolean indicator Return type: bool
-
contain_user
(user_id)¶ Check whether or not an user is involved in any interaction.
Parameters: user_id (int) – target user id. Returns: A boolean indicator Return type: bool
-
get_interactions_by_item_gb_user
(item_id)¶ Retrieve the interactions (grouped by user ids) involve a specific item.
Parameters: item_id (int) – target item id. Returns: Users that have interacted with given item. Return type: list
-
get_interactions_by_user_gb_item
(user_id)¶ Retrieve the interactions (grouped by item ids) involve a specific user.
Parameters: user_id (int) – target user id. Returns: Items that have interacted with given user. Return type: list
-
get_unique_item_list
()¶ Retrieve a list of unique item ids.
Returns: A list of unique item ids. Return type: numpy array
-
get_unique_user_list
()¶ Retrieve a list of unique user ids.
Returns: A list of unique user ids. Return type: numpy array
-
unique_item_count
()¶ Number of unique items.
Returns: Number of unique items. Return type: int
-
unique_user_count
()¶ Number of unique users.
Returns: Number of unique users. Return type: int