openrec.legacy.utils.implicit_dataset module

class openrec.legacy.utils.implicit_dataset.ImplicitDataset(raw_data, max_user, max_item, name='dataset')

Bases: openrec.legacy.utils.dataset.Dataset

The ImplicitDataset class stores and parses a sequence of user implicit feedback for training or evaluation. It extends the functionality of the Dataset class.

Parameters:
  • raw_data (numpy structured array) – Input raw data. Other legacy formats (e.g., sparse matrix) are supported but not recommended.
  • max_user (int) – Maximum number of users in the recommendation system.
  • max_item (int) – Maximum number of items in the recommendation system.
  • name (str) – Name of the dataset.

Notes

The ImplicitDataset class parses the input raw_data into structured dictionaries (consumed by samplers or model trainer). This class expects raw_data as a numpy structured array, where each row represents a data point and contains at least two keys:

  • user_id: the user involved in the interaction.
  • item_id: the item involved in the interaction.

raw_data might contain other keys, such as timestamp, and location, etc. based on the use cases of different recommendation systems. An user should be uniquely and numerically indexed from 0 to total_number_of_users - 1. The items should be indexed likewise.

contain_item(item_id)

Check whether or not an item is involved in any interaction.

Parameters:item_id (int) – target item id.
Returns:A boolean indicator
Return type:bool
contain_user(user_id)

Check whether or not an user is involved in any interaction.

Parameters:user_id (int) – target user id.
Returns:A boolean indicator
Return type:bool
get_interactions_by_item_gb_user(item_id)

Retrieve the interactions (grouped by user ids) involve a specific item.

Parameters:item_id (int) – target item id.
Returns:Users that have interacted with given item.
Return type:list
get_interactions_by_user_gb_item(user_id)

Retrieve the interactions (grouped by item ids) involve a specific user.

Parameters:user_id (int) – target user id.
Returns:Items that have interacted with given user.
Return type:list
get_unique_item_list()

Retrieve a list of unique item ids.

Returns:A list of unique item ids.
Return type:numpy array
get_unique_user_list()

Retrieve a list of unique user ids.

Returns:A list of unique user ids.
Return type:numpy array
unique_item_count()

Number of unique items.

Returns:Number of unique items.
Return type:int
unique_user_count()

Number of unique users.

Returns:Number of unique users.
Return type:int