openrec.legacy.utils.dataset module¶
-
class
openrec.legacy.utils.dataset.
Dataset
(raw_data, max_user, max_item, name='dataset')¶ Bases:
object
The Dataset class stores a sequence of data points for training or evaluation.
Parameters: - raw_data (numpy structured array) – Input raw data.
- max_user (int) – Maximum number of users in the recommendation system.
- max_item (int) – Maximum number of items in the recommendation system.
- name (str) – Name of the dataset.
Notes
The Dataset class expects
raw_data
as a numpy structured array, where each row represents a data point and contains at least two keys:user_id
: the user involved in the interaction.item_id
: the item involved in the interaction.
raw_data
might contain other keys, such astimestamp
, andlocation
, etc. based on the use cases of different recommendation systems. An user should be uniquely and numerically indexed from 0 tototal_number_of_users - 1
. The items should be indexed likewise.-
max_item
()¶ Maximum number of items.
Returns: Maximum number of items. Return type: int
-
max_user
()¶ Maximum number of users.
Returns: Maximum number of users. Return type: int
-
shuffle
()¶ Shuffle the dataset entries.