reservoirpy.datasets.one_hot_encode#

reservoirpy.datasets.one_hot_encode(y: ndarray | List)[source]#

Encode categorical features as a one-hot numeric array.

This functions creates a trailing column for each class from the dataset. This function also supports inputs as lists of numpy arrays to stay compatible with the ReservoirPy format.

Accepted inputs and corresponding outputs:

  • array of shape (n, ) or (n, 1) or list of length n -> array of shape (n, n_classes)

  • array of shape (n, m) or (n, m, 1) -> array of shape (n, m, n_classes)

  • list of arrays with shape (m, ) or (m, 1) -> list of arrays with shape (n, n_classes)

Parameters:
  • X (array or list of categorical values, or list of array of categorical values) – The data to determine the categories of each features.

  • y (ndarray | List) –

Returns:

One-hot encoded dataset

Return type:

array or list. See above for details.

Example

>>> from reservoirpy.datasets import one_hot_encode
>>> X = np.random.normal(size=(10, 100, 1))  # 10 series, 100 timesteps
>>> y = np.mean(X, axis=(1,2)) > 0. # a boolean for each series
>>> print(y)
[ True False False False  True False  True  True  True False]
>>> y_encoded, classes = one_hot_encode(y)
>>> y_encoded
array([ [0., 1.],
        [1., 0.],
        [1., 0.],
        [1., 0.],
        [0., 1.],
        [1., 0.],
        [0., 1.],
        [0., 1.],
        [0., 1.],
        [1., 0.]])
>>> classes
array([False,  True])