skika.data.stream_generator_redundancy_drift#

Classes

StreamGeneratorRedund([base_stream, ...])

Stream generator with change in number of redundant features

class skika.data.stream_generator_redundancy_drift.StreamGeneratorRedund(base_stream=RandomRBFGeneratorRedund(model_random_state=None, n_centroids=50, n_classes=2, n_features=30, noise_percentage=0.0, perc_redund_feature=None, sample_random_state=None), random_state=None, n_drifts=10, n_instances=10000)#

Stream generator with change in number of redundant features

Create a stream from RandomRBFRedun or HyperPlanRedun to generate a given number of drifts with a given number of instances. Each concept contains a different number of redundant features (0, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100% of the total number of features).

Drifts are regularly placed every n_instances/n_drifts instances.

Parameters

base_stream (Stream (Default: RandomRBFRedun)) – The base stream to use.
random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
n_drifts (int (Default: 10)) – Number of drifts to be generated.
n_instances (int (Default: 10000)) – Number of instances to be generated.

Example

>>> # Imports
>>> from skika.data.stream_generator_redundancy_drift import StreamGeneratorRedund
>>> from skika.data.random_rbf_generator_redund import RandomRBFGeneratorRedund
>>> # Set the stream
>>> stream = StreamGeneratorRedund(base_stream = RandomRBFGeneratorRedund(n_classes=2, n_features=30, n_centroids=50, noise_percentage = 0.0), random_state=None, n_drifts = 10, n_instances = 10000)
>>> stream.prepare_for_use()
>>> # Retrieve next sample
>>> stream.next_sample()
(array([[0.21780997, 0.37810599, 0.24129934, 0.78979064, 0.83463727,
             0.90272964, 0.5611584 , 0.58977699, 0.78035701, 0.89178544,
             0.55418949, 0.30293076, 0.09691338, 0.75894948, 0.03441104,
             0.58977699, 0.75894948, 0.24129934, 0.78979064, 0.83463727,
             0.37810599, 0.55418949, 0.75894948, 0.24129934, 0.55418949,
             0.78035701, 0.09691338, 0.90272964, 0.83463727, 0.24129934]]),
array([1]))

property feature_names#

Retrieve the names of the features.

Returns: names of the features
Return type: list

get_data_info()#

Retrieves minimum information from the stream

Used by evaluator methods to id the stream.

The default format is: ‘Stream name - n_targets, n_classes, n_features’.

Returns: Stream data information
Return type: string

get_info()#

Collects and returns the information about the configuration of the estimator

Returns: Configuration of the estimator.
Return type: string

get_params(deep=True)#

Get parameters for this estimator.

Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: mapping of string to any

has_more_samples()#: Checks if stream has more samples. :returns: True if stream has more samples. :rtype: Boolean

is_restartable()#: Determine if the stream is restartable. :returns: True if stream is restartable. :rtype: Boolean

last_sample()#

Retrieves last batch_size samples in the stream.

Returns: A numpy.ndarray of shape (batch_size, n_features) and an array-like of shape (batch_size, n_targets), representing the next batch_size samples.
Return type: tuple or tuple list

property n_cat_features#

Retrieve the number of integer features.

Returns: The number of integer features in the stream.
Return type: int

property n_features#

Retrieve the number of features.

Returns: The total number of features.
Return type: int

property n_num_features#

Retrieve the number of numerical features.

Returns: The number of numerical features in the stream.
Return type: int

n_remaining_samples()#

Returns the estimated number of remaining samples.

Returns: Remaining number of samples. -1 if infinite (e.g. generator)
Return type: int

property n_targets#

Retrieve the number of targets

Returns: the number of targets in the stream.
Return type: int

next_sample(batch_size=1)#

Return batch_size samples generated by choosing a centroid at random and randomly offsetting its attributes so that it is placed inside the hypersphere of that centroid.

Parameters: batch_size (int) – The number of samples to return.
Returns: Return a tuple with the features matrix and the labels matrix for the batch_size samples that were requested.
Return type: tuple or tuple list

property perc_redund_features#: Retrieve the number of redundant features. :returns: The total number of redundant features. :rtype: int

prepare_for_use()#

Prepares the stream for use. Randomly create the list of redundant numbers of features used for each concept in the stream.

Notes

This functions should always be called after the stream initialization.

reset()#

Resets the estimator to its initial state.

Return type: self

restart()#: Restart the stream.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type: self

property target_names#

Retrieve the names of the targets

Returns: the names of the targets in the stream.
Return type: list

property target_values#

Retrieve all target_values in the stream for each target.

Returns: list of lists of all target_values for each target
Return type: list