skika.data.stream_generator_redundancy_drift#

Classes

StreamGeneratorRedund([base_stream, ...])

Stream generator with change in number of redundant features

class skika.data.stream_generator_redundancy_drift.StreamGeneratorRedund(base_stream=RandomRBFGeneratorRedund(model_random_state=None, n_centroids=50, n_classes=2, n_features=30, noise_percentage=0.0, perc_redund_feature=None, sample_random_state=None), random_state=None, n_drifts=10, n_instances=10000)#

Stream generator with change in number of redundant features

Create a stream from RandomRBFRedun or HyperPlanRedun to generate a given number of drifts with a given number of instances. Each concept contains a different number of redundant features (0, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100% of the total number of features).

Drifts are regularly placed every n_instances/n_drifts instances.

Parameters
  • base_stream (Stream (Default: RandomRBFRedun)) – The base stream to use.

  • random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

  • n_drifts (int (Default: 10)) – Number of drifts to be generated.

  • n_instances (int (Default: 10000)) – Number of instances to be generated.

Example

>>> # Imports
>>> from skika.data.stream_generator_redundancy_drift import StreamGeneratorRedund
>>> from skika.data.random_rbf_generator_redund import RandomRBFGeneratorRedund
>>> # Set the stream
>>> stream = StreamGeneratorRedund(base_stream = RandomRBFGeneratorRedund(n_classes=2, n_features=30, n_centroids=50, noise_percentage = 0.0), random_state=None, n_drifts = 10, n_instances = 10000)
>>> stream.prepare_for_use()
>>> # Retrieve next sample
>>> stream.next_sample()
(array([[0.21780997, 0.37810599, 0.24129934, 0.78979064, 0.83463727,
             0.90272964, 0.5611584 , 0.58977699, 0.78035701, 0.89178544,
             0.55418949, 0.30293076, 0.09691338, 0.75894948, 0.03441104,
             0.58977699, 0.75894948, 0.24129934, 0.78979064, 0.83463727,
             0.37810599, 0.55418949, 0.75894948, 0.24129934, 0.55418949,
             0.78035701, 0.09691338, 0.90272964, 0.83463727, 0.24129934]]),
array([1]))
property feature_names#

Retrieve the names of the features.

Returns

names of the features

Return type

list

get_data_info()#

Retrieves minimum information from the stream

Used by evaluator methods to id the stream.

The default format is: ‘Stream name - n_targets, n_classes, n_features’.

Returns

Stream data information

Return type

string

get_info()#

Collects and returns the information about the configuration of the estimator

Returns

Configuration of the estimator.

Return type

string

get_params(deep=True)#

Get parameters for this estimator.

Parameters

deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

has_more_samples()#

Checks if stream has more samples. :returns: True if stream has more samples. :rtype: Boolean

is_restartable()#

Determine if the stream is restartable. :returns: True if stream is restartable. :rtype: Boolean

last_sample()#

Retrieves last batch_size samples in the stream.

Returns

A numpy.ndarray of shape (batch_size, n_features) and an array-like of shape (batch_size, n_targets), representing the next batch_size samples.

Return type

tuple or tuple list

property n_cat_features#

Retrieve the number of integer features.

Returns

The number of integer features in the stream.

Return type

int

property n_features#

Retrieve the number of features.

Returns

The total number of features.

Return type

int

property n_num_features#

Retrieve the number of numerical features.

Returns

The number of numerical features in the stream.

Return type

int

n_remaining_samples()#

Returns the estimated number of remaining samples.

Returns

Remaining number of samples. -1 if infinite (e.g. generator)

Return type

int

property n_targets#

Retrieve the number of targets

Returns

the number of targets in the stream.

Return type

int

next_sample(batch_size=1)#

Return batch_size samples generated by choosing a centroid at random and randomly offsetting its attributes so that it is placed inside the hypersphere of that centroid.

Parameters

batch_size (int) – The number of samples to return.

Returns

Return a tuple with the features matrix and the labels matrix for the batch_size samples that were requested.

Return type

tuple or tuple list

property perc_redund_features#

Retrieve the number of redundant features. :returns: The total number of redundant features. :rtype: int

prepare_for_use()#

Prepares the stream for use. Randomly create the list of redundant numbers of features used for each concept in the stream.

Notes

This functions should always be called after the stream initialization.

reset()#

Resets the estimator to its initial state.

Return type

self

restart()#

Restart the stream.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type

self

property target_names#

Retrieve the names of the targets

Returns

the names of the targets in the stream.

Return type

list

property target_values#

Retrieve all target_values in the stream for each target.

Returns

list of lists of all target_values for each target

Return type

list