skika.data.random_rbf_generator_redund#
Classes
|
Class that stores a centroid's attributes. |
|
Random Radial Basis Function stream generator. |
- class skika.data.random_rbf_generator_redund.Centroid#
Class that stores a centroid’s attributes.
- class skika.data.random_rbf_generator_redund.RandomRBFGeneratorRedund(model_random_state=None, sample_random_state=None, n_classes=2, n_features=10, perc_redund_feature=0.0, n_centroids=50, noise_percentage=0.0)#
Random Radial Basis Function stream generator.
Modified version of scikit-multiflow code to include generation of redundant attributes.
Produces a radial basis function stream. A number of centroids, having a random central position, a standard deviation, a class label and weight, are generated. A new sample is created by choosing one of the centroids at random, taking into account their weights, and offsetting the attributes at a random direction from the centroid’s center. The offset length is drawn from a Gaussian distribution.
This process will create a normally distributed hypersphere of samples on the surrounds of each centroid.
We added a parameter to set a percentage of redundant features among the total number number of features.
- Parameters
model_random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random..
sample_random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random..
n_classes (int (Default: 2)) – The number of class labels to generate.
n_features (int (Default: 10)) – The number of numerical features to generate.
perc_redund_feature (float (Default: 0.0)) – The percentage of features to be redundant. From 0.0 to 1.0.
n_centroids (int (Default: 50)) – The number of centroids to generate.
noise_percentage (float (Default: 0.0)) – Percentage of noise to add to the data. From 0.0 to 1.0.
Examples
>>> # Imports >>> from skika.data.random_rbf_generator import RandomRBFGeneratorRedund >>> # Setting up the stream >>> stream = RandomRBFGeneratorRedund(model_random_state=99, sample_random_state=50, n_classes=4, n_features=10, perc_redund_feature = 0.4, n_centroids=50) >>> stream.prepare_for_use() >>> # Retrieving one sample >>> stream.next_sample() (array([[0.44952282, 1.09201096, 0.34778443, 0.92181679, 0.19503463, 0.28834419, 0.44952282, 0.19503463, 0.92181679, 0.19503463]]), array([3])) >>> # Retrieving 10 samples >>> stream.next_sample(10) (array([[ 0.70374896, 0.65752835, 0.20343463, 0.56136917, 0.76659286, 0.61081231, 0.70374896, 0.76659286, 0.56136917, 0.76659286], [ 0.27797196, 0.05640135, 0.80946171, 0.60572837, 0.95080656, 0.25512099, 0.27797196, 0.95080656, 0.60572837, 0.95080656], [ 0.33696167, 0.10923638, 0.85987231, 0.61868598, 0.85755211, 0.19469184, 0.33696167, 0.85755211, 0.61868598, 0.85755211], [ 0.71886223, 0.23078927, 0.45013806, 0.03019141, 0.42679505, 0.03841721, 0.71886223, 0.42679505, 0.03019141, 0.42679505], [-0.01849262, 0.92570731, 0.87564868, 0.49372553, 0.39717634, 0.46697609, -0.01849262, 0.39717634, 0.49372553, 0.39717634], [ 0.81850217, 0.87228851, 0.18873385, -0.04254749, 0.06942877, 0.55567756, 0.81850217, 0.06942877, -0.04254749, 0.06942877], [ 0.69888163, 0.61994977, 0.43074298, 0.27526838, 0.69566798, 0.91059369, 0.69888163, 0.69566798, 0.27526838, 0.69566798], [ 1.01929588, 0.80181051, 0.50547533, 0.14715636, 0.42889167, 0.61513174, 1.01929588, 0.42889167, 0.14715636, 0.42889167], [ 0.37738633, 0.60922205, 0.64216064, 0.90009707, 0.91787083, 0.36189554, 0.37738633, 0.91787083, 0.90009707, 0.91787083], [ 0.62185359, 0.75178244, 1.00436662, 0.24412816, 0.41070861, 0.52547739, 0.62185359, 0.41070861, 0.24412816, 0.41070861]]), array([3, 3, 3, 2, 3, 2, 0, 2, 0, 2])) >>> # Generators will have infinite remaining instances, so it returns -1 >>> stream.n_remaining_samples() -1 >>> stream.has_more_samples() True
- property feature_names#
Retrieve the names of the features.
- Returns
names of the features
- Return type
list
- generate_centroids()#
Sequentially creates all the centroids, choosing at random a center, a label, a standard deviation and a weight.
- get_data_info()#
Retrieves minimum information from the stream
Used by evaluator methods to id the stream.
The default format is: ‘Stream name - n_targets, n_classes, n_features’.
- Returns
Stream data information
- Return type
string
- get_info()#
Collects and returns the information about the configuration of the estimator
- Returns
Configuration of the estimator.
- Return type
string
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
- has_more_samples()#
Checks if stream has more samples.
- Returns
True if stream has more samples.
- Return type
Boolean
- is_restartable()#
Determine if the stream is restartable. :returns: True if stream is restartable. :rtype: Boolean
- last_sample()#
Retrieves last batch_size samples in the stream.
- Returns
A numpy.ndarray of shape (batch_size, n_features) and an array-like of shape (batch_size, n_targets), representing the next batch_size samples.
- Return type
tuple or tuple list
- property n_cat_features#
Retrieve the number of integer features.
- Returns
The number of integer features in the stream.
- Return type
int
- property n_features#
Retrieve the number of features.
- Returns
The total number of features.
- Return type
int
- property n_num_features#
Retrieve the number of numerical features.
- Returns
The number of numerical features in the stream.
- Return type
int
- n_remaining_samples()#
Returns the estimated number of remaining samples.
- Returns
Remaining number of samples. -1 if infinite (e.g. generator)
- Return type
int
- property n_targets#
Retrieve the number of targets
- Returns
the number of targets in the stream.
- Return type
int
- next_sample(batch_size=1)#
Return batch_size samples generated by choosing a centroid at random and randomly offsetting its attributes so that it is placed inside the hypersphere of that centroid.
- Parameters
batch_size (int) – The number of samples to return.
- Returns
Return a tuple with the features matrix and the labels matrix for the batch_size samples that were requested.
- Return type
tuple or tuple list
- property perc_redund_features#
Retrieve the number of redundant features. :returns: The total number of redundant features. :rtype: int
- prepare_for_use()#
Prepares the stream for use.
Notes
This functions should always be called after the stream initialization.
- reset()#
Resets the estimator to its initial state.
- Return type
self
- restart()#
Restart the stream.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Return type
self
- property target_names#
Retrieve the names of the targets
- Returns
the names of the targets in the stream.
- Return type
list
- property target_values#
Retrieve all target_values in the stream for each target.
- Returns
list of lists of all target_values for each target
- Return type
list