skika.data.generate_dataset#

Functions

`generate_concept_chain`(concept_desc, sequential)	Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept.
`generate_experiment_concept_chain`(...)	Generates a list of concepts for a datastream.
`generate_pattern_concept_chain`(concept_desc, ...)	Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept.
`get_concept_by_example`(num_samples, ...)
`get_concepts`(gt_concepts, ex_index, num_samples)	Given [(gt_concept, start_i, end_i)...] Return the ground truth occuring at a given index.
`get_concepts_from_model`(concept_chain, ...)
`get_model_drifts`(num_samples, datastream)
`make_reuse_folder`(experiment_directory)
`saveStreamToArff`(filename, stream_examples, ...)	Save examples to an ARFF file.
`save_stream`(options, ds_options[, pattern, arff])	Create, generate and save a data stream to csv or ARFF.

Classes

`DatastreamOptions`(noise, num_concepts, ...)	Options for generating a concept.
`ExperimentOptions`(seed, stream_type, directory)
`NPEncoder`(*[, skipkeys, ensure_ascii, ...])

class skika.data.generate_dataset.DatastreamOptions(noise, num_concepts, hard_diff, easy_diff, hard_appear, easy_appear, hard_prop, examples_per_appearence, stream_type, seed, gradual)#: Options for generating a concept.

class skika.data.generate_dataset.NPEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)#

default(obj)#

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

encode(o)#

Return a JSON string representation of a Python data structure.

>>> from json.encoder import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'

iterencode(o, _one_shot=False)#

Encode the given object and yield each string representation as available.

For example:

for chunk in JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)

skika.data.generate_dataset.generate_concept_chain(concept_desc, sequential)#

Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept.

Parameters: sequential (bool) – If true, concept transitions are determined by ID without randomness.

skika.data.generate_dataset.generate_experiment_concept_chain(ds_options, sequential, pattern)#

Generates a list of concepts for a datastream.

Parameters

ds_options – options for the data stream
sequential (bool) – If concepts should be sequential not random
pattern (bool) – If transitions should have an underlying pattern

Returns

concept_chain (dict<int><int>)
num_samples (int)
concept_descriptions (list<ConceptOccurence>)

skika.data.generate_dataset.generate_pattern_concept_chain(concept_desc, sequential)#

Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept. This is generated using a random markov model, so specific transtion patterns have unique properties.

Parameters: sequential (bool) – If true, concept transitions are determined by ID without randomness.

skika.data.generate_dataset.get_concepts(gt_concepts, ex_index, num_samples)#: Given [(gt_concept, start_i, end_i)…] Return the ground truth occuring at a given index.

skika.data.generate_dataset.saveStreamToArff(filename, stream_examples, stream_supplementary, arff)#

Save examples to an ARFF file.

Parameters

filename (str) – filename with extention
stream_examples (list) – list of examples [[X, y]]
stream_supplementary (list) – list of supplementary info for each observation
arff (bool) – Use arff or CSV

skika.data.generate_dataset.save_stream(options, ds_options, pattern=False, arff=False)#

Create, generate and save a data stream to csv or ARFF.

Parameters

options (ExperimentOptions) – options for the experiment
ds_options (DatastreamOptions) – options for the stream
pattern (bool) – Should use a pattern for concept ordering
arff (bool) – Save to ARFF