skika.data.generate_dataset#

Functions

generate_concept_chain(concept_desc, sequential)

Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept.

generate_experiment_concept_chain(...)

Generates a list of concepts for a datastream.

generate_pattern_concept_chain(concept_desc, ...)

Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept.

get_concept_by_example(num_samples, ...)

get_concepts(gt_concepts, ex_index, num_samples)

Given [(gt_concept, start_i, end_i)...] Return the ground truth occuring at a given index.

get_concepts_from_model(concept_chain, ...)

get_model_drifts(num_samples, datastream)

make_reuse_folder(experiment_directory)

saveStreamToArff(filename, stream_examples, ...)

Save examples to an ARFF file.

save_stream(options, ds_options[, pattern, arff])

Create, generate and save a data stream to csv or ARFF.

Classes

DatastreamOptions(noise, num_concepts, ...)

Options for generating a concept.

ExperimentOptions(seed, stream_type, directory)

NPEncoder(*[, skipkeys, ensure_ascii, ...])

class skika.data.generate_dataset.DatastreamOptions(noise, num_concepts, hard_diff, easy_diff, hard_appear, easy_appear, hard_prop, examples_per_appearence, stream_type, seed, gradual)#

Options for generating a concept.

class skika.data.generate_dataset.NPEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)#
default(obj)#

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
encode(o)#

Return a JSON string representation of a Python data structure.

>>> from json.encoder import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'
iterencode(o, _one_shot=False)#

Encode the given object and yield each string representation as available.

For example:

for chunk in JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)
skika.data.generate_dataset.generate_concept_chain(concept_desc, sequential)#

Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept.

Parameters

sequential (bool) – If true, concept transitions are determined by ID without randomness.

skika.data.generate_dataset.generate_experiment_concept_chain(ds_options, sequential, pattern)#

Generates a list of concepts for a datastream.

Parameters
  • ds_options – options for the data stream

  • sequential (bool) – If concepts should be sequential not random

  • pattern (bool) – If transitions should have an underlying pattern

Returns

  • concept_chain (dict<int><int>)

  • num_samples (int)

  • concept_descriptions (list<ConceptOccurence>)

skika.data.generate_dataset.generate_pattern_concept_chain(concept_desc, sequential)#

Given a list of availiable concepts, generate a dict with (start, id) pairs giving the start of each concept. This is generated using a random markov model, so specific transtion patterns have unique properties.

Parameters

sequential (bool) – If true, concept transitions are determined by ID without randomness.

skika.data.generate_dataset.get_concepts(gt_concepts, ex_index, num_samples)#

Given [(gt_concept, start_i, end_i)…] Return the ground truth occuring at a given index.

skika.data.generate_dataset.saveStreamToArff(filename, stream_examples, stream_supplementary, arff)#

Save examples to an ARFF file.

Parameters
  • filename (str) – filename with extention

  • stream_examples (list) – list of examples [[X, y]]

  • stream_supplementary (list) – list of supplementary info for each observation

  • arff (bool) – Use arff or CSV

skika.data.generate_dataset.save_stream(options, ds_options, pattern=False, arff=False)#

Create, generate and save a data stream to csv or ARFF.

Parameters
  • options (ExperimentOptions) – options for the experiment

  • ds_options (DatastreamOptions) – options for the stream

  • pattern (bool) – Should use a pattern for concept ordering

  • arff (bool) – Save to ARFF