skika.hyper_parameter_tuning.drift_detectors.build_pareto_knowledge_drifts#
Classes
|
Description : |
- class skika.hyper_parameter_tuning.drift_detectors.build_pareto_knowledge_drifts.BuildDriftKnowledge(results_directory, names_detectors, names_streams, output, verbose=False)#
- Description :
Class to build the pareto knowledge from hyper-parameters configurations evaluated on differents datasets for the drift detector tuning. The knowledge consists in the best configuration of hyper-parameters for each dataset.
The datasets are characterised by meta-features and a knowledge base can be then be built to link these features to the best configurations.
- Parameters :
- results_directory: str
Path to the directory containing the knowledge files (results of the evaluation of the configurations on example streams)
- names_detectors: list of str
List of the names of the detectors
- names_streams: list of str
list of the names of the streams
- n_meta_features: int, default = 15 ((severity, magnitude, interval) * (med, kurto, skew, per10, per90))
Number of meta-features extracted from the stream NOT USED FOR THE MOMENT as we use theoritical meta-features and not measured ones
- knowledge_type: str
String indicating what knowledge is being calculated (for arf tree tuning or drift detectors) NOT USED FOR THE MOMENT, need further implementing to bring the two applications together
- output: str
Directory path where to save output file
- verbose: bool, default = False
Print pareto figures if True
- Output:
Csv file containing the configurations selected for each example stream (each row = 1 stream)
Example
>>> names_stm = ['BernouW1ME0010','BernouW1ME005095','BernouW1ME00509','BernouW1ME0109','BernouW1ME0108','BernouW1ME0208','BernouW1ME0207','BernouW1ME0307','BernouW1ME0306','BernouW1ME0406','BernouW1ME0506','BernouW1ME05506', >>> 'BernouW100ME0010','BernouW100ME005095','BernouW100ME00509','BernouW100ME0109','BernouW100ME0108','BernouW100ME0208','BernouW100ME0207','BernouW100ME0307','BernouW100ME0306','BernouW100ME0406','BernouW100ME0506','BernouW100ME05506', >>> 'BernouW500ME0010','BernouW500ME005095','BernouW500ME00509','BernouW500ME0109','BernouW500ME0108','BernouW500ME0208','BernouW500ME0207','BernouW500ME0307','BernouW500ME0306','BernouW500ME0406','BernouW500ME0506','BernouW500ME05506'] >>> >>> names_detect = [['PH1','PH2','PH3','PH4','PH5','PH6','PH7','PH8','PH9','PH10','PH11','PH12','PH13','PH14','PH15','PH16'], >>> ['ADWIN1','ADWIN2','ADWIN3','ADWIN4','ADWIN5','ADWIN6','ADWIN7','ADWIN8','ADWIN9'], >>> ['DDM1','DDM2','DDM3','DDM4','DDM5','DDM6','DDM7','DDM8','DDM9','DDM10'], >>> ['SeqDrift21','SeqDrift22','SeqDrift23','SeqDrift24','SeqDrift25','SeqDrift26','SeqDrift27','SeqDrift28','SeqDrift29','SeqDrift210', >>> 'SeqDrift211','SeqDrift212','SeqDrift213','SeqDrift214','SeqDrift215','SeqDrift216','SeqDrift217','SeqDrift218']] >>> >>> output_dir = os.getcwd() >>> directory_path_files = 'examples/pareto_knowledge/ExampleDriftKnowledge' # Available in hyper-param-tuning-examples repository >>> >>> pareto_build = BuildDriftKnowledge(results_directory=directory_path_files, names_detectors=names_detect, names_streams=names_stm, output=output_dir, verbose=True) >>> pareto_build.load_drift_data() >>> pareto_build.calculate_pareto() >>> pareto_build.best_config
- property best_config#
Retrieve the length of the stream. :returns: The length of the stream. :rtype: int
- calculate_crowding(scores)#
From https://github.com/MichaelAllen1966 Crowding is based on a vector for each individual All dimension is normalised between low and high. For any one dimension, all solutions are sorted in order low to high. Crowding for chromsome x for that score is the difference between the next highest and next lowest score. Total crowding value sums all crowding for all scores
- calculate_pareto()#
Function to calculate the Pareto front and detect the knee point
- identify_pareto(scores)#
- load_drift_data()#
Function to load the performance data from the csv files
- reduce_by_crowding(scores, number_to_select)#
From https://github.com/MichaelAllen1966 This function selects a number of solutions based on tournament of crowding distances. Two members of the population are picked at random. The one with the higher croding dostance is always picked