A generator of test datasets
Install the dependencies:
pip install -e requirements.txtStart the UI and open localhost:
python main.py --uiNOTE 0: the UI starts in debug mode as default
NOTE 1: it is possible to open the dataset already created to inspect them just using the load button with the same folder name of the generated dataset
You can use also the script mode passing a valid configuration as in the following example:
python dataset_generator.py gen configs/HighFreqDataset.json --dest-folder ./dataset{
"seed": 42,
"num_days": 90,
"num_req_x_day": 1000,
"dest_folder": "HighFrequencyDataset",
"function": {
"function_name": "HighFrequencyDataset",
"kwargs": {
"num_files": 100000,
"min_file_size": 100,
"max_file_size": 24000,
"lambda_less_req_files": 1.0,
"lambda_more_req_files": 10.0,
"perc_more_req_files": 10.0,
"perc_files_x_day": 1.0,
"size_generator_function": "gen_random_sizes"
}
}
}NOTE: the
function_namehave to be a validGenFunctionderived class and the kwargs are the input passed to the__init__function.