DataHandler#
- class DataHandler(sampling_plan, **kwargs)[source]#
Bases:
object
Post-processing data created from a sampling plan. Data (individual samples) were created with
do_mpc.sampling.Sampler
. The list of all samples originates fromdo_mpc.sampling.SamplingPlanner
and is used to initiate this class (sampling_plan
).The class can be created with optional keyword arguments which are passed to
set_param()
.Configuration and retrieving processed data:
Initiate the object with the
sampling_plan
originating fromdo_mpc.sampling.SamplingPlanner
.Set parameters with
set_param()
. Most importantly, the directory in which the individual samples are located should be passe withdata_dir
argument.(Optional) set one (or multiple) post-processing functions. These functions are applied to each loaded sample and can, e.g., extract or compile important information.
Load and return samples either by indexing with the
__getitem__()
method or by filtering withfilter()
.
Example:
sp = do_mpc.sampling.SamplingPlanner() # Plan with two variables alpha and beta: sp.set_sampling_var('alpha', np.random.randn) sp.set_sampling_var('beta', lambda: np.random.randint(0,5)) plan = sp.gen_sampling_plan(n_samples=10) sampler = do_mpc.sampling.Sampler(plan) # Sampler computes the product of two variables alpha and beta # that were created in the SamplingPlanner: def sample_function(alpha, beta): return alpha*beta sampler.set_sample_function(sample_function) sampler.sample_data() # Create DataHandler object with same plan: dh = do_mpc.sampling.DataHandler(plan) # Assume you want to compute the square of the result of each sample dh.set_post_processing('square', lambda res: res**2) # As well as the value itself: dh.set_post_processing('default', lambda res: res) # Query all post-processed results with: dh[:]
- __getitem__(ind)[source]#
Index results from the
DataHandler
. Pass an index or a slice operator.
Methods#
filter#
- filter(self, input_filter=None, output_filter=None)#
Filter data from the DataHandler. Filters can be applied to inputs or to results that were obtained with the post-processing functions. Filtering returns only a subset from the created samples based on arbitrary conditions.
Example:
sp = do_mpc.sampling.SamplingPlanner() # SamplingPlanner with two variables alpha and beta: sp.set_sampling_var('alpha', np.random.randn) sp.set_sampling_var('beta', lambda: np.random.randint(0,5)) plan = sp.gen_sampling_plan() ... dh = do_mpc.sampling.DataHandler(plan) dh.set_post_processing('square', lambda res: res**2) # Return all samples with alpha < 0 and beta > 2 dh.filter(input_filter = lambda alpha, beta: alpha < 0 and beta > 2) # Return all samples for which the computed value square < 5 dh.filter(output_filter = lambda square: square < 5)
- Parameters
input_filter (
Union
[FunctionType
,BuiltinMethodType
,None
]) – Function to filter the data.output_filter (
Union
[FunctionType
,BuiltinMethodType
,None
]) – Function to filter the data
- Raises
assertion – No post processing function is set
assertion – filter_fun must be either Function of BuiltinFunction_or_Method
- Returns
list
– Returns the post processed samples that satisfy the filter
set_param#
- set_param(self, **kwargs)#
Set the parameters of the DataHandler.
Parameters must be passed as pairs of valid keywords and respective argument. For example:
datahandler.set_param(overwrite = True)
- Parameters
data_dir (bool) – Directory where the data can be found (as defined in the
do_mpc.sampling.Sampler
).sample_name (str) – Naming scheme for samples (as defined in the
do_mpc.sampling.Sampler
).save_format (str) – Choose either
pickle
ormat
(as defined in thedo_mpc.sampling.Sampler
).
- Return type
None
set_post_processing#
- set_post_processing(self, name, post_processing_function)#
Set a post processing function. The post processing function is applied to all loaded samples, e.g. with
__getitem__()
orfilter()
. Users can set an arbitrary amount of post processing functions by repeatedly calling this method.The
post_processing_function
can have two possible signatures:post_processing_function(case_definition, sample_result)
post_processing_function(sample_result)
Where
case_definition
is adict
of all variables introduced in thedo_mpc.sampling.SamplingPlanner
andsample_results
is the result obtained from the function introduced withdo_mpc.sampling.Sampler.set_sample_function
.Note
Setting a post processing function with an already existing name will overwrite the previously set post processing function.
Example:
sp = do_mpc.sampling.SamplingPlanner() # Plan with two variables alpha and beta: sp.set_sampling_var('alpha', np.random.randn) sp.set_sampling_var('beta', lambda: np.random.randint(0,5)) plan = sp.gen_sampling_plan(n_samples=10) sampler = do_mpc.sampling.Sampler(plan) # Sampler computes the product of two variables alpha and beta # that were created in the SamplingPlanner: def sample_function(alpha, beta): return alpha*beta sampler.set_sample_function(sample_function) sampler.sample_data() # Create DataHandler object with same plan: dh = do_mpc.sampling.DataHandler(plan) # Assume you want to compute the square of the result of each sample dh.set_post_processing('square', lambda res: res**2) # As well as the value itself: dh.set_post_processing('default', lambda res: res) # Query all post-processed results with: dh[:]
- Parameters
name (
str
) – Name of the output of the post-processing operationpost_processing_function (
Union
[FunctionType
,BuiltinMethodType
]) – The post processing function to be evaluted
- Raises
assertion – name must be string
assertion – post_processing_function must be either Function of BuiltinFunction
- Return type
None
Attributes#
data_dir#
- DataHandler.data_dir#
Set the directory where the results are stored.