DataHandler#

class DataHandler(sampling_plan, **kwargs)[source]#

Bases: object

Post-processing data created from a sampling plan. Data (individual samples) were created with do_mpc.sampling.Sampler. The list of all samples originates from do_mpc.sampling.SamplingPlanner and is used to initiate this class (sampling_plan).

The class can be created with optional keyword arguments which are passed to set_param().

Configuration and retrieving processed data:

  1. Initiate the object with the sampling_plan originating from do_mpc.sampling.SamplingPlanner.

  2. Set parameters with set_param(). Most importantly, the directory in which the individual samples are located should be passe with data_dir argument.

  3. (Optional) set one (or multiple) post-processing functions. These functions are applied to each loaded sample and can, e.g., extract or compile important information.

  4. Load and return samples either by indexing with the __getitem__() method or by filtering with filter().

Example:

sp = do_mpc.sampling.SamplingPlanner()

# Plan with two variables alpha and beta:
sp.set_sampling_var('alpha', np.random.randn)
sp.set_sampling_var('beta', lambda: np.random.randint(0,5))

plan = sp.gen_sampling_plan(n_samples=10)

sampler = do_mpc.sampling.Sampler(plan)

# Sampler computes the product of two variables alpha and beta
# that were created in the SamplingPlanner:

def sample_function(alpha, beta):
    return alpha*beta

sampler.set_sample_function(sample_function)

sampler.sample_data()

# Create DataHandler object with same plan:
dh = do_mpc.sampling.DataHandler(plan)

# Assume you want to compute the square of the result of each sample
dh.set_post_processing('square', lambda res: res**2)

# As well as the value itself:
dh.set_post_processing('default', lambda res: res)

# Query all post-processed results with:
dh[:]
__getitem__(ind)[source]#

Index results from the DataHandler. Pass an index or a slice operator.

Methods#

filter#

filter(self, input_filter=None, output_filter=None)#

Filter data from the DataHandler. Filters can be applied to inputs or to results that were obtained with the post-processing functions. Filtering returns only a subset from the created samples based on arbitrary conditions.

Example:

sp = do_mpc.sampling.SamplingPlanner()

# SamplingPlanner with two variables alpha and beta:
sp.set_sampling_var('alpha', np.random.randn)
sp.set_sampling_var('beta', lambda: np.random.randint(0,5))
plan = sp.gen_sampling_plan()

...

dh = do_mpc.sampling.DataHandler(plan)
dh.set_post_processing('square', lambda res: res**2)

# Return all samples with alpha < 0 and beta > 2
dh.filter(input_filter = lambda alpha, beta: alpha < 0 and beta > 2)
# Return all samples for which the computed value square < 5
dh.filter(output_filter = lambda square: square < 5)
Parameters:
  • input_filter (Union[FunctionType, BuiltinMethodType]) – Function to filter the data.

  • output_filter (Union[FunctionType, BuiltinMethodType]) – Function to filter the data

Raises:
  • assertion – No post processing function is set

  • assertion – filter_fun must be either Function of BuiltinFunction_or_Method

Returns:

list – Returns the post processed samples that satisfy the filter

set_param#

set_param(self, **kwargs)#

Set the parameters of the DataHandler.

Parameters must be passed as pairs of valid keywords and respective argument. For example:

datahandler.set_param(overwrite = True)
Parameters:
Return type:

None

set_post_processing#

set_post_processing(self, name, post_processing_function)#

Set a post processing function. The post processing function is applied to all loaded samples, e.g. with __getitem__() or filter(). Users can set an arbitrary amount of post processing functions by repeatedly calling this method.

The post_processing_function can have two possible signatures:

  1. post_processing_function(case_definition, sample_result)

  2. post_processing_function(sample_result)

Where case_definition is a dict of all variables introduced in the do_mpc.sampling.SamplingPlanner and sample_results is the result obtained from the function introduced with do_mpc.sampling.Sampler.set_sample_function.

Note

Setting a post processing function with an already existing name will overwrite the previously set post processing function.

Example:

sp = do_mpc.sampling.SamplingPlanner()

# Plan with two variables alpha and beta:
sp.set_sampling_var('alpha', np.random.randn)
sp.set_sampling_var('beta', lambda: np.random.randint(0,5))

plan = sp.gen_sampling_plan(n_samples=10)

sampler = do_mpc.sampling.Sampler(plan)

# Sampler computes the product of two variables alpha and beta
# that were created in the SamplingPlanner:

def sample_function(alpha, beta):
    return alpha*beta

sampler.set_sample_function(sample_function)

sampler.sample_data()

# Create DataHandler object with same plan:
dh = do_mpc.sampling.DataHandler(plan)

# Assume you want to compute the square of the result of each sample
dh.set_post_processing('square', lambda res: res**2)

# As well as the value itself:
dh.set_post_processing('default', lambda res: res)

# Query all post-processed results with:
dh[:]
Parameters:
  • name (str) – Name of the output of the post-processing operation

  • post_processing_function (Union[FunctionType, BuiltinMethodType]) – The post processing function to be evaluted

Raises:
  • assertion – name must be string

  • assertion – post_processing_function must be either Function of BuiltinFunction

Return type:

None

Attributes#

data_dir#

DataHandler.data_dir#

Set the directory where the results are stored.