multistate_kernel package¶

Submodules¶

multistate_kernel.kernel module¶

class multistate_kernel.kernel.MultiStateKernel(kernels, scale, scale_bounds)¶

Bases: multistate_kernel.kernel.VariadicKernelOperator

Kernel for multi-state process

This kernel handles multidimensional stochastic processes analysis inside sklearn.gaussian_process framework.

Let us consider n_states different gaussian processes. Every process is allowed to have its own internal correlation properties. All processes in common are independent from each other. We call this processes as generating processes. At any time moment (X position) we can construct the vector \({\bf e}\) having n_states elements where i-th element containts the value of i-th process. Since the generating processes are independent, covariance matrix for the generating vector is diagonal one.

To introduce correlations between processes, the lower triangular scaling matrix \({\rm R}\) is used. At every time moment we call \({\rm R} {\bf e}\) as observable vector and corresponding elements of vector \({\rm R} {\bf e}\) as observed processes.

The user specify kernels for each generating process in natural order. The kernel parameters and scaling matrix \({\rm R}\) are to be found by optimizer when training on real data. The user should not introduce multiplication constants in front of provided kernels. The scales are alread handled by the scaling matrix \({\rm R}\) and additional parameters will make optimization problem degenerated.

We assume that at every time-moment (X-position) only one observable process is measured. Since observable processes are correlated, measured value Y for one process leads to additional information in form of conditional probability for other processes at the same time-moment (X-position).

The data for training should be provided in the following format. Each X sample must be prepended with additional element enumerating the observed process which the sample belogs to. This additional integer value is called a state. Corresponding Y contains measured value. It is supported to have two samples with different state but the same time-moment (X-position).

For instance if we have two observable processes and measured the first process as the following

X	1	3	5
Y	0.0	0.3	0.1

The second process as the following

X	2	6
Y	0.2	0.4

Then expected input X for MultiStateKernel should be the following [[0,1],[1,2],[0,3],[0,5],[1,6]]. And values Y are arranged with respect to X: [0.0, 0.2, 0.3, 0.1, 0.4]. This is common rule for training and predict operations.

Parameters:	kernels (list of `sklearn.gaussian_process.kernels.Kernel`) – Array of kernels for each state. The array length should be n_states. scale (array, shape (n_states, n_states)) – Initial lower triangular scale matrix. scale_bounds (array, shape (2, n_states, n_states)) – Lower and upper bounds for elements of the scale matrix.

Examples

Here, we construct MultiStateKernel for the case of two states. The first generating process is expected to obey Matern kernel. The second one is white noise. We also specify initial scaling matrix with anti-correlation between this states.

>>> k1 = Matern(nu=0.5, length_scale=1.0, length_scale_bounds=(0.01, 100))
>>> k2 = WhiteKernel(noise_level=1, noise_level_bounds='fixed')
>>> ms_kernel = MultiStateKernel((k1, k2,),
...                              np.array([[1,0],[-0.5,1]]),
...                              [np.array([[-2.0,-2.0],[-2.0,-2.0]]),
...                              np.array([[2.0,2.0],[2.0,2.0]])])

multistate_kernel.util module¶

class multistate_kernel.util.FrozenOrderedDict(*args, **kwargs)¶

Bases: _abcoll.Mapping

Immutable ordered dictionary

It is based on collections.OrderedDict, so it remembers insertion order

class multistate_kernel.util.StateData(x, y, err)¶

Bases: tuple

err¶: Alias for field number 2

x¶: Alias for field number 0

y¶: Alias for field number 1

class multistate_kernel.util.MultiStateData(state_data_odict, scikit_learn_data)¶

Bases: object

Multi state data class

This class holds two representation of the multi state data. The first representation is a frozen ordered dictionary .odict composed from key - trinity of x, y err (all 1-D arrays). The second representation is .arrays namedtuple composed from three scikit-learn friendly arrays: x (2-D, as needed by MultiStateKernel), y and err, and additional constant norm, that

should multiplies y and err to get .odict values.

This class shouldn’t be constructed by __init__ but by class methods

Parameters:	state_data_odict (FrozenOrderedDict[str: StateData or numpy.recarray]) – Ordered dictionary of the pairs of objects with .x, .y, .err attributes, all of them should by 1-D numpy.ndarray scikit_learn_data (ScikitLearnData) – Object with .x (2-D numpy.ndarray), .y (1-D numpy.ndarray), .err (1-D numpy.ndarray), .norm (positive float).

odict¶: FrozenOrderedDict[str: StateData or numpy.recarray]

arrays¶: ScikitLearnData

norm¶: float

keys¶: tuple

key(idx)¶: State name by its index

idx(key)¶: State index by its name

convert_arrays(x, y, err)¶: New MultiStateData object from scikit-learn style arrays

append(other)¶

Add data from another MultiStateData object

Parameters:	other (MultiStateData) –

append_dict(d)¶

Add data from dictionary

Parameters:	d (dict-like) – Dictionary that is similar to .odict

arrays

convert_arrays(x, y, err)

Get new MultiStateKernel object from scikit-learn style arrays

Parameters:	x (2-D numpy.ndarray) – X-data in the format specified by MultiStateKernel: the first column is th state index, the second column is coordinate. y (1-D numpy.ndarray) – y-data err (1-D numpy.ndarray) – Errors for y
Returns:	New MultiStateData object with the same .norm and .keys as original
Return type:	MultiStateData

classmethod from_arrays(x, y, err, norm=1, **kwargs)¶

Construct from scikit-learn style arrays

Parameters:

x (2-D numpy.ndarray) – X-data in the format specified of MultiStateKernel: the first column is th state index, the second column is coordinate.
y (1-D numpy.ndarray) – y-data
err (1-D numpy.ndarray) – Errors for y
norm (positive float, optional) – The positive constant to multiply y and err to obtain their original values
keys (array_like, optional) – The names for states. The default is integral indexes

Raises:

IndexError: inconsistent input data shapes

classmethod from_items(items)¶

Construct from iterable of (key: (x, y, err))

Raises:	ValueError: inconsistent input data shapes

classmethod from_scikit_learn_data(data, keys=None)¶

Construct from ScikitLearnData

Parameters:	data (ScikitLearnData) – An object with x, y, err and norm attributes. For details of these attributes see .from_arrays() keys (array_like, optional) – The names for states. The default is integral indexes
Raises:	IndexError: inconsistent input data shapes

classmethod from_state_data(*args, **kwargs)¶

Construct from iterable of (key: object), where object should has as attributes x, y and err, all are 1-D numpy.ndarray

Raises:	ValueError: inconsistent input data shapes

keys()

norm

odict

sample(x)¶

Generate scikit-learn style sample from 1-d array

Parameters:	x (1-D numpy.ndarray) – x sample data, it assumes to be the sample for every state
Returns:	X-data in the format specified by MultiStateKernel
Return type:	2-D numpy.ndarray

Module contents¶

class multistate_kernel.MultiStateKernel(kernels, scale, scale_bounds)¶