multistate_kernel package

Submodules

multistate_kernel.kernel module

class multistate_kernel.kernel.MultiStateKernel(kernels, scale, scale_bounds)

Bases: multistate_kernel.kernel.VariadicKernelOperator

Kernel for multi-state process

This kernel handles multidimensional stochastic processes analysis inside sklearn.gaussian_process framework.

Let us consider n_states different gaussian processes. Every process is allowed to have its own internal correlation properties. All processes in common are independent from each other. We call this processes as generating processes. At any time moment (X position) we can construct the vector \({\bf e}\) having n_states elements where i-th element containts the value of i-th process. Since the generating processes are independent, covariance matrix for the generating vector is diagonal one.

To introduce correlations between processes, the lower triangular scaling matrix \({\rm R}\) is used. At every time moment we call \({\rm R} {\bf e}\) as observable vector and corresponding elements of vector \({\rm R} {\bf e}\) as observed processes.

The user specify kernels for each generating process in natural order. The kernel parameters and scaling matrix \({\rm R}\) are to be found by optimizer when training on real data. The user should not introduce multiplication constants in front of provided kernels. The scales are alread handled by the scaling matrix \({\rm R}\) and additional parameters will make optimization problem degenerated.

We assume that at every time-moment (X-position) only one observable process is measured. Since observable processes are correlated, measured value Y for one process leads to additional information in form of conditional probability for other processes at the same time-moment (X-position).

The data for training should be provided in the following format. Each X sample must be prepended with additional element enumerating the observed process which the sample belogs to. This additional integer value is called a state. Corresponding Y contains measured value. It is supported to have two samples with different state but the same time-moment (X-position).

For instance if we have two observable processes and measured the first process as the following

X 1 3 5
Y 0.0 0.3 0.1

The second process as the following

X 2 6
Y 0.2 0.4

Then expected input X for MultiStateKernel should be the following [[0,1],[1,2],[0,3],[0,5],[1,6]]. And values Y are arranged with respect to X: [0.0, 0.2, 0.3, 0.1, 0.4]. This is common rule for training and predict operations.

Parameters:
  • kernels (list of sklearn.gaussian_process.kernels.Kernel) – Array of kernels for each state. The array length should be n_states.
  • scale (array, shape (n_states, n_states)) – Initial lower triangular scale matrix.
  • scale_bounds (array, shape (2, n_states, n_states)) – Lower and upper bounds for elements of the scale matrix.

Examples

Here, we construct MultiStateKernel for the case of two states. The first generating process is expected to obey Matern kernel. The second one is white noise. We also specify initial scaling matrix with anti-correlation between this states.

>>> k1 = Matern(nu=0.5, length_scale=1.0, length_scale_bounds=(0.01, 100))
>>> k2 = WhiteKernel(noise_level=1, noise_level_bounds='fixed')
>>> ms_kernel = MultiStateKernel((k1, k2,),
...                              np.array([[1,0],[-0.5,1]]),
...                              [np.array([[-2.0,-2.0],[-2.0,-2.0]]),
...                              np.array([[2.0,2.0],[2.0,2.0]])])

See also

sklearn.gaussian_process.kernels.Kernel
base kernel interface
class ConstantMatrix(coeffs, coeffs_bounds)

Bases: sklearn.gaussian_process.kernels.Kernel

bounds
diag(X)

Returns the diagonal of the kernel k(X, X).

The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.

Parameters:X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y)
Returns:K_diag – Diagonal of kernel k(X, X)
Return type:array, shape (n_samples_X,)
get_params(deep=True)

Get parameters of this kernel.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
hyperparameter_coeffs
is_stationary()

Returns whether the kernel is stationary.

theta
tril
diag(X)

Returns the diagonal of the kernel k(X, X). The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.

Parameters:X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y)
Returns:K_diag – Diagonal of kernel k(X, X)
Return type:array, shape (n_samples_X,)
get_params(deep=True)

Get parameters of this kernel.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
class multistate_kernel.kernel.VariadicKernelOperator(**kernels)

Bases: sklearn.gaussian_process.kernels.Kernel

Variadic number nested kernel container.

This is a base class to simplify handling multiple nested kernels as single one kernel. The class arranges parameters to be optimized and handles other simple basic stuff. The user should inherite from this class and implement the rest sklearn.gaussian_process.kernels.KernelOperator interface, for instance, __call__().

Parameters:kernels (dict of sklearn.gaussian_process.kernels.Kernel) – The named collection of the kernels objects to be nested.

See also

sklearn.gaussian_process.kernels.Kernel
Base kernel interface.
sklearn.gaussian_process.kernels.KernelOperator
Kernel operator for two nested kernels.
bounds
get_params(deep=True)

Get parameters of this kernel.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
hyperparameters

Returns a list of all hyperparameter.

is_stationary()

Returns whether the kernel is stationary.

theta

multistate_kernel.util module

class multistate_kernel.util.FrozenOrderedDict(*args, **kwargs)

Bases: _abcoll.Mapping

Immutable ordered dictionary

It is based on collections.OrderedDict, so it remembers insertion order

class multistate_kernel.util.StateData(x, y, err)

Bases: tuple

err

Alias for field number 2

x

Alias for field number 0

y

Alias for field number 1

class multistate_kernel.util.MultiStateData(state_data_odict, scikit_learn_data)

Bases: object

Multi state data class

This class holds two representation of the multi state data. The first representation is a frozen ordered dictionary .odict composed from key - trinity of x, y err (all 1-D arrays). The second representation is .arrays namedtuple composed from three scikit-learn friendly arrays: x (2-D, as needed by MultiStateKernel), y and err, and additional constant norm, that

should multiplies y and err to get .odict values.

This class shouldn’t be constructed by __init__ but by class methods

Parameters:
  • state_data_odict (FrozenOrderedDict[str: StateData or numpy.recarray]) – Ordered dictionary of the pairs of objects with .x, .y, .err attributes, all of them should by 1-D numpy.ndarray
  • scikit_learn_data (ScikitLearnData) – Object with .x (2-D numpy.ndarray), .y (1-D numpy.ndarray), .err (1-D numpy.ndarray), .norm (positive float).
odict

FrozenOrderedDict[str: StateData or numpy.recarray]

arrays

ScikitLearnData

norm

float

keys

tuple

key(idx)

State name by its index

idx(key)

State index by its name

convert_arrays(x, y, err)

New MultiStateData object from scikit-learn style arrays

append(other)

Add data from another MultiStateData object

Parameters:other (MultiStateData) –
append_dict(d)

Add data from dictionary

Parameters:d (dict-like) – Dictionary that is similar to .odict
arrays
convert_arrays(x, y, err)

Get new MultiStateKernel object from scikit-learn style arrays

Parameters:
  • x (2-D numpy.ndarray) – X-data in the format specified by MultiStateKernel: the first column is th state index, the second column is coordinate.
  • y (1-D numpy.ndarray) – y-data
  • err (1-D numpy.ndarray) – Errors for y
Returns:

New MultiStateData object with the same .norm and .keys as original

Return type:

MultiStateData

classmethod from_arrays(x, y, err, norm=1, **kwargs)

Construct from scikit-learn style arrays

Parameters:
  • x (2-D numpy.ndarray) – X-data in the format specified of MultiStateKernel: the first column is th state index, the second column is coordinate.
  • y (1-D numpy.ndarray) – y-data
  • err (1-D numpy.ndarray) – Errors for y
  • norm (positive float, optional) – The positive constant to multiply y and err to obtain their original values
  • keys (array_like, optional) – The names for states. The default is integral indexes
Raises:

IndexError: inconsistent input data shapes

classmethod from_items(items)

Construct from iterable of (key: (x, y, err))

Raises:ValueError: inconsistent input data shapes
classmethod from_scikit_learn_data(data, keys=None)

Construct from ScikitLearnData

Parameters:
  • data (ScikitLearnData) – An object with x, y, err and norm attributes. For details of these attributes see .from_arrays()
  • keys (array_like, optional) – The names for states. The default is integral indexes
Raises:

IndexError: inconsistent input data shapes

classmethod from_state_data(*args, **kwargs)

Construct from iterable of (key: object), where object should has as attributes x, y and err, all are 1-D numpy.ndarray

Raises:ValueError: inconsistent input data shapes
keys()
norm
odict
sample(x)

Generate scikit-learn style sample from 1-d array

Parameters:x (1-D numpy.ndarray) – x sample data, it assumes to be the sample for every state
Returns:X-data in the format specified by MultiStateKernel
Return type:2-D numpy.ndarray

Module contents

class multistate_kernel.MultiStateKernel(kernels, scale, scale_bounds)

Bases: multistate_kernel.kernel.VariadicKernelOperator

Kernel for multi-state process

This kernel handles multidimensional stochastic processes analysis inside sklearn.gaussian_process framework.

Let us consider n_states different gaussian processes. Every process is allowed to have its own internal correlation properties. All processes in common are independent from each other. We call this processes as generating processes. At any time moment (X position) we can construct the vector \({\bf e}\) having n_states elements where i-th element containts the value of i-th process. Since the generating processes are independent, covariance matrix for the generating vector is diagonal one.

To introduce correlations between processes, the lower triangular scaling matrix \({\rm R}\) is used. At every time moment we call \({\rm R} {\bf e}\) as observable vector and corresponding elements of vector \({\rm R} {\bf e}\) as observed processes.

The user specify kernels for each generating process in natural order. The kernel parameters and scaling matrix \({\rm R}\) are to be found by optimizer when training on real data. The user should not introduce multiplication constants in front of provided kernels. The scales are alread handled by the scaling matrix \({\rm R}\) and additional parameters will make optimization problem degenerated.

We assume that at every time-moment (X-position) only one observable process is measured. Since observable processes are correlated, measured value Y for one process leads to additional information in form of conditional probability for other processes at the same time-moment (X-position).

The data for training should be provided in the following format. Each X sample must be prepended with additional element enumerating the observed process which the sample belogs to. This additional integer value is called a state. Corresponding Y contains measured value. It is supported to have two samples with different state but the same time-moment (X-position).

For instance if we have two observable processes and measured the first process as the following

X 1 3 5
Y 0.0 0.3 0.1

The second process as the following

X 2 6
Y 0.2 0.4

Then expected input X for MultiStateKernel should be the following [[0,1],[1,2],[0,3],[0,5],[1,6]]. And values Y are arranged with respect to X: [0.0, 0.2, 0.3, 0.1, 0.4]. This is common rule for training and predict operations.

Parameters:
  • kernels (list of sklearn.gaussian_process.kernels.Kernel) – Array of kernels for each state. The array length should be n_states.
  • scale (array, shape (n_states, n_states)) – Initial lower triangular scale matrix.
  • scale_bounds (array, shape (2, n_states, n_states)) – Lower and upper bounds for elements of the scale matrix.

Examples

Here, we construct MultiStateKernel for the case of two states. The first generating process is expected to obey Matern kernel. The second one is white noise. We also specify initial scaling matrix with anti-correlation between this states.

>>> k1 = Matern(nu=0.5, length_scale=1.0, length_scale_bounds=(0.01, 100))
>>> k2 = WhiteKernel(noise_level=1, noise_level_bounds='fixed')
>>> ms_kernel = MultiStateKernel((k1, k2,),
...                              np.array([[1,0],[-0.5,1]]),
...                              [np.array([[-2.0,-2.0],[-2.0,-2.0]]),
...                              np.array([[2.0,2.0],[2.0,2.0]])])

See also

sklearn.gaussian_process.kernels.Kernel
base kernel interface
class ConstantMatrix(coeffs, coeffs_bounds)

Bases: sklearn.gaussian_process.kernels.Kernel

bounds
diag(X)

Returns the diagonal of the kernel k(X, X).

The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.

Parameters:X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y)
Returns:K_diag – Diagonal of kernel k(X, X)
Return type:array, shape (n_samples_X,)
get_params(deep=True)

Get parameters of this kernel.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
hyperparameter_coeffs
is_stationary()

Returns whether the kernel is stationary.

theta
tril
diag(X)

Returns the diagonal of the kernel k(X, X). The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.

Parameters:X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y)
Returns:K_diag – Diagonal of kernel k(X, X)
Return type:array, shape (n_samples_X,)
get_params(deep=True)

Get parameters of this kernel.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any