multistate_kernel package¶
Submodules¶
multistate_kernel.kernel module¶
-
class
multistate_kernel.kernel.
MultiStateKernel
(kernels, scale, scale_bounds)¶ Bases:
multistate_kernel.kernel.VariadicKernelOperator
Kernel for multi-state process
This kernel handles multidimensional stochastic processes analysis inside
sklearn.gaussian_process
framework.Let us consider
n_states
different gaussian processes. Every process is allowed to have its own internal correlation properties. All processes in common are independent from each other. We call this processes as generating processes. At any time moment (X position) we can construct the vector \({\bf e}\) havingn_states
elements where i-th element containts the value of i-th process. Since the generating processes are independent, covariance matrix for the generating vector is diagonal one.To introduce correlations between processes, the lower triangular scaling matrix \({\rm R}\) is used. At every time moment we call \({\rm R} {\bf e}\) as observable vector and corresponding elements of vector \({\rm R} {\bf e}\) as observed processes.
The user specify kernels for each generating process in natural order. The kernel parameters and scaling matrix \({\rm R}\) are to be found by optimizer when training on real data. The user should not introduce multiplication constants in front of provided kernels. The scales are alread handled by the scaling matrix \({\rm R}\) and additional parameters will make optimization problem degenerated.
We assume that at every time-moment (X-position) only one observable process is measured. Since observable processes are correlated, measured value Y for one process leads to additional information in form of conditional probability for other processes at the same time-moment (X-position).
The data for training should be provided in the following format. Each X sample must be prepended with additional element enumerating the observed process which the sample belogs to. This additional integer value is called a state. Corresponding Y contains measured value. It is supported to have two samples with different state but the same time-moment (X-position).
For instance if we have two observable processes and measured the first process as the following
X 1 3 5 Y 0.0 0.3 0.1 The second process as the following
X 2 6 Y 0.2 0.4 Then expected input X for
MultiStateKernel
should be the following[[0,1],[1,2],[0,3],[0,5],[1,6]]
. And values Y are arranged with respect to X:[0.0, 0.2, 0.3, 0.1, 0.4]
. This is common rule for training and predict operations.Parameters: - kernels (list of
sklearn.gaussian_process.kernels.Kernel
) – Array of kernels for each state. The array length should be n_states. - scale (array, shape (n_states, n_states)) – Initial lower triangular scale matrix.
- scale_bounds (array, shape (2, n_states, n_states)) – Lower and upper bounds for elements of the scale matrix.
Examples
Here, we construct
MultiStateKernel
for the case of two states. The first generating process is expected to obey Matern kernel. The second one is white noise. We also specify initial scaling matrix with anti-correlation between this states.>>> k1 = Matern(nu=0.5, length_scale=1.0, length_scale_bounds=(0.01, 100)) >>> k2 = WhiteKernel(noise_level=1, noise_level_bounds='fixed') >>> ms_kernel = MultiStateKernel((k1, k2,), ... np.array([[1,0],[-0.5,1]]), ... [np.array([[-2.0,-2.0],[-2.0,-2.0]]), ... np.array([[2.0,2.0],[2.0,2.0]])])
See also
sklearn.gaussian_process.kernels.Kernel
- base kernel interface
-
class
ConstantMatrix
(coeffs, coeffs_bounds)¶ Bases:
sklearn.gaussian_process.kernels.Kernel
-
bounds
¶
-
diag
(X)¶ Returns the diagonal of the kernel k(X, X).
The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.
Parameters: X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y) Returns: K_diag – Diagonal of kernel k(X, X) Return type: array, shape (n_samples_X,)
-
get_params
(deep=True)¶ Get parameters of this kernel.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
hyperparameter_coeffs
¶
-
is_stationary
()¶ Returns whether the kernel is stationary.
-
theta
¶
-
tril
¶
-
-
diag
(X)¶ Returns the diagonal of the kernel k(X, X). The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.
Parameters: X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y) Returns: K_diag – Diagonal of kernel k(X, X) Return type: array, shape (n_samples_X,)
-
get_params
(deep=True)¶ Get parameters of this kernel.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
- kernels (list of
-
class
multistate_kernel.kernel.
VariadicKernelOperator
(**kernels)¶ Bases:
sklearn.gaussian_process.kernels.Kernel
Variadic number nested kernel container.
This is a base class to simplify handling multiple nested kernels as single one kernel. The class arranges parameters to be optimized and handles other simple basic stuff. The user should inherite from this class and implement the rest
sklearn.gaussian_process.kernels.KernelOperator
interface, for instance,__call__()
.Parameters: kernels (dict of sklearn.gaussian_process.kernels.Kernel) – The named collection of the kernels objects to be nested. See also
sklearn.gaussian_process.kernels.Kernel
- Base kernel interface.
sklearn.gaussian_process.kernels.KernelOperator
- Kernel operator for two nested kernels.
-
bounds
¶
-
get_params
(deep=True)¶ Get parameters of this kernel.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
hyperparameters
¶ Returns a list of all hyperparameter.
-
is_stationary
()¶ Returns whether the kernel is stationary.
-
theta
¶
multistate_kernel.util module¶
-
class
multistate_kernel.util.
FrozenOrderedDict
(*args, **kwargs)¶ Bases:
_abcoll.Mapping
Immutable ordered dictionary
It is based on collections.OrderedDict, so it remembers insertion order
-
class
multistate_kernel.util.
StateData
(x, y, err)¶ Bases:
tuple
-
err
¶ Alias for field number 2
-
x
¶ Alias for field number 0
-
y
¶ Alias for field number 1
-
-
class
multistate_kernel.util.
MultiStateData
(state_data_odict, scikit_learn_data)¶ Bases:
object
Multi state data class
This class holds two representation of the multi state data. The first representation is a frozen ordered dictionary .odict composed from key - trinity of x, y err (all 1-D arrays). The second representation is .arrays namedtuple composed from three scikit-learn friendly arrays: x (2-D, as needed by MultiStateKernel), y and err, and additional constant norm, that
should multiplies y and err to get .odict values.This class shouldn’t be constructed by __init__ but by class methods
Parameters: - state_data_odict (FrozenOrderedDict[str: StateData or numpy.recarray]) – Ordered dictionary of the pairs of objects with .x, .y, .err attributes, all of them should by 1-D numpy.ndarray
- scikit_learn_data (ScikitLearnData) – Object with .x (2-D numpy.ndarray), .y (1-D numpy.ndarray), .err (1-D numpy.ndarray), .norm (positive float).
-
odict
¶ FrozenOrderedDict[str: StateData or numpy.recarray]
-
arrays
¶ ScikitLearnData
-
norm
¶ float
-
keys
¶ tuple
-
key
(idx)¶ State name by its index
-
idx
(key)¶ State index by its name
-
convert_arrays
(x, y, err)¶ New MultiStateData object from scikit-learn style arrays
-
append
(other)¶ Add data from another MultiStateData object
Parameters: other (MultiStateData) –
-
append_dict
(d)¶ Add data from dictionary
Parameters: d (dict-like) – Dictionary that is similar to .odict
-
arrays
-
convert_arrays
(x, y, err) Get new MultiStateKernel object from scikit-learn style arrays
Parameters: - x (2-D numpy.ndarray) – X-data in the format specified by MultiStateKernel: the first column is th state index, the second column is coordinate.
- y (1-D numpy.ndarray) – y-data
- err (1-D numpy.ndarray) – Errors for y
Returns: New MultiStateData object with the same .norm and .keys as original
Return type:
-
classmethod
from_arrays
(x, y, err, norm=1, **kwargs)¶ Construct from scikit-learn style arrays
Parameters: - x (2-D numpy.ndarray) – X-data in the format specified of MultiStateKernel: the first column is th state index, the second column is coordinate.
- y (1-D numpy.ndarray) – y-data
- err (1-D numpy.ndarray) – Errors for y
- norm (positive float, optional) – The positive constant to multiply y and err to obtain their original values
- keys (array_like, optional) – The names for states. The default is integral indexes
Raises: IndexError: inconsistent input data shapes
-
classmethod
from_items
(items)¶ Construct from iterable of (key: (x, y, err))
Raises: ValueError: inconsistent input data shapes
-
classmethod
from_scikit_learn_data
(data, keys=None)¶ Construct from ScikitLearnData
Parameters: - data (ScikitLearnData) – An object with x, y, err and norm attributes. For details of these attributes see .from_arrays()
- keys (array_like, optional) – The names for states. The default is integral indexes
Raises: IndexError: inconsistent input data shapes
-
classmethod
from_state_data
(*args, **kwargs)¶ Construct from iterable of (key: object), where object should has as attributes x, y and err, all are 1-D numpy.ndarray
Raises: ValueError: inconsistent input data shapes
-
keys
()
-
norm
-
odict
-
sample
(x)¶ Generate scikit-learn style sample from 1-d array
Parameters: x (1-D numpy.ndarray) – x sample data, it assumes to be the sample for every state Returns: X-data in the format specified by MultiStateKernel Return type: 2-D numpy.ndarray
Module contents¶
-
class
multistate_kernel.
MultiStateKernel
(kernels, scale, scale_bounds)¶ Bases:
multistate_kernel.kernel.VariadicKernelOperator
Kernel for multi-state process
This kernel handles multidimensional stochastic processes analysis inside
sklearn.gaussian_process
framework.Let us consider
n_states
different gaussian processes. Every process is allowed to have its own internal correlation properties. All processes in common are independent from each other. We call this processes as generating processes. At any time moment (X position) we can construct the vector \({\bf e}\) havingn_states
elements where i-th element containts the value of i-th process. Since the generating processes are independent, covariance matrix for the generating vector is diagonal one.To introduce correlations between processes, the lower triangular scaling matrix \({\rm R}\) is used. At every time moment we call \({\rm R} {\bf e}\) as observable vector and corresponding elements of vector \({\rm R} {\bf e}\) as observed processes.
The user specify kernels for each generating process in natural order. The kernel parameters and scaling matrix \({\rm R}\) are to be found by optimizer when training on real data. The user should not introduce multiplication constants in front of provided kernels. The scales are alread handled by the scaling matrix \({\rm R}\) and additional parameters will make optimization problem degenerated.
We assume that at every time-moment (X-position) only one observable process is measured. Since observable processes are correlated, measured value Y for one process leads to additional information in form of conditional probability for other processes at the same time-moment (X-position).
The data for training should be provided in the following format. Each X sample must be prepended with additional element enumerating the observed process which the sample belogs to. This additional integer value is called a state. Corresponding Y contains measured value. It is supported to have two samples with different state but the same time-moment (X-position).
For instance if we have two observable processes and measured the first process as the following
X 1 3 5 Y 0.0 0.3 0.1 The second process as the following
X 2 6 Y 0.2 0.4 Then expected input X for
MultiStateKernel
should be the following[[0,1],[1,2],[0,3],[0,5],[1,6]]
. And values Y are arranged with respect to X:[0.0, 0.2, 0.3, 0.1, 0.4]
. This is common rule for training and predict operations.Parameters: - kernels (list of
sklearn.gaussian_process.kernels.Kernel
) – Array of kernels for each state. The array length should be n_states. - scale (array, shape (n_states, n_states)) – Initial lower triangular scale matrix.
- scale_bounds (array, shape (2, n_states, n_states)) – Lower and upper bounds for elements of the scale matrix.
Examples
Here, we construct
MultiStateKernel
for the case of two states. The first generating process is expected to obey Matern kernel. The second one is white noise. We also specify initial scaling matrix with anti-correlation between this states.>>> k1 = Matern(nu=0.5, length_scale=1.0, length_scale_bounds=(0.01, 100)) >>> k2 = WhiteKernel(noise_level=1, noise_level_bounds='fixed') >>> ms_kernel = MultiStateKernel((k1, k2,), ... np.array([[1,0],[-0.5,1]]), ... [np.array([[-2.0,-2.0],[-2.0,-2.0]]), ... np.array([[2.0,2.0],[2.0,2.0]])])
See also
sklearn.gaussian_process.kernels.Kernel
- base kernel interface
-
class
ConstantMatrix
(coeffs, coeffs_bounds)¶ Bases:
sklearn.gaussian_process.kernels.Kernel
-
bounds
¶
-
diag
(X)¶ Returns the diagonal of the kernel k(X, X).
The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.
Parameters: X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y) Returns: K_diag – Diagonal of kernel k(X, X) Return type: array, shape (n_samples_X,)
-
get_params
(deep=True)¶ Get parameters of this kernel.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
hyperparameter_coeffs
¶
-
is_stationary
()¶ Returns whether the kernel is stationary.
-
theta
¶
-
tril
¶
-
-
diag
(X)¶ Returns the diagonal of the kernel k(X, X). The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.
Parameters: X (array, shape (n_samples_X, n_features)) – Left argument of the returned kernel k(X, Y) Returns: K_diag – Diagonal of kernel k(X, X) Return type: array, shape (n_samples_X,)
-
get_params
(deep=True)¶ Get parameters of this kernel.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
- kernels (list of