List of Functions and Default Parameters

Mathematical Definitions of Standard Kernels

Constant kernel: k(x,x') = \sigma^2

Gabor kernel: k(x,x') = \exp{(-\frac{||x-x'||^2} {2l^2})}  \cos{(\frac{2\pi||x-x'||}{p})}

Linear kernel with ARD: k(x,x') = \sigma^2 x^T L^{-1} x'

where L is a diagnal matrix consist of length scale l_{i}^2 for each dimension i.

Linear kernel: k(x,x') = \sigma^2 x^T x'

Matern kernel: k(x,x') = \sigma^2 f(r\sqrt{d})\exp{(-r\sqrt{d})}

with f(t)=1 for d=1, f(t)=1+t for d=3 and f(t)=(1+t+t^2)/3 for d=5.

where r is the distance r=\sqrt{||x-x'||^T L^{-1}||x-x'||} and L is a diagnal matrix consist of length scale l_{i}^2 for each dimension i.

Independent noise kernel: k(x,x') = \sigma_{n}^{2}

Periodic kernel: k(x,x') = \sigma^2\exp{(-\frac{2\sin^2(\pi||x-x'||/p)}{l^2})}

Piecewise polynomial kernel: k(x,x') = \sigma^2 \max{(1-r,0)}^{(j+v)} f(r,j)

with j = floor(D/2)+v+1

where D is the dimension of input and L is a diagnal matrix consist of length scale l_{i}^2 for each dimension i and f is a function depending on v. See gpml matlab v3.4.

Polynomial kernel: k(x,x') = \sigma^2 (x^T x' + c )^d

Squared exponential kernel: k(x,x') = \sigma^2 \exp{(-\frac{||x-x'||^2}{2l^2})}

Squared exponential kernel with ARD: k(x,x') = \sigma^2 \exp{(-\frac{||x-x'||^T L^{-1}||x-x'||}{2})}

where L is a diagnal matrix consist of length scale l_{i}^2 for each dimension i.

Squared exponential kernel with unit magnitude: k(x,x') = \exp{(-\frac{||x-x'||^2}{2l^2})}

Rational quadratic kernel: k(x,x') = \sigma^2 (1+\frac{||x-x'||^2}{2\alpha l^2})^{-\alpha}

Rational quadratic kernel with ARD: k(x,x') = \sigma^2 (1+\frac{||x-x'||^T L^{-1}||x-x'||}{2\alpha})^{-\alpha}

where L is a diagnal matrix consist of length scale l_{i}^2 for each dimension i.

List of Kernels and Default Parameters

class pyGPs.Core.cov.Const(log_sigma=0.0)[source]

Constant kernel. hyp = [ log_sigma ]

Parameters:log_sigma – signal deviation.
class pyGPs.Core.cov.FITCOfKernel(cov, inducingInput)[source]

Covariance function to be used together with the FITC approximation. The function allows for more than one output argument and does not respect the interface of a proper covariance function. Instead of outputing the full covariance, it returns cross-covariances between the inputs x, z and the inducing inputs xu as needed by infFITC

class pyGPs.Core.cov.Gabor(log_ell=0.0, log_p=0.0)[source]

Gabor covariance function with length scale ell and period p. The covariance function is parameterized as:

k(x,z) = h( ||x-z|| ) with h(t) = exp(-t^2/(2*ell^2))*cos(2*pi*t/p).

The hyperparameters are:

hyp = [log(ell), log(p)]

Note that SM covariance implements a weighted sum of Gabor covariance functions, but using an alternative (spectral) parameterization.

Parameters:
  • log_ell – characteristic length scale.
  • log_p – period.
class pyGPs.Core.cov.Kernel[source]

This is a base class of Kernel functions there is no computation in this class, it just defines rules about a kernel class should have each covariance function will inherit it and implement its own behaviour

checkInputGetCovMatrix(x, z, mode)[source]

Check validity of inputs for the method getCovMatrix()

Parameters:
  • x – training data
  • z – test data
  • mode (str) – ‘self_test’ return self derivative matrix of test data(test by 1). ‘train’ return training derivative matrix(train by train). ‘cross’ return cross derivative matrix between x and z(train by test)
checkInputGetDerMatrix(x, z, mode, der)[source]

Check validity of inputs for the method getDerMatrix()

Parameters:
  • x – training data
  • z – test data
  • mode (str) – ‘self_test’ return self derivative matrix of test data(test by 1). ‘train’ return training derivative matrix(train by train). ‘cross’ return cross derivative matrix between x and z(train by test)
  • der (int) – index of hyperparameter whose derivative to be computed
fitc(inducingInput)[source]

Covariance function to be used together with the FITC approximation. Setting FITC gp model will implicitly call this method.

Returns:an instance of FITCOfKernel
getCovMatrix(x=None, z=None, mode=None)[source]

Return the specific covariance matrix according to input mode

Parameters:
  • x – training data
  • z – test data
  • mode (str) – ‘self_test’ return self covariance matrix of test data(test by 1). ‘train’ return training covariance matrix(train by train). ‘cross’ return cross covariance matrix between x and z(train by test)
Returns:

the corresponding covariance matrix

getDerMatrix(x=None, z=None, mode=None, der=None)[source]

Compute derivatives wrt. hyperparameters according to input mode

Parameters:
  • x – training data
  • z – test data
  • mode (str) – ‘self_test’ return self derivative matrix of test data(test by 1). ‘train’ return training derivative matrix(train by train). ‘cross’ return cross derivative matrix between x and z(train by test)
  • der (int) – index of hyperparameter whose derivative to be computed
Returns:

the corresponding derivative matrix

class pyGPs.Core.cov.LINard(D=None, log_ell_list=None)[source]

Linear covariance function with Automatic Relevance Detemination. hyp = log_ell_list

Parameters:
  • D – dimension of training data. Set if you want default ell, which is 1 for each dimension.
  • log_ell_list – characteristic length scale for each dimension.
class pyGPs.Core.cov.Linear(log_sigma=0.0)[source]

Linear kernel. hyp = [ log_sigma ].

Parameters:log_sigma – signal deviation.
class pyGPs.Core.cov.Matern(log_ell=0.0, d=3, log_sigma=0.0)[source]

Matern covariance function with nu = d/2 and isotropic distance measure. For d=1 the function is also known as the exponential covariance function or the Ornstein-Uhlenbeck covariance in 1d. d will be rounded to 1, 3, 5 or 7 hyp = [ log_ell, log_sigma]

Parameters:
  • d – d is 2 times nu. Can only be 1,3, 5, or 7
  • log_ell – characteristic length scale.
  • log_sigma – signal deviation.
class pyGPs.Core.cov.Noise(log_sigma=0.0)[source]

Independent covariance function, i.e “white noise”, with specified variance. Normally NOT used anymore since noise is now added in liklihood. hyp = [ log_sigma ]

Parameters:log_sigma – signal deviation.
class pyGPs.Core.cov.Periodic(log_ell=0.0, log_p=0.0, log_sigma=0.0)[source]

Stationary kernel for a smooth periodic function. hyp = [ log_ell, log_p, log_sigma]

Parameters:
  • log_p – period.
  • log_ell – characteristic length scale.
  • log_sigma – signal deviation.
class pyGPs.Core.cov.PiecePoly(log_ell=0.0, v=2, log_sigma=0.0)[source]

Piecewise polynomial kernel with compact support. hyp = [log_ell, log_sigma]

Parameters:
  • log_ell – characteristic length scale.
  • log_sigma – signal deviation.
  • v – degree v will be rounded to 0,1,2,or 3. (not treated as hyperparameter, i.e. will not be trained).
class pyGPs.Core.cov.Poly(log_c=0.0, d=2, log_sigma=0.0)[source]

Polynomial covariance function. hyp = [ log_c, log_sigma ]

Parameters:
  • log_c – inhomogeneous offset.
  • log_sigma – signal deviation.
  • d – degree of polynomial (not treated as hyperparameter, i.e. will not be trained).
class pyGPs.Core.cov.Pre(M1, M2)[source]

Precomputed kernel matrix. No hyperparameters and thus nothing will be optimised.

Parameters:
  • M1 – cross covariances matrix(train+1 by test). last row is self covariances (diagonal of test by test)
  • M2 – training set covariance matrix (train by train)
class pyGPs.Core.cov.ProductOfKernel(cov1, cov2)[source]

Product of two kernel function.

class pyGPs.Core.cov.RBF(log_ell=0.0, log_sigma=0.0)[source]

Squared Exponential kernel with isotropic distance measure. hyp = [log_ell, log_sigma]

Parameters:
  • log_ell – characteristic length scale.
  • log_sigma – signal deviation.
class pyGPs.Core.cov.RBFard(D=None, log_ell_list=None, log_sigma=0.0)[source]

Squared Exponential kernel with Automatic Relevance Determination. hyp = log_ell_list + [log_sigma]

Parameters:
  • D – dimension of pattern. set if you want default ell, which is 1 for each dimension.
  • log_ell_list – characteristic length scale for each dimension.
  • log_sigma – signal deviation.
class pyGPs.Core.cov.RBFunit(log_ell=0.0)[source]

Squared Exponential kernel with isotropic distance measure with unit magnitude. i.e signal variance is always 1. hyp = [ log_ell ]

Parameters:log_ell – characteristic length scale.
class pyGPs.Core.cov.RQ(log_ell=0.0, log_sigma=0.0, log_alpha=0.0)[source]

Rational Quadratic covariance function with isotropic distance measure. hyp = [ log_ell, log_sigma, log_alpha ]

Parameters:
  • log_ell – characteristic length scale.
  • log_sigma – signal deviation.
  • log_alpha – shape parameter for the RQ covariance.
class pyGPs.Core.cov.RQard(D=None, log_ell_list=None, log_sigma=0.0, log_alpha=0.0)[source]

Rational Quadratic covariance function with Automatic Relevance Detemination (ARD) distance measure. hyp = log_ell_list + [ log_sigma, log_alpha ]

Parameters:
  • D – dimension of pattern. set if you want default ell, which is 0.5 for each dimension.
  • log_ell_list – characteristic length scale for each dimension.
  • log_sigma – signal deviation.
  • log_alpha – shape parameter for the RQ covariance.
class pyGPs.Core.cov.SM(Q=0, hyps=, []D=None)[source]

Gaussian Spectral Mixture covariance function. The covariance function is parameterized as:

k(x^p,x^q) = w’*prod( exp(-2*pi^2*d^2*v)*cos(2*pi*d*m), 2), d = |x^p,x^q|

where m(DxQ), v(DxQ) are the means and variances of the spectral mixture components and w are the mixture weights. The hyperparameters are:

hyp = [ log(w), log(m(:)), log(sqrt(v(:))) ]

Copyright (c) by Andrew Gordon Wilson and Hannes Nickisch, 2013-10-09.

For more details, see 1) Gaussian Process Kernels for Pattern Discovery and Extrapolation, ICML, 2013, by Andrew Gordon Wilson and Ryan Prescott Adams. 2) GPatt: Fast Multidimensional Pattern Extrapolation with Gaussian Processes, arXiv 1310.5288, 2013, by Andrew Gordon Wilson, Elad Gilboa, Arye Nehorai and John P. Cunningham, and http://mlg.eng.cam.ac.uk/andrew/pattern

Parameters:
  • log_w – weight coefficients.
  • log_m – spectral means (frequencies).
  • log_v – spectral variances.
class pyGPs.Core.cov.ScaleOfKernel(cov, scalar)[source]

Scale of a kernel function.

class pyGPs.Core.cov.SumOfKernel(cov1, cov2)[source]

Sum of two kernel function.

List of Means and Default Parameters

class pyGPs.Core.mean.Const(c=5.0)[source]

Constant mean function. hyp = [c]

Parameters:c – constant value for mean
class pyGPs.Core.mean.Linear(D=None, alpha_list=None)[source]

Linear mean function. self.hyp = alpha_list

Parameters:D – dimension of training data. Set if you want default alpha, which is 0.5 for each dimension.
Alpha_list:scalar alpha for each dimension
class pyGPs.Core.mean.Mean[source]

The base function for mean function

getDerMatrix(x=None, der=None)[source]

Compute derivatives wrt. hyperparameters.

Parameters:
  • x – training inputs
  • der (int) – index of hyperparameter whose derivative to be computed
Returns:

the corresponding derivative matrix

getMean(x=None)[source]

Get the mean vector based on the inputs.

Parameters:x – training data
class pyGPs.Core.mean.One[source]

One mean.

class pyGPs.Core.mean.PowerOfMean(mean, d)[source]

Power of a mean fucntion.

class pyGPs.Core.mean.ProductOfMean(mean1, mean2)[source]

Product of two mean fucntions.

class pyGPs.Core.mean.ScaleOfMean(mean, scalar)[source]

Scale of a mean function.

class pyGPs.Core.mean.SumOfMean(mean1, mean2)[source]

Sum of two mean functions.

class pyGPs.Core.mean.Zero[source]

Zero mean.

List of Likelihoods

class pyGPs.Core.lik.Erf[source]

Error function or cumulative Gaussian likelihood function for binary classification or probit regression.

Erf(t)=\frac{1}{2}(1+erf(\frac{t}{\sqrt{2}}))=normcdf(t)

class pyGPs.Core.lik.Gauss(log_sigma=-2.3025850929940455)[source]

Gaussian likelihood function for regression.

Gauss(t)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(t-y)^2}{2\sigma^2}}, where y is the mean and \sigma is the standard deviation.

hyp = [ log_sigma ]

class pyGPs.Core.lik.Laplace(log_sigma=-2.3025850929940455)[source]

Laplacian likelihood function for regression. ONLY works with EP inference!

Laplace(t) = \frac{1}{2b}e^{-\frac{|t-y|}{b}} where b=\frac{\sigma}{\sqrt{2}}, y is the mean and \sigma is the standard deviation.

hyp = [ log_sigma ]

List of Inference

class pyGPs.Core.inf.Exact[source]

Exact inference for a GP with Gaussian likelihood. Compute a parametrization of the posterior, the negative log marginal likelihood and its derivatives w.r.t. the hyperparameters.

class pyGPs.Core.inf.EP[source]

Expectation Propagation approximation to the posterior Gaussian Process.

class pyGPs.Core.inf.Laplace[source]

Laplace’s Approximation to the posterior Gaussian process.

class pyGPs.Core.inf.FITC_Exact[source]

FITC approximation to the posterior Gaussian process. The function is equivalent to infExact with the covariance function: Kt = Q + G; G = diag(g); g = diag(K-Q); Q = Ku’ * inv(Quu) * Ku; where Ku and Kuu are covariances w.r.t. to inducing inputs xu, snu2 = sn2/1e6 is the noise of the inducing inputs and Quu = Kuu + snu2*eye(nu).

class pyGPs.Core.inf.FITC_EP[source]

FITC-EP approximation to the posterior Gaussian process. The function is equivalent to infEP with the covariance function: Kt = Q + G; G = diag(g); g = diag(K-Q); Q = Ku’ * inv(Kuu + snu2 * eye(nu)) * Ku; where Ku and Kuu are covariances w.r.t. to inducing inputs xu and snu2 = sn2/1e6 is the noise of the inducing inputs. We fixed the standard deviation of the inducing inputs snu to be a one per mil of the measurement noise’s standard deviation sn. In case of a likelihood without noise parameter sn2, we simply use snu2 = 1e-6. For details, see The Generalized FITC Approximation, Andrew Naish-Guzman and Sean Holden, NIPS, 2007.

class pyGPs.Core.inf.FITC_Laplace[source]

FITC-Laplace approximation to the posterior Gaussian process. The function is equivalent to Laplace with the covariance function: Kt = Q + G; G = diag(g); g = diag(K-Q); Q = Ku’ * inv(Kuu + snu2 * eye(nu)) * Ku; where Ku and Kuu are covariances w.r.t. to inducing inputs xu and snu2 = sn2/1e6 is the noise of the inducing inputs. We fixed the standard deviation of the inducing inputs snu to be a one per mil of the measurement noise’s standard deviation sn. In case of a likelihood without noise parameter sn2, we simply use snu2 = 1e-6.