Miguel Lazaro-Gredilla Homepage

# SSGP code simple usage tutorial

Following these simple steps, the experimental section of the article "Sparse Spectrum Gaussian Processes for Regression" (submitted for review) can be easily reproduced. If you want to do that, you might also want to download the corresponding Regression datasets.

## Usage

2. Unzip it to some directory of your choice.
3. Within matlab, change to that directory (cd directory) or add it to the path (addpath directory).
• Store your n training input vectors in a matrix X_tr of size nxD (D is the dimension of the inputs).
• Store the corresponding n targets in a vector T_tr of size nx1.
• Store your n_tst test input vectors in a matrix X_tst of size n_tstxD.
• Store the corresponding n_tst targets in a vector T_tst of size n_tstx1. (If they are not available, just use a properly sized vector of zeros).
5. Set m to the desired number of spectral points (one half the number of desired basis functions) and run the main function:

[NMSE, mu, S2, NMLP, loghyper, convergence] = ssgprfixed_ui(X_tr, T_tr, X_tst, T_tst, m);

6. The function selects the model using training data and makes predictions for test data. It returns the predictive mean (mu), variance (S2), error measures on test data NMSE (Normalized Mean Square Error) and NMLP (Negative Mean Log Probability), selected hyperparameters (loghyper) and convergence curve.

## Notes

• Training cost scales as nm². (i.e. linear in the number of inputs).
• Choosing bigger values for m usually results in more accurate solutions, at a higher computational cost.
• In the limit (for infinite m), the model is equivalent to a GP model with squared exponential covariance.
• The starting point for the hyperparameter search is randomized. Different runs will lead to slightly different results.
• Results obtained with this procedure represent a baseline. If you can provide a better initial (or final) guess for the hyperparameters, better results are expected.
• The default number of iterations used for optimization is probably unnecesarily big for most problems. This is so to ensure proper convergence. You can reduce it to save time.
• If you want to:
• manually select the initialization point,
• avoid having to reselect the model each time you use it,
• change the number of optimization iterations,
• use a different covariance function,
• etc...   