See the Mathematical description of lgpr models vignette for more information about the connection between different options and the created statistical model.

create_model(
formula,
data,
likelihood = "gaussian",
prior = NULL,
c_hat = NULL,
num_trials = NULL,
options = NULL,
prior_only = FALSE,
verbose = FALSE,
sample_f = !(likelihood == "gaussian")
)

## Arguments

formula The model formula, where it must contain exatly one tilde (~), with response variable on the left-hand side and model terms on the right-hand side terms are be separated by a plus (+) sign all variables appearing in formula must be found in data See the "Model formula syntax" section below (lgp) for instructions on how to specify the model terms. A data.frame where each column corresponds to one variable, and each row is one observation. Continuous covariates and the response variable must have type "numeric" and categorical covariates must have type "factor". Missing values should be indicated with NaN or NA. The response variable cannot contain missing values. Column names should not contain trailing or leading underscores. Determines the observation model. Must be either "gaussian" (default), "poisson", "nb" (negative binomial), "binomial" or "bb" (beta binomial). A named list, defining the prior distribution of model (hyper)parameters. See the "Defining priors" section below (lgp). The GP mean. This should only be given if sample_f is TRUE, otherwise the GP will always have zero mean. If sample_f is TRUE, the given c_hat can be a vector of length dim(data), or a real number defining a constant GP mean. If not specified and sample_f is TRUE, c_hat is set to c_hat = mean(y), if likelihood is "gaussian", c_hat =  log(mean(y)) if likelihood is "poisson" or "nb", c_hat =  log(p/(1-p)), where p = mean(y/num_trials) if likelihood is "binomial" or "bb", where y denotes the response variable measurements. This argument (number of trials) is only needed when likelihood is "binomial" or "bb". Must have length one or equal to the number of data points. Setting num_trials=1 and likelihood="binomial" corresponds to Bernoulli observation model. A named list with the following possible fields: delta Amount of added jitter to ensure positive definite covariance matrices. vm_params Variance mask function parameters (numeric vector of length 2). If options is NULL, default options are used. The defaults are equivalent to options = list(delta = 1e-8, vm_params = c(0.025, 1)). Should likelihood be ignored? See also sample_param_prior which can be used for any lgpmodel, and whose runtime is independent of the number of observations. Should some informative messages be printed? Determines if the latent function values are sampled (must be TRUE if likelihood is not "gaussian"). If this is TRUE, the response variable will be normalized to have zero mean and unit variance.

## Value

An object of class lgpmodel, containing the Stan input created based on parsing the specified formula, prior, and other options.

Other main functions: draw_pred(), get_draws(), lgp(), pred(), prior_pred(), sample_model()