Creates an additive Gaussian process model using
`create_model`

and fits it using `sample_model`

.
See the
Mathematical description of lgpr models
vignette for more information about the connection between different options
and the created statistical model.

```
lgp(
formula,
data,
likelihood = "gaussian",
prior = NULL,
c_hat = NULL,
num_trials = NULL,
options = NULL,
prior_only = FALSE,
verbose = FALSE,
sample_f = !(likelihood == "gaussian"),
quiet = FALSE,
skip_postproc = sample_f,
...
)
```

- formula
The model formula, where

it must contain exatly one tilde (

`~`

), with response variable on the left-hand side and model terms on the right-hand sideterms are be separated by a plus (

`+`

) signall variables appearing in

`formula`

must be found in`data`

See the "Model formula syntax" section below (

`lgp`

) for instructions on how to specify the model terms.- data
A

`data.frame`

where each column corresponds to one variable, and each row is one observation. Continuous covariates and the response variable must have type`"numeric"`

and categorical covariates must have type`"factor"`

. Missing values should be indicated with`NaN`

or`NA`

. The response variable cannot contain missing values. Column names should not contain trailing or leading underscores.- likelihood
Determines the observation model. Must be either

`"gaussian"`

(default),`"poisson"`

,`"nb"`

(negative binomial),`"binomial"`

or`"bb"`

(beta binomial).- prior
A named list, defining the prior distribution of model (hyper)parameters. See the "Defining priors" section below (

`lgp`

).- c_hat
The GP mean. This should only be given if

`sample_f`

is`TRUE`

, otherwise the GP will always have zero mean. If`sample_f`

is`TRUE`

, the given`c_hat`

can be a vector of length`dim(data)[1]`

, or a real number defining a constant GP mean. If not specified and`sample_f`

is`TRUE`

,`c_hat`

is set to`c_hat = mean(y)`

, if`likelihood`

is`"gaussian"`

,`c_hat =`

`log(mean(y))`

if`likelihood`

is`"poisson"`

or`"nb"`

,`c_hat =`

`log(p/(1-p))`

, where`p = mean(y/num_trials)`

if`likelihood`

is`"binomial"`

or`"bb"`

,

where

`y`

denotes the response variable measurements.- num_trials
This argument (number of trials) is only needed when likelihood is

`"binomial"`

or`"bb"`

. Must have length one or equal to the number of data points. Setting`num_trials=1`

and`likelihood="binomial"`

corresponds to Bernoulli observation model.- options
A named list with the following possible fields:

`delta`

Amount of added jitter to ensure positive definite covariance matrices.`vm_params`

Variance mask function parameters (numeric vector of length 2).

If

`options`

is`NULL`

, default options are used. The defaults are equivalent to`options = list(delta = 1e-8, vm_params = c(0.025, 1))`

.- prior_only
Should likelihood be ignored? See also

`sample_param_prior`

which can be used for any lgpmodel, and whose runtime is independent of the number of observations.- verbose
Can messages be printed during model creation? Has no effect if

`quiet=TRUE`

.- sample_f
Determines if the latent function values are sampled (must be

`TRUE`

if likelihood is not`"gaussian"`

). If this is`TRUE`

, the response variable will be normalized to have zero mean and unit variance.- quiet
Should all output messages be suppressed? You need to set also

`refresh=0`

if you want to suppress also the progress update messages from`sampling`

.- skip_postproc
Should all postprocessing be skipped? If this is

`TRUE`

, the returned lgpfit object will likely be much smaller (if`sample_f=FALSE`

).- ...
Optional arguments passed to

`sampling`

or`optimizing`

.

Returns an object of the S4 class lgpfit.

There are two ways to define the model formula:

Using a common

`formula`

-like syntax, like in`y ~ age +`

`age|id`

`+ sex`

. Terms can consist of a single variable, such as`age`

, or an interaction of two variables, such as`age|id`

. In single-variable terms, the variable can be either continuous (numeric) or categorical (factor), whereas in interaction terms the variable on the left-hand side of the vertical bar (`|`

) has to be continuous and the one on the right-hand side has to be categorical. Formulae specified using this syntax are translated to the advanced format so thatsingle-variable terms become

`gp(x)`

if variable`x`

is numeric and`zs(x)`

if`x`

is a factorinteraction terms

`x|z`

become`gp(x)*zs(z)`

Using the advanced syntax, like in

`y ~ gp(age) +`

`gp(age)*zs(id) +`

`het(id)*gp_vm(disAge)`

. This creates lgprhs objects, which consist of lgpterms, which consist of lgpexprs. This approach must be used if creating nonstationary, heterogeneous or temporally uncertain components.

Either one of the approaches should be used and they should not be mixed.

The `prior`

argument must be a named list, like
`list(alpha=student_t(4), wrp=igam(30,10))`

. See examples in tutorials.
Possible allowed names are

`"alpha"`

= component magnitude parameters`"ell"`

= component lengthscale parameters`"wrp"`

= input warping steepness parameters`"sigma"`

= noise magnitude (Gaussian obs. model)`"phi"`

= inv. overdispersion (negative binomial obs. model)`"gamma"`

= overdispersion (beta-binomial obs. model)`"beta"`

= heterogeneity parameters`"effect_time"`

= uncertain effect time parameters`"effect_time_info"`

= additional options for the above

See `priors`

for functions that can be
used to define the list elements. If a parameter of a model is not given
in this list, a default prior will be used for it.

It is not recommended to use default priors blindly. Rather, priors should
be specified according to the knowledge about the problem at hand, as in any
Bayesian analysis. In `lgpr`

this is especially important when

Using a non-Gaussian likelihood or otherwise setting

`sample_f = TRUE`

. In this case the response variable is not normalized, so the scale on which the data varies must be taken into account when defining priors of the signal magnitude parameters`alpha`

and possible noise parameters (`sigma`

,`phi`

,`gamma`

). Also it should be checked if`c_hat`

is set in a sensible way.Using a model that contains a

`gp_ns(x)`

or`gp_vm(x)`

expression in its formula. In this case the corresponding covariate`x`

is not normalized, and the prior for the input warping steepness parameter`wrp`

must be set according to the expected width of the window in which the nonstationary effect of`x`

occurs. By default, the width of this window is about 36, which has been set assuming that the unit of`x`

is months.

Other main functions:
`create_model()`

,
`draw_pred()`

,
`get_draws()`

,
`pred()`

,
`prior_pred()`

,
`sample_model()`