split_by_factor
splits according to given factor
split_within_factor
splits according to given
data point indices within the same level of a factor
split_within_factor_random
selects k points
from each level of a factor uniformly at random as test data
split_random
splits uniformly at random
split_data
splits according to given data rows
split_by_factor(data, test, var_name = "id")
split_within_factor(data, idx_test, var_name = "id")
split_within_factor_random(data, k_test = 1, var_name = "id")
split_random(data, p_test = 0.2, n_test = NULL)
split_data(data, i_test, sort_ids = TRUE)
a data frame
the levels of the factor that will be used as test data
name of a factor in the data
indices point indices with the factor
desired number of test data points per each level of the factor
desired proportion of test data
desired number of test data points (if NULL, p_test
is used to compute this)
test data row indices
should the test indices be sorted into increasing order
a named list with names train
, test
, i_train
and i_test
Other data frame handling functions:
add_dis_age()
,
add_factor()
,
add_factor_crossing()
,
adjusted_c_hat()
,
new_x()