split_by_factor
splits according to given factor
split_within_factor
splits according to given
data point indices within the same level of a factor
split_within_factor_random
selects k points
from each level of a factor uniformly at random as test data
split_random
splits uniformly at random
split_data
splits according to given data rows
split_by_factor(data, test, var_name = "id") split_within_factor(data, idx_test, var_name = "id") split_within_factor_random(data, k_test = 1, var_name = "id") split_random(data, p_test = 0.2, n_test = NULL) split_data(data, i_test, sort_ids = TRUE)
data | a data frame |
---|---|
test | the levels of the factor that will be used as test data |
var_name | name of a factor in the data |
idx_test | indices point indices with the factor |
k_test | desired number of test data points per each level of the factor |
p_test | desired proportion of test data |
n_test | desired number of test data points (if NULL, |
i_test | test data row indices |
sort_ids | should the test indices be sorted into increasing order |
a named list with names train
, test
, i_train
and i_test
Other data frame handling functions:
add_dis_age()
,
add_factor_crossing()
,
add_factor()
,
adjusted_c_hat()
,
new_x()