Function foci, implements the variable selection algorithm Feature Ordering by Conditional Independence algorithm introduced in the paper A Simple Measure Of Conditional Dependence. It produces an ordering of the predictors according to their predictive power. This ordering is used for variable selection without putting any assumption on the distribution of the data or assuming any particular underlying model. The simplicity of the estimation of the conditional dependence coefficient makes it an efficient method for variable ordering and variable selection that can be used for high dimensional settings.
In the following example, \(Y\) is a function of the first four columns of \(X\) in a complex way.
n = 2000
p = 100
X = matrix(rnorm(n * p), ncol = p)
colnames(X) = paste0(rep("X", p), seq(1, p))
Y = X[, 1] * X[, 2] + sin(X[, 1] * X[, 3]) + X[, 4]^2
result1 = foci(Y, X, numCores = 1)
result1
#> $selectedVar
#>    index names
#> 1:     4    X4
#> 2:     1    X1
#> 3:     2    X2
#> 4:     3    X3
#> 
#> $stepT
#> [1] 0.3290468 0.3732953 0.6273279 0.7596504
#> 
#> attr(,"class")
#> [1] "foci"In the previous example, using the default values of input variables stop and num_feature we let the function stop according to the foci algorithm’s stopping rule. The user can decide to stop after picking how many variables they want to. In this example, the user decides to stop the process after seeing exactly 5 selected variables.