Manual Symbolic Regression: Testing Hypotheses

Introduction

In addition to automated symbolic regression, leaf allows users to define their own candidate equations using the "manual" engine. This enables direct testing of hypotheses and incorporation of prior knowledge, while still leveraging leaf’s tools for parameter fitting, evaluation, and multi-view modeling.

Installation

Before using leafr, ensure the Python backend is installed:

leafr::install_leafr()

Load package

library(leaf)
if (!backend_available()) {
  message("Install backend with leaf::install_leaf()")
}  

Define the formula and custom equations

User-defined equations are specified as character strings. These can include:

model_formula <- "y ~ f(log(A), T, T**2, A | Archipelago, species)"
eqs <- c(
  "T**2*(u1 + u2*log(A) + u3*T)",
  "x3*(u1 + u2*x1 + u3*x2)",  # same as above
  "exp(u1 + u2*log(T) + u3*A*x2)"  # can mix both, but if using A directly in the equation need to specify it in the formula
)

Load the data

train_data <- leaf_data("GMD")
#> Warning in leaf_data("GMD"): Invalid data name. Run leaf_data() for a
#> full list of options.
head(train_data)
#> NULL

Register equations

Even in manual mode, search_equations() is used to register and preprocess the equations. No search is performed.

regressor$search_equations(
  data = train_data,
  formula = model_formula
)
#> Error in `py_call_impl()`:
#> ! TypeError: object of type 'NoneType' has no len()
#> Run `reticulate::py_last_error()` for details.

Fit parameters and inspect results

# Only one equation gets a finite loss
fit_results <- regressor$fit(data = train_data)
#> Error in `py_call_impl()`:
#> ! RuntimeError: You must run equation_search() before fitting parameters.
#> Run `reticulate::py_last_error()` for details.
pareto_front <- regressor$evaluate(metrics = c("RMSE", "PseudoR2"))
#> Error in `py_call_impl()`:
#> ! RuntimeError: You must run equation_search() before scoring.
#> Run `reticulate::py_last_error()` for details.
head(pareto_front)
#> Error:
#> ! object 'pareto_front' not found