leaf

leaf (Learning Equations for Automated Function discovery) is an R package for symbolic regression designed for ecological modelling. It discovers interpretable mathematical equations from data using a principled four-stage workflow — separating equation discovery from model evaluation — to ensure selected models are accurate, parsimonious, and generalizable.

leaf wraps powerful SR engines including PySR and RSRM, and is built around Multi-view Symbolic Regression (MvSR), which finds a single generalizable equation structure that can be fitted to multiple groups (e.g., species, sites, archipelagos) independently.

Installation

You can install leaf from CRAN with:

install.packages("leaf")

After installing, you need to install the Python backend once:

library(leaf)
leaf::install_leaf()

Example

library(leaf)

# Initialize a symbolic regressor
regressor <- SymbolicRegressor$new(
  engine = "rsrm",
  loss = "MSE"
)

# Search for candidate equations
train_data <- leaf_data("eg_train")
regressor$search_equations(
  data = train_data,
  formula = "y ~ f(x1, x2)"
)

# Fit parameters and evaluate
regressor$fit(data = train_data)
regressor$evaluate(metrics = c("RMSE", "R2"), data = leaf_data("eg_test"))

# Inspect results
print(regressor$get_pareto_front())

Core features

Documentation

See the vignettes for detailed usage:

Citation

A citation will be available once the accompanying paper is published.

Acknowledgements

This work was developed at INESC-ID and supported by national funds through FCT - Fundação para a Ciência e a Tecnologia, under project [REF].