README

idionomics is an R toolkit for idionomic science — a research philosophy that places the unit of the ensemble (individual/couple/group) at the center of analysis. Rather than assuming a common distribution, a similar enough process for each unit, and fitting a single model to the whole ensemble, idionomic methods model each unit separately, then aggregate upward if sensible. The group-level picture emerges from individual results, not the other way around, while explicitly evaluating whether aggregation is reasonable given the measured level of heterogeneity of effects.

The package is built around intensive longitudinal data where each participant contributes a time series. It provides a pipeline from preprocessing through modeling to group-level summaries.

The idionomic science principle

Classical longitudinal methods (e.g., multilevel models) estimate one set of parameters shared — or partially shared — across all units of an ensemble. If the focus of interest is the trajectory of individuals, this is only sensible under hard-to-meet assumptions, such as exchangeability and/or ergodicity. If these assumptions are not met, ensemble averages may systematically obscure individual differences: an average positive effect may coexist with a significant subset of individuals for whom the effect is negative, nonsignificant, or zero.

This preserves the individual’s data structure, produces person-specific estimates that can be reported back to participants or explored as a basis for personalized intervention, and provides honest group-level summaries that distinguish “the average effect is X” from “most people show effect X”, “the average effect is null but there are significant effects at both sides”, and similar patterns that a single pooled estimate might drastically obscure.

Installation

# Install from a local source directory (development version)
install.packages("devtools")
devtools::install("/path/to/idionomics")

# Or install directly from GitHub:
devtools::install_github("cristobalehc/idionomics")

Recommended analysis pipeline

Function reference

i_screener() — Pre-pipeline data quality filter

i_screener(df, cols, id_var,
         min_n_subject = 20,
         min_sd       = NULL,
         max_mode_pct = NULL,
         filter_type  = "joint",
         mode         = "filter",
         verbose      = FALSE)

Screens subjects for data quality on raw (unstandardised) data before it enters the pipeline. After pmstandardize(), all non-constant series have within-person variance = 1 by construction, making iarimax()’s minvar filter ineffective. Running i_screener() on raw data catches low-quality subjects at the right stage.

Criterion	What it catches	Default
`min_n_subject`	Subjects with too few observations	20
`min_sd`	Near-constant series (floor/ceiling, low range)	`NULL` (off)
`max_mode_pct`	“Stuck” responders (e.g. ≥ 80 % of responses identical)	`NULL` (off)

filter_type = "joint" (default) excludes a subject if they fail any criterion on any variable — consistent with iarimax()’s AND filter. filter_type = "per_column" evaluates each variable independently.

mode controls the output format: - "filter" — returns the dataframe with failing subjects removed (joint) or their failing column values set to NA (per_column). - "flag" — appends a logical pass_overall column (joint) or <col>_pass columns (per_column). - "report" — returns a per-subject quality summary table.

library(idionomics)

set.seed(42)
panel <- do.call(rbind, lapply(1:9, function(id) {
  a <- rnorm(50)
  b <- 0.4 * a + rnorm(50)
  c <- 0.4 * b + rnorm(50)
  data.frame(id = as.character(id), time = seq_len(50),
             a = a, b = b, c = c,
             y = 0.5 * a + rnorm(50),
             stringsAsFactors = FALSE)
}))

# Subject 10: near-constant "a" — will be caught by i_screener(min_sd = 0.5)
s10 <- data.frame(id = "10", time = seq_len(50),
                  a = rep(3, 50), b = rnorm(50), c = rnorm(50),
                  y = rnorm(50), stringsAsFactors = FALSE)

# Subject 11: full positive loop (a -> b -> c -> a)
a11 <- rnorm(50)
b11 <- 0.6 * a11 + rnorm(50, sd = 0.5)
c11 <- 0.6 * b11 + rnorm(50, sd = 0.5)
s11 <- data.frame(id = "11", time = seq_len(50),
                  a = a11 + 0.4 * c11, b = b11, c = c11,
                  y = 0.5 * a11 + rnorm(50), stringsAsFactors = FALSE)

# Subject 12: negative a -> y effect
a12 <- rnorm(50)
s12 <- data.frame(id = "12", time = seq_len(50),
                  a = a12, b = 0.4 * a12 + rnorm(50),
                  c = rnorm(50), y = -0.5 * a12 + rnorm(50),
                  stringsAsFactors = FALSE)

panel <- rbind(panel, s10, s11, s12)

# Remove subjects with too few obs or low raw variance, before standardizing
panel_clean <- i_screener(panel, cols = c("a", "b", "c", "y"), id_var = "id",
                        min_n_subject = 20, min_sd = 0.5, verbose = TRUE)

# Inspect quality without committing to removal
report <- i_screener(panel, cols = c("a", "b", "c", "y"), id_var = "id",
                   min_n_subject = 20, min_sd = 0.5, max_mode_pct = 0.80,
                   mode = "report", verbose = TRUE)
print(report)

# Flag subjects for inspection, then decide
flagged <- i_screener(panel, cols = c("a", "b", "c", "y"), id_var = "id",
                    min_sd = 0.5, mode = "flag")
table(flagged$pass_overall)

pmstandardize() — Within-person z-scoring

pmstandardize(df, cols, id_var, verbose = FALSE, append = TRUE)

Computes (x - person_mean) / person_sd for each person × column combination. Output columns are named <col>_psd.

# Standardize all four variables within each person
panel_std <- pmstandardize(panel_clean, cols = c("a", "b", "c", "y"), id_var = "id",
                          verbose = TRUE)
head(panel_std)

i_detrender() — Within-person linear detrending

i_detrender(df, cols, id_var, timevar,
            min_n_subject = 20, minvar = 0.01,
            verbose = FALSE, append = TRUE)

Fits lm(col ~ time) within each subject and appends the column with the residuals (<col>_dt). Subjects with too few observations, insufficient pre-detrend variance, or near-zero post-detrend variance receive NA — independently for each column.

panel_dt <- i_detrender(panel_std, cols = c("a_psd", "b_psd", "c_psd", "y_psd"),
                        id_var = "id", timevar = "time", verbose = TRUE)
head(panel_dt)

iarimax() — Core I-ARIMAX algorithm

iarimax(dataframe, min_n_subject = 20, minvar = 0.01,
        y_series, x_series, focal_predictor = NULL,
        id_var, timevar, fixed_d = NULL,
        correlation_method = "pearson",
        keep_models = FALSE, verbose = FALSE)

Fits one forecast::auto.arima() model per subject, extracts coefficients via broom::tidy(), and pools the focal predictor’s coefficients with metafor::rma(). The fixed_d argument optionally fixes the differencing order across all subjects to ensure coefficients are on the same scale (e.g., fixed_d = 0 for levels, fixed_d = 1 for changes); AR and MA orders are always selected automatically per subject.

result <- iarimax(panel_dt,
                  y_series  = "y_psd_dt",
                  x_series  = "a_psd_dt",
                  id_var    = "id",
                  timevar   = "time",
                  verbose   = TRUE)

summary(result)   # prints subject counts, direction/significance counts, REMA
plot(result)      # caterpillar plot with RE-MA overlay

What the return value contains

Field	Description
`$results_df`	Per-subject ARIMA orders, estimates, SEs, `n_valid`, `n_params`, `raw_cor`
`$meta_analysis`	`metafor::rma` object (or `NULL` if rma failed)
`$case_number_detail`	Subject counts: original, filtered, ARIMA-failed, analyzed
`$models`	Raw `Arima` objects (only if `keep_models = TRUE`)

i_pval() — Per-subject p-values

i_pval(iarimax_object, feature = NULL)

Attaches a pval_<feature> column to results_df using the two-tailed t-distribution with ML-based degrees of freedom (n_valid - n_params).

result_pval <- i_pval(result)
result_pval$results_df[, c("id", "estimate_a_psd_dt", "pval_a_psd_dt")]

sden_test() — Sign Divergence / Equisyncratic Null test

sden_test(iarimax_object, alpha_arimax = 0.05, alpha_binom = NULL,
          test = "auto", feature = NULL)

A binomial test on the count of individually significant effects. Two test variants:

sden <- sden_test(result)
summary(sden)

# Force ENT regardless of REMA
sden_ent <- sden_test(result, test = "ENT")

looping_machine() — Directed loop detection

looping_machine(dataframe, a_series, b_series, c_series, id_var, timevar,
                covariates = NULL, include_third_as_covariate = FALSE,
                min_n_subject = 20, minvar = 0.01, fixed_d = NULL,
                correlation_method = "pearson",
                alpha = 0.05, keep_models = FALSE, verbose = FALSE)

Fits three I-ARIMAX legs (a→b, b→c, c→a), applies i_pval() to each, and computes Loop_positive_directed: a 0/1 indicator that is 1 only when all three focal betas are positive and significant at alpha.

loop_result <- looping_machine(panel_dt,
                               a_series = "a_psd_dt", b_series = "b_psd_dt",
                               c_series = "c_psd_dt",
                               id_var = "id", timevar = "time",
                               verbose = TRUE)

# Proportion of subjects with detected positive directed loop
mean(loop_result$loop_df$Loop_positive_directed, na.rm = TRUE)

# Per-leg I-ARIMAX results are also returned
summary(loop_result$iarimax_a_to_b)

Full pipeline example

library(idionomics)

set.seed(42)
panel <- do.call(rbind, lapply(1:9, function(id) {
  a <- rnorm(50)
  b <- 0.4 * a + rnorm(50)
  c <- 0.4 * b + rnorm(50)
  data.frame(
    id   = as.character(id),
    time = seq_len(50),
    a = a, b = b, c = c,
    y = 0.5 * a + rnorm(50),
    stringsAsFactors = FALSE
  )
}))

# Manually created subjects.
s10 <- data.frame(id = "10", time = seq_len(50),
                  a = rep(3, 50), b = rnorm(50), c = rnorm(50),
                  y = rnorm(50), stringsAsFactors = FALSE)
a11 <- rnorm(50)
b11 <- 0.6 * a11 + rnorm(50, sd = 0.5)
c11 <- 0.6 * b11 + rnorm(50, sd = 0.5)
s11 <- data.frame(id = "11", time = seq_len(50),
                  a = a11 + 0.4 * c11, b = b11, c = c11,
                  y = 0.5 * a11 + rnorm(50), stringsAsFactors = FALSE)
a12 <- rnorm(50)
s12 <- data.frame(id = "12", time = seq_len(50),
                  a = a12, b = 0.4 * a12 + rnorm(50),
                  c = rnorm(50), y = -0.5 * a12 + rnorm(50),
                  stringsAsFactors = FALSE)
panel <- rbind(panel, s10, s11, s12)

# Step 1: Quality screening on raw data (before standardization)
panel_clean <- i_screener(panel, cols = c("a", "b", "c", "y"), id_var = "id",
                        min_n_subject = 20, min_sd = 0.5, max_mode_pct = 0.80,
                        verbose = TRUE)

# Step 2: Within-person standardization
panel_std <- pmstandardize(panel_clean, cols = c("a", "b", "c", "y"), id_var = "id",
                          verbose = TRUE)

# Step 3: Linear detrending
panel_dt <- i_detrender(panel_std, cols = c("a_psd", "b_psd", "c_psd", "y_psd"),
                        id_var = "id", timevar = "time", verbose = TRUE)

# Step 4a: I-ARIMAX (single predictor)
result <- iarimax(panel_dt,
                  y_series = "y_psd_dt", x_series = "a_psd_dt",
                  id_var = "id", timevar = "time", verbose = TRUE)

summary(result)
plot(result)

# Step 4b: Directed loop detection
loop_result <- looping_machine(panel_dt,
                               a_series = "a_psd_dt", b_series = "b_psd_dt",
                               c_series = "c_psd_dt",
                               id_var = "id", timevar = "time",
                               verbose = TRUE)

mean(loop_result$loop_df$Loop_positive_directed, na.rm = TRUE)

# Step 5: Per-subject p-values
result_pval <- i_pval(result)

# Step 6: SDEN test
sden <- sden_test(result_pval)
summary(sden)

Key dependencies

Statistical software can produce results that are technically valid but analytically inappropriate for a given context. Users are encouraged to review the methods and code, inspect their data, and exercise independent statistical judgment before reporting findings.

Package	Purpose
`forecast`	`auto.arima()` — per-subject ARIMA model selection
`metafor`	`rma()` — random-effects meta-analysis
`broom`	`tidy()` — coefficient extraction from Arima objects
`ggplot2`, `forcats`	Caterpillar plot
`dplyr`, `tidyr`, `tibble`, `rlang`	Data manipulation

idionomics

The idionomic science principle

Installation

Recommended analysis pipeline

Function reference

`i_screener()` — Pre-pipeline data quality filter

`pmstandardize()` — Within-person z-scoring

`i_detrender()` — Within-person linear detrending

`iarimax()` — Core I-ARIMAX algorithm

What the return value contains

`i_pval()` — Per-subject p-values

`sden_test()` — Sign Divergence / Equisyncratic Null test

`looping_machine()` — Directed loop detection

Full pipeline example

Key dependencies

idionomics

The idionomic science principle

Installation

Recommended analysis pipeline

Function reference

i_screener() — Pre-pipeline data quality filter

pmstandardize() — Within-person z-scoring

i_detrender() — Within-person linear detrending

iarimax() — Core I-ARIMAX algorithm

What the return value contains

i_pval() — Per-subject p-values

sden_test() — Sign Divergence / Equisyncratic Null test

looping_machine() — Directed loop detection

Full pipeline example

Key dependencies

`i_screener()` — Pre-pipeline data quality filter

`pmstandardize()` — Within-person z-scoring

`i_detrender()` — Within-person linear detrending

`iarimax()` — Core I-ARIMAX algorithm

`i_pval()` — Per-subject p-values

`sden_test()` — Sign Divergence / Equisyncratic Null test

`looping_machine()` — Directed loop detection