This vignette explains how to compute performance measures for publication bias correction methods across all benchmark conditions. This is the final step after computing method results (see Computing Method Results) and allows you to evaluate and compare method performance systematically.
If you are contributing a new method to the package, the package maintainers will compute and update the precomputed measures upon your submission. This vignette is primarily for advanced users who want to compute custom measures or update measures for their own analyses.
After computing and storing method results for all DGMs, you need to:
This process creates the performance summaries that allow systematic comparison of methods across conditions.
Before computing measures, ensure that:
The package computes various performance measures defined in the measures()
documentation. Each measure is computed separately for each
method-setting-condition combination.
Some methods may fail to converge for certain datasets. The method replacement strategy handles these cases by:
For example, if RMA fails to converge, it might be
replaced with simpler FMA (fixed-effects) results for those
specific cases. Method replacement allows us to mimic an analyst
analyzing data who would choose a different method upon a method
failure. See this article
for a detailed description of different non-convergence handling
strategies in simulation studies.
Specify which DGMs to process and which methods to compute measures for:
Process each DGM to compute both standard and replacement performance measures:
for (dgm_name in dgm_names) {
# Download precomputed results for existing methods (for replacements)
download_dgm_results(dgm_name)
### Simple performance metrics ----
# Compute primary measures (not dependent on CI or power)
compute_measures(
dgm_name = dgm_name,
method = methods_settings$method,
method_setting = methods_settings$method_setting,
power_test_type = methods_settings$power_test_type,
measures = c("bias", "relative_bias", "mse", "rmse",
"empirical_variance", "empirical_se", "convergence"),
verbose = TRUE,
estimate_col = "estimate",
true_effect_col = "mean_effect",
ci_lower_col = "ci_lower",
ci_upper_col = "ci_upper",
p_value_col = "p_value",
bf_col = "BF",
convergence_col = "convergence",
n_repetitions = 1000,
overwrite = FALSE
)
# If your method does not return CI or hypothesis test, skip these measures
compute_measures(
dgm_name = dgm_name,
method = methods_settings$method,
method_setting = methods_settings$method_setting,
power_test_type = methods_settings$power_test_type,
measures = c("power", "coverage", "mean_ci_width", "interval_score",
"negative_likelihood_ratio", "positive_likelihood_ratio"),
verbose = TRUE,
estimate_col = "estimate",
true_effect_col = "mean_effect",
ci_lower_col = "ci_lower",
ci_upper_col = "ci_upper",
p_value_col = "p_value",
bf_col = "BF",
convergence_col = "convergence",
n_repetitions = 1000,
overwrite = FALSE
)
### Replacement performance metrics ----
# Specify method replacement strategy
# The most common one: random-effects meta-analysis -> fixed-effect meta-analysis
RMA_replacement <- list(
method = c("RMA", "FMA"),
method_setting = c("default", "default"),
power_test_type = c("p_value", "p_value")
)
method_replacements <- list(
"myNewMethod-default" = RMA_replacement
)
compute_measures(
dgm_name = dgm_name,
method = methods_settings$method,
method_setting = methods_settings$method_setting,
power_test_type = methods_settings$power_test_type,
method_replacements = method_replacements,
measures = c("bias", "relative_bias", "mse", "rmse",
"empirical_variance", "empirical_se", "convergence"),
verbose = TRUE,
estimate_col = "estimate",
true_effect_col = "mean_effect",
ci_lower_col = "ci_lower",
ci_upper_col = "ci_upper",
p_value_col = "p_value",
bf_col = "BF",
convergence_col = "convergence",
n_repetitions = 1000,
overwrite = FALSE
)
# If your method does not return CI or hypothesis test, skip these measures
compute_measures(
dgm_name = dgm_name,
method = methods_settings$method,
method_setting = methods_settings$method_setting,
power_test_type = methods_settings$power_test_type,
method_replacements = method_replacements,
measures = c("power", "coverage", "mean_ci_width", "interval_score",
"negative_likelihood_ratio", "positive_likelihood_ratio"),
verbose = TRUE,
estimate_col = "estimate",
true_effect_col = "mean_effect",
ci_lower_col = "ci_lower",
ci_upper_col = "ci_upper",
p_value_col = "p_value",
bf_col = "BF",
convergence_col = "convergence",
n_repetitions = 1000,
overwrite = FALSE
)
}method)"p_value" or "bayes_factor")measures() for
available options)"estimate")"mean_effect")"ci_lower")"ci_upper")"p_value")"BF")"convergence")The package maintainers will compute and update precomputed measures when you contribute a new method. If you want to contribute a new method:
Do you want the benchmark to include additional measures or evaluate different parameters? Open an issue/contact the benchmark maintainers. We will be happy to incorporate your suggestions!