hce package introThe package supports the simulation and analysis of univariate hierarchical composite endpoints (HCEs). The rationale for constructing a univariate endpoint, in contrast to the multivariate version of Generalized Pairwise Comparisons, which typically yields summary statistics rather than a single endpoint, is described in Gasparyan, Koch, and Brunner (Gasparyan, Koch, and Brunner 2026). The framework is broad: univariate HCEs can be derived from different outcome types, including time-to-event, continuous, and binary outcomes. However, it applies only to fixed follow-up designs without early dropout.
In this setting, several approaches can be used to derive a patient-level ordinal outcome, which is the essence of a univariate HCE. Gasparyan, Koch, and Brunner (Gasparyan, Koch, and Brunner 2026) derived the distribution for the setting with multiple time-to-event outcomes and a single continuous (or ordinal) outcome assessed at the end of follow-up; in that construction, the event times also contribute to the HCE. Accordingly, our definition of an HCE focuses on this setting. Although the theoretical distribution is not required for estimation, understanding the distribution remains important for interpreting summary statistics in greater detail and for avoiding the Condorcet non-transitivity paradox (De Condorcet et al. 2014).
These endpoints have been implemented in clinical trials across multiple therapeutic areas. For example, see Gasparyan et al. (2022) for an implementation in a COVID-19 setting and for practical considerations in constructing hierarchical composite endpoints. For chronic kidney disease (CKD) applications, see (Little et al. 2023; Heerspink et al. 2023; Kondo et al. 2024).
The primary analysis method is win odds. Other win statistics, including win ratio and net benefit (Dong et al. 2023), are also implemented when there is no censoring. Win odds relies on the DeLong-DeLong-Clarke-Pearson formula (DeLong, DeLong, and Clarke-Pearson 1988) for the variance of the win proportion and is based on the Brunner-Munzel test (Brunner and Munzel 2000). We also include the Brunner-Konietschke version (Brunner and Konietschke 2025) of the Bamber estimator (Bamber 1975) for the variance of the win proportion. In addition, the package provides Wilson-type, range-preserving confidence intervals for the win proportion and win odds, following Schüürhuis, Konietschke, and Brunner (Schüürhuis, Konietschke, and Brunner 2025).
The power and sample size formulas cover several classes of
alternatives: "shifted", "ordered", and
"max" for the maximum value of the standard deviation. All
formulas are derived from Bamber (Bamber 1975). By default,
Noether’s formula (Noether 1987) is used for
shifted distributions. Additional discussion of power calculations for
shifted distributions is available in (Gasparyan, Kowalewski, et al. 2021;
Gasparyan, Kowalewski, and Koch
2022).
For reviews of HCE design in clinical trials, see (Gasparyan et al. 2022), (Gasparyan et al. 2023), and (Little et al. 2023). The Basic Data Structure (BDS), which follows Analysis Data Model (ADaM) (CDISC 2009) principles for hierarchical composite endpoints, is described in Gasparyan et al. (2024). For visualization, the maraca plot (Karpefors, Lindholm, and Gasparyan 2023) can be used.
Stratified and adjusted win odds are calculated using randomization-based covariate adjustment theory (Koch et al. 1998); for a review, see (Gasparyan, Folkvaljon, et al. 2021).
Thresholds for pairwise comparisons involving the continuous component of the HCE are implemented according to the theory in Gasparyan, Koch, and Brunner (Gasparyan, Koch, and Brunner 2026).
All implementations are rank-based. For a methodological review, see (Brunner, Bathke, and Konietschke 2018).
Load the package hce and check the version
For citing the package, run citation("hce") (Gasparyan 2026).
List the functions and the datasets in the package
ls("package:hce")
#> [1] "ADET" "ADLB" "ADSL" "COVID19" "COVID19b"
#> [6] "COVID19plus" "HCE1" "HCE2" "HCE3" "HCE4"
#> [11] "HGLL" "IWP" "KHCE" "as_formulae" "as_hce"
#> [16] "calcWINS" "calcWO" "dGLL" "deltaWO" "hGLL"
#> [21] "hce" "minWO" "pGLL" "powerWO" "propWINS"
#> [26] "qGLL" "rGLL" "regWO" "rweibullGF" "simHCE"
#> [31] "simKHCE" "simORD" "simTTE" "sizeWO" "sizeWR"
#> [36] "stratWO" "summaryWO"The package contains the following Datasets: use
data(package = "hce") for the list of all datasets included
in the package.
Simulated datasets: HCE1 - HCE4
contain two treatment groups and analysis values AVAL for a
hierarchical composite endpoint.
COVID-19 datasets: COVID19,
COVID19b, and COVID19plus contain COVID-19
ordinal scale outcomes (Beigel et al. 2020; Kalil et al. 2021).
Synthetic kidney disease datasets:
ADET contains event-time data, ADLB contains
laboratory data, and ADSL contains subject-level baseline
characteristics. Based on these data, KHCE provides the
derived kidney hierarchical composite endpoint for the same patients
(Heerspink et al.
2023).
The main functions for calculating the so-called win
statistics—win probability, win odds,
win ratio, and net benefit—are
calcWO() for win odds and calcWINS() for all
win statistics, including win odds. To summarize wins, losses, and ties,
use summaryWO(), which also reports the win probability,
win odds, and win ratio, together with the standard errors of the win
probability and win odds. By default, all of these functions use the
DeLong-DeLong-Clarke-Pearson formula (DeLong, DeLong, and
Clarke-Pearson 1988) to estimate the standard error of the
win probability and win odds.
In calcWINS(), the SE_WP_Type argument can
be set to unbiased to use the
Brunner-Konietschke version (Brunner and Konietschke 2025)
of the Bamber estimator (Bamber 1975) for the variance of
the win proportion, or to biased to use the
DeLong-DeLong-Clarke-Pearson variance estimator, which
is the same method used in calcWO() and may be biased in
small samples. In the same function, the CI_WP_Type
argument can be set to Wilson to obtain
Wilson-type, range-preserving confidence intervals for
the win proportion and win odds, following Schüürhuis, Konietschke, and
Brunner (Schüürhuis, Konietschke, and Brunner
2025). The function calcWINS() also implements
Goodman-Kruskal’s gamma (Goodman and Kruskal 1954, 1963). All reported
statistics include confidence intervals.
The functions regWO() and stratWO() compute
adjusted and stratified win odds, respectively, based on the
randomization-based covariate adjustment theory
developed in Koch et al. (Koch et al. 1998); for a review,
see Gasparyan et al. (Gasparyan, Folkvaljon, et al.
2021).
In principle, all of these functions operate on ordinal outcomes when
each patient contributes a single analysis value. Therefore, the input
data should be at the patient level. Any patient-level dataset with a
numeric analysis value can be used to calculate these statistics. In
this setting, the use of win statistics is equivalent to ordinal
analysis. Specifically, win odds corresponds to the classical
Mann-Whitney odds, win ratio is the number of
concordances divided by the number of discordances and can be derived
from Goodman-Kruskal’s gamma, and net benefit
corresponds to Somers’ D C/R. Therefore, any software
that computes these statistics should produce the same results. For
example, although the implementation here is fully rank-based, the
estimate of Goodman-Kruskal’s gamma and its confidence
interval agree with DescTools::GoodmanKruskalGamma() (Signorell 2025).
All calculations are rank-based and therefore computationally
efficient. The basic building blocks are the individual win
proportions, defined for each patient as the proportion of wins
plus one-half of the proportion of ties. The function IWP()
computes these individual win proportions for each patient in the
dataset and returns an updated dataset with a new column. Any
win-odds-related calculation can be based on these quantities (Gasparyan,
Folkvaljon, et al. 2021).
The idea behind univariate hierarchical composite endpoints (Gasparyan, Koch, and Brunner 2026) is to derive a single ordinal outcome from multiple outcomes of different types, such as time-to-event, continuous, and binary outcomes, by prioritizing the most clinically important outcome for each patient. The resulting ordinal outcome can then be used to calculate win statistics.
To support this, the package provides helper functions for deriving
hce objects from different types of outcomes. In
particular, hce() and as_hce() check the
structure of the input data and attempt to convert it to an
hce object. Once this conversion is complete, that is, once
the univariate HCE has been derived, the subsequent theory and analysis
are the same as for any other ordinal outcome.
The function deltaWO() calculates threshold-adjusted win
odds (Gasparyan, Koch, and Brunner
2026).
The functions powerWO(), sizeWO(), and
minWO() provide tools for calculating
power, sample size, and the
minimum detectable treatment effect for win odds under
different classes of alternatives: "shifted",
"ordered", and "max", where "max"
refers to the maximum value of the standard deviation. All formulas are
based on Bamber (Bamber 1975). For shifted
distributions, the default is Noether’s formula (Noether
1987). See also (Gasparyan, Kowalewski, et al. 2021;
Gasparyan, Kowalewski, and Koch
2022) for further discussion of shifted distributions. The
function sizeWR() provides sample size calculations for the
win ratio (Yu and Ganju 2022).
The function propWINS() is a convenient tool for
deriving the proportions of wins,
losses, and ties for each treatment
group from the win odds and win ratio.
A plot method for hce objects (created by
as_hce()) to provide the ordinal dominance graph (Bamber
1975).
The generalized log-logistic (GLL) distributions
arise as the marginal distributions of the Weibull-Gamma frailty
model. For comparison, one can use rweibullGF(),
which implements the random frailty formulation, and compare its output
with the generalized log-logistic distribution functions: density
dGLL(), distribution function pGLL(), quantile
function qGLL(), random generation rGLL(),
hazard function hGLL(), and cumulative hazard function
HGLL().
These distributions form the basis for constructing two-outcome
hierarchical composite endpoints with death and hospitalization in the
generalized illness-death model implemented in
simTTE().
To facilitate visualization and review of sample size and power
calculations, print() and plot() methods are
implemented for hce_results objects generated by the
functions powerWO(), sizeWO(), and
minWO().
The function as_formulae() is an internal helper that
decomposes formula objects into treatment, response, covariate, and
grouping variables using the standardized names TRTP,
AVAL, COVAR, and GROUP,
respectively.
hce Objectshce() Functionhce objects can be constructed using the helper function
hce(), which has the following arguments:
We see that the required arguments are GROUP, which
specifies the clinically most important event of a patient to be
included in the analysis, and TRTP, which specifies the
(planned) treatment group of a patient (exactly two treatment groups
should be present). Note that:
The hce structure assumes that only one event per
patient is present for the analysis, meaning that the resulting
hce object created by the hce() function is a
patient-level dataset. The function hce() does not select
the clinically most important event of the patient but requires it to be
already done when calling it.
The argument TRTP should have exactly two
levels.
Consider the following example of ordinal outcomes ‘I’, ‘II’, and ‘III’:
set.seed(2022)
n <- 100
dat <- hce(GROUP = rep(x = c("I", "II", "III"), each = 100),
TRTP = sample(x = c("Active", "Control"), size = n*3, replace = TRUE))
class(dat)
#> [1] "hce" "data.frame"This dataset has the appropriate structure of hce
objects, but its class inherits from an object of class
data.frame. This means that all functions available for
data frames can be applied to hce objects, for example, the
function head():
head(dat)
#> TRTP GROUP PARAMN AVAL
#> 1 Control I 1 1
#> 2 Active I 1 1
#> 3 Control I 1 1
#> 4 Active I 1 1
#> 5 Active I 1 1
#> 6 Control I 1 1We see that the dataset has a very specific structure. The column
PARAMN shows how the function hce() generated
the order of given events (it uses usual alphabetic order for the unique
values in the GROUP column to determine the clinical
importance of events):
In the class hce, higher values for the ordering mean
clinically less important events. For example, death, which is the most
important event, should always get the lowest ordinal value. If there is
a need to specify the order of outcomes, then the GROUP
argument can be provided as a factor with the levels specifying the
necessary order:
set.seed(2022)
n <- 100
GROUP = rep(x = c("I", "II", "III"), each = 100)
GROUP <- factor(GROUP, levels = c("III", "II", "I"))
dat <- hce(GROUP = GROUP,
TRTP = sample(x = c("Active", "Control"), size = n*3, replace = TRUE))
unique(dat[, c("GROUP", "PARAMN")])
#> GROUP PARAMN
#> 1 I 3
#> 101 II 2
#> 201 III 1This means that the clinically most important event is ‘III’ instead
of ‘I’. The argument AVAL0 is meant to help in cases where
we want to introduce sub-ordering within each GROUP
category. For example, if two events in the group ‘I’ can be compared
based on other parameters, then the AVAL0 argument can be
specified to take that into account.
Below we use the built-in data frame HCE1 to construct
an hce object. Before specifying the order of events, it is
a good idea to check what are the unique events included in the
GROUP column:
Therefore, we can construct the following object using the
hce() function:
HCE <- hce(GROUP = factor(HCE1$GROUP, levels = c("TTE1", "TTE2", "TTE3", "TTE4", "C")),
TRTP = HCE1$TRTP, AVAL0 = HCE1$AVAL0, PADY = 1080)
class(HCE)
#> [1] "adhce" "hce" "data.frame"
head(HCE)
#> TRTP GROUP AVAL0 PARAMN AVAL PADY GROUPN
#> 1 A C -2.21 5 5446.32 1080 5400
#> 2 P C 17.06 5 5465.59 1080 5400
#> 3 A TTE4 966.00 4 5286.00 1080 4320
#> 4 P TTE4 352.00 4 4672.00 1080 4320
#> 5 A C -12.45 5 5436.08 1080 5400
#> 6 P C 44.62 5 5493.15 1080 5400The object’s class is adhce, which inherits from
hce. This design indicates that the object includes
additional columns.
hce Object from a Data FrameConsider the dataset HCE1, which is part of the package
hce:
data(HCE1, package = "hce")
class(HCE1)
#> [1] "data.frame"
head(HCE1)
#> ID TRTP GROUP GROUPN AVALT AVAL0 AVAL PADY
#> 1 1 A C 5400 1080 -2.21 5446.32 1080
#> 2 2 P C 5400 1080 17.06 5465.59 1080
#> 3 3 A TTE4 4320 966 966.00 5286.00 1080
#> 4 4 P TTE4 4320 352 352.00 4672.00 1080
#> 5 5 A C 5400 1080 -12.45 5436.08 1080
#> 6 6 P C 5400 1080 44.62 5493.15 1080This dataset has the appropriate structure of hce
objects, but its class is data.frame. A generic way of
coercing data structures to an hce object is to use the
function as_hce(). This function performs checks (using an
internal validator function) and creates an hce object from
the given data structure (using an internal constructor function). If
coercion is not possible, it will throw an error explaining the
issue.
dat1 <- as_hce(HCE2)
str(dat1)
#> Classes 'adhce', 'hce' and 'data.frame': 1000 obs. of 9 variables:
#> $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ TRTP : chr "A" "P" "A" "P" ...
#> $ GROUP : Factor w/ 5 levels "TTE1","TTE2",..: 1 5 5 1 2 3 5 2 1 5 ...
#> $ GROUPN: num 1080 5400 5400 1080 2160 3240 5400 2160 1080 5400 ...
#> $ AVALT : num 120 1080 1080 577 782 985 1080 293 313 1080 ...
#> $ AVAL0 : num 120 3.35 22.8 577 782 ...
#> $ AVAL : num 1200 5461 5480 1657 2942 ...
#> $ PADY : num 1080 1080 1080 1080 1080 1080 1080 1080 1080 1080 ...
#> $ PARAMN: num 1 5 5 1 2 3 5 2 1 5 ...hce Objects Using simHCE()To simulate values from a hierarchical composite endpoint, we use the
function simHCE(), which has the following arguments:
args("simHCE")
#> function (n, n0 = n, TTE_A, TTE_P, CM_A, CM_P, CSD_A = 1, CSD_P = CSD_A,
#> fixedfy = 1, yeardays = 360, pat = 100, shape = 1, theta = 1,
#> logC = FALSE, seed = NULL, dec = 2, all_data = FALSE)
#> NULLThe vector arguments TTE_A and TTE_P
specify the event rates per year for time-to-event outcomes in the
active and control groups, respectively. The function assumes a Weibull
survival function with the same shape parameter for simulating all
time-to-event outcomes in both treatment groups (by default
shape = 1, which assumes an exponential survival function).
These two vectors should have the same length, which indicates the
number of time-to-event outcomes.
By default, the event rates are presented per 100 patient-years
(pat = 100), which can be changed using the argument
pat. The function simulates event times in days, and the
yeardays = 360 argument can be used to change the number of
days in a year (e.g., 365 or 365.25).
The function simulates events during a fixed follow-up period
only, and the fixedfy argument can be used to change the
length of the follow-up (in years).
The function simulates the continuous outcome from a normal
(default) or log-normal (if logC = TRUE) distribution with
given means and standard deviations for two treatment groups.
Rates_A <- c(1.72, 1.74, 0.58, 1.5, 1)
Rates_P <- c(2.47, 2.24, 2.9, 4, 6)
dat3 <- simHCE(n = 2500, n0 = 1500, TTE_A = Rates_A,
TTE_P = Rates_P,
CM_A = -3, CM_P = -6,
CSD_A = 16, CSD_P = 15,
fixedfy = 3, seed = 2023)class(dat3)
#> [1] "adhce" "hce" "data.frame"
head(dat3)
#> ID TRTP GROUP GROUPN AVALT AVAL0 AVAL seed PADY PARAMN
#> 1 1 A C 6480 1080 -9.32 6526.85 2023 1080 6
#> 2 2 A C 6480 1080 0.11 6536.28 2023 1080 6
#> 3 3 A C 6480 1080 27.06 6563.23 2023 1080 6
#> 4 4 A C 6480 1080 -4.78 6531.39 2023 1080 6
#> 5 5 A TTE2 2160 390 390.00 2550.00 2023 1080 2
#> 6 6 A C 6480 1080 4.56 6540.73 2023 1080 6hce ObjectsAs we see, the function simHCE() creates an object of
type hce, which inherits from the built-in class
data.frame. We can check all implemented methods for this
new class as follows:
methods(class = "hce")
#> [1] calcWINS calcWO plot summaryWO
#> see '?methods' for accessing help and source codeThe function calcWO() calculates the win odds and its
confidence interval, while summaryWO() provides a more
detailed calculation of win odds, including the number of wins, losses,
and ties by GROUP categories.
HCE <- hce(GROUP = factor(HCE3$GROUP, levels = c("TTE1", "TTE2", "TTE3", "TTE4", "C")),
TRTP = HCE3$TRTP, AVAL0 = HCE3$AVAL0, PADY = 1080)
calcWO(HCE)
#> WO LCL UCL SE WOnull alpha Pvalue WP
#> 1 1.332829 1.15078 1.543678 0.07493202 1 0.05 0.0001014229 0.571336
#> LCL_WP UCL_WP SE_WP SD_WP N
#> 1 0.5353673 0.6073047 0.01835169 0.5803314 1000
calcWINS(HCE)
#> $summary
#> WIN LOSS TIE TOTAL Pties
#> 1 142824 107156 20 250000 8e-05
#>
#> $WP
#> WP LCL UCL Pvalue
#> 1 0.571336 0.5353673 0.6073047 0.0001014229
#>
#> $NetBenefit
#> NetBenefit LCL UCL Pvalue
#> 1 0.142672 0.0707347 0.2146093 0.0001014229
#>
#> $WO
#> WO LCL UCL Pvalue
#> 1 1.332829 1.15078 1.543678 0.0001014229
#>
#> $WR1
#> WR LCL1 UCL1 Pvalue1
#> 1 1.332861 1.150793 1.543733 0.000125974
#>
#> $WR2
#> WR LCL2 UCL2 Pvalue2
#> 1 1.332861 1.155092 1.537987 8.351682e-05
#>
#> $gamma
#> gamma LCL UCL Pvalue
#> 1 0.1426834 0.07074057 0.2146263 0.000101418
#>
#> $SE
#> WP_SE NetBenefit_SE logWR_SE1 logWR_SE2 gamma_SE
#> 1 0.01835169 0.03670338 0.07493804 0.07303552 0.03670621
#>
#> $ref
#> [1] "A vs P"
#>
#> $Input
#> alpha WOnull
#> 1 0.05 1
HCE$TRTP <- factor(HCE$TRTP, levels = c("P", "A"))
plot(HCE, fill = TRUE, col = "#865A4F", type = 'l', lwd = 2)
abline(a = 0, b = 1, lwd = 2, col = "#999999", lty = 2)To check the generic functions available for the adhce
class by running the following command:
methods(class = "adhce")
#> [1] deltaWO summaryWO
#> see '?methods' for accessing help and source codeThe main difference between summaryWO.hce() and
summaryWO.adhce() is that summaryWO.adhce()
also provides a group-wise summary, in addition to the overall
summary:
summaryWO(HCE)
#> $summary
#> TRTP WIN LOSS TIE TOTAL WR WO
#> 1 A 142824 107156 20 250000 1.3328605 1.3328294
#> 2 P 107156 142824 20 250000 0.7502661 0.7502835
#>
#> $summary_by_GROUP
#> TRTP GROUP WIN LOSS TIE TOTAL
#> 1 A TTE1 2473 39525 2 42000
#> 2 P TTE1 2985 29513 2 32500
#> 3 A TTE2 6847 22648 5 29500
#> 4 P TTE2 13857 45138 5 59000
#> 5 A TTE3 13251 16745 4 30000
#> 6 P TTE3 13356 25140 4 38500
#> 7 A TTE4 9056 6443 1 15500
#> 8 P TTE4 15383 19616 1 35000
#> 9 A C 111197 21795 8 133000
#> 10 P C 61575 23417 8 85000
#>
#> $WO
#> WO SE WP SE_WP
#> 1 1.332829 0.07493202 0.571336 0.01835169
#>
#> $cumsummary_by_GROUP
#> GROUPN WINS COUNT TOTAL PROP
#> 1 1 A wins 23417 250000 0.093668
#> 6 1 P wins 21795 250000 0.087180
#> 11 1 Ties 204788 250000 0.819152
#> 2 2 A wins 52930 250000 0.211720
#> 7 2 P wins 61320 250000 0.245280
#> 12 2 Ties 135750 250000 0.543000
#> 3 3 A wins 98068 250000 0.392272
#> 8 3 P wins 83968 250000 0.335872
#> 13 3 Ties 67964 250000 0.271856
#> 4 4 A wins 123208 250000 0.492832
#> 9 4 P wins 100713 250000 0.402852
#> 14 4 Ties 26079 250000 0.104316
#> 5 5 A wins 142824 250000 0.571296
#> 10 5 P wins 107156 250000 0.428624
#> 15 5 Ties 20 250000 0.000080An important addition to summaryWO.adhce() is the
cumsummary_by_GROUP output, which reports cumulative wins,
losses, and ties as outcomes are added sequentially according to the
priority order:
res0 <- summaryWO(HCE, ref = "P")
res <- res0$cumsummary_by_GROUP
barplot(PROP ~ WINS + GROUPN, data = res,
col = c("darkgreen", "darkred", "darkblue"),
xlab = "Proportions", xlim = c(0, 1),
ylab = "Cumulative components by prioritization",
legend.text = unique(res$WINS), beside = TRUE, horiz = TRUE)
grid()