insurancerating provides building blocks for common
actuarial pricing tasks in GLM-based tariff analysis. The package does
not prescribe a single pricing method. Instead, it supports practical
steps that often appear in insurance pricing work: portfolio analysis,
model interpretation, tariff refinement and model validation.
This vignette gives a compact overview of those building blocks and how they can be combined.
A pricing analysis often starts by checking how the observed portfolio behaves by risk factor. This is useful before modelling, but also later when reviewing whether fitted relativities are plausible.
factor_analysis() summarises exposure, claim frequency,
average severity, risk premium and related metrics by one or more risk
factors.
fa <- factor_analysis(
MTPL,
risk_factors = "zip",
claim_count = "nclaims",
claim_amount = "amount",
exposure = "exposure"
)
head(fa)
#> zip amount nclaims exposure frequency average_severity risk_premium
#> 1 1 116178669 1593 11080.6274 0.1437644 72930.74 10484.846
#> 2 2 59751985 1008 7782.6301 0.1295192 59277.76 7677.608
#> 3 3 58988962 1038 7587.5644 0.1368028 56829.44 7774.427
#> 4 0 821510 29 206.8438 0.1402024 28327.93 3971.644The output helps answer practical questions such as:
For numeric variables with long or skewed tails,
outlier_histogram() can help inspect extreme observations
before fitting severity models or constructing tariff segments.
Large claims can dominate severity and pure premium analysis. In capped severity workflows, it is often useful to assess a cap first, decompose the historical claim amounts, and then decide how the excess burden should be allocated. A low threshold increases pricing responsiveness but introduces volatility. A high threshold improves stability but may understate structural differences between segments.
thresholds <- assess_excess_threshold(
claims,
claim_amount = "claim_amount",
thresholds = c(50000, 100000, 150000),
exposure = "earned_exposure",
group = "sector"
)
autoplot(thresholds, y = "premium_impact")After choosing a threshold, calculate_excess_loss()
creates a deterministic historical decomposition. It does not bootstrap
or allocate anything.
The allocation step is where sharing and uncertainty are handled. Portfolio allocation is stable but ignores group experience. Risk-factor allocation is responsive but can be volatile. Partial allocation balances portfolio stability, group responsiveness and the credibility of observed excess experience.
allocation <- allocate_excess_loss(
excess,
excess_amount = "excess_claim_amount",
allocation_weight = "earned_exposure",
risk_factor = "sector",
allocation = "partial",
preserve_total_excess = TRUE
)
summary(allocation, compare_to_empirical = TRUE)
autoplot(allocation, y = "allocated_loading")
autoplot(allocation, y = "credibility")In the allocation output, allocated_excess_loss is the
absolute monetary burden assigned to a row.
allocated_loading is the corresponding loading per unit of
the chosen weight, such as earned exposure. This distinction matters
when the output is added back to pricing data.
The allocated loading can then be added to the pricing data.
excess$base_premium <- excess$technical_premium
priced <- apply_excess_loading(
excess,
allocation,
base_premium = "base_premium"
)This excess loading is part of the technical risk premium. It is not intended as a commercial margin.
Many tariffs use grouped versions of continuous variables such as
age, vehicle age or insured value. risk_factor_gam() can be
used to inspect the fitted shape of a continuous risk factor.
derive_tariff_segments() can then derive candidate segment
boundaries from that pattern.
age_gam <- risk_factor_gam(
data = MTPL,
claim_count = "nclaims",
risk_factor = "age_policyholder",
exposure = "exposure"
)
age_segments <- derive_tariff_segments(age_gam)
age_segments
#> Tariff segment boundaries:
#> [1] 18 25 32 39 51 58 65 84 95The derived segments can be added back to the portfolio with
add_tariff_segments().
portfolio <- MTPL |>
add_tariff_segments(age_segments, name = "age_policyholder_segment")
head(portfolio[, c("age_policyholder", "age_policyholder_segment")])
#> # A tibble: 6 × 2
#> age_policyholder age_policyholder_segment
#> <int> <fct>
#> 1 70 (65,84]
#> 2 40 (39,51]
#> 3 78 (65,84]
#> 4 49 (39,51]
#> 5 59 (58,65]
#> 6 71 (65,84]These functions are intended to support actuarial judgement, not replace it. Candidate segment boundaries should still be reviewed for credibility, stability and practical usability.
GLMs are widely used in insurance pricing because they provide an
interpretable multiplicative structure. After fitting a model,
rating_table() expresses the coefficients in tariff-table
form.
portfolio$zip <- as.factor(portfolio$zip)
freq_model <- glm(
nclaims ~ zip + age_policyholder_segment + offset(log(exposure)),
family = poisson(),
data = portfolio
)
rt <- rating_table(
freq_model,
model_data = portfolio,
exposure = "exposure"
)
head(rt$df)
#> risk_factor level est_freq_model exposure
#> 1 (Intercept) (Intercept) 0.2743790 NA
#> 2 zip 0 1.0000000 207
#> 3 zip 1 0.9944341 11081
#> 4 zip 2 0.8960053 7783
#> 5 zip 3 0.9493475 7588
#> 6 age_policyholder_segment [18,25] 1.0000000 1331Observed portfolio experience from factor_analysis() can
be attached to the rating table with
add_observed_experience(). This makes the comparison
between model relativities and observed experience explicit.
zip_experience <- factor_analysis(
portfolio,
risk_factors = "zip",
claim_count = "nclaims",
exposure = "exposure"
)
rt |>
add_observed_experience(zip_experience, metric = "frequency") |>
autoplot(risk_factors = "zip")Raw model output may be statistically valid but still unsuitable for direct tariff use. Sparse levels, noisy estimates or non-monotonic adjacent effects can make a tariff hard to explain or maintain.
The refinement workflow makes these adjustments explicit:
refined_model <- prepare_refinement(freq_model) |>
add_smoothing(
model_variable = "age_policyholder_segment",
source_variable = "age_policyholder",
weights = "exposure"
) |>
add_restriction(restrictions) |>
refit()Common refinement tasks include:
These tools are most useful when the statistical model already captures the main risk structure and the remaining work is tariff refinement.
Pricing models should be checked before their output is used in a
tariff. insurancerating contains helpers for several common
checks:
check_overdispersion() for Poisson frequency
modelscheck_residuals() for simulation-based residual
diagnostics using DHARMabootstrap_performance() for predictive stability with
metrics such as RMSErating_grid() to inspect observed rating-grid
combinationsFor example:
check_overdispersion(freq_model)
#> Dispersion ratio = 1.185
#> Pearson's Chi-squared = 35522.367
#> p-value = < 0.001
#> Overdispersion detected.Validation does not make a tariff decision by itself. It gives evidence about model fit, stability and areas that may need further review.
One possible workflow is:
factor_analysis() and
outlier_histogram().assess_excess_threshold() where capped severity or
excess-loss loadings are relevant.calculate_excess_loss() and
allocate_excess_loss().risk_factor_gam().derive_tariff_segments().rating_table().add_observed_experience().prepare_refinement(), add_smoothing(),
add_restriction() or add_relativities().The exact order and choice of functions depends on the portfolio, product, data quality and pricing objective.
For a worked example, see:
For coefficient refinement:
For validation: