Over informative processes, naive estimator of learning—difference between post and pre process scores—underestimates actual learning. A heuristic account for why the naive estimator is negatively biased is as follows: people know as much or more after exposed to an informative process than before it. And the less people know, the larger the number of items they don’t know. And greater the opportunity to guess.
Guessing, even when random, only increases the proportion correct. Thus, bias due to guessing for naive measures of knowledge is always positive. On average, thus, there is more positive bias in the pre-process scores than post-process scores. And naturally, subtracting pre-process scores from post-process provides an attenuated estimate of actual learning. For a more complete treatment of the issue, read this paper by Ken Cor and Gaurav Sood.
We provide a few different ways to adjust estimates of learning for guessing. For now, we limit our attention to cases where the same battery of knowledge questions has been asked in both the pre- and the post-process wave. And to cases where closed-ended questions have been asked. (Guessing is not a serious issue on open-ended items. See more evidence for that in DK Means DK by Robert Luskin and John Bullock.) More generally, the package implements the methods to adjust learning for guessing discussed in this paper.
To get the current release version from CRAN:
install.packages("guess")
To get the current development version from GitHub:
# install.packages("devtools")
library(devtools)
::install_github("finite-sample/guess", build_vignettes = TRUE) devtools
To learn about how to use the package, see the vignette:
# Overview of the package
vignette("using_guess", package = "guess")
The package provides several methods for adjusting learning estimates:
lca_cor()
: Latent Class Analysis
correction using transition matricesstnd_cor()
: Standard guessing
correction based on number of incorrect responsesgroup_adj()
: Group-level adjustment
accounting for propensity to guessfit_model()
: Unified model fitting
function with goodness-of-fit testingImportant: The group_adj()
function
handles different groups implicitly through their gamma
values (propensity to guess), rather than requiring explicit
group identifiers. Groups with different guessing behaviors should use
different gamma parameters:
# Example: Different groups with different guessing rates
<- 0.15 # Lower guessing rate
high_ability_gamma <- 0.35 # Higher guessing rate
low_ability_gamma
# Apply adjustment with group-specific gamma
group_adj(pre_test, post_test, gamma = c(high_ability_gamma, low_ability_gamma, ...))
This design allows for flexible modeling where gamma can vary by item, respondent characteristics, or any other grouping structure.
Scripts are released under MIT License.
The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.