ankiR provides a tidy interface for reading Anki flashcard databases in R. This vignette shows common workflows for analyzing your Anki learning data.
ankiR can automatically detect your Anki installation:
library(ankiR)
# Auto-detect (uses first profile found)
col <- anki_collection()
# Specify a profile
col <- anki_collection(profile = "User 1")
# Or provide a path directly
col <- anki_collection(path = "/path/to/collection.anki2")The collection object provides methods to access different data:
For one-off queries, use the standalone functions. They handle connection cleanup automatically:
Notes contain the actual content of your flashcards:
Cards are generated from notes. One note can produce multiple cards:
cards <- anki_cards()
# cid: Card ID
# nid: Note ID (links to notes table)
# did: Deck ID
# type: 0=new, 1=learning, 2=review, 3=relearning
# queue: -1=suspended, 0=new, 1=learning, 2=review
# due: Due date/position
# ivl: Current interval in days
# reps: Number of reviews
# lapses: Number of times forgottenIf you use FSRS (Free Spaced Repetition Scheduler), ankiR can extract the memory state parameters:
cards_fsrs <- anki_cards_fsrs()
# Additional columns:
# stability: Time in days for recall probability to drop to 90%
# difficulty: How hard the card is (1-10)
# desired_retention: Target recall probability
# decay: FSRS-6 decay parameter (w20)Retrievability is the probability you’ll recall a card right now:
library(ankiR)
library(dplyr)
library(ggplot2)
# Get data
reviews <- anki_revlog()
cards <- anki_cards()
decks <- anki_decks()
# Daily review count
daily_reviews <- reviews |>
count(review_date, name = "reviews")
ggplot(daily_reviews, aes(review_date, reviews)) +
geom_col(fill = "steelblue") +
labs(title = "Daily Reviews", x = NULL, y = "Reviews") +
theme_minimal()
# Card maturity by deck
cards |>
left_join(decks, by = "did") |>
filter(type == 2) |> # Review cards only
group_by(name) |>
summarise(
cards = n(),
avg_interval = mean(ivl),
mature = sum(ivl >= 21), # Cards with 21+ day interval
.groups = "drop"
) |>
arrange(desc(cards))cards_fsrs <- anki_cards_fsrs()
# Distribution of stability values
cards_fsrs |>
filter(!is.na(stability), stability > 0) |>
ggplot(aes(stability)) +
geom_histogram(bins = 50, fill = "steelblue") +
scale_x_log10() +
labs(
title = "Distribution of Card Stability",
x = "Stability (days, log scale)",
y = "Count"
) +
theme_minimal()
# Difficulty vs Stability
cards_fsrs |>
filter(!is.na(stability), !is.na(difficulty)) |>
ggplot(aes(difficulty, stability)) +
geom_point(alpha = 0.3) +
scale_y_log10() +
labs(
title = "Card Difficulty vs Stability",
x = "Difficulty (1-10)",
y = "Stability (days, log scale)"
) +
theme_minimal()Close connections: Always call
col$close() when using anki_collection()
directly, or use the convenience functions which handle this
automatically.
Anki must be closed: The database is locked while Anki is running. Close Anki before reading the database.
Backup first: While ankiR only reads data (never writes), it’s good practice to backup your collection before any analysis.
Large collections: For very large collections,
consider using SQL queries directly via
DBI::dbGetQuery(col$con, "SELECT ...") for better
performance.