Help for package highlightr

Title:

Highlight Conserved Edits Across Versions of a Document

Version:

1.1.2

Description:

Input multiple versions of a source document, and receive HTML code for a highlighted version of the source document indicating the frequency of occurrence of phrases in the different versions. This method is described in Chapter 3 of Rogers (2024) https://digitalcommons.unl.edu/dissertations/AAI31240449/.

License:

MIT + file LICENSE

Encoding:

UTF-8

RoxygenNote:

7.3.2

Imports:

dplyr, ggplot2, magrittr, purrr, quanteda, quanteda.textstats, stringi, stringr, tibble, tidyr, tm, zoomerjoin

Depends:

R (≥ 2.10)

LazyData:

true

URL:

https://rachelesrogers.github.io/highlightr/, https://github.com/rachelesrogers/highlightr

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0)

VignetteBuilder:

knitr

Config/testthat/edition:

BugReports:

https://github.com/rachelesrogers/highlightr/issues

NeedsCompilation:

Packaged:

2025-06-26 23:34:56 UTC; 165086

Author:

Center for Statistics and Applications in Forensic Evidence [aut, cph, fnd], Rachel Rogers

[aut, cre], Susan VanderPlas

[aut]

Maintainer:

Rachel Rogers <rrogers.rpackages@gmail.com>

Repository:

CRAN

Date/Publication:

2025-06-26 23:50:02 UTC

Collocation of Comments

Description

This function provides the frequency of collocations in comments that correspond to the provided transcript.

Usage

collocate_comments(transcript_token, note_token, collocate_length = 5)

Arguments

transcript_token

transcript token to act as baseline for notes, resulting from token_transcript()

note_token

tokenized document of notes, resulting from token_comments()

collocate_length

the length of the collocation. Default is 5

Value

data frame of the transcript and corresponding note frequency

Examples

comment_example_rename <- dplyr::rename(comment_example, page_notes=Notes)
toks_comment <- token_comments(comment_example_rename[1:100,])
transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)
collocation_object <- collocate_comments(toks_transcript, toks_comment)

Collocate Comments Fuzzy

Description

This function provides the frequency of collocations in comments that correspond to the provided transcript, using fuzzy matching.

Usage

collocate_comments_fuzzy(
  transcript_token,
  note_token,
  collocate_length = 5,
  n_bands = 50,
  threshold = 0.7
)

Arguments

transcript_token

transcript token to act as baseline for notes, resulting from token_transcript()

note_token

tokenized document of notes, resulting from token_comments()

collocate_length

the length of the collocation. Default is 5

n_bands

number of bands used in MinHash algorithm passed to zoomerjoin::jaccard_right_join(). Default is 50

threshold

considered a match in for Jaccard distance passed to zoomerjoin::jaccard_right_join(). Default is 0.7

Value

data frame of the transcript and corresponding note frequency

Examples

comment_example_rename <- dplyr::rename(comment_example[1:10,], page_notes=Notes)
toks_comment <- token_comments(comment_example_rename)
transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)
fuzzy_object <- collocate_comments_fuzzy(toks_transcript, toks_comment)

Map collocation to ggplot object

Description

This assigns colors based on frequency to the words in the transcript.

Usage

collocation_plot(
  frequency_doc,
  n_scenario = 1,
  colors = c("#f251fc", "#f8ff1b")
)

Arguments

frequency_doc

document of frequencies (returned from transcript_frequency())

n_scenario

number of scenarios for which this transcript appeared. Defualt is 1

colors

list for color specification for the gradient. Default is c("#f251fc","#f8ff1b")

Value

list of plot, plot object, and frequency

Examples

comment_example_rename <- dplyr::rename(comment_example, page_notes=Notes)
toks_comment <- token_comments(comment_example_rename)
transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)
collocation_object <- collocate_comments(toks_transcript, toks_comment)
merged_frequency <- transcript_frequency(transcript_example_rename, collocation_object)
freq_plot <- collocation_plot(merged_frequency)

Comment Example Dataset

Description

Participant comments for the initial description used in the jury perception study

Usage

comment_example

Format

`comment_example`

A data frame with 125 rows and 2 columns:

ID: Participant Identifier
Notes: Participant notes

Source

Jury Perception Study (see Rogers (2024) https://digitalcommons.unl.edu/dissertations/AAI31240449/)

Create Highlighted Testimony

Description

Adds html tags to create a highlighted testimony corresponding to word frequency.

Usage

highlighted_text(plot_object, labels = c("", ""))

Arguments

plot_object

plot object resulting from collocation_plot()

labels

lower and upper labels for the gradient scale

Value

html code for highlighted text

Examples

comment_example_rename <- dplyr::rename(comment_example, page_notes=Notes)
toks_comment <- token_comments(comment_example_rename)
transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)
collocation_object <- collocate_comments(toks_transcript, toks_comment)
merged_frequency <- transcript_frequency(transcript_example_rename, collocation_object)
freq_plot <- collocation_plot(merged_frequency)
page_highlight <- highlighted_text(freq_plot, merged_frequency)

Tokenize comments

Description

This function tokenizes comments that are to be used in collocate_comments_fuzzy() or collocate_comments()

Usage

token_comments(comment_document)

Arguments

comment_document

document containing notes by individual, where the column containing the notes is named page_notes

Value

tokenized comments

Examples

comment_example_rename <- dplyr::rename(comment_example, page_notes=Notes)
toks_comment <- token_comments(comment_example_rename)

Tokenize Transcript

Description

This function tokenizes a transcript document that is to be used in collocate_comments_fuzzy() or collocate_comments()

Usage

token_transcript(transcript_file)

Arguments

transcript_file

data frame of the transcript, where the transcript text is in a column named text.

Value

a tokenized object

Examples

transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)

Transcript Example

Description

Text corresponding to participant comments

Usage

transcript_example

Format

`transcript_example`

A data frame with 1 row and 1 column:

Text: Transcript text corresponding to the jury perception study

Source

Jury Perception Study (see Rogers (2024) https://digitalcommons.unl.edu/dissertations/AAI31240449/ and Garrett et. al. (2020) doi:10.1037/lhb0000423)

Mapping Collocation Frequency to Transcript Document

Description

This function connects the collocation frequency calculated in collocate_comments_fuzzy() to the base transcript.

Usage

transcript_frequency(transcript, collocate_object)

Arguments

transcript

transcript document

collocate_object

collocation object (returned from collocate_comments_fuzzy() or collocate_comments())

Value

a dataframe of the transcript document with collocation values by word

Examples

comment_example_rename <- dplyr::rename(comment_example, page_notes=Notes)
toks_comment <- token_comments(comment_example_rename)
transcript_example_rename <- dplyr::rename(transcript_example, text=Text)
toks_transcript <- token_transcript(transcript_example_rename)
collocation_object <- collocate_comments(toks_transcript, toks_comment)
merged_frequency <- transcript_frequency(transcript_example_rename, collocation_object)

Wikipedia Edit History for "Highlighter"

Description

Text corresponding to versions of the Wikipedia article for Highlighter

Usage

wiki_pages

Format

`wiki_pages`

A data frame with 50 rows and 1 column:

page_notes: text of the Wikipedia page for Highlighter

Source

Wikipedia: https://en.wikipedia.org/w/index.php?title=Highlighter&action=history