Title: | Process Data from Wearable Light Loggers and Optical Radiation Dosimeters |
Version: | 0.9.2 |
Description: | Import, processing, validation, and visualization of personal light exposure measurement data from wearable devices. The package implements features such as the import of data and metadata files, conversion of common file formats, validation of light logging data, verification of crucial metadata, calculation of common parameters, and semi-automated analysis and visualization. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/tscnlab/LightLogR, https://tscnlab.github.io/LightLogR/, https://zenodo.org/doi/10.5281/zenodo.11562600 |
BugReports: | https://github.com/tscnlab/LightLogR/issues |
Imports: | cowplot, dplyr, ggplot2, ggsci, ggtext, hms, janitor, lubridate, magrittr, plotly, purrr, readr, rlang, scales, slider, stats, stringr, suntools, tibble, tidyr, utils |
Depends: | R (≥ 4.3) |
LazyData: | true |
Suggests: | covr, flextable, gghighlight, gt, gtsummary, knitr, patchwork, pkgload, rmarkdown, rsconnect, testthat (≥ 3.0.0), tidyverse |
Config/testthat/edition: | 3 |
Config/Needs/website: | rmarkdown |
NeedsCompilation: | no |
Packaged: | 2025-06-10 10:51:50 UTC; zauner |
Author: | Johannes Zauner |
Maintainer: | Johannes Zauner <johannes.zauner@tum.de> |
Repository: | CRAN |
Date/Publication: | 2025-06-10 11:10:02 UTC |
LightLogR: Process Data from Wearable Light Loggers and Optical Radiation Dosimeters
Description
Import, processing, validation, and visualization of personal light exposure measurement data from wearable devices. The package implements features such as the import of data and metadata files, conversion of common file formats, validation of light logging data, verification of crucial metadata, calculation of common parameters, and semi-automated analysis and visualization.
Author(s)
Maintainer: Johannes Zauner johannes.zauner@tum.de (ORCID)
Authors:
Manuel Spitschan manuel.spitschan@tum.de (ORCID)
Steffen Hartmeyer steffen.hartmeyer@epfl.ch (ORCID)
Other contributors:
MeLiDos [funder]
EURAMET (European Association of National Metrology Institutes. Website: www.euramet.org. Grant Number: 22NRM05 MeLiDos. Grant Statement: The project (22NRM05 MeLiDos) has received funding from the European Partnership on Metrology, co-financed from the European Union’s Horizon Europe Research and Innovation Programme and by the Participating States.) [funder]
European Union (Co-funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or EURAMET. Neither the European Union nor the granting authority can be held responsible for them.) [funder]
TSCN-Lab (www.tscnlab.org) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/tscnlab/LightLogR/issues
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the magrittr placeholder. |
rhs |
A function call using the magrittr semantics. |
Value
The result of calling rhs(lhs)
.
Add Brown et al. (2022) reference illuminance to a dataset
Description
Adds several columns to a light logger dataset. It requires a column that contains the Brown states, e.g. "daytime", "evening", and "night". From that the function will add a column with the recommended illuminance, a column that checks if the illuminance of the dataset is within the recommended illuminance levels, and a column that gives a label to the reference.
Usage
Brown2reference(
dataset,
MEDI.colname = MEDI,
Brown.state.colname = State.Brown,
Brown.rec.colname = Reference,
Reference.label = "Brown et al. (2022)",
overwrite = FALSE,
...
)
Arguments
dataset |
A dataframe that contains a column with the Brown states |
MEDI.colname |
The name of the column that contains the MEDI values which are used for checks against the Brown reference illuminance. Must be part of the dataset. |
Brown.state.colname |
The name of the column that contains the Brown states. Must be part of the dataset. |
Brown.rec.colname |
The name of the column that will contain the recommended illuminance. Must not be part of the dataset, otherwise it will throw an error. |
Reference.label |
The label that will be used for the reference. Expects
a |
overwrite |
If |
... |
Additional arguments that will be passed to |
Details
On a lower level, the function uses Brown_rec()
and Brown_check()
to
create the required information.
Value
A dataframe on the basis of the dataset
that contains the added
columns.
References
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
See Also
Other Brown:
Brown_check()
,
Brown_cut()
,
Brown_rec()
,
sleep_int2Brown()
Examples
#add Brown reference illuminance to some sample data
testdata <- tibble::tibble(MEDI = c(100, 10, 1, 300),
State.Brown = c("day", "evening", "night", "day"))
Brown2reference(testdata)
Check whether a value is within the recommended illuminance/MEDI levels by Brown et al. (2022)
Description
This is a lower level function. It checks a given value against a threshold
for the states given by Brown et al. (2022). The function is vectorized. For
day
the threshold is a lower limit, for evening
and night
the threshold
is an upper limit.
Usage
Brown_check(
value,
state,
Brown.day = "day",
Brown.evening = "evening",
Brown.night = "night",
Brown.day.th = 250,
Brown.evening.th = 10,
Brown.night.th = 1
)
Arguments
value |
Illuminance value to check against the recommendation. needs to be numeric, can be a vector. |
state |
The state from Brown et al. (2022). Needs to be a character
vector with the same length as |
Brown.day , Brown.evening , Brown.night |
The names of the states from Brown
et al. (2022). These are the default values ( |
Brown.day.th , Brown.evening.th , Brown.night.th |
The thresholds for the
states from Brown et al. (2022). These are the default values ( |
Value
A logical vector with the same length as value
that indicates
whether the value is within the recommended illuminance levels.
References
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
See Also
Other Brown:
Brown2reference()
,
Brown_cut()
,
Brown_rec()
,
sleep_int2Brown()
Examples
states <- c("day", "evening", "night", "day")
values <- c(100, 10, 1, 300)
Brown_check(values, states)
Brown_check(values, states, Brown.day.th = 100)
Create a state column that cuts light levels into sections by Brown et al. (2022)
Description
This is a convenience wrapper arount cut()
and dplyr::mutate()
. It
creates a state column dividing a light column into recommended levels by
Brown et al. (2022). Cuts can be adjusted or extended with vector_cuts
and
vector_labels
Usage
Brown_cut(
dataset,
MEDI.colname = MEDI,
New.state.colname = state,
vector_cuts = c(-Inf, 1, 10, 250, Inf),
vector_labels = "default",
overwrite = TRUE
)
Arguments
dataset |
A light exposure dataframe |
MEDI.colname |
The colname containing melanopic EDI values (or,
alternatively, Illuminance). Defaults to |
New.state.colname |
Name of the new column that will contain the cut data. Expects a symbol. |
vector_cuts |
Numeric vector of breaks for the cuts. |
vector_labels |
Vector of labels for the cuts. Must be one entry shorter
than |
overwrite |
Logical. Should the |
Value
The input dataset with an additional (or overwritten) column containing a cut light vector
References
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
See Also
Other Brown:
Brown2reference()
,
Brown_check()
,
Brown_rec()
,
sleep_int2Brown()
Examples
sample.data.environment |>
Brown_cut(vector_labels = c("0-1lx", "1-10lx", "10-250lx", "250lx-Inf")) |>
dplyr::count(state)
Set the recommended illuminance/MEDI levels by Brown et al. (2022)
Description
This is a lower level function. It sets the recommended illuminance/MEDI levels by Brown et al. (2022) for a given state. The function is vectorized.
Usage
Brown_rec(
state,
Brown.day = "day",
Brown.evening = "evening",
Brown.night = "night",
Brown.day.th = 250,
Brown.evening.th = 10,
Brown.night.th = 1
)
Arguments
state |
The state from Brown et al. (2022). Needs to be a character vector. |
Brown.day , Brown.evening , Brown.night |
The names of the states from Brown
et al. (2022). These are the default values ( |
Brown.day.th , Brown.evening.th , Brown.night.th |
The thresholds for the
states from Brown et al. (2022). These are the default values ( |
Value
A dataframe with the same length as state
that contains the
recommended illuminance/MEDI levels.
References
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
See Also
Other Brown:
Brown2reference()
,
Brown_check()
,
Brown_cut()
,
sleep_int2Brown()
Examples
states <- c("day", "evening", "night")
Brown_rec(states)
Brown_rec(states, Brown.day.th = 100)
Convert Datetime columns to Time columns
Description
Convert Datetime columns to Time columns
Usage
Datetime2Time(
dataset,
cols = dplyr::where(lubridate::is.POSIXct),
silent = FALSE
)
Arguments
dataset |
A data.frame with POSIXct columns. |
cols |
The column names to convert. Expects a |
silent |
Logical on whether no message shall be shown if input and
output are identical. Defaults to |
Value
The input dataset with converted POSIXct columns as time (hms) columns. With the default settings, if no POSIXct column exists, input and output will be identical.
Examples
sample.data.environment |> Datetime2Time()
#more than one POSIX col
sample.data.environment |>
dplyr::mutate(Datetime2 = lubridate::POSIXct(1)) |>
Datetime2Time()
#only converting one of them
sample.data.environment |>
dplyr::mutate(Datetime2 = lubridate::POSIXct(1)) |>
Datetime2Time(Datetime)
#if uncertain whether column exists
sample.data.environment |>
Datetime2Time(dplyr::any_of("Datetime3"))
Create a (shifted) sequence of Datetimes for axis breaks
Description
Take a vector of Datetimes and create a sequence of Datetimes with a given
shift and interval. This is a helper function to create breaks for plotting,
e.g. in gg_days()
, and is best used in conjunction with
Datetime_limits()
. The function is a thin wrapper around seq()
.
Usage
Datetime_breaks(x, shift = lubridate::duration(12, "hours"), by = "1 day")
Arguments
x |
a vector of |
shift |
a |
by |
a |
Value
a vector
of Datetimes
Examples
dataset <- c("2023-08-15", "2023-08-20")
Datetime_breaks(dataset)
Datetime_breaks(dataset, shift = 0)
Datetime_breaks(dataset, by = "12 hours")
Find or set sensible limits for Datetime axis
Description
Take a vector of Datetimes
and return the start of the first and end of the
last day of data. The start
and the length
can be adjusted by
durations
, like lubridate::ddays()
. It is used in the gg_days()
function to return a sensible x-axis. This function is a thin wrapper around
lubridate::floor_date()
and lubridate::ceiling_date()
.
Usage
Datetime_limits(
x,
start = NULL,
length = NULL,
unit = "1 day",
midnight.rollover = FALSE,
...
)
Arguments
x |
a vector of |
start |
optional |
length |
optional |
unit |
a |
midnight.rollover |
a |
... |
other arguments passed to |
Value
a 2 item vector
of Datetimes
with the (adjusted) start and end of
the input vector.
Examples
dataset <- c("2023-08-15", "2023-08-20")
breaks <- Datetime_breaks(dataset)
Datetime_limits(breaks)
Datetime_limits(breaks, start = lubridate::ddays(1))
Datetime_limits(breaks, length = lubridate::ddays(2))
Create a Date column in the dataset
Description
Create a Date column in the dataset
Usage
add_Date_col(
dataset,
Date.colname = Date,
group.by = FALSE,
as.wday = FALSE,
Datetime.colname = Datetime
)
Arguments
dataset |
A light logger dataset. Expects a |
Date.colname |
Name of the newly created column. Expects a |
group.by |
Logical whether the output should be (additionally) grouped by the new column |
as.wday |
Logical of whether the added column should calculate day of
the week instead of date. If |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Value
a data.frame
object identical to dataset
but with the added
column of Date data
Examples
sample.data.environment %>% add_Date_col()
#days of the week
sample.data.environment %>%
add_Date_col(as.wday = TRUE, group.by = TRUE) |>
summarize_numeric(remove = c("Datetime"))
Create a Time-of-Day column in the dataset
Description
Create a Time-of-Day column in the dataset
Usage
add_Time_col(
dataset,
Datetime.colname = Datetime,
Time.colname = Time,
output.dataset = TRUE
)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Time.colname |
Name of the newly created column. Expects a |
output.dataset |
should the output be a |
Value
a data.frame
object identical to dataset
but with the added
column of Time-of-Day data, or a vector
with the Time-of-Day-data
Examples
sample.data.environment %>% add_Time_col()
Add states to a dataset based on groups and start/end times
Description
add_states()
brings states to a time series dataset. It uses the
States.dataset
to add states to the dataset
. The States.dataset
must at
least contain the same variables as the dataset
grouping, as well as a
start and end time. Beware if both datasets operate on different time zones
and consider to set force.tz = TRUE
.
Usage
add_states(
dataset,
States.dataset,
Datetime.colname = Datetime,
start.colname = start,
end.colname = end,
force.tz = FALSE,
leave.out = c("duration", "epoch")
)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
States.dataset |
A light logger dataset. Needs to be a dataframe. This
dataset must contain the same variables as the |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
start.colname , end.colname |
The columns that contain the start and end
time. Need to be |
force.tz |
If |
leave.out |
A character vector of columns that should not be carried
over to the |
Details
Beware if columns in the dataset
and States.dataset
have the same name
(other then grouping variables). The underlying function,
dplyr::left_join()
will mark the columns in the dataset
with a suffix
.x
, and in the States.dataset
with a suffix .y
.
Value
a modified dataset
with the states added. The states are added as
new columns to the dataset
. The columns are named after the columns in
the States.dataset
, except for the start and end times, which are
removed.
Examples
states <-
sample.data.environment |>
filter_Date(length = "1 day") |>
extract_states(Daylight, MEDI > 1000)
states |> head(2)
#add states to a dataset and plot them - as we only looked for states on the
# first day (see above), only the first day will show up in the plot
sample.data.environment |>
filter_Date(length = "2 day") |>
add_states(states) |>
gg_days() |>
gg_state(Daylight)
Aggregate dates to a single day
Description
Condenses a dataset
by aggregating the data to a single day per group, with
a resolution of choice unit
. aggregate_Date()
is opinionated in the sense
that it sets default handlers for each data type of numeric
, character
,
logical
, and factor
. These can be overwritten by the user. Columns that
do not fall into one of these categories need to be handled individually by
the user (...
argument) or will be removed during aggregation. If no unit
is specified the data will simply be aggregated to the most common interval
(dominant.epoch
) in every group. aggregate_Date()
is especially useful
for summary plots that show an average day.
Usage
aggregate_Date(
dataset,
Datetime.colname = Datetime,
unit = "none",
type = c("round", "floor", "ceiling"),
date.handler = stats::median,
numeric.handler = mean,
character.handler = function(x) names(which.max(table(x, useNA = "ifany"))),
logical.handler = function(x) mean(x) >= 0.5,
factor.handler = function(x) factor(names(which.max(table(x, useNA = "ifany")))),
datetime.handler = stats::median,
duration.handler = function(x) lubridate::duration(mean(x)),
time.handler = function(x) hms::as_hms(mean(x)),
...
)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
unit |
Unit of binning. See |
type |
One of |
date.handler |
A function that calculates the aggregated day for each
group. By default, this is set to |
numeric.handler , character.handler , logical.handler , factor.handler , datetime.handler , duration.handler , time.handler |
functions that handle the respective data types. The default handlers
calculate the |
... |
arguments given over to |
Details
Summary values for type POSIXct
are calculated as the median, because the
mean can be nonsensical at times (e.g., the mean of Day1 18:00 and Day2
18:00, is Day2 6:00, which can be the desired result, but if the focus is on
time, rather then on datetime, it is recommended that values are converted to
times via hms::as_hms()
before applying the function (the mean of 18:00 and
18:00 is still 18:00, not 6:00). Using the median as a default handler
ensures a more sensible datetime.
aggregate_Date()
splits the Datetime
column into a Date.data
and a Time
column. It will create subgroups for each Time
present in a group and aggregate each group into a single day, then remove
the sub grouping.
Use the ...
to create summary statistics for each group, e.g. maximum or
minimum values for each time point group.
Performing aggregate_Datetime()
with any unit
and then
aggregate_Date()
with a unit
of "none"
is equivalent to just using
aggregate_Date()
with that unit
directly (provided the other arguments
are set the same between the functions). Disentangling the two functions
can be useful to split the computational cost for very small instances of
unit
in large datasets. It can also be useful to apply different handlers
when aggregating data to the desired unit
of time, before further
aggregation to a single day, as these handlers as well as ...
are used
twice if the unit
is not set to "none"
.
Value
A tibble
with aggregated Datetime
data, at maximum one day per
group. If the handler arguments capture all column types, the number of
columns will be the same as in the input dataset
.
Examples
library(ggplot2)
#gg_days without aggregation
sample.data.environment %>%
gg_days()
#with daily aggregation
sample.data.environment %>%
aggregate_Date() %>%
gg_days()
#with daily aggregation and a different time aggregation
sample.data.environment %>%
aggregate_Date(unit = "15 mins", type = "floor") %>%
gg_days()
#adding further summary statistics about the range of MEDI
sample.data.environment %>%
aggregate_Date(unit = "15 mins", type = "floor",
MEDI_max = max(MEDI),
MEDI_min = min(MEDI)) %>%
gg_days() +
geom_ribbon(aes(ymin = MEDI_min, ymax = MEDI_max), alpha = 0.5)
Aggregate Datetime data
Description
Condenses a dataset
by aggregating the data to a given (shorter) interval
unit
. aggregate_Datetime()
is opinionated in the sense that it sets
default handlers for each data type of numeric
, character
, logical
,
factor
, duration
, time
, and datetime
. These can be overwritten by the
user. Columns that do not fall into one of these categories need to be
handled individually by the user (...
argument) or will be removed during
aggregation. If no unit is specified the data will simply be aggregated to
the most common interval (dominant.epoch
), which is most often not an
aggregation but a rounding.)
Usage
aggregate_Datetime(
dataset,
unit = "dominant.epoch",
Datetime.colname = Datetime,
type = c("round", "floor", "ceiling"),
numeric.handler = mean,
character.handler = function(x) names(which.max(table(x, useNA = "ifany"))),
logical.handler = function(x) mean(x) >= 0.5,
factor.handler = function(x) factor(names(which.max(table(x, useNA = "ifany")))),
datetime.handler = mean,
duration.handler = function(x) lubridate::duration(mean(x)),
time.handler = function(x) hms::as_hms(mean(x)),
...
)
Arguments
dataset |
A light logger dataset. Expects a |
unit |
Unit of binning. See |
Datetime.colname |
column name that contains the datetime. Defaults to
|
type |
One of |
numeric.handler , character.handler , logical.handler , factor.handler , datetime.handler , duration.handler , time.handler |
functions that handle the respective data types. The default handlers
calculate the |
... |
arguments given over to |
Details
Summary values for type POSIXct
are calculated as the mean, which can be
nonsensical at times (e.g., the mean of Day1 18:00 and Day2 18:00, is Day2
6:00, which can be the desired result, but if the focus is on time, rather
then on datetime, it is recommended that values are converted to times via
hms::as_hms()
before applying the function (the mean of 18:00 and 18:00 is
still 18:00, not 6:00).
Value
A tibble
with aggregated Datetime
data. Usually the number of
rows will be smaller than the input dataset
. If the handler arguments
capture all column types, the number of columns will be the same as in the
input dataset
.
Examples
#dominant epoch without aggregation
sample.data.environment %>%
dominant_epoch()
#dominant epoch with 5 minute aggregation
sample.data.environment %>%
aggregate_Datetime(unit = "5 mins") %>%
dominant_epoch()
#dominant epoch with 1 day aggregation
sample.data.environment %>%
aggregate_Datetime(unit = "1 day") %>%
dominant_epoch()
Alphaopic (+ photopic) action spectra
Description
A dataframe of alphaopic action spectra plus the photopic action spectrum. The alphaopic action spectra are according to the CIE S 026/E:2018 standard. The alphaopic action spectra are for a 32-year-old standard observer. The photopic action spectrum is for a 2° standard observer.
Usage
alphaopic.action.spectra
Format
alphaopic.action.spectra
A datafram with 471 rows and 7 columns:
- wavelength
integer of wavelength, from 360 to 830 nm. Unit is nm
- melanopic
numeric melanopic action spectrum
- l_cone_opic
numeric L-cone opic action spectrum
- m_cone_opic
numeric M-cone opic action spectrum
- s_cone_opic
numeric S-cone opic action spectrum
- rhodopic
numeric rhodopic action spectrum
- photopic
numeric photopic action spectrum
Source
https://cie.co.at/datatable/cie-spectral-luminous-efficiency-photopic-vision
<https://files.cie.co.at/CIE S 026 alpha-opic Toolbox.xlsx>
References
CIE (2019). ISO/CIE 11664-1:2019(E). Colorimetry — Part 1: CIE standard colorimetric observers. Vienna, CIE
CIE (2018). CIE S 026/E:2018. CIE system for metrology of optical radiation for ipRGC-influenced responses of light. Vienna, CIE
Circadian lighting metrics from Barroso et al. (2014)
Description
This function calculates the metrics proposed by Barroso et al. (2014) for light-dosimetry in the context of research on the non-visual effects of light. The following metrics are calculated:
Usage
barroso_lighting_metrics(
Light.vector,
Time.vector,
epoch = "dominant.epoch",
loop = FALSE,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
loop |
Logical. Should the data be looped? Defaults to |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame be returned? If |
Details
bright_threshold
The maximum light intensity for which at least six hours of measurements are at the same or higher level.
dark_threshold
The minimum light intensity for which at least eight hours of measurements are at the same or lower level.
bright_mean_level
The 20% trimmed mean of all light intensity measurements equal or above the
bright_threshold
.dark_mean_level
The 20% trimmed mean of all light intensity measurements equal or below the
dark_threshold
.bright_cluster
The longest continuous time interval above the
bright_threshold
.dark_cluster
The longest continuous time interval below the
dark_threshold
.circadian_variation
A measure of periodicity of the daily lighting schedule over a given set of days. Calculated as the coefficient of variation of input light data.
Value
List or dataframe with the seven values: bright_threshold
, dark_threshold
,
bright_mean_level
, dark_mean_level
, bright_cluster
, dark_cluster
,
circadian_variation
. The output type of bright_cluster
, dark_cluster
,
is a duration object.
References
Barroso, A., Simons, K., & Jager, P. de. (2014). Metrics of circadian lighting for clinical investigations. Lighting Research & Technology, 46(6), 637–649. doi:10.1177/1477153513502664
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Examples
dataset1 <-
tibble::tibble(
Id = rep("B", 60 * 24),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)),
MEDI = c(rep(sample(seq(0,1,0.1), 60*8, replace = TRUE)),
rep(sample(1:1000, 16, replace = TRUE), each = 60))
)
dataset1 %>%
dplyr::reframe(barroso_lighting_metrics(MEDI, Datetime, as.df = TRUE))
Brightest or darkest continuous period
Description
This function finds the brightest or darkest continuous period of a given
timespan and calculates its mean
light level, as well as the timing of the period's
onset
, midpoint
, and offset
. It is defined as the period with the maximum
or minimum mean light level. Note that the data need to be regularly spaced
(i.e., no gaps) for correct results.
Usage
bright_dark_period(
Light.vector,
Time.vector,
period = c("brightest", "darkest"),
timespan = "10 hours",
epoch = "dominant.epoch",
loop = FALSE,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
period |
String indicating the type of period to look for. Can be either
|
timespan |
The timespan across which to calculate. Can be either a
duration or a duration string, e.g.,
|
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
loop |
Logical. Should the data be looped? If |
na.rm |
Logical. Should missing values be removed for the calculation?
Defaults to |
as.df |
Logical. Should the output be returned as a data frame? Defaults
to |
Details
Assumes regular 24h light data. Otherwise, results may not be meaningful. Looping the data is recommended for finding the darkest period.
Value
A named list with the mean
, onset
, midpoint
, and offset
of the
calculated brightest or darkest period, or if as.df == TRUE
a data frame
with columns named {period}_{timespan}_{metric}
. The output type corresponds
to the type of Time.vector
, e.g., if Time.vector
is HMS, the timing metrics
will be also HMS, and vice versa for POSIXct.
References
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
# Dataset with light > 250lx between 06:00 and 18:00
dataset1 <-
tibble::tibble(
Id = rep("A", 24),
Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset1 %>%
dplyr::reframe(bright_dark_period(MEDI, Datetime, "brightest", "10 hours",
as.df = TRUE))
dataset1 %>%
dplyr::reframe(bright_dark_period(MEDI, Datetime, "darkest", "7 hours",
loop = TRUE, as.df = TRUE))
# Dataset with duration as Time.vector
dataset2 <-
tibble::tibble(
Id = rep("A", 24),
Datetime = lubridate::dhours(0:23),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset2 %>%
dplyr::reframe(bright_dark_period(MEDI, Datetime, "brightest", "10 hours",
as.df = TRUE))
dataset2 %>%
dplyr::reframe(bright_dark_period(MEDI, Datetime, "darkest", "5 hours",
loop = TRUE, as.df = TRUE))
Centroid of light exposure
Description
This function calculates the centroid of light exposure as the mean of the time vector weighted in proportion to the corresponding binned light intensity.
Usage
centroidLE(
Light.vector,
Time.vector,
bin.size = NULL,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
bin.size |
Value specifying size of bins to average the light data over.
Must be either a duration or a duration string, e.g.,
|
na.rm |
Logical. Should missing values be removed for the calculation?
Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Value
Single column data frame or vector.
References
Phillips, A. J. K., Clerx, W. M., O’Brien, C. S., Sano, A., Barger, L. K., Picard, R. W., Lockley, S. W., Klerman, E. B., & Czeisler, C. A. (2017). Irregular sleep/wake patterns are associated with poorer academic performance and delayed circadian and sleep/wake timing. Scientific Reports, 7(1), 3216. doi:10.1038/s41598-017-03171-4
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
# Dataset with POSIXct time vector
dataset1 <-
tibble::tibble(
Id = rep("A", 24),
Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset1 %>%
dplyr::reframe(
"Centroid of light exposure" = centroidLE(MEDI, Datetime, "2 hours")
)
# Dataset with hms time vector
dataset2 <-
tibble::tibble(
Id = rep("A", 24),
Time = hms::as_hms(lubridate::as_datetime(0) + lubridate::hours(0:23)),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset2 %>%
dplyr::reframe(
"Centroid of light exposure" = centroidLE(MEDI, Time, "2 hours")
)
# Dataset with duration time vector
dataset3 <-
tibble::tibble(
Id = rep("A", 24),
Hour = lubridate::duration(0:23, "hours"),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset3 %>%
dplyr::reframe(
"Centroid of light exposure" = centroidLE(MEDI, Hour, "2 hours")
)
Counts the Time differences (epochs) per group (in a grouped dataset)
Description
Counts the Time differences (epochs) per group (in a grouped dataset)
Usage
count_difftime(dataset, Datetime.colname = Datetime)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Value
a tibble
with the number of occurences of each time difference per
group
Examples
#count_difftime returns the number of occurences of each time difference
#and is more comprehensive in terms of a summary than `gap_finder` or
#`dominant_epoch`
count_difftime(sample.data.irregular)
dominant_epoch(sample.data.irregular)
gap_finder(sample.data.irregular)
#irregular data can be regularized with `aggregate_Datetime`
sample.data.irregular |>
aggregate_Datetime(unit = "15 secs") |>
count_difftime()
create_Timedata
Description
create_Timedata
Usage
create_Timedata(...)
Arguments
... |
Input arguments to |
Value
a data.frame
object identical to dataset
but with the added
column of Time-of-Day data, or a vector
with the Time-of-Day-data
Examples
sample.data.environment %>% create_Timedata()
Create Datetime bins for visualization and calculation
Description
cut_Datetime
is a wrapper around lubridate::round_date()
(and friends)
combined with dplyr::mutate()
, to create a new column in a light logger
dataset with a specified binsize. This can be "3 hours"
, "15 secs"
, or
"0.5 days"
. It is a useful step between a dataset and a visualization or
summary step.
Usage
cut_Datetime(
dataset,
unit = "3 hours",
type = c("round", "floor", "ceiling"),
Datetime.colname = Datetime,
New.colname = Datetime.rounded,
group_by = FALSE,
...
)
Arguments
dataset |
A light logger dataset. Expects a |
unit |
Unit of binning. See |
type |
One of |
Datetime.colname |
column name that contains the datetime. Defaults to
|
New.colname |
Column name for the added column in the |
group_by |
Should the data be grouped by the new column? Defaults to |
... |
Parameter handed over to |
Value
a data.frame
object identical to dataset
but with the added
column of binned datetimes.
Examples
#compare Datetime and Datetime.rounded
sample.data.environment %>%
cut_Datetime() %>%
dplyr::slice_sample(n = 5)
Create reference data from other data
Description
Create reference data from almost any other data that has a datetime column
and a data column. The reference data can even be created from subsets of the
same data. Examples are that one participant can be used as a reference for
all other participants, or that the first (second,...) day of every
participant data is the reference for any other day. This function needs to
be carefully handled, when the reference data time intervals are shorter than
the data time intervals. In that case, use aggregate_Datetime()
on the
reference data beforehand to lengthen the interval.
Usage
data2reference(
dataset,
Reference.data = dataset,
Datetime.column = Datetime,
Data.column = MEDI,
Id.column = Id,
Reference.column = Reference,
overwrite = FALSE,
filter.expression.reference = NULL,
across.id = FALSE,
shift.start = FALSE,
length.restriction.seconds = 60,
shift.intervals = "auto",
Reference.label = NULL
)
Arguments
dataset |
A light logger dataset |
Reference.data |
The data that should be used as reference. By default
the |
Datetime.column |
Datetime column of the |
Data.column |
Data column in the |
Id.column |
Name of the |
Reference.column |
Name of the reference column that will be added to
the |
overwrite |
If |
filter.expression.reference |
Expression that is used to filter the
|
across.id |
Grouping variables that should be ignored when creating the
reference data. Default is |
shift.start |
If |
length.restriction.seconds |
Restricts the application of reference data
to a maximum length in seconds. Default is |
shift.intervals |
Time shift in seconds, that is applied to every data
point in the reference data. Default is |
Reference.label |
Label that is added to the reference data. If |
Details
To use subsets of data, use the filter.expression.reference
argument to
specify the subsets of data. The across.id
argument specifies whether the
reference data should be used across all or some grouping variables (e.g.,
across participants). The shift.start
argument enables a shift of the
reference data start time to the start of the respective group.
and @examples for more information. The expression is evaluated
within dplyr::filter()
.
Value
A dataset
with a new column Reference
that contains the reference
data.
Examples
library(dplyr)
library(lubridate)
library(ggplot2)
gg_reference <- function(dataset) {
dataset %>%
ggplot(aes(x = Datetime, y = MEDI, color = Id)) +
geom_line(linewidth = 1) +
geom_line(aes(y = Reference), color = "black", size = 0.25, linetype = "dashed") +
theme_minimal() + facet_wrap(~ Id, scales = "free_y")
}
#in this example, each data point is its own reference
sample.data.environment %>%
data2reference() %>%
gg_reference()
#in this example, the first day of each ID is the reference for the other days
#this requires grouping of the Data by Day, which is then specified in across.id
#also, shift.start needs to be set to TRUE, to shift the reference data to the
#start of the groupings
sample.data.environment %>% group_by(Id, Day = as_date(Datetime)) %>%
data2reference(
filter.expression.reference = as_date(Datetime) == min(as_date(Datetime)),
shift.start = TRUE,
across.id = "Day") %>%
gg_reference()
#in this example, the Environment Data will be used as a reference
sample.data.environment %>%
data2reference(
filter.expression.reference = Id == "Environment",
across.id = TRUE) %>%
gg_reference()
Disparity index
Description
This function calculates the continuous disparity index as described in Fernández-Martínez et al. (2018).
Usage
disparity_index(Light.vector, na.rm = FALSE, as.df = FALSE)
Arguments
Light.vector |
Numeric vector containing the light data. |
na.rm |
Logical. Should missing values be removed? Defaults to FALSE |
as.df |
Logical. Should the output be returned as a data frame? If |
Value
Single column data frame or vector.
References
Fernández-Martínez, M., Vicca, S., Janssens, I. A., Carnicer, J., Martín-Vide, J., & Peñuelas, J. (2018). The consecutive disparity index, D: A measure of temporal variability in ecological studies. Ecosphere, 9(12), e02527. doi:10.1002/ecs2.2527
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
dataset1 <-
tibble::tibble(
Id = rep("A", 24),
Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23),
MEDI = sample(0:1000, 24),
)
dataset1 %>%
dplyr::reframe(
"Disparity index" = disparity_index(MEDI)
)
Determine the dominant epoch/interval of a dataset
Description
Calculate the dominant epoch/interval of a dataset. The dominant epoch/interval is the epoch/interval that is most frequent in the dataset. The calculation is done per group, so that you might get multiple variables. If two or more epochs/intervals are equally frequent, the first one (shortest one) is chosen.
Usage
dominant_epoch(dataset, Datetime.colname = Datetime)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
Value
A tibble
with one row per group and a column with the
dominant.epoch
as a lubridate::duration()
. Also a column with the
group.indices
, which is helpful for referencing the dominant.epoch
across dataframes of equal grouping.
See Also
Other regularize:
extract_gaps()
,
gap_finder()
,
gap_handler()
,
gapless_Datetimes()
,
has_gaps()
,
has_irregulars()
Examples
dataset <-
tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"),
Datetime = lubridate::as_datetime(1) +
lubridate::days(c(0:2, 4, 6, 8)))
dataset
#get the dominant epoch by group
dataset %>%
dplyr::group_by(Id) %>%
dominant_epoch()
#get the dominant epoch of the whole dataset
dataset %>%
dominant_epoch()
Calculate the dose (value·hours)
Description
This function calculates the dose from a time series. For light, this is equal to the actual definition of light exposure (CIE term luminous exposure). Output will always be provided in value·hours (e.g., for light, lx·hours).
Usage
dose(
Light.vector,
Time.vector,
epoch = "dominant.epoch",
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
na.rm |
Logical. Should missing values (NA) be removed for the
calculation? Defaults to |
as.df |
Logical. Should a data frame with be returned? If |
Details
The time series does not have to be regular, however, it will be aggregated
to a regular timeseries of the given epoch. Implicit gaps (i.e., no
observations), will be converted to NA values (which can be ignored with
na.rm = TRUE
).
Value
A numeric object as single value, or single column data frame with the dose in value·hours
References
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
dose(c(1,1,1,1), lubridate::dhours(c(1:4)), na.rm = TRUE)
#with gaps
dose(c(1,1,1), lubridate::dhours(c(1,3:4)), na.rm = TRUE)
#gaps can be aggregated to a coarser interval, which can be sensibe
#if they are still representative
dose(c(1,1,1), lubridate::dhours(c(1,3:4)), na.rm = TRUE, epoch = "2 hours")
Handle jumps in Daylight Savings (DST) that are missing in the data
Description
When data is imported through LightLogR
and a timezone applied, it is
assumed that the timestamps are correct - which is the case, e.g., if
timestamps are stored in UTC
, or they are in local time. Some if not most
measurement devices are set to local time before a recording interval starts.
If during the recording a daylight savings jump happens (in either
direction), the device might not adjust timestamps for this change. This
results in an unwanted shift in the data, starting at the time of the DST
jump and likely continues until the end of a file. dst_change_handler
is
used to detect such jumps within a group and apply the correct shift in the
data (i.e., the shift that should have been applied by the device).
important Note that this function is only useful if the time stamp in
the raw data deviates from the actual date-time. Note also, that this
function results in a gap during the DST jump, which should be handled by
gap_handler()
afterwards. It will also result in potentially double the
timestamps during the jum back from DST to standard time. This will result
in some inconsistencies with some functions, so we recommend to use
aggregate_Datetime()
afterwards with a unit
equal to the dominant epoch.
Finally, the function is not equipped to handle more than one jump per group.
The jump is based on whether the group starts out with DST or not. the
function will remove datetime rows with NA
values.
Usage
dst_change_handler(
dataset,
Datetime.colname = Datetime,
filename.colname = NULL
)
Arguments
dataset |
dataset to be summarized, must be a |
Datetime.colname |
name of the column that contains the Datetime data, expects a |
filename.colname |
(optional) column name that contains the filename.
If provided, it will use this column as a temporary grouping variable
additionally to the |
Details
The detection of a DST jump is based on the function lubridate::dst()
and jumps are only applied within a group. During import, this function is used if dst_adjustment = TRUE
is set and includes by default the filename as the grouping variable, additionally to Id
.
Value
A tibble
with the same columns as the input dataset, but shifted
See Also
Other DST:
dst_change_summary()
Examples
#create some data that crosses a DST jump
data <-
tibble::tibble(
Datetime = seq.POSIXt(from = as.POSIXct("2023-03-26 01:30:00", tz = "Europe/Berlin"),
to = as.POSIXct("2023-03-26 03:00:00", tz = "Europe/Berlin"),
by = "30 mins"),
Value = 1)
#as can be seen next, there is a gap in the data - this is necessary when
#using a timezone with DST.
data$Datetime
#Let us say now, that the device did not adjust for the DST - thus the 03:00
#timestamp is actually 04:00 in local time. This can be corrected for by:
data %>% dst_change_handler() %>% .$Datetime
Get a summary of groups where a daylight saving time change occurs.
Description
Get a summary of groups where a daylight saving time change occurs.
Usage
dst_change_summary(dataset, Datetime.colname = Datetime)
Arguments
dataset |
dataset to be summarized, must be a |
Datetime.colname |
name of the column that contains the Datetime data, expects a |
Value
a tibble
with the groups where a dst change occurs. The column dst_start
is a boolean that indicates whether the start of this group occurs during daylight savings.
See Also
Other DST:
dst_change_handler()
Examples
sample.data.environment %>%
dplyr::mutate(Datetime =
Datetime + lubridate::dweeks(8)) %>%
dst_change_summary()
Duration above/below threshold or within threshold range
Description
This function calculates the duration spent above/below a specified threshold light level or within a specified range of light levels.
Usage
duration_above_threshold(
Light.vector,
Time.vector,
comparison = c("above", "below"),
threshold,
epoch = "dominant.epoch",
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the time above or below threshold
should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the time within the two thresholds will be calculated. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame with be returned? If |
Value
A duration object as single value, or single column data frame.
References
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
N <- 60
# Dataset with epoch = 1min
dataset1 <-
tibble::tibble(
Id = rep("A", N),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N),
MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))),
)
# Dataset with epoch = 30s
dataset2 <-
tibble::tibble(
Id = rep("B", N),
Datetime = lubridate::as_datetime(0) + lubridate::seconds(seq(30, N * 30, 30)),
MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))),
)
dataset.combined <- rbind(dataset1, dataset2)
dataset1 %>%
dplyr::reframe("TAT >250lx" = duration_above_threshold(MEDI, Datetime, threshold = 250))
dataset1 %>%
dplyr::reframe(duration_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE))
# Group by Id to account for different epochs
dataset.combined %>%
dplyr::group_by(Id) %>%
dplyr::reframe("TAT >250lx" = duration_above_threshold(MEDI, Datetime, threshold = 250))
Calculate duration of data in each group
Description
This function calculates the total duration of data in each group of a dataset, based on a datetime column and a variable column. It uses the dominant epoch (interval) of each group to calculate the duration.
Usage
durations(
dataset,
Variable.colname = Datetime,
Datetime.colname = Datetime,
count.NA = FALSE,
show.missing = FALSE,
show.interval = FALSE,
FALSE.as.NA = FALSE
)
Arguments
dataset |
A light logger dataset. Expects a dataframe. If not imported by LightLogR, take care to choose sensible variables for the Datetime.colname and Variable.colname. |
Variable.colname |
Column name that contains the variable for which to calculate the duration. Expects a symbol. Needs to be part of the dataset. |
Datetime.colname |
Column name that contains the datetime. Defaults to "Datetime" which is automatically correct for data imported with LightLogR. Expects a symbol. Needs to be part of the dataset. Must be of type POSIXct. |
count.NA |
Logical. Should NA values in Variable.colname be counted as part of the duration? Defaults to FALSE. |
show.missing |
Logical. Should the duration of NAs be provided in a separate column "Missing"? Defaults to FALSE. |
show.interval |
Logical. Should the dominant epoch (interval) be shown in a column "interval"? Defaults to FALSE. |
FALSE.as.NA |
Logical. Should FALSE values in the Variable.colname be treated as NA (i.e., missing)? |
Value
A tibble with one row per group and a column "duration" containing
the duration of each group as a lubridate::duration()
. If show.missing = TRUE
, a column "missing" is added with the duration of NAs, and a column
"total" with the total duration. If show.interval = TRUE
, a column
"interval" is added with the dominant epoch of each group.
Examples
# Calculate the duration of a dataset
durations(sample.data.environment)
# create artificial gaps in the data
gapped_data <-
sample.data.environment |>
dplyr::filter(MEDI >= 10) |>
gap_handler(full.days = TRUE)
#by default, the Datetime column is selected for the `Variable.colname`,
#basically ignoring NA measurement values
gapped_data |>
durations(count.NA = TRUE)
# Calculate the duration where MEDI are available
durations(gapped_data, MEDI)
# Calculate the duration, show the duration of NAs separately
durations(gapped_data, MEDI, show.missing = TRUE)
# Calculate the duration, show the dominant epoch
durations(gapped_data, Variable.colname = MEDI, show.interval = TRUE)
# Calculate durations for day and night separately
gapped_data |>
add_photoperiod(coordinates = c(48.52, 9.06)) |>
dplyr::group_by(photoperiod.state, .add = TRUE) |>
durations(Variable.colname = MEDI, show.interval = TRUE, show.missing = TRUE)
Exponential moving average filter (EMA)
Description
This function smoothes the data using an exponential moving average filter with a specified decay half-life.
Usage
exponential_moving_average(
Light.vector,
Time.vector,
decay = "90 min",
epoch = "dominant.epoch"
)
Arguments
Light.vector |
Numeric vector containing the light data. Missing values are replaced by 0. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
decay |
The decay half-life controlling the exponential smoothing.
Can be either a duration or a string. If it is a string, it
needs to be a valid duration string, e.g., |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
Details
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
Value
A numeric vector containing the smoothed light data. The output has the same
length as Light.vector
.
References
Price, L. L. A. (2014). On the Role of Exponential Smoothing in Circadian Dosimetry. Photochemistry and Photobiology, 90(5), 1184-1192. doi:10.1111/php.12282
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
sample.data.environment.EMA = sample.data.environment %>%
dplyr::filter(Id == "Participant") %>%
filter_Datetime(length = lubridate::days(2)) %>%
dplyr::mutate(MEDI.EMA = exponential_moving_average(MEDI, Datetime))
# Plot to compare results
sample.data.environment.EMA %>%
ggplot2::ggplot(ggplot2::aes(x = Datetime)) +
ggplot2::geom_line(ggplot2::aes(y = MEDI), colour = "black") +
ggplot2::geom_line(ggplot2::aes(y = MEDI.EMA), colour = "red")
Find and extract clusters from a dataset
Description
extract_clusters()
searches for and summarizes clusters where
data meets a certain condition. Clusters have a specified duration and can
be interrupted while still counting as one cluster. The variable can either
be a column in the dataset or an expression that gets evaluated in a
dplyr::mutate()
call.
Cluster start and end times are shifted by half of the epoch each. E.g., a state lasting for 4 measurement points will have a duration of 4 measurement intervals, and a state only occuring once, of one interval. This deviates from simply using the time difference between the first and last occurance, which would be one epoch shorter (e.g., the start and end points for a state lasting a single point is identical, i.e., zero duration)
Groups will not be dropped, meaning that summaries based on the clusters will account for groups without clusters.
For correct cluster identification, there can be no gaps in the data!
Gaps can inadvertently be introduced to a gapless dataset through grouping.
E.g., when grouping by photoperiod (day/night) within a participant, this
introduces gaps between the individual days and nights that together form
the group. To avoid this, either group by individual days and nights (e.g.,
by using number_states()
before grouping), which will make sure a cluster
cannot extend beyond any grouping. Alternatively, you can set handle.gaps = TRUE
(at computational cost).
add_clusters()
identifies clusters and adds them back into the
dataset through a rolling join. This is a convenience function built on extract_clusters()
.
Usage
extract_clusters(
data,
Variable,
Datetime.colname = Datetime,
cluster.duration = "30 mins",
duration.type = c("min", "max"),
interruption.duration = 0,
interruption.type = c("max", "min"),
cluster.colname = state.count,
return.only.clusters = TRUE,
drop.empty.groups = TRUE,
handle.gaps = FALSE,
add.label = FALSE
)
add_clusters(
data,
Variable,
Datetime.colname = Datetime,
cluster.duration = "30 mins",
duration.type = c("min", "max"),
interruption.duration = 0,
interruption.type = c("max", "min"),
cluster.colname = state,
handle.gaps = FALSE
)
Arguments
data |
A light logger dataset. Expects a dataframe. |
Variable |
The variable or condition to be evaluated for clustering. Can be a column name or an expression. |
Datetime.colname |
Column name that contains the datetime. Defaults to "Datetime" which is automatically correct for data imported with LightLogR. Expects a symbol. |
cluster.duration |
The minimum or maximum duration of a cluster. Defaults to 30 minutes. Expects a lubridate duration object (or a numeric in seconds). |
duration.type |
Type of the duration requirement for clusters. Either "min" (minimum duration) or "max" (maximum duration). Defaults to "min". |
interruption.duration |
The duration of allowed interruptions within a cluster. Defaults to 0 (no interruptions allowed). |
interruption.type |
Type of the interruption duration. Either "max" (maximum interruption) or "min" (minimum interruption). Defaults to "max". |
cluster.colname |
Name of the column to use for the cluster identification. Defaults to "state.count". Expects a symbol. |
return.only.clusters |
Whether to return only the identified clusters (TRUE) or also include non-clusters (FALSE). Defaults to TRUE. |
drop.empty.groups |
Logical. Should empty groups be dropped? Only works
if |
handle.gaps |
Logical whether the data shall be treated with
|
add.label |
Logical. Option to add a label to the output containing the
condition. E.g., |
Value
For extract_clusters()
a dataframe containing the identified
clusters or all time periods, depending on return.only.clusters
.
For add_clusters()
a dataframe containing the original data with an additional column
for cluster identification.
Examples
dataset <-
sample.data.environment |>
dplyr::filter(Id == "Participant") |>
filter_Date(length = "1 day")
# Extract clusters with minimum duration of 1 hour and interruptions of up to 5 minutes
dataset |>
extract_clusters(
MEDI > 250,
cluster.duration = "1 hour",
interruption.duration = "5 mins"
)
# Add clusters to a dataset where lux values are above 20 for at least 30 minutes
dataset_with_clusters <-
dataset %>% add_clusters(MEDI > 20)
#peak into the dataset
dataset_with_clusters[4500:4505,]
Extract gap episodes from the data
Description
Finds and extracts gap episodes from a dataset. If no variable is provided, it will look for implicit gaps (gaps in the regular interval), if a variable is provided, it will look for implicit and explicit gaps (NA in the variable)
Usage
extract_gaps(
dataset,
Variable.colname = NULL,
Datetime.colname = Datetime,
epoch = "dominant.epoch",
full.days = TRUE,
include.implicit.gaps = TRUE
)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Variable.colname |
Column name of the variable to check for NA values. Expects a symbol or NULL (only implicit gaps). |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
full.days |
If |
include.implicit.gaps |
Logical. Whether to expand the datetime sequence
and search for implicit gaps, or not. Default is |
Value
A dataframe containing gap times per grouping variable
See Also
Other regularize:
dominant_epoch()
,
gap_finder()
,
gap_handler()
,
gapless_Datetimes()
,
has_gaps()
,
has_irregulars()
Examples
#removing some data to create gaps
sample.data.environment |>
dplyr::filter(MEDI <= 50000) |>
extract_gaps() |> head()
#not searching for implicit gaps
sample.data.environment |>
dplyr::filter(MEDI <= 50000) |>
extract_gaps(MEDI, include.implicit.gaps = FALSE)
#making implicit gaps explicit changes the summary
sample.data.environment |>
dplyr::filter(MEDI <= 50000) |>
gap_handler()|>
extract_gaps(MEDI, include.implicit.gaps = FALSE) |> head()
Add metrics to extracted sSummary
Description
This helper function adds metric values to an extract, like from
extract_states()
or extract_clusters()
. E.g., the average value of a
variable during a cluster or state instance might be of interest. The metrics
must be specified by the user using the ...
argument.
Usage
extract_metric(
extracted_data,
data,
identifying.colname = state.count,
Datetime.colname = Datetime,
...
)
Arguments
extracted_data |
A dataframe containing cluster or state summaries,
typically from |
data |
The original dataset that produced |
identifying.colname |
Name of the column in |
Datetime.colname |
Column name that contains the datetime in |
... |
Arguments specifying the metrics to add summary. For example:
|
Details
The original data
does not have to have the cluster/state information, but
it will be computationally faster if it does.
Value
A dataframe containing the extracted data with added metrics.
Examples
# Extract clusters and add mean MEDI value
sample.data.environment |>
filter_Date(length = "2 days") |>
extract_clusters(MEDI > 1000) |>
extract_metric(
sample.data.environment,
"mean_medi" = mean(MEDI, na.rm = TRUE)
) |>
dplyr::select(Id, state.count, duration, mean_medi)
# Extract states and add mean MEDI value
dataset <-
sample.data.environment |>
filter_Date(length = "2 days") |>
add_photoperiod(c(48.5, 9))
dataset |>
extract_states(photoperiod.state) |>
extract_metric(dataset, mean_lux = mean(MEDI)) |>
dplyr::select(state.count, duration, mean_lux)
Extract summaries of states
Description
Extracts a state from a dataset and provides their start and end times, as well as duration and epoch. The state does not have to exist in the dataset, but can be dynamically created. Extracted states can have group-dropping disabled, meaning that summaries based on the extracted states show empty groups as well.
Usage
extract_states(
data,
State.colname,
State.expression = NULL,
Datetime.colname = Datetime,
handle.gaps = FALSE,
epoch = "dominant.epoch",
drop.empty.groups = TRUE,
group.by.state = TRUE
)
Arguments
data |
A light logger dataset. Expects a dataframe. |
State.colname |
The variable or condition to be evaluated for state
exctration. Expects a symbol. If it is not part of the data, a
|
State.expression |
If |
Datetime.colname |
Column name that contains the datetime. Defaults to "Datetime" which is automatically correct for data imported with LightLogR. Expects a symbol. |
handle.gaps |
Logical whether the data shall be treated with
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
drop.empty.groups |
Logical. Should empty groups be dropped? Only works
if |
group.by.state |
Logical. Should the output be automatically be grouped by the new state? |
Value
a dataframe with one row per state instance. Each row will consist of the original dataset grouping, the state column. A state.count column, start and end Datetimes, as well as a duration of the state
Examples
#summarizing states "photoperiod"
states <-
sample.data.environment |>
add_photoperiod(c(48.52, 9.06)) |>
extract_states(photoperiod.state)
states |> head(2)
states |> tail(2)
states |> summarize_numeric(c("state.count", "epoch"))
Filter Datetimes in a dataset.
Description
Filtering a dataset based on Dates or Datetimes may often be necessary prior
to calcuation or visualization. The functions allow for a filtering based on
simple strings
or Datetime
scalars, or by specifying a length. They also
support prior dplyr grouping, which is useful, e.g., when you only want to
filter the first two days of measurement data for every participant,
regardless of the actual date. If you want to filter based on times of the
day, look to filter_Time()
.
Usage
filter_Datetime(
dataset,
Datetime.colname = Datetime,
start = NULL,
end = NULL,
length = NULL,
length_from_start = TRUE,
full.day = FALSE,
tz = NULL,
only_Id = NULL,
filter.expr = NULL
)
filter_Date(..., start = NULL, end = NULL)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
start , end |
For
|
length |
Either a Period or Duration from lubridate. E.g., |
length_from_start |
A |
full.day |
A |
tz |
Timezone of the start/end times. If |
only_Id |
An expression of |
filter.expr |
Advanced filtering conditions. If not |
... |
Parameter handed over to |
Value
a data.frame
object identical to dataset
but with only the
specified Dates/Times.
See Also
Other filter:
filter_Time()
Other filter:
filter_Time()
Examples
library(lubridate)
library(dplyr)
#baseline
range.unfiltered <- sample.data.environment$Datetime %>% range()
range.unfiltered
#setting the start of a dataset
sample.data.environment %>%
filter_Datetime(start = "2023-08-31 12:00:00") %>%
pull(Datetime) %>%
range()
#setting the end of a dataset
sample.data.environment %>%
filter_Datetime(end = "2023-08-31 12:00:00") %>% pull(Datetime) %>% range()
#setting a period of a dataset
sample.data.environment %>%
filter_Datetime(end = "2023-08-31 12:00:00", length = days(2)) %>%
pull(Datetime) %>% range()
#setting only the period of a dataset
sample.data.environment %>%
filter_Datetime(length = days(2)) %>%
pull(Datetime) %>% range()
#advanced filtering based on grouping (second day of each group)
sample.data.environment %>%
#shift the "Environment" group by one day
mutate(
Datetime = ifelse(Id == "Environment", Datetime + ddays(1), Datetime) %>%
as_datetime()) -> sample
sample %>% summarize(Daterange = paste(min(Datetime), max(Datetime), sep = " - "))
#now we can use the `filter.expr` argument to filter from the second day of each group
sample %>%
filter_Datetime(filter.expr = Datetime > Datetime[1] + days(1)) %>%
summarize(Daterange = paste(min(Datetime), max(Datetime), sep = " - "))
sample.data.environment %>% filter_Date(end = "2023-08-31")
Filter multiple times based on a list of arguments.
Description
filter_Datetime_multiple()
is a wrapper around filter_Datetime()
or
filter_Date()
that allows the cumulative filtering of Datetimes
based on
varying filter conditions. It is most useful in conjunction with the
only_Id
argument, e.g., to selectively cut off dates depending on
participants (see examples)
Usage
filter_Datetime_multiple(
dataset,
arguments,
filter_function = filter_Datetime,
...
)
Arguments
dataset |
A light logger dataset |
arguments |
A list of arguments to be passed to |
filter_function |
The function to be used for filtering, either
|
... |
Additional arguments passed to the filter function. If the
|
Value
A dataframe with the filtered data
Examples
arguments <- list(
list(start = "2023-08-31", only_Id = quote(Id == "Participant")),
list(end = "2023-08-31", only_Id = quote(Id == "Environment")))
#compare the unfiltered dataset
sample.data.environment %>% gg_overview(Id.colname = Id)
#compare the unfiltered dataset
sample.data.environment %>%
filter_Datetime_multiple(arguments = arguments, filter_Date) %>%
gg_overview(Id.colname = Id)
Filter Times in a dataset.
Description
Filter Times in a dataset.
Usage
filter_Time(
dataset,
Datetime.colname = Datetime,
start = NULL,
end = NULL,
length = NULL
)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
start , end , length |
a
|
Value
a data.frame
object identical to dataset
but with only the
specified Times.
See Also
Other filter:
filter_Datetime()
Examples
sample.data.environment %>%
filter_Time(start = "4:00:34", length = "12:00:00") %>%
dplyr::pull(Time) %>% range() %>% hms::as_hms()
Frequency of crossing light threshold
Description
This functions calculates the number of times a given threshold light level is crossed.
Usage
frequency_crossing_threshold(
Light.vector,
threshold,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
threshold |
Single numeric value specifying the threshold light level to compare with. |
na.rm |
Logical. Should missing light values be removed? Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Value
Data frame or matrix with pairs of threshold and calculated values.
References
Alvarez, A. A., & Wildsoet, C. F. (2013). Quantifying light exposure patterns in young adult students. Journal of Modern Optics, 60(14), 1200–1208. doi:10.1080/09500340.2013.845700
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
N = 60
dataset1 <-
tibble::tibble(
Id = rep("A", N),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N),
MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))),
)
dataset1 %>%
dplyr::reframe("Frequency crossing 250lx" = frequency_crossing_threshold(MEDI, threshold = 250))
dataset1 %>%
dplyr::reframe(frequency_crossing_threshold(MEDI, threshold = 250, as.df = TRUE))
Gain / Gain-ratio tables to normalize counts
Description
A list of tables containing gain and gain-ratios to normalize counts across different sensor gains.
Usage
gain.ratio.tables
Format
gain.ratio.tables
A list containing two-column tibbles
- TSL2585
gain table for the ambient light sensor TSL2585
- Info
A named
character
vector specifying the version and date a sensor was added
Details
Utility: Some sensors provide raw counts and gain levels as part of their output. In some cases it is desirable to compare counts between sensors, e.g., to gauge daylight outside by comparing UV counts to photopic counts (a high ratio of UV/Pho indicates outside daylight). Or to gauge daylight inside by comparing IR counts to photopic counts (a high ratio of IR/Pho with a low ratio of UV/Pho indicates daylight in the context of LED or fluorescent lighting)
Check for and output gaps in a dataset
Description
Quickly check for implicit missing Datetime
data. Outputs a message with a
short summary, and can optionally return the gaps as a tibble
. Uses
gap_handler()
internally.
Usage
gap_finder(
dataset,
Datetime.colname = Datetime,
epoch = "dominant.epoch",
gap.data = FALSE,
silent = FALSE,
full.days = FALSE
)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
gap.data |
Logical. If |
silent |
Logical. If |
full.days |
If |
Details
The gap_finder()
function is a wrapper around gap_handler()
with the
behavior
argument set to "gaps"
. The main difference is that
gap_finder()
returns a message with a short summary of the gaps in the
dataset, and that the tibble
with the gaps contains a column gap.id
that
indicates the gap number, which is useful to determine, e.g., the consecutive
number of gaps between measurement data.
Value
Prints message with a short summary of the gaps in the dataset. If
gap.data = TRUE
, returns a tibble
of the gaps in the dataset.
See Also
Other regularize:
dominant_epoch()
,
extract_gaps()
,
gap_handler()
,
gapless_Datetimes()
,
has_gaps()
,
has_irregulars()
Examples
dataset <-
tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"),
Datetime = lubridate::as_datetime(1) +
lubridate::days(c(0:2, 4, 6, 8)) +
lubridate::hours(c(0,12,rep(0,4)))) %>%
dplyr::group_by(Id)
dataset
#look for gaps assuming the epoch is the dominant epoch of each group
gap_finder(dataset)
#return the gaps as a tibble
gap_finder(dataset, gap.data = TRUE)
#assuming the epoch is 1 day, we have different gaps, and the datapoint at noon is now `irregular`
gap_finder(dataset, epoch = "1 day")
Fill implicit gaps in a light logger dataset
Description
Datasets from light loggers often have implicit gaps. These gaps are implicit
in the sense that consecutive timestamps (Datetimes
) might not follow a
regular epoch/interval. This function fills these implicit gaps by creating a
gapless sequence of Datetimes
and joining it to the dataset. The gapless
sequence is determined by the minimum and maximum Datetime
in the dataset
(per group) and an epoch. The epoch can either be guessed from the dataset or
specified by the user. A sequence of gapless Datetimes
can be created with
the gapless_Datetimes()
function, whereas the dominant epoch in the data
can be checked with the dominant_epoch()
function. The behaviour
argument
specifies how the data is combined. By default, the data is joined with a
full join, which means that all rows from the gapless sequence are kept, even
if there is no matching row in the dataset.
Usage
gap_handler(
dataset,
Datetime.colname = Datetime,
epoch = "dominant.epoch",
behavior = c("full_sequence", "regulars", "irregulars", "gaps"),
full.days = FALSE
)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
behavior |
The behavior of the join of the |
full.days |
If |
Value
A modified tibble
similar to dataset
but with handling of implicit gaps, depending on the behavior
argument:
-
"full_sequence"
adds timestamps to thedataset
that are missing based on a full sequence ofDatetimes
(i.e., the gapless sequence). Thedataset
is this equal (no gaps) or greater in the number of rows than the input. One column is added.is.implicit
indicates whether the row was added (TRUE
) or not (FALSE
). This helps differentiating measurement values from values that might be imputed later on. -
"regulars"
keeps only rows from the gapless sequence that have a matching row in the dataset. This can be interpreted as a row-reduceddataset
with only regular timestamps according to theepoch
. In case of no gaps this tibble has the same number of rows as the input. -
"irregulars"
keeps only rows from thedataset
that do not follow the regular sequence ofDatetimes
according to theepoch
. In case of no gaps this tibble has 0 rows. -
"gaps"
returns atibble
of all implicit gaps in the dataset. In case of no gaps this tibble has 0 rows.
See Also
Other regularize:
dominant_epoch()
,
extract_gaps()
,
gap_finder()
,
gapless_Datetimes()
,
has_gaps()
,
has_irregulars()
Examples
dataset <-
tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"),
Datetime = lubridate::as_datetime(1) +
lubridate::days(c(0:2, 4, 6, 8)) +
lubridate::hours(c(0,12,rep(0,4)))) %>%
dplyr::group_by(Id)
dataset
#assuming the epoch is 1 day, we can add implicit data to our dataset
dataset %>% gap_handler(epoch = "1 day")
#we can also check whether there are irregular Datetimes in our dataset
dataset %>% gap_handler(epoch = "1 day", behavior = "irregulars")
#to get to the gaps, we can use the "gaps" behavior
dataset %>% gap_handler(epoch = "1 day", behavior = "gaps")
#finally, we can also get just the regular Datetimes
dataset %>% gap_handler(epoch = "1 day", behavior = "regulars")
Tabular summary of data and gaps in all groups
Description
gap_table()
creates a gt::gt()
with one row per group, summarizing key
gap and gap-related information about the dataset. These include the
available data, total duration, number of gaps, missing implicit and explicit
data, and, optionally, irregular data.
Usage
gap_table(
dataset,
Variable.colname = MEDI,
Variable.label = "melanopic EDI",
title = "Summary of available and missing data",
Datetime.colname = Datetime,
epoch = "dominant.epoch",
full.days = TRUE,
include.implicit.gaps = TRUE,
check.irregular = TRUE,
get.df = FALSE
)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Variable.colname |
Column name of the variable to check for NA values. Expects a symbol. |
Variable.label |
Clear name of the variable. Expects a string |
title |
Title string for the table |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
full.days |
If |
include.implicit.gaps |
Logical. Whether to expand the datetime sequence
and search for implicit gaps, or not. Default is |
check.irregular |
Logical on whether to include irregular data in the summary, i.e. data points that do not fall on the regular sequence. |
get.df |
Logical whether the dataframe should be returned instead of a
|
Value
A gt table about data and gaps in the dataset
Examples
sample.data.environment |> dplyr::filter(MEDI <= 50000) |> gap_table()
Create a gapless sequence of Datetimes
Description
Create a gapless sequence of Datetimes. The Datetimes are determined by the minimum and maximum Datetime in the dataset and an epoch. The epoch can either be guessed from the dataset or specified by the user.
Usage
gapless_Datetimes(
dataset,
Datetime.colname = Datetime,
epoch = "dominant.epoch",
full.days = FALSE
)
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
full.days |
If |
Value
A tibble
with a gapless sequence of Datetime
as specified by
epoch
.
See Also
Other regularize:
dominant_epoch()
,
extract_gaps()
,
gap_finder()
,
gap_handler()
,
has_gaps()
,
has_irregulars()
Examples
dataset <-
tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"),
Datetime = lubridate::as_datetime(1) +
lubridate::days(c(0:2, 4, 6, 8))) %>%
dplyr::group_by(Id)
dataset %>% gapless_Datetimes()
dataset %>% dplyr::ungroup() %>% gapless_Datetimes()
dataset %>% gapless_Datetimes(epoch = "1 day")
Create a simple Time-of-Day plot of light logger data, faceted by Date
Description
gg_day()
will create a simple ggplot for every data in a dataset. The
result can further be manipulated like any ggplot. This will be sensible to
refine styling or guides.
Usage
gg_day(
dataset,
start.date = NULL,
end.date = NULL,
x.axis = Datetime,
y.axis = MEDI,
aes_col = NULL,
aes_fill = NULL,
group = Id,
geom = "point",
scales = c("fixed", "free_x", "free_y", "free"),
x.axis.breaks = hms::hms(hours = seq(0, 24, by = 3)),
y.axis.breaks = c(-10^(5:0), 0, 10^(0:5)),
y.scale = "symlog",
y.scale.sc = FALSE,
x.axis.label = "Time of Day",
y.axis.label = "Illuminance (lx, MEDI)",
format.day = "%d/%m",
title = NULL,
subtitle = NULL,
interactive = FALSE,
facetting = TRUE,
jco_color = TRUE,
...
)
Arguments
dataset |
A light logger dataset. Expects a |
start.date , end.date |
Choose an optional start or end date within your
|
x.axis , y.axis |
column name that contains the datetime (x, defaults to
|
aes_col , aes_fill |
optional arguments that define separate sets and
colors or fills them. Expects anything that works with the layer data
|
group |
Optional column name that defines separate sets. Useful for
certain geoms like |
geom |
What geom should be used for visualization? Expects a
|
scales |
For |
x.axis.breaks , y.axis.breaks |
Where should breaks occur on the x and
y.axis? Expects a |
y.scale |
How should the y-axis be scaled?
|
y.scale.sc |
|
x.axis.label , y.axis.label |
labels for the x- and y-axis. Expects a
|
format.day |
Label for each day. Default is |
title |
Plot title. Expects a |
subtitle |
Plot subtitle. Expects a |
interactive |
Should the plot be interactive? Expects a |
facetting |
Should an automated facet by day be applied? Default is
|
jco_color |
Should the |
... |
Other options that get passed to the main geom function. Can be used to adjust to adjust size, linewidth, or linetype. |
Details
Besides plotting, the function creates two new variables from the given
Datetime
:
-
Day.data
is a factor that is used for facetting withggplot2::facet_wrap()
. Make sure to use this variable, if you change the faceting manually. Also, the function checks, whether this variable already exists. If it does, it will only convert it to a factor and do the faceting on that variable. -
Time
is anhms
created withhms::as_hms()
that is used for the x.axis
The default scaling of the y-axis is a symlog
scale, which is a logarithmic
scale that only starts scaling after a given threshold (default = 0). This
enables values of 0 in the plot, which are common in light logger data, and
even enables negative values, which might be sensible for non-light data. See
symlog_trans()
for details on tweaking this scale. The scale can also be
changed to a normal or logarithmic scale - see the y.scale argument for more.
The default scaling of the color and fill scales is discrete, with the
ggsci::scale_color_jco()
and ggsci::scale_fill_jco()
scales. To use a
continuous scale, use the jco_color = FALSE
setting. Both fill
and
color
aesthetics are set to NULL
by default. For most geoms, this is not
important, but geoms that automatically use those aesthetics (like
geom_bin2d, where fill = stat(count)) are affected by this. Manually adding
the required aesthetic (like aes_fill = ggplot2::stat(count)
will fix
this).
Value
A ggplot object
Examples
#use `col`for separation of different sets
plot <- gg_day(
sample.data.environment,
scales = "fixed",
end.date = "2023-08-31",
y.axis.label = "mEDI (lx)",
aes_col = Id)
plot
#you can easily overwrite the color scale afterwards
plot + ggplot2::scale_color_discrete()
#or change the facetting
plot + ggplot2::facet_wrap(~Day.data + Id)
Create a simple datetime plot of light logger data, faceted by group
Description
gg_days()
will create a simple ggplot along the timeline. The result can
further be manipulated like any ggplot. This will be sensible to refine
styling or guides. Through the x.axis.limits
arguments, the plot can be
much refined to align several groups of differing datetime ranges. It uses
the Datetime_limits()
function to calculate the limits of the x-axis.
Another notable functions that are used are Datetime_breaks()
to calculate
the breaks of the x-axis.
Usage
gg_days(
dataset,
x.axis = Datetime,
y.axis = MEDI,
aes_col = NULL,
aes_fill = NULL,
group = NULL,
geom = "line",
scales = c("free_x", "free_y", "fixed", "free"),
x.axis.breaks = Datetime_breaks,
y.axis.breaks = c(-10^(5:0), 0, 10^(0:5)),
y.scale = "symlog",
y.scale.sc = FALSE,
x.axis.label = "Local Date/Time",
y.axis.label = "Illuminance (lx, MEDI)",
x.axis.limits = Datetime_limits,
x.axis.format = "%a %D",
title = NULL,
subtitle = NULL,
interactive = FALSE,
facetting = TRUE,
jco_color = TRUE,
...
)
Arguments
dataset |
A light logger dataset. Expects a |
x.axis , y.axis |
column name that contains the datetime (x, defaults to
|
aes_col , aes_fill |
optional input that defines separate sets and colors
or fills them. Expects anything that works with the layer data
|
group |
Optional column name that defines separate sets. Useful for
certain geoms like |
geom |
What geom should be used for visualization? Expects a
|
scales |
For |
x.axis.breaks |
The (major) breaks of the x-axis. Defaults to
|
y.axis.breaks |
Where should breaks occur on the y.axis? Expects a
|
y.scale |
How should the y-axis be scaled?
|
y.scale.sc |
|
x.axis.label , y.axis.label |
labels for the x- and y-axis. Expects a
|
x.axis.limits |
The limits of the x-axis. Defaults to
|
x.axis.format |
The format of the x-axis labels. Defaults to |
title |
Plot title. Expects a |
subtitle |
Plot subtitle. Expects a |
interactive |
Should the plot be interactive? Expects a |
facetting |
Should an automated facet by grouping be applied? Default is
|
jco_color |
Should the |
... |
Other options that get passed to the main geom function. Can be used to adjust to adjust size, linewidth, or linetype. |
Details
The default scaling of the y-axis is a symlog
scale, which is a logarithmic
scale that only starts scaling after a given threshold (default = 0). This
enables values of 0 in the plot, which are common in light logger data, and
even enables negative values, which might be sensible for non-light data. See
symlog_trans()
for details on tweaking this scale. The scale can also be
changed to a normal or logarithmic scale - see the y.scale argument for more.
Value
A ggplot object
Examples
dataset <-
sample.data.environment %>%
aggregate_Datetime(unit = "5 mins")
dataset %>% gg_days()
#restrict the x-axis to 3 days
dataset %>%
gg_days(
x.axis.limits = \(x) Datetime_limits(x, length = lubridate::ddays(3))
)
Double Plots
Description
The function is by default opinionated, and will automatically select the best way to display the double date plot. However, the user can also manually select the type of double date plot to be displayed: repeating each day (default when there is only one day in all of the groups), or displaying consecutive days (default when there are multiple days in the groups).
Usage
gg_doubleplot(
dataset,
Datetime.colname = Datetime,
type = c("auto", "repeat", "next"),
geom = "ribbon",
alpha = 0.5,
col = "grey40",
fill = "#EFC000FF",
linewidth = 0.4,
x.axis.breaks.next = Datetime_breaks,
x.axis.format.next = "%a %D",
x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "6 hours", shift =
lubridate::duration(0, "hours")),
x.axis.format.repeat = "%H:%M",
...
)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
type |
One of "auto", "repeat", or "next". The default is "auto", which will automatically select the best way to display the double date plot based on the amount of days in the dataset ( |
geom |
The type of geom to be used in the plot. The default is "ribbon". |
alpha , linewidth |
The alpha and linewidth setting of the geom. The default is 0.5 and 0.4, respectively. |
col , fill |
The color and fill of the geom. The default is "#EFC000FF". If the parameters |
x.axis.breaks.next , x.axis.breaks.repeat |
Datetime breaks when consecutive days are displayed ( |
x.axis.format.next , x.axis.format.repeat |
Datetime label format when consecutive days are displayed ( |
... |
Arguments passed to |
Details
gg_doubleplot()
is a wrapper function for gg_days()
, combined with an internal function to duplicate and reorganize dates in a dataset for a double plot view. This means that the same day is displayed multiple times within the plot in order to reveal pattern across days.
Value
a ggplot object
Examples
#take only the Participant data from sample data, and three days
library(dplyr)
library(lubridate)
library(ggplot2)
sample.data <-
sample.data.environment %>%
dplyr::filter(Id == "Participant") %>%
filter_Date(length = ddays(3))
#create a double plot with the default settings
sample.data %>% gg_doubleplot()
#repeat the same day in the plot
sample.data %>% gg_doubleplot(type = "repeat")
#more examples that are not executed for computation time:
#use the function with more than one Id
sample.data.environment %>%
filter_Date(length = ddays(3)) %>%
gg_doubleplot(aes_fill = Id, aes_col = Id) +
facet_wrap(~ Date.data, ncol = 1, scales = "free_x", strip.position = "left")
#if data is already grouped by days, type = "repeat" will be automatic
sample.data.environment %>%
dplyr::group_by(Date = date(Datetime), .add = TRUE) %>%
filter_Date(length = ddays(3)) %>%
gg_doubleplot(aes_fill = Id, aes_col = Id) +
guides(fill = "none", col = "none") + #remove the legend
facet_wrap(~ Date.data, ncol = 1, scales = "free_x", strip.position = "left")
#combining `aggregate_Date()` with `gg_doubleplot()` easily creates a good
#overview of the data
sample.data.environment %>%
aggregate_Date() %>%
gg_doubleplot()
Visualize gaps and irregular data
Description
gg_gaps()
is built upon gg_days()
, gap_finder()
, and gg_state()
to
visualize where gaps and irregular data in a dataset are. The function does
not differentiate between implicit gaps
, which are missing timestamps of
the regular interval, explicit gaps
, which are NA
values. Optionally, the
function shows irregular data
, which are datapoints that fall outside the
regular interval.
Usage
gg_gaps(
dataset,
Variable.colname = MEDI,
Datetime.colname = Datetime,
fill.gaps = "red",
col.irregular = "red",
alpha = 0.5,
on.top = FALSE,
epoch = "dominant.epoch",
full.days = TRUE,
show.irregulars = FALSE,
group.by.days = FALSE,
include.implicit.gaps = TRUE,
...
)
Arguments
dataset |
A light logger dataset. Expects a |
Variable.colname |
Variable that becomes the basis for gap analysis. expects a symbol |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
fill.gaps |
Fill color for the gaps |
col.irregular |
Dot color for irregular data |
alpha |
A numerical value between 0 and 1 representing the transparency of the gaps Default is 0.5. |
on.top |
Logical scalar. If |
epoch |
The epoch to use for the gapless sequence. Can be either a
|
full.days |
Logical. Whether full days are expected, even for the first and last measurement |
show.irregulars |
Logical. Show irregular data points. Default is
|
group.by.days |
Logical. Whether data should be grouped by days. This can make sense if only very few days from large groups are affected |
include.implicit.gaps |
Logical. Whether the time series should be expanded only the current observations taken. |
... |
Additional arguments given to |
Value
a ggplot
object with all gaps and optionally irregular data.
Groups that do not have any gaps nor irregular data will be removed for
clarity. Null if no groups remain
Examples
#calling gg_gaps on a healthy dataset is pointless
sample.data.environment |> gg_gaps()
#creating a gapped and irregular dataset
bad_dataset <-
sample.data.environment |>
aggregate_Datetime(unit = "5 mins") |>
dplyr::filter(Id == "Participant") |>
filter_Date(length = "2 days") |>
dplyr::mutate(
Datetime = dplyr::if_else(
lubridate::date(Datetime) == max(lubridate::date(Datetime)),
Datetime, Datetime + 1
)
) |>
dplyr::filter(MEDI <250)
bad_dataset |> has_gaps()
bad_dataset |> has_irregulars()
#by default, gg_gaps() only shows gaps
bad_dataset |> gg_gaps()
#it can also show irregular data
bad_dataset |> gg_gaps(show.irregulars = TRUE)
Plot a heatmap across days and times of day
Description
This function plots a heatmap of binned values across the day over all days
in a group. It also allows doubleplot functionality. **gg_heatmap()
does
not work with the additive functions gg_photoperiod()
and gg_state()
.
Usage
gg_heatmap(
dataset,
Variable.colname = MEDI,
Datetime.colname = Datetime,
unit = "1 hour",
doubleplot = c("no", "same", "next"),
date.title = "Date",
date.breaks = 1,
date.labels = "%d/%m",
time.title = "Local time (HH:MM)",
time.breaks = hms::hms(hours = seq(0, 48, by = 6)),
time.labels = "%H:%M",
fill.title = "Illuminance\n(lx, mel EDI)",
fill.scale = "symlog",
fill.labels = function(x) format(x, scientific = FALSE, big.mark = " "),
fill.breaks = c(-10^(5:0), 0, 10^(0:5)),
fill.limits = c(0, 10^5),
fill.remove = FALSE,
...
)
Arguments
dataset |
A light dataset |
Variable.colname |
The column name of the variable to display. Defaults
to |
Datetime.colname |
The column name of the datetime column. Defaults to
|
unit |
level of aggregation for |
doubleplot |
Should the data be plotted as a doubleplot. Default is "no". "next" will plot the respective next day after the first, "same" will plot the same day twice. |
date.title |
Title text of the y-axis. Defaults to |
date.breaks |
Spacing of date breaks. Defaults to |
date.labels |
Formatting code of the date labels |
time.title |
Title text of the x-axis. Defaults to |
time.breaks |
Spacing of time breaks. Defauls to every six hours. |
time.labels |
Formatting code of the time labels |
fill.title |
Title text of the value (fill) scale. |
fill.scale |
Scaling of the value (fill) scale. Defaults to |
fill.labels |
Formula to format the label values. |
fill.breaks |
Breaks in the fill scale |
fill.limits |
Limits of the fill scale. A length-2 numeric with the
lower and upper scale. If one is replaced with |
fill.remove |
Logical. Should the fill scale be removed? Handy when the fill scale is to be replaced by another scale without the console messages warning about existing scale |
... |
Other arguments to provide to the underlying
|
Details
The function uses ggplot2::scale_fill_viridis_c()
for the fill scale. The
scale can be substituted by any other scale via the standard +
command of
ggplot2. It is recommended to set fill.remove = TRUE
to reduce warnings.
Value
A ggplot object
Examples
sample.data.environment |> gg_heatmap()
#heatmap with doubleplot
sample.data.environment |> gg_heatmap(doubleplot = "next")
#change the unit of aggregation
sample.data.environment |> gg_heatmap(unit = "5 mins")
#change the limits of the fill scale
sample.data.environment |> gg_heatmap(fill.limits = c(0, 10^4))
Plot an overview of dataset intervals with implicit missing data
Description
Plot an overview of dataset intervals with implicit missing data
Usage
gg_overview(
dataset,
Datetime.colname = Datetime,
Id.colname = Id,
gap.data = NULL,
...,
interactive = FALSE
)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Id.colname |
The column name of the Id column (default is |
gap.data |
Optionally provide a |
... |
Additional arguments given to the main |
interactive |
Should the plot be interactive? Expects a |
Value
A ggplot
object
Examples
sample.data.environment %>% gg_overview()
Add photoperiods to gg_day() or gg_days() plots
Description
gg_photoperiod()
is a helper function to add photoperiod information to
plots generated with gg_day()
or gg_days()
. The function can either draw
on the dawn
and dusk
columns of the dataset or use the coordinates
and
solarDep
arguments to calculate the photoperiods. The time series must be
based on a column called Datetime
.
Usage
gg_photoperiod(
ggplot_obj,
coordinates = NULL,
alpha = 0.2,
solarDep = 6,
on.top = FALSE,
...
)
Arguments
ggplot_obj |
A |
coordinates |
A two element numeric vector representing the latitude and
longitude of the location. If |
alpha |
A numerical value between 0 and 1 representing the transparency of the photoperiods. Default is 0.2. |
solarDep |
A numerical value representing the solar depression angle
between 90 and -90. This means a value of 6 equals -6 degrees above the
horizon. Default is 6, equalling |
on.top |
Logical scalar. If |
... |
Additional arguments given to the |
Details
If used in combination with gg_doubleplot()
, with that function in the
type = "repeat"
setting (either manually set, or because there is only one
day of data per group present), photoperiods need to be added separately
through add_photoperiod()
, or the second photoperiod in each panel will be
off by one day. See the examples for more information.
In general, if the photoperiod setup is more complex, it makes sense to add it prior to plotting and make sure the photoperiods are correct.
Value
a modified ggplot
object with the photoperiods added.
See Also
Other photoperiod:
photoperiod()
Examples
coordinates <- c(48.521637, 9.057645)
#adding photoperiods to a ggplot
sample.data.environment |>
gg_days() |>
gg_photoperiod(coordinates)
#adding photoperiods prior to plotting
sample.data.environment |>
add_photoperiod(coordinates, solarDep = 0) |>
gg_days() |>
gg_photoperiod()
#more examples that are not executed for computation time:
#plotting photoperiods automatically works for both gg_day() and gg_days()
sample.data.environment |>
gg_day() |>
gg_photoperiod(coordinates)
#plotting for gg_doubleplot mostly works fine
sample.data.environment |>
filter_Date(length = "2 days") |>
gg_doubleplot() |>
gg_photoperiod(coordinates)
#however, in cases where only one day of data per group is available, or the
#type = "repeat" setting is used, the photoperiods need to be added
#separately. Otherwise the second day will be off by one day in each panel.
#The visual difference is subtle, and might not be visible at all, as
#photoperiod only every changes by few minutes per day.
#WRONG
sample.data.environment |>
filter_Date(length = "1 days") |>
gg_doubleplot() |>
gg_photoperiod(coordinates)
#CORRECT
sample.data.environment |>
filter_Date(length = "1 days") |>
add_photoperiod(coordinates) |>
gg_doubleplot() |>
gg_photoperiod()
Add states to gg_day() or gg_days() plots
Description
gg_state()
is a helper function to add state information to plots generated
with gg_day()
, gg_days()
, or gg_doubleplot()
. The function can draw on
any column in the dataset, but factor-like or logical columns make the most
sense. The time series must be based on a column called Datetime
.
Usage
gg_state(
ggplot_obj,
State.colname,
aes_fill = NULL,
aes_col = NULL,
alpha = 0.2,
on.top = FALSE,
ignore.FALSE = TRUE,
...
)
Arguments
ggplot_obj |
A |
State.colname |
The colname of the state to add to the plot. Must be
part of the dataset. Expects a |
aes_fill , aes_col |
conditional aesthetics for |
alpha |
A numerical value between 0 and 1 representing the transparency of the states. Default is 0.2. |
on.top |
Logical scalar. If |
ignore.FALSE |
Logical that drops |
... |
Additional arguments given to the |
Value
a modified ggplot
object with the states added.
Examples
#creating a simple TRUE/FALSE state in the sample data: Light above 250 lx mel EDI
#and a second state that cuts data into chunks relating to the Brown et al. 2022 thresholds
#(+aggregating Data to 5 minute intervals & reducing it to three days)
state_data <-
sample.data.environment |>
dplyr::mutate(state = MEDI > 250) |>
Brown_cut(MEDI, state2) |>
aggregate_Datetime(unit = "5 mins") |>
filter_Datetime(length = "3 days")
state_data |>
gg_days() |>
gg_state(state)
#state 2 has more than one valid state, thus we need to assign a fill aesthetic
state_data |>
gg_days() |>
gg_state(state2, aes_fill = state2) +
ggplot2::scale_fill_manual(values=c("#868686FF", "#EFC000FF", "#0073C2FF"))
#this line is simply for sensible colors
#same, but with gg_day()
state_data |>
dplyr::filter(Id == "Participant") |>
gg_day(geom = "line") |>
gg_state(state, fill = "red")
#more complex state
state_data |>
dplyr::filter(Id == "Participant") |>
gg_day(geom = "line") |>
gg_state(state2, aes_fill = state2)
#with gg_doubleplot
state_data |>
dplyr::filter(Id == "Participant") |>
gg_doubleplot() |>
gg_state(state2, aes_fill = state2)
Does a dataset have implicit gaps
Description
Returns TRUE
if there are implicit gaps in the dataset and FALSE
if it is
gapless. Gaps can make sense depending on the grouping structure, but the
general sequence of Datetimes within a dataset should always be gapless.
Usage
has_gaps(dataset, Datetime.colname = Datetime, epoch = "dominant.epoch")
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
Value
logical
See Also
Other regularize:
dominant_epoch()
,
extract_gaps()
,
gap_finder()
,
gap_handler()
,
gapless_Datetimes()
,
has_irregulars()
Examples
#The sample dataset does not have gaps
sample.data.environment |> has_gaps()
#removing some of the data creates gaps
sample.data.environment |> dplyr::filter(MEDI <= 50000) |> has_gaps()
#having a grouped dataframe where the groups span multiple unconnected parts
#is considered a gap, which can be relevant, e.g., when searching for clusters
sample.data.environment |>
add_photoperiod(c(47.1, 10)) |>
dplyr::group_by(photoperiod.state) |>
has_gaps()
#to avoid this, use `number_states()` for grouping
sample.data.environment |>
add_photoperiod(c(48.52, 9.06)) |>
number_states(photoperiod.state) |>
dplyr::group_by(photoperiod.state.count, .add = TRUE) |>
has_gaps()
Does a dataset have irregular data
Description
Returns TRUE
if there are irregular data in the dataset and FALSE
if not.
Irregular data can make sense if two datasets within a single group are
shifted to one another, e.g., if it contains data from two separate recording
sessions. The second session will be unlikely to have started at the exact
interval timing of the first session. While this is not problematic in itself, it is
still recommended to rectify the Datetimes to a common timestamp if time
precision permits it, e.g., through aggregate_Datetime()
or
cut_Datetime()
.
Usage
has_irregulars(dataset, Datetime.colname = Datetime, epoch = "dominant.epoch")
Arguments
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
Value
logical
See Also
Other regularize:
dominant_epoch()
,
extract_gaps()
,
gap_finder()
,
gap_handler()
,
gapless_Datetimes()
,
has_gaps()
Examples
#the sample dataset does not have any irregular data
sample.data.environment |> has_irregulars()
#even removing some data does not make it irregular, as all the Datetimes
#still fall in the regular interval
sample.data.environment |> dplyr::filter(MEDI <= 50000) |> has_irregulars()
#shifting some of the data will create irregular data
sample.data.environment |>
dplyr::mutate(
Datetime = dplyr::if_else(
sample(c(TRUE, FALSE), dplyr::n(), replace = TRUE), Datetime, Datetime + 1
)
) |>
has_irregulars()
Import a light logger dataset or related data
Description
Imports a dataset and does the necessary transformations to get the right
column formats. Unless specified otherwise, the function will set the
timezone of the data to UTC
. It will also enforce an Id
to separate
different datasets and will order/arrange the dataset within each Id
by
Datetime. See the Details and Devices section for more information and the
full list of arguments.
Usage
import_Dataset(device, ...)
import
Arguments
device |
From what device do you want to import? For a few devices,
there is a sample data file that you can use to test the function (see the
examples). See |
... |
Parameters that get handed down to the specific import functions |
Format
An object of class list
of length 19.
Details
There are specific and a general import function. The general import
function is described below, whereas the specific import functions take the
form of import$device()
. The general import function is a thin wrapper
around the specific import functions. The specific import functions take
the following arguments:
-
filename
: Filename(s) for the Dataset. Can also contain the filepath, butpath
must then beNULL
. Expects acharacter
. If the vector is longer than1
, multiple files will be read in into one Tibble. -
path
: Optional path for the dataset(s).NULL
is the default. Expects acharacter
. -
n_max
: maximum number of lines to read. Default isInf
. -
tz
: Timezone of the data."UTC"
is the default. Expects acharacter
. You can look up the supported timezones withOlsonNames()
. -
Id.colname
: Lets you specify a column for the id of a dataset. Expects a symbol (Default isId
). This column will be used for grouping (dplyr::group_by()
). -
auto.id
: If theId.colname
column is not part of thedataset
, theId
can be automatically extracted from the filename. The argument expects a regular expression regex and will by default just give the whole filename without file extension. -
manual.id
: If this argument is notNULL
, and noId
column is part of thedataset
, thischaracter
scalar will be used. We discourage the use of this arguments when importing more than one file -
silent
: If set toTRUE
, the function will not print a summary message of the import or plot the overview. Default isFALSE
. -
locale
: The locale controls defaults that vary from place to place. -
not.before
: Remove data prior to this date. This argument is provided tostart
offilter_Date()
. Data will be filtered out before any of the summaries are shown. -
dst_adjustment
: If a file crosses daylight savings time, but the device does not adjust time stamps accordingly, you can set this argument toTRUE
, to apply this shift manually. It is selective, so it will only be done in files that cross between DST and standard time. Default isFALSE
. Usesdst_change_handler()
to do the adjustment. Look there for more infos. It is not equipped to handle two jumps in one file (so back and forth between DST and standard time), but will work fine if jums occur in separate files. -
auto.plot
: a logical on whether to callgg_overview()
after import. Default isTRUE
. But is set toFALSE
if the argumentsilent
is set toTRUE
. -
...
: supply additional arguments to the readr import functions, likena
. Might also be used to supply arguments to the specific import functions, likecolumn_names
forActiwatch_Spectrum
devices. Those devices will always throw a helpful error message if you forget to supply the necessary arguments. If theId
column is already part of thedataset
it will just use this column. If the column is not present it will add this column and fill it with the filename of the importfile (see paramauto.id
). -
print_n
can be used if you want to see more rows from the observation intervals -
remove_duplicates
can be used if identical observations are present within or across multiple files. The default isFALSE
. The function keeps only unique observations (=rows) if set to' TRUE'. This is a convenience implementation ofdplyr::distinct()
.
Value
Tibble/Dataframe with a POSIXct column for the datetime
Devices
The set of import functions provide a convenient way to
import light logger data that is then perfectly formatted to add metadata,
make visualizations and analyses. There are a number of devices supported,
where import should just work out of the box. To get an overview, you can
simply call the supported_devices()
dataset. The list will grow
continuously as the package is maintained.
supported_devices() #> [1] "ActLumus" "ActTrust" "Actiwatch_Spectrum" #> [4] "Actiwatch_Spectrum_de" "Circadian_Eye" "Clouclip" #> [7] "DeLux" "GENEActiv_GGIR" "Kronowise" #> [10] "LIMO" "LYS" "LiDo" #> [13] "LightWatcher" "MotionWatch8" "OcuWEAR" #> [16] "Speccy" "SpectraWear" "VEET" #> [19] "nanoLambda"
ActLumus
Manufacturer: Condor Instruments
Model: ActLumus
Implemented: Sep 2023
A sample file is provided with the package, it can be accessed through
system.file("extdata/205_actlumus_Log_1020_20230904101707532.txt.zip", package = "LightLogR")
. It does not need to be unzipped to be imported.
This sample file is a good example for a regular dataset without gaps
LYS
Manufacturer: LYS Technologies
Model: LYS Button
Implemented: Sep 2023
A sample file is provided with the package, it can be accessed through
system.file("extdata/sample_data_LYS.csv", package = "LightLogR")
. This
sample file is a good example for an irregular dataset.
Actiwatch_Spectrum & Actiwatch_Spectrum_de
Manufacturer: Philips Respironics
Model: Actiwatch Spectrum
Implemented: Nov 2023 / July 2024
Important note: The Actiwatch_Spectrum
function is for an international/english formatting. The Actiwatch_Spectrum_de
function is for a german formatting, which slightly differs in the datetime format, the column names, and the decimal separator.
ActTrust
Manufacturer: Condor Instruments
Model: ActTrust1, ActTrust2
Implemented: Mar 2024
This function works for both ActTrust 1 and 2 devices
Speccy
Manufacturer: Monash University
Model: Speccy
Implemented: Feb 2024
DeLux
Manufacturer: Intelligent Automation Inc
Model: DeLux
Implemented: Dec 2023
LiDo
Manufacturer: University of Lucerne
Model: LiDo
Implemented: Nov 2023
SpectraWear
Manufacturer: University of Manchester
Model: SpectraWear
Implemented: May 2024
NanoLambda
Manufacturer: NanoLambda
Model: XL-500 BLE
Implemented: May 2024
LightWatcher
Manufacturer: Object-Tracker
Model: LightWatcher
Implemented: June 2024
VEET
Manufacturer: Meta Reality Labs
Model: VEET
Implemented: July 2024
Required Argument: modality
A character scalar describing the
modality to be imported from. Can be one of "ALS"
(Ambient light sensor),
"IMU"
(Inertial Measurement Unit), "INF"
(Information), "PHO"
(Spectral Sensor), "TOF"
(Time of Flight)
Circadian_Eye
Manufacturer: Max-Planck-Institute for Biological Cybernetics, Tübingen
Model: melanopiQ Circadian Eye (Prototype)
Implemented: July 2024
Kronowise
Manufacturer: Kronohealth
Model: Kronowise
Implemented: July 2024
GENEActiv with GGIR preprocessing
Manufacturer: Activeinsights
Model: GENEActiv
Note: This import function takes GENEActiv data that was preprocessed
through the GGIR package. By
default, GGIR
aggregates light data into intervals of 15 minutes. This
can be set by the windowsizes
argument in GGIR, which is a three-value
vector, where the second values is set to 900 seconds by default. To import
the preprocessed data with LightLogR
, the filename
argument requires a
path to the parent directory of the GGIR output folders, specifically the
meta
folder, which contains the light exposure data. Multiple filename
s
can be specified, each of which needs to be a path to a different GGIR
parent directory. GGIR exports can contain data from multiple participants,
these will always be imported fully by providing the parent directory. Use
the pattern
argument to extract sensible Id
s from the .RData
filenames within the meta/basic/ folder. As per the author, Dr. Vincent van Hees, GGIR preprocessed data are always in
local time, provided the desiredtz
/configtz
are properly set in GGIR.
LightLogR
still requires a timezone to be set, but will not timeshift the
import data.
MotionWatch 8
Manufacturer: CamNtech
Implemented: September 2024
LIMO
Manufacturer: ENTPE
Implemented: September 2024
LIMO exports LIGHT
data and IMU
(inertia measurements, also UV) in
separate files. Both can be read in with this function, but not at the same
time. Please decide what type of data you need and provide the respective
filenames.
OcuWEAR
Manufacturer: Ocutune
Implemented: September 2024
OcuWEAR data contains spectral data. Due to the format of the data file,
the spectrum is not directly part of the tibble, but rather a list column
of tibbles within the imported data, containing a Wavelength
(nm) and
Intensity
(mW/m^2) column.
Clouclip
Manufacturer: Clouclip
Implemented: April 2025
Clouclip export files have the ending .xls
, but are no real Microsoft
Excel files, rather they are tab-separated text files. LightLogR thus does
not read them in with an excel import routine. The measurement columns
Lux
and Dis
contain sentinel values. -1
(Dis
and Lux
) indicates
sleep mode, whereas 204
(only Dis
) indicates an out of range
measurement. These values will be set to NA
, and an additional column is
added that translates these status codes. The columns carry the name
{.col}_status
.
Examples
Imports made easy
To import a file, simple specify the filename (and path) and feed it to the
import_Dataset
function. There are sample datasets for all devices.
The import functions provide a basic overview of the data after import, such as the intervals between measurements or the start and end dates.
filepath <- system.file("extdata/205_actlumus_Log_1020_20230904101707532.txt.zip", package = "LightLogR") dataset <- import_Dataset("ActLumus", filepath, auto.plot = FALSE) #> #> Successfully read in 61'016 observations across 1 Ids from 1 ActLumus-file(s). #> Timezone set is UTC. #> The system timezone is Europe/Berlin. Please correct if necessary! #> #> First Observation: 2023-08-28 08:47:54 #> Last Observation: 2023-09-04 10:17:04 #> Timespan: 7.1 days #> #> Observation intervals: #> Id interval.time n pct #> 1 205_actlumus_Log_1020_20230904101707532.txt 10s 61015 100%
Import functions can also be called directly:
dataset <- import$ActLumus(filepath, auto.plot = FALSE) #> #> Successfully read in 61'016 observations across 1 Ids from 1 ActLumus-file(s). #> Timezone set is UTC. #> The system timezone is Europe/Berlin. Please correct if necessary! #> #> First Observation: 2023-08-28 08:47:54 #> Last Observation: 2023-09-04 10:17:04 #> Timespan: 7.1 days #> #> Observation intervals: #> Id interval.time n pct #> 1 205_actlumus_Log_1020_20230904101707532.txt 10s 61015 100%
dataset %>% dplyr::select(Datetime, TEMPERATURE, LIGHT, MEDI, Id) %>% dplyr::slice(1500:1505) #> # A tibble: 6 x 5 #> # Groups: Id [1] #> Datetime TEMPERATURE LIGHT MEDI Id #> <dttm> <dbl> <dbl> <dbl> <fct> #> 1 2023-08-28 12:57:44 26.9 212. 202. 205_actlumus_Log_1020_20230904101~ #> 2 2023-08-28 12:57:54 26.9 208. 199. 205_actlumus_Log_1020_20230904101~ #> 3 2023-08-28 12:58:04 26.9 205. 196. 205_actlumus_Log_1020_20230904101~ #> 4 2023-08-28 12:58:14 26.8 204. 194. 205_actlumus_Log_1020_20230904101~ #> 5 2023-08-28 12:58:24 26.9 203. 194. 205_actlumus_Log_1020_20230904101~ #> 6 2023-08-28 12:58:34 26.8 204. 195. 205_actlumus_Log_1020_20230904101~
See Also
Import data that contain Datetimes
of Statechanges
Description
Auxiliary data greatly enhances data analysis. This function allows the
import of files that contain Statechanges
, i.e., specific time points of
when a State
(like sleep
or wake
) starts.
Usage
import_Statechanges(
filename,
path = NULL,
sep = ",",
dec = ".",
structure = c("wide", "long"),
Datetime.format = "ymdHMS",
tz = "UTC",
State.colnames,
State.encoding = State.colnames,
Datetime.column = Datetime,
Id.colname,
State.newname = State,
Id.newname = Id,
keep.all = FALSE,
silent = FALSE
)
Arguments
filename |
Filename(s) for the Dataset. Can also contain the filepath,
but |
path |
Optional path for the dataset(s). |
sep |
String that separates columns in the import file. Defaults to
|
dec |
String that indicates a decimal separator in the import file.
Defaults to |
structure |
String that specifies whether the import file is in the
|
Datetime.format |
String that specifies the format of the |
tz |
Timezone of the data. |
State.colnames |
Column name or vector of column names (the latter only
in the
|
State.encoding |
In the |
Datetime.column |
Symbol of the
|
Id.colname |
Symbol of the column that contains the |
State.newname |
Symbol of the column that will contain the |
Id.newname |
Column name used for renaming the |
keep.all |
Logical that specifies whether all columns should be
kept in the output. Defaults to |
silent |
Logical that specifies whether a summary of the
imported data should be shown. Defaults to |
Details
Data can be present in the long or wide format.
In the
wide
format, multipleDatetime
columns indicate the state through the column name. These get pivoted to thelong
format and can be recoded through theState.encoding
argument.In the
long
format, one column indicates theState
, while the other gives theDatetime
.
Value
a dataset with the ID
, State
, and Datetime
columns. May contain
additional columns if keep.all
is TRUE
.
Examples
#get the example file from within the package
path <- system.file("extdata/",
package = "LightLogR")
file.sleep <- "205_sleepdiary_all_20230904.csv"
#import Data in the wide format (sleep/wake times)
import_Statechanges(file.sleep, path,
Datetime.format = "dmyHM",
State.colnames = c("sleep", "offset"),
State.encoding = c("sleep", "wake"),
Id.colname = record_id,
sep = ";",
dec = ",")
#import in the long format (Comments on sleep)
import_Statechanges(file.sleep, path,
Datetime.format = "dmyHM",
State.colnames = "comments",
Datetime.column = sleep,
Id.colname = record_id,
sep = ";",
dec = ",", structure = "long")
Adjust device imports or make your own
Description
Adjust device imports or make your own
Usage
import_adjustment(import_expr)
Arguments
import_expr |
A named list of import expressions. The basis for
|
Details
This function should only be used with some knowledge of how
expressions work in R. The minimal required output for an expression to work
as expected, it must lead to a data frame containing a Datetime
column with
the correct time zone. It has access to all arguments defined in the
description of import_Dataset()
. The ...
argument should be passed to
whatever csv reader function is used, so that it works as expected. Look at
ll_import_expr()$ActLumus
for a quite minimal example.
Value
A list of import functions
Examples
#create a new import function for the ActLumus device, same as the old
new_import <- import_adjustment(ll_import_expr())
#the new one is identical to the old one in terms of the function body
identical(body(import$ActLumus), body(new_import$ActLumus))
#change the import expression for the LYS device to add a message at the top
new_import_expr <- ll_import_expr()
new_import_expr$ActLumus[[4]] <-
rlang::expr({ cat("**This is a new import function**\n")
data
})
new_import <- import_adjustment(new_import_expr)
filepath <-
system.file("extdata/205_actlumus_Log_1020_20230904101707532.txt.zip",
package = "LightLogR")
#Now, a message is printed when the import function is called
data <- new_import$ActLumus(filepath, auto.plot = FALSE)
Interdaily stability (IS)
Description
This function calculates the variability of 24h light exposure patterns across multiple days. Calculated as the ratio of the variance of the average daily pattern to the total variance across all days. Calculated with mean hourly light levels. Ranges between 0 (Gaussian noise) and 1 (Perfect Stability).
Usage
interdaily_stability(
Light.vector,
Datetime.vector,
use.samplevar = FALSE,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Datetime.vector |
Vector containing the time data. Must be POSIXct. |
use.samplevar |
Logical. Should the sample variance be used (divide by N-1)?
By default ( |
na.rm |
Logical. Should missing values be removed? Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Details
Note that this metric will always be 1 if the data contains only one 24 h day.
Value
Numeric value or dataframe with column 'IS'.
References
Van Someren, E. J. W., Swaab, D. F., Colenda, C. C., Cohen, W., McCall, W. V., & Rosenquist, P. B. (1999). Bright Light Therapy: Improved Sensitivity to Its Effects on Rest-Activity Rhythms in Alzheimer Patients by Application of Nonparametric Methods. Chronobiology International, 16(4), 505–518. doi:10.3109/07420529908998724
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
set.seed(1)
N <- 24 * 7
# Calculate metric for seven 24 h days with two measurements per hour
dataset1 <-
tibble::tibble(
Id = rep("A", N * 2),
Datetime = lubridate::as_datetime(0) + c(lubridate::minutes(seq(0, N * 60 - 30, 30))),
MEDI = sample(1:1000, N * 2)
)
dataset1 %>%
dplyr::summarise(
"Interdaily stability" = interdaily_stability(MEDI, Datetime)
)
Adds a state column to a dataset from interval data
Description
This function can make use of Interval
data that contain States
(like
"sleep"
, "wake"
, "wear"
) and add a column to a light logger dataset
,
where the State
of every Datetime
is specified, based on the
participant's Id
.
Usage
interval2state(
dataset,
State.interval.dataset,
Datetime.colname = Datetime,
State.colname = State,
Interval.colname = Interval,
Id.colname.dataset = Id,
Id.colname.interval = Id,
overwrite = FALSE,
output.dataset = TRUE
)
Arguments
dataset |
A light logger dataset. Expects a |
State.interval.dataset |
Name of the dataset that contains |
Datetime.colname |
column name that contains the datetime. Defaults to
|
State.colname , Interval.colname |
Column names of the |
Id.colname.dataset , Id.colname.interval |
Column names of the
participant's |
overwrite |
If |
output.dataset |
should the output be a |
Value
One of
a
data.frame
object identical todataset
but with the state column addeda
vector
with the states
Examples
#create a interval dataset
library(tibble)
library(dplyr)
library(lubridate)
library(rlang)
library(purrr)
states <- tibble::tibble(Datetime = c("2023-08-15 6:00:00",
"2023-08-15 23:00:00",
"2023-08-16 6:00:00",
"2023-08-16 22:00:00",
"2023-08-17 6:30:00",
"2023-08-18 1:00:00",
"2023-08-18 6:00:00",
"2023-08-18 22:00:00",
"2023-08-19 6:00:00",
"2023-08-19 23:00:00",
"2023-08-20 6:00:00",
"2023-08-20 22:00:00"),
State = rep(c("wake", "sleep"), 6),
Wear = rep(c("wear", "no wear"), 6),
Performance = rep(c(100, 0), 6),
Id = "Participant")
intervals <- sc2interval(states)
#create a dataset with states
dataset_with_states <-
sample.data.environment %>%
interval2state(State.interval.dataset = intervals)
#visualize the states - note that the states are only added to the respective ID in the dataset
library(ggplot2)
ggplot(dataset_with_states, aes(x = Datetime, y = MEDI, color = State)) +
geom_point() +
facet_wrap(~Id, ncol = 1)
#import multiple State columns from the interval dataset
#interval2state will only add a single State column to the dataset,
#which represents sleep/wake in our case
dataset_with_states[8278:8283,]
#if we want to add multiple columns we can either perfom the function
#multiple times with different states:
dataset_with_states2 <-
dataset_with_states %>%
interval2state(State.interval.dataset = intervals, State.colname = Wear)
dataset_with_states2[8278:8283,]
#or we can use `purrr::reduce` to add multiple columns at once
dataset_with_states3 <-
syms(c("State", "Wear", "Performance")) %>%
reduce(\(x,y) interval2state(x, State.interval.dataset = intervals, State.colname = !!y),
.init = sample.data.environment)
#Note:
# - the State.colnames have to be provided as symbols (`rlang::syms`)
# - the reduce function requires a two argument function `\(x,y)`, where `x`
# is the dataset to be continiously modified and `y` is the symbol of the
# State column name to be added
# - the `!!` operator from `rlang` is used to exchange `y` with each symbol
# - the `.init` argument is the initial dataset to be modified
#this results in all states being applied
dataset_with_states3[8278:8283,]
Intradaily variability (IV)
Description
This function calculates the variability of consecutive Light levels within a 24h day. Calculated as the ratio of the variance of the differences between consecutive Light levels to the total variance across the day. Calculated with mean hourly Light levels. Higher values indicate more fragmentation.
Usage
intradaily_variability(
Light.vector,
Datetime.vector,
use.samplevar = FALSE,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Datetime.vector |
Vector containing the time data. Must be POSIXct. |
use.samplevar |
Logical. Should the sample variance be used (divide by N-1)?
By default ( |
na.rm |
Logical. Should missing values be removed? Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Value
Numeric value or dataframe with column 'IV'.
References
Van Someren, E. J. W., Swaab, D. F., Colenda, C. C., Cohen, W., McCall, W. V., & Rosenquist, P. B. (1999). Bright Light Therapy: Improved Sensitivity to Its Effects on Rest-Activity Rhythms in Alzheimer Patients by Application of Nonparametric Methods. Chronobiology International, 16(4), 505–518. doi:10.3109/07420529908998724
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
set.seed(1)
N <- 24 * 2
# Calculate metric for two 24 h days with two measurements per hour
dataset1 <-
tibble::tibble(
Id = rep("A", N * 2),
Datetime = lubridate::as_datetime(0) + c(lubridate::minutes(seq(0, N * 60 - 30, 30))),
MEDI = sample(1:1000, N * 2)
)
dataset1 %>%
dplyr::summarise(
"Intradaily variability" = intradaily_variability(MEDI, Datetime)
)
Join similar Datasets
Description
Join Light logging datasets that have a common structure. The least commonality are identical columns for Datetime
and Id
across all sets.
Usage
join_datasets(
...,
Datetime.column = Datetime,
Id.column = Id,
add.origin = FALSE,
debug = FALSE
)
Arguments
... |
|
Datetime.column , Id.column |
Column names for the |
add.origin |
Should a column named |
debug |
Output changes to a tibble indicating which dataset is missing the respective |
Value
One of
a
data.frame
of joined datasetsa
tibble
of datasets with missing columns. Only ifdebug = TRUE
Examples
#load in two datasets
path <- system.file("extdata",
package = "LightLogR")
file.LL <- "205_actlumus_Log_1020_20230904101707532.txt.zip"
file.env <- "cyepiamb_CW35_Log_1431_20230904081953614.txt.zip"
dataset.LL <- import$ActLumus(file.LL, path = path, auto.id = "^(\\d{3})")
dataset.env <- import$ActLumus(file.env, path = path, manual.id = "CW35")
#join the datasets
joined <- join_datasets(dataset.LL, dataset.env)
#compare the number of rows
nrow(dataset.LL) + nrow(dataset.env) == nrow(joined)
#debug, when set to TRUE, will output a tibble of datasets with missing necessary columns
dataset.LL <- dataset.LL %>% dplyr::select(-Datetime)
join_datasets(dataset.LL, dataset.env, debug = TRUE)
Get the import expression for a device
Description
Returns the import expression for all device in LightLogR.
Usage
ll_import_expr()
Details
These expressions are used to import and prepare data from specific devices.
The list is made explicit, so that a user, requiring slight changes to the
import functions, (e.g., because a timestamp is formatted differently) can
modify or add to the list. The list can be turned into a fully functional
import function through import_adjustment()
.
Value
A list of import expressions for all supported devices
See Also
import_Dataset, import_Dataset
Examples
ll_import_expr()[1]
Add a defined number to a numeric and log transform it
Description
Frequently, light exposure data need to be log-transformed. Because light exposure data frequently also contain many zero-values, adding a small value avoids losing those observations. Must be applied with care and reported.
exp_zero_inflated()
is the reverse function to log_zero_inflated()
.
Usage
log_zero_inflated(x, offset = 0.1, base = 10)
exp_zero_inflated(x, offset = 0.1, base = 10)
Arguments
x |
A numeric vector |
offset |
the amount to add to |
base |
The logarithmic base, by default |
Value
a transformed numeric vector
References
Johannes Zauner, Carolina Guidolin, Manuel Spitschan (2025) How to deal with darkness: Modelling and visualization of zero-inflated personal light exposure data on a logarithmic scale. bioRxiv. doi: https://doi.org/10.1101/2024.12.30.630669
Examples
c(0, 1, 10, 100, 1000, 10000) |> log_zero_inflated()
#For use in a function
sample.data.environment |>
dplyr::filter(Id == "Participant") |>
dplyr::group_by(Date = lubridate::wday(Datetime, label = TRUE, week_start = 1)) |>
dplyr::summarize(
TAT250 = duration_above_threshold(log_zero_inflated(MEDI),
Datetime,
threshold = log_zero_inflated(250)
)
)
#Calling exp_zero_inflated on data transformed with log_zero_inflated yields to the original result
c(0, 1, 10, 100, 1000, 10000) |> log_zero_inflated() |> exp_zero_inflated()
Calculate mean daily metrics from daily summary
Description
mean_daily
calculates a three-row summary of metrics showing average
weekday, weekend, and mean daily values of all non-grouping numeric columns.
The basis is a dataframe that contains metrics per weekday, or per date (with
calculate.from.Date = Datetime
). The function requires a column specifying
the day of the week as a factor (with Monday as the weekstart), or it can
calculate this from a date column if provided.
Usage
mean_daily(
data,
Weekend.type = Date,
na.rm = TRUE,
calculate.from.Date = NULL,
prefix = "average_",
filter.empty = FALSE,
sub.zero = FALSE,
Datetime2Time = TRUE
)
Arguments
data |
A dataframe containing the metrics to summarize |
Weekend.type |
A column in the dataframe that specifies the day of the week as a factor, where weekstart is Monday (so weekends are 6 and 7 in numeric representation). If it is a date, it will be converted to this factor |
na.rm |
Logical, whether to remove NA values when calculating means. Default is TRUE. |
calculate.from.Date |
Optional. A column in the dataframe containing dates from which to calculate the Weekend.type. If provided, Weekend.type will be generated from this column. |
prefix |
String that is the prefix on summarized values |
filter.empty |
Filter out empty rows. Default is FALSE |
sub.zero |
Logical. Should missing values be replaced by zero? Defaults
to |
Datetime2Time |
Logical of whether POSIXct columns should be transformed
into hms(time) columns, which is usually sensible for averaging (default is
|
Details
Summary values for type POSIXct
are calculated as the mean, which can be
nonsensical at times (e.g., the mean of Day1 18:00 and Day2 18:00, is Day2
6:00, which can be the desired result, but if the focus is on time, rather
then on datetime, it is recommended that values are converted to times via
hms::as_hms()
before applying the function (the mean of 18:00 and 18:00 is
still 18:00, not 6:00).
Value
A dataframe with three rows representing average weekday, weekend, and mean daily values of all numeric columns
Examples
# Create sample data
sample_data <- data.frame(
Date = factor(c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"),
levels = c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun")),
lux = c(250, 300, 275, 280, 290, 350, 320),
duration = lubridate::as.duration(c(120, 130, 125, 135, 140, 180, 160))
)
# Calculate mean daily metrics
mean_daily(sample_data)
# With a Date column
sample_data_with_date <- data.frame(
Date = seq(as.Date("2023-05-01"), as.Date("2023-05-07"), by = "day"),
lux = c(250, 300, 275, 280, 290, 350, 320),
duration = lubridate::as.duration(c(120, 130, 125, 135, 140, 180, 160))
)
mean_daily(sample_data_with_date)
Calculate mean daily metrics from Time Series
Description
mean_daily_metric
is a convenience wrapper around mean_daily
that
summarizes data imported with LightLogR per weekday and calculates mean daily
values for a specific metric. Examples include duration_above_threshold()
(the default), or durations()
.
Usage
mean_daily_metric(
data,
Variable,
Weekend.type = Date,
Datetime.colname = Datetime,
metric_type = duration_above_threshold,
prefix = "average_",
filter.empty = FALSE,
Datetime2Time = TRUE,
...
)
Arguments
data |
A dataframe containing light logger data imported with LightLogR |
Variable |
The variable column to analyze. Expects a |
Weekend.type |
A (new) column in the dataframe that specifies the day of the week as a factor |
Datetime.colname |
Column name containing datetime values. Defaults to
|
metric_type |
The metric function to apply, default is
|
prefix |
String that is the prefix on summarized values |
filter.empty |
Filter out empty rows. Default is FALSE |
Datetime2Time |
Logical of whether POSIXct columns should be transformed
into hms(time) columns, which is usually sensible for averaging (default is
|
... |
Additional arguments passed to the metric function |
Value
A dataframe with three rows representing average weekday, weekend, and mean daily values for the specified metric
Examples
# Calculate mean daily duration above threshold. As the data only contains
# data for two days, Weekend and Mean daily will throw NA
sample.data.irregular |>
aggregate_Datetime(unit = "1 min") |>
mean_daily_metric(
Variable = lux,
threshold = 100
)
# again with another dataset
sample.data.environment |>
mean_daily_metric(
Variable = MEDI,
threshold = 250)
# by default, datetime columns are converted to time
sample.data.environment |>
mean_daily_metric(
Variable = MEDI,
metric_type = timing_above_threshold,
threshold = 250)
Midpoint of cumulative light exposure.
Description
This function calculates the timing corresponding to half of the cumulative light exposure within the given time series.
Usage
midpointCE(Light.vector, Time.vector, na.rm = FALSE, as.df = FALSE)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
na.rm |
Logical. Should missing values be removed for the calculation? If |
as.df |
Logical. Should the output be returned as a data frame? If |
Value
Single column data frame or vector.
References
Shochat, T., Santhi, N., Herer, P., Flavell, S. A., Skeldon, A. C., & Dijk, D.-J. (2019). Sleep Timing in Late Autumn and Late Spring Associates With Light Exposure Rather Than Sun Time in College Students. Frontiers in Neuroscience, 13. doi:10.3389/fnins.2019.00882
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
dataset1 <-
tibble::tibble(
Id = rep("A", 24),
Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset1 %>%
dplyr::reframe(
"Midpoint of cmulative exposure" = midpointCE(MEDI, Datetime)
)
# Dataset with HMS time vector
dataset2 <-
tibble::tibble(
Id = rep("A", 24),
Time = hms::as_hms(lubridate::as_datetime(0) + lubridate::hours(0:23)),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset2 %>%
dplyr::reframe(
"Midpoint of cmulative exposure" = midpointCE(MEDI, Time)
)
# Dataset with duration time vector
dataset3 <-
tibble::tibble(
Id = rep("A", 24),
Hour = lubridate::duration(0:23, "hours"),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
dataset3 %>%
dplyr::reframe(
"Midpoint of cmulative exposure" = midpointCE(MEDI, Hour)
)
Normalize counts between sensor outputs
Description
This is a niche helper function to normalize counts. Some sensors provide raw
counts and gain levels as part of their output. In some cases it is desirable
to compare counts between sensors, e.g., to gauge daylight outside by
comparing UV counts to photopic counts (a high ratio of UV/Pho indicates
outside daylight). Or to gauge daylight inside by comparing IR counts to
photopic counts (a high ratio of IR/Pho with a low ratio of UV/Pho indicates
daylight in the context of LED or fluorescent lighting). The user can provide
their own gain ratiotable, or use a table provided for a sensor in the
gain.ratio.table
dataset from LightLogR
.
Usage
normalize_counts(dataset, gain.columns, count.columns, gain.ratio.table)
Arguments
dataset |
a |
gain.columns |
a |
count.columns |
a |
gain.ratio.table |
a two-column tibble containing |
Value
an extended dataset with new columns containing normalized counts
See Also
Other Spectrum:
spectral_integration()
,
spectral_reconstruction()
Examples
example.table <-
tibble::tibble(
uvGain = c(4096, 1024, 2),
visGain = c(4096, 4096, 4096),
irGain = c(2,2,2),
uvValue = c(692, 709, 658),
visValue = c(128369, 129657, 128609),
irValue = c(122193, 127113, 124837))
gain.columns = c("uvGain", "visGain", "irGain")
count.columns = c("uvValue", "visValue", "irValue")
example.table |>
normalize_counts(gain.columns, count.columns, gain.ratio.tables$TSL2585)
Number non-consecutive state occurrences
Description
number_states()
creates a new column in a dataset that takes a state column
and assigns a count value to each state, rising every time a state is
replaced by another state. E.g., a column with the states "day" and "night"
will produce a column indicating whether this is "day 1", "day 2", and so
forth, as will the "night" state with "night 1", "night 2", etc. Grouping
within the input dataset is respected, i.e., the count will reset for each
group.
Usage
number_states(
dataset,
state.colname,
colname.extension = ".count",
use.original.state = TRUE
)
Arguments
dataset |
A |
state.colname |
Column name that contains the state. Expects a |
colname.extension |
The extension that is added to the state name to
create the new column. Defaults to |
use.original.state |
Logical, whether the original state should be part of the output column. |
Details
The state column is not limited to two states, but can have as many states as
needed. Also, it does not matter in which time frames these states change, so
they do not necessarily conform to a 24-hour day. NA
values will be treated
as their own state.
Gaps in the data can lead to non-sensible outcomes, e.g. if there is no
in-between state/observation between a day state at "18:00:00" and a day
state at "6:00:00" - this would be counted as day 1
still. In these cases,
the gap_handler()
function can be useful to a priori add observations.
Value
The input dataset
with an additional column that counts the
occurrences of each state. The new column will of type character
if
use.original.state = TRUE
and integer
otherwise.
Examples
dataset <- tibble::tibble(
state =
c("day", "day", "day", "night", "night", "day", "day", "night",
"night", "night", "day", "night")
)
number_states(dataset, state)
number_states(dataset, state, use.original.state = FALSE)
#example with photoperiods, calculating the mean values for each day and night
coordinates <- c(48.52, 9.06)
sample.data.environment |>
add_photoperiod(coordinates) |>
number_states(photoperiod.state) |>
dplyr::group_by(photoperiod.state.count, .add = TRUE) |>
dplyr::summarize(mean_MEDI = mean(MEDI)) |>
tail(13)
Non-visual circadian response
Description
This function calculates the non-visual circadian response (nvRC). It takes into account the assumed response dynamics of the non-visual system and the circadian rhythm and processes the light exposure signal to quantify the effective circadian-weighted input to the non-visual system (see Details).
Usage
nvRC(
MEDI.vector,
Illuminance.vector,
Time.vector,
epoch = "dominant.epoch",
sleep.onset = NULL
)
Arguments
MEDI.vector |
Numeric vector containing the melanopic EDI data. |
Illuminance.vector |
Numeric vector containing the Illuminance data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
sleep.onset |
The time of habitual sleep onset. Can be HMS, numeric, or NULL.
If NULL (the default), then the data is assumed to start at habitual sleep onset.
If |
Details
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
Value
A numeric vector containing the nvRC data. The output has the same
length as Time.vector
.
References
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
dataset1 <-
tibble::tibble(
Id = rep("B", 60 * 48),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*48-1)),
Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60),
rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)),
MEDI = Illuminance * rep(sample(0.5:1.5, 48, replace = TRUE), each = 60)
)
# Time.vector as POSIXct
dataset1.nvRC <- dataset1 %>%
dplyr::mutate(
nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00"))
)
# Time.vector as difftime
dataset2 <- dataset1 %>%
dplyr::mutate(Datetime = Datetime - lubridate::as_datetime(lubridate::dhours(22)))
dataset2.nvRC <- dataset2 %>%
dplyr::mutate(
nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = lubridate::dhours(0))
)
Performance metrics for circadian response
Description
These functions compare the non-visual circadian response (see nvRC
)
for measured personal light exposure to the nvRC for a reference light exposure pattern,
such as daylight.
Usage
nvRC_circadianDisturbance(nvRC, nvRC.ref, as.df = FALSE)
nvRC_circadianBias(nvRC, nvRC.ref, as.df = FALSE)
nvRC_relativeAmplitudeError(nvRC, nvRC.ref, as.df = FALSE)
Arguments
nvRC |
Time series of non-visual circadian response
(see |
nvRC.ref |
Time series of non-visual circadian response
circadian response (see |
as.df |
Logical. Should the output be returned as a data frame? Defaults to TRUE. |
Details
nvRC_circadianDisturbance()
calculates the circadian disturbance (CD).
It is expressed as
CD(i,T)=\frac{1}{T}\int_{t_{i}}^{t_{i}+T}
{\lvert r_{C}(t)-r_{C}^{ref}(t)\rvert dt},
and quantifies the total difference between the measured circadian response and the circadian response to a reference profile.
nvRC_circadianBias()
calculates the circadian bias (CB).
It is expressed as
CB(i,T)=\frac{1}{T}\int_{t_{i}}^{t_{i}+T}
{(r_{C}(t)-r_{C}^{ref}(t))dt},
and provides a measure of the overall trend for the difference in circadian response, i.e. positive values for overestimating and negative for underestimating between the measured circadian response and the circadian response to a reference profile.
nvRC_relativeAmplitudeError()
calculates the relative amplitude error (RAE).
It is expressed as
RAE(i,T)=r_{C,max}-r_{C,max}^{ref},
and quantifies the difference between the maximum response achieved in a period to the reference signal.
Value
A numeric value or single column data frame.
References
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
Examples
dataset1 <-
tibble::tibble(
Id = rep("B", 60 * 24),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)),
Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)),
MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60),
) %>%
dplyr::mutate(
nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00"))
)
dataset.reference <-
tibble::tibble(
Id = rep("Daylight", 60 * 24),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)),
Illuminance = c(rep(0, 60*6), rep(10000, 12*60), rep(0, 60*6)),
MEDI = Illuminance
) %>%
dplyr::mutate(
nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00"))
)
# Circadian disturbance
nvRC_circadianDisturbance(dataset1$nvRC, dataset.reference$nvRC)
# Circadian bias
nvRC_circadianBias(dataset1$nvRC, dataset.reference$nvRC)
# Relative amplitude error
nvRC_relativeAmplitudeError(dataset1$nvRC, dataset.reference$nvRC)
Non-visual direct response
Description
This function calculates the non-visual direct response (nvRD). It takes into account the assumed response dynamics of the non-visual system and processes the light exposure signal to quantify the effective direct input to the non-visual system (see Details).
Usage
nvRD(MEDI.vector, Illuminance.vector, Time.vector, epoch = "dominant.epoch")
Arguments
MEDI.vector |
Numeric vector containing the melanopic EDI data. |
Illuminance.vector |
Numeric vector containing the Illuminance data. |
Time.vector |
Vector containing the time data. Can be |
epoch |
The epoch at which the data was sampled. Can be either a
|
Details
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
Value
A numeric vector containing the nvRD data. The output has the same
length as Time.vector
.
References
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
# Dataset 1 with 24h measurement
dataset1 <-
tibble::tibble(
Id = rep("A", 60 * 24),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)),
Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)),
MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60)
)
# Dataset 2 with 48h measurement
dataset2 <-
tibble::tibble(
Id = rep("B", 60 * 48),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*48-1)),
Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60),
rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)),
MEDI = Illuminance * rep(sample(0.5:1.5, 48, replace = TRUE), each = 60)
)
# Combined datasets
dataset.combined <- rbind(dataset1, dataset2)
# Calculate nvRD per ID
dataset.combined.nvRD <- dataset.combined %>%
dplyr::group_by(Id) %>%
dplyr::mutate(
nvRD = nvRD(MEDI, Illuminance, Datetime)
)
Cumulative non-visual direct response
Description
This function calculates the cumulative non-visual direct response (nvRD). This is basically the integral of the nvRD over the provided time period in hours. The unit of the resulting value thus is "nvRD*h".
Usage
nvRD_cumulative_response(
nvRD,
Time.vector,
epoch = "dominant.epoch",
as.df = FALSE
)
Arguments
nvRD |
Numeric vector containing the non-visual direct response.
See |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
as.df |
Logical. Should a data frame with be returned? If |
Value
A numeric value or single column data frame.
References
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
dataset1 <-
tibble::tibble(
Id = rep("A", 60 * 24),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)),
Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 14, replace = TRUE), each = 60), rep(0, 60*2)),
MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60)
) %>%
dplyr::mutate(
nvRD = nvRD(MEDI, Illuminance, Datetime)
)
dataset1 %>%
dplyr::summarise(
"cumulative nvRD" = nvRD_cumulative_response(nvRD, Datetime)
)
Length of longest continuous period above/below threshold
Description
This function finds the length of the longest continous period above/below a specified threshold light level or within a specified range of light levels.
Usage
period_above_threshold(
Light.vector,
Time.vector,
comparison = c("above", "below"),
threshold,
epoch = "dominant.epoch",
loop = FALSE,
na.replace = FALSE,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the period of light levels above or
below threshold should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the period of light levels within the two thresholds will be calculated. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
loop |
Logical. Should the data be looped? Defaults to |
na.replace |
Logical. Should missing values (NA) be replaced
for the calculation? If |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
If |
as.df |
Logical. Should a data frame be returned? If |
Value
A duration object (see duration
) as single value,
or single column data frame.
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
N <- 60
# Dataset with continous period of >250lx for 35min
dataset1 <-
tibble::tibble(
Id = rep("A", N),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N),
MEDI = c(sample(1:249, N-35, replace = TRUE),
sample(250:1000, 35, replace = TRUE))
)
dataset1 %>%
dplyr::reframe("Period >250lx" = period_above_threshold(MEDI, Datetime, threshold = 250))
dataset1 %>%
dplyr::reframe("Period <250lx" = period_above_threshold(MEDI, Datetime, "below", threshold = 250))
# Dataset with continous period of 100-250lx for 20min
dataset2 <-
tibble::tibble(
Id = rep("B", N),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N),
MEDI = c(sample(c(1:99, 251-1000), N-20, replace = TRUE),
sample(100:250, 20, replace = TRUE)),
)
dataset2 %>%
dplyr::reframe("Period 250lx" = period_above_threshold(MEDI, Datetime, threshold = c(100,250)))
# Return data frame
dataset1 %>%
dplyr::reframe(period_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE))
Calculate photoperiod and boundary times
Description
A family of functions to extract and add photoperiod information.
photoperiod()
creates a tibble
with the calculated times of dawn and dusk
for the given location and date. The function is a convenience wrapper for
suntools::crepuscule()
to calculate the times of dawn and dusk. By default,
civil dawn and dusk are calculated, but the function can be used to calculate
other times by changing the solarDep
parameter (e.g., 0 for sunrise/sunset,
12 for nautical, and 18 for astronomical).
Taking a light exposure dataset as input, extract_photoperiod()
calculates
the photoperiods and their boundary times for each unique day in the dataset,
given a location and boundary condition (i.e., the solar depression angle).
Basically, this is a convenience wrapper for photoperiod()
that takes a
light logger dataset and extracts unique dates and the time zone from the
dataset.
add_photoperiod()
adds photoperiod information to a light logger dataset.
Beyond the photoperiod information, it will categorize the
photoperiod.state
as "day"
or "night"
. If overwrite
is set to TRUE
,
the function will overwrite any columns with the same name.
solar_noon()
calculates the solar noon for a given location and date. The
function is a convenience wrapper for suntools::solarnoon()
. The function
has no companions like extract_photoperiod()
or add_photoperiod()
, but
will be extended, if there is sufficient interest.
Usage
photoperiod(coordinates, dates, tz, solarDep = 6)
extract_photoperiod(
dataset,
coordinates,
Datetime.colname = Datetime,
solarDep = 6
)
add_photoperiod(
dataset,
coordinates,
Datetime.colname = Datetime,
solarDep = 6,
overwrite = FALSE
)
solar_noon(coordinates, dates, tz)
Arguments
coordinates |
A two element numeric vector representing the latitude and longitude of the location. Important note: Latitude is the first element and Longitude is the second element. |
dates |
A date of format |
tz |
Timezone of the data. Expects a |
solarDep |
A numerical value representing the solar depression angle
between 90 and -90. This means a value of 6 equals -6 degrees above the
horizon. Default is 6, equalling |
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
overwrite |
Logical scalar. If |
Details
Please note that all functions of the photoperiod
family work with one
coordinate pair at a time. If you have multiple locations (and multiple time
zones), you need to run the function for each location separately. We suggest
using a nested dataframe structure, and employ the purrr
package to iterate
over the locations.
Value
photoperiod()
returns a tibble
with the calculated times of dawn
and dusk for the given location and date, with the length equal to the
dates
input parameter . The tibble contains the following columns:
-
date
with the date of the calculation, stored as classDate
-
tz
with the timezone of the output, stored as classcharacter
-
lat
andlon
with the latitude and longitude of the location, stored as classnumeric
-
solar.angle
with the negative solar depression angle, i.e. the sun elevation above the horizon. stored as classnumeric
-
dawn
anddusk
with the calculated datetimes, stored as classPOSIXct
-
photoperiod
with the calculated photoperiod, stored as classdifftime
.
extract_photoperiod()
returns a tibble
of the same structure as
photoperiod()
, but with a length equal to the number of unique dates in
the dataset.
add_photoperiod returns the input dataset with the added
photoperiod information. The information is appended with the following
columns: dawn
, dusk
, photoperiod
, and photoperiod.state
.
solar_noon()
returns a tibble
with the calculated solar noon
See Also
Other photoperiod:
gg_photoperiod()
Examples
#example für Tübingen, Germany
coordinates <- c(48.521637, 9.057645)
dates <- c("2023-06-01", "2025-08-23")
tz <- "Europe/Berlin"
#civil dawn/dusk
photoperiod(coordinates, dates, tz)
#sunrise/sunset
photoperiod(coordinates, dates, tz, solarDep = 0)
#extract_photoperiod
sample.data.environment |>
extract_photoperiod(coordinates)
#add_photoperiod
added_photoperiod <-
sample.data.environment |>
add_photoperiod(coordinates)
added_photoperiod |> head()
added_photoperiod |>
filter_Date(length = "3 days") |>
gg_days(aes_col = photoperiod.state,
group = dplyr::consecutive_id(photoperiod.state),
jco_color = TRUE
)
added_photoperiod |>
filter_Date(length = "3 days") |>
gg_day(aes_col = Id) +
ggplot2:: geom_rect(
data = \(x) x |> dplyr::ungroup(Id) |> dplyr::summarize(dawn = mean(dawn) |> hms::as_hms()),
ggplot2::aes(xmin = 0, xmax = dawn, ymin = -Inf, ymax = Inf),
alpha = 0.1
) +
ggplot2:: geom_rect(
data = \(x) x |> dplyr::ungroup(Id) |> dplyr::summarize(dusk = mean(dusk) |> hms::as_hms()),
ggplot2::aes(xmin = dusk, xmax = 24*60*60, ymin = -Inf, ymax = Inf),
alpha = 0.1
)
added_photoperiod |> dplyr::summarize(dawn = mean(dawn) |> hms::as_hms())
#solar_noon()
solar_noon(coordinates, dates, tz)
Pulses above threshold
Description
This function clusters the light data into continuous clusters (pulses) of light above/below a given threshold. Clustering may be fine-tuned by setting the minimum length of the clusters and by allowing brief interruptions to be included in a single cluster, with a specified maximum length of interruption episodes and proportion of total amount of interruptions to light above threshold.
Usage
pulses_above_threshold(
Light.vector,
Time.vector,
comparison = c("above", "below"),
threshold,
min.length = "2 mins",
max.interrupt = "8 mins",
prop.interrupt = 0.25,
epoch = "dominant.epoch",
return.indices = FALSE,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. Missing values will
be considered as |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the time above or below threshold
should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the timing corresponding to light levels between the two thresholds will be calculated. |
min.length |
The minimum length of a pulse. Can be either a
duration or a string. If it is a string, it needs to be a valid
duration string, e.g., |
max.interrupt |
Maximum length of each episode of interruptions. Can be either a
duration or a string. If it is a string, it needs to be a valid
duration string, e.g., |
prop.interrupt |
Numeric value between |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
return.indices |
Logical. Should the cluster indices be returned? Only works if
|
na.rm |
Logical. Should missing values be removed for the calculation of
pulse metrics? Defaults to |
as.df |
Logical. Should a data frame be returned? If |
Details
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
Value
List or data frame with calculated values.
References
Wilson, J., Reid, K. J., Braun, R. I., Abbott, S. M., & Zee, P. C. (2018). Habitual light exposure relative to circadian timing in delayed sleep-wake phase disorder. Sleep, 41(11). doi:10.1093/sleep/zsy166
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
Examples
# Sample data
data = sample.data.environment %>%
dplyr::filter(Id == "Participant") %>%
filter_Datetime(length = lubridate::days(1)) %>%
dplyr::mutate(
Time = hms::as_hms(Datetime),
)
# Time vector as datetime
data %>%
dplyr::reframe(pulses_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE))
# Time vector as hms time
data %>%
dplyr::reframe(pulses_above_threshold(MEDI, Time, threshold = 250, as.df = TRUE))
# Pulses below threshold
data %>%
dplyr::reframe(pulses_above_threshold(MEDI, Datetime, "below", threshold = 250, as.df = TRUE))
# Pulses within threshold range
data %>%
dplyr::reframe(pulses_above_threshold(MEDI, Datetime, threshold = c(250,1000), as.df = TRUE))
Remove groups that have too few data points
Description
This function removes groups from a dataframe that do not have sufficient
data points. Groups of one data point will automatically be removed. Single
data points are common after using aggregate_Datetime()
.
Usage
remove_partial_data(
dataset,
Variable.colname = Datetime,
threshold.missing = 0.2,
by.date = FALSE,
Datetime.colname = Datetime,
show.result = FALSE,
handle.gaps = FALSE
)
Arguments
dataset |
A light logger dataset. Expects a dataframe. If not imported by LightLogR, take care to choose sensible variables for the Datetime.colname and Variable.colname. |
Variable.colname |
Column name that contains the variable for which to
assess sufficient datapoints. Expects a symbol. Needs to be part of the
dataset. Default is |
threshold.missing |
either
|
by.date |
Logical. Should the data be (additionally) grouped by day?
Defaults to |
Datetime.colname |
Column name that contains the datetime. Defaults to "Datetime" which is automatically correct for data imported with LightLogR. Expects a symbol. Needs to be part of the dataset. Must be of type POSIXct. |
show.result |
Logical, whether the output of the function is summary of the data (TRUE), or the reduced dataset (FALSE, the default) |
handle.gaps |
Logical, whether the data shall be treated with
|
Value
if show.result = FALSE
(default), a reduced dataframe without the
groups that did not have sufficient data
Examples
#create sample data with gaps
gapped_data <-
sample.data.environment |>
dplyr::filter(MEDI < 30000)
#check their status, based on the MEDI variable
gapped_data |> remove_partial_data(MEDI, handle.gaps = TRUE, show.result = TRUE)
#the function will produce a warning if implicit gaps are present
gapped_data |> remove_partial_data(MEDI, show.result = TRUE)
#one group (Environment) does not make the cut of 20% missing data
gapped_data |> remove_partial_data(MEDI, handle.gaps = TRUE) |> dplyr::count(Id)
#for comparison
gapped_data |> dplyr::count(Id)
#If the threshold is set differently, e.g., to 2 days allowed missing, results vary
gapped_data |>
remove_partial_data(MEDI, handle.gaps = TRUE, threshold.missing = "2 days") |>
dplyr::count(Id)
#The removal can be automatically switched to daily detections within groups
gapped_data |>
remove_partial_data(MEDI, handle.gaps = TRUE, by.date = TRUE, show.result = TRUE) |>
head()
Create a reverse transformation function specifically for date scales
Description
This helper function is exclusive for gg_heatmap()
, to get a reversed date
sequence.
Usage
reverse2_trans()
Value
a transformation function
Source
from https://github.com/tidyverse/ggplot2/issues/4014
Examples
reverse2_trans()
Sample of wearable data combined with environmental data
Description
A subset of data from a study at the TSCN-Lab using the ActLumus light logger. This dataset contains personal light exposure information for one participant over the course of six full days. The dataset is measured with a 10 second epoch and is complete (no missing values). Additionally environmental light data was captured with a second light logger mounted horizontally at the TUM university roof, without any obstructions (besides a transparent plastic halfdome). The epoch for this data is 30 seconds. This dataset allows for some interesting calculations based on available daylight at a given point in time.
Usage
sample.data.environment
Format
sample.data.environment
A tibble with 69,120 rows and 3 columns:
- Datetime
POSIXct Datetime
- MEDI
melanopic EDI measurement data. Unit is lux.
- Id
A
character
vector indicating whether the data is from theParticipant
or from theEnvironment
.
Source
Sample of highly irregular wearable data
Description
A dataset collected with a wearable device that has a somewhat irregular recording pattern. Overall, the data are recorded every 15 seconds. Every tenth or so measurement takes 16 seconds, every hundredths 17 seconds, every thousandths 18 seconds, and so on. This makes the dataset a prime example for handling and dealing with irregular data.
Usage
sample.data.irregular
Format
sample.data.irregular
A tibble with 11,422 rows and 13 columns:
- Id
A
character
vector indicating the participant (onlyP1
).- Datetime
POSIXct Datetime
- lux
numeric Illuminance. Unit is lux.
- kelvin
numeric correlated colour temperature (CCT). Unit is Kelvin.
- rgbR
numeric red sensor channel output. Unit is W/m2/nm.
- rgbG
numeric green sensor channel output. Unit is W/m2/nm.
- rgbB
numeric blue sensor channel output. Unit is W/m2/nm.
- rgbIR
numeric infrared sensor channel output. Unit is W/m2/nm.
- movement
numeric indicator for movement (intensity) of the device. Movement is given in discrete counts correlating to the number of instances the accelerometer records instances greater than 0.1875g per 15s sampling interval.
- MEDI
melanopic EDI measurement data. Unit is lux.
- R.
Unknown, but likely direct or derived output from the red sensor channel
- G.
Unknown, but likely direct or derived output from the green sensor channel
- B.
Unknown, but likely direct or derived output from the blue sensor channel
Statechange (sc) Timestamps to Intervals
Description
Takes an input of datetimes
and Statechanges
and creates a column with
Intervals
. If full = TRUE
, it will also create intervals for the day
prior to the first state change and after the last. If output.dataset = FALSE
it will give a named vector, otherwise a tibble
. The state change
info requires a description or name of the state (like "sleep"
or "wake"
,
or "wear"
) that goes into effect at the given Datetime
. Works for grouped
data so that it does not mix up intervals between participants. Missing data
should be explicit if at all possible. Also, the maximum allowed length of an
interval can be set, so that implicit missing timestamps after a set period
of times can be enforced.
Usage
sc2interval(
dataset,
Datetime.colname = Datetime,
Statechange.colname = State,
State.colname = State,
Interval.colname = Interval,
full = TRUE,
starting.state = NA,
output.dataset = TRUE,
Datetime.keep = FALSE,
length.restriction = 60 * 60 * 24
)
Arguments
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Statechange.colname , Interval.colname , State.colname |
Column names that
do contain the name/description of the |
full , starting.state |
These arguments handle the state on the first day
before the first state change and after the last state change on the last
day. If |
output.dataset |
should the output be a |
Datetime.keep |
If |
length.restriction |
If the length between intervals is too great, the
interval state can be set to |
Value
One of
a
data.frame
object identical todataset
but with the interval instead of the datetime. The originalStatechange
column now indicates theState
during theInterval
.a named
vector
with the intervals, where the names are the states
Examples
library(tibble)
library(lubridate)
library(dplyr)
sample <- tibble::tibble(Datetime = c("2023-08-15 6:00:00",
"2023-08-15 23:00:00",
"2023-08-16 6:00:00",
"2023-08-16 22:00:00",
"2023-08-17 6:30:00",
"2023-08-18 1:00:00"),
State = rep(c("wake", "sleep"), 3),
Id = "Participant")
#intervals from sample
sc2interval(sample)
#compare sample (y) and intervals (x)
sc2interval(sample) %>%
mutate(Datetime = int_start(Interval)) %>%
dplyr::left_join(sample, by = c("Id", "State"),
relationship = "many-to-many") %>%
head()
Recode Sleep/Wake intervals to Brown state intervals
Description
Takes a dataset with sleep/wake intervals and recodes them to Brown state
intervals. Specifically, it recodes the sleep
intervals to night
, reduces
wake
intervals by a specified evening.length
and recodes them to
evening
and day
intervals. The evening.length
is the time between day
and night
. The result can be used as input for interval2state()
and might
be used subsequently with Brown2reference()
.
Usage
sleep_int2Brown(
dataset,
Interval.colname = Interval,
Sleep.colname = State,
wake.state = "wake",
sleep.state = "sleep",
Brown.day = "day",
Brown.evening = "evening",
Brown.night = "night",
evening.length = lubridate::dhours(3),
Brown.state.colname = State.Brown,
output.dataset = TRUE
)
Arguments
dataset |
A dataset with sleep/wake intervals. |
Interval.colname |
The name of the column with the intervals. Defaults to |
Sleep.colname |
The name of the column with the sleep/wake states. Defaults to |
wake.state , sleep.state |
The names of the wake and sleep states in the |
Brown.day , Brown.evening , Brown.night |
The names of the Brown states that will be used. Defaults to |
evening.length |
The length of the evening interval in seconds. Can also use lubridate duration or period objects. Defaults to 3 hours. |
Brown.state.colname |
The name of the column with the newly created Brown states. Works as a simple renaming of the |
output.dataset |
Whether to return the whole |
Details
The function will filter out any non-sleep intervals that are shorter than the specified evening.length
. This prevents problematic behaviour when the evening.length
is longer than the wake
intervals or, e.g., when the first state is sleep after midnight and there is a prior NA
interval from midnight till sleep. This behavior might, however, result in problematic results for specialized experimental setups with ultra short wake/sleep cycles. The sleep_int2Brown()
function would not be applicable in those cases anyways.
Value
A dataset with the Brown states or a vector with the Brown states. The Brown states are created in a new column with the name specified in Brown.state.colname
. The dataset will have more rows than the original dataset, because the wake
intervals are split into day
and evening
intervals.
References
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
See Also
Other Brown:
Brown2reference()
,
Brown_check()
,
Brown_cut()
,
Brown_rec()
Examples
#create a sample dataset
sample <- tibble::tibble(Datetime = c("2023-08-15 6:00:00",
"2023-08-15 23:00:00",
"2023-08-16 6:00:00",
"2023-08-16 22:00:00",
"2023-08-17 6:30:00",
"2023-08-18 1:00:00"),
State = rep(c("wake", "sleep"), 3),
Id = "Participant")
#intervals from sample
sc2interval(sample)
#recoded intervals
sc2interval(sample) %>% sleep_int2Brown()
Integrate spectral irradiance with optional weighting
Description
Integrates over a given spectrum, optionally over only a portion of the spectrum, optionally with a weighing function. Can be used to calculate spectral contributions in certain wavelength ranges, or to calculate (alphaopically equivalent daylight) illuminance.
Usage
spectral_integration(
spectrum,
wavelength.range = NULL,
action.spectrum = NULL,
general.weight = 1
)
Arguments
spectrum |
Tibble with spectral data (1st col: wavelength, 2nd col: SPD values) |
wavelength.range |
Optional integration bounds (length-2 numeric) |
action.spectrum |
Either:
|
general.weight |
Scalar multiplier or "auto" for built-in efficacies |
Details
The function uses trapezoidal integration and recognizes differing
step-widths in the spectrum. If an action spectrum is used, values of the
action spectrum at the spectral wavelenghts are interpolated with
stats::approx()
.
The used efficacies for for the auto-weighting are:
photopic: 683.0015478
melanopic: 1/0.0013262
rhodopic: 1/0.0014497
l_cone_opic: 1/0.0016289
m_cone_opic: 1/0.0014558
s_cone_opic: 1/0.0008173
This requires input values in W/(m^2) for the spectrum. If it is provided in other units, the result has to be rescaled afterwards.
Value
Numeric integrated value
See Also
Other Spectrum:
normalize_counts()
,
spectral_reconstruction()
Examples
# creating an equal energy spectrum of value 1
spd <- data.frame(wl = 380:780, values = 1)
#integrating over the full spectrum
spectral_integration(spd)
#integrating over wavelengths 400-500 nm
spectral_integration(spd, wavelength.range = c(400, 500))
#calculating the photopic illuminance of an equal energy spectrum with 1 W/(m^2*nm)
spectral_integration(spd, action.spectrum = "photopic", general.weight = "auto")
#calculating the melanopic EDI of an equal energy spectrum with 1 W/(m^2*nm)
spectral_integration(spd, action.spectrum = "melanopic", general.weight = "auto")
# Custom action spectrum
custom_act <- data.frame(wavelength = 400:700, weight = 0.5)
spectral_integration(spd, wavelength.range = c(400,700),
action.spectrum = custom_act, general.weight = 2)
#using a spectrum that is broader then the action spectrum will not change the
#output, as the action spectrum will use zeros beyond its range
Reconstruct spectral irradiance from sensor counts
Description
This function takes sensor data in the form of (normalized) counts and
reconstructs a spectral power distribution (SPD) through a calibration matrix.
The matrix takes the form of sensor channel x wavelength
, and the spectrum
results form a linear combination of counts x calibration-value
for any
wavelength in the matrix. Handles multiple sensor readings by returning a list of spectra
Usage
spectral_reconstruction(
sensor_channels,
calibration_matrix,
format = c("long", "wide")
)
Arguments
sensor_channels |
Named numeric vector or dataframe with sensor readings. Names must match calibration matrix columns. |
calibration_matrix |
Matrix or dataframe with sensor-named columns and wavelength-indexed rows |
format |
Output format: "long" (list of tibbles) or "wide" (dataframe) |
Details
Please note that calibration matrices are not provided by LightLogR, but can
be provided by a wearable device manufacturer. Counts can be normalized with
the normalize_counts()
function, provided that the output also contains a
gain
column.
Value
"long": List of tibbles (wavelength, irradiance)
"wide": Dataframe with wavelength columns and one row per spectrum
See Also
Other Spectrum:
normalize_counts()
,
spectral_integration()
Examples
# Calibration matrix example
calib <- matrix(1:12, ncol=3, dimnames = list(400:403, c("R", "G", "B")))
# Named vector input
spectral_reconstruction(c(R=1, G=2, B=3), calib)
# Dataframe input
df <- data.frame(R=1, G=2, B=3, other_col=10)
spectral_reconstruction(dplyr::select(df, R:B), calib)
# Multiple spectra: as list columns
df <- data.frame(Measurement = c(1,2), R=c(1,2), G=c(2,4), B=c(3,6))
df <-
df |>
dplyr::mutate(
Spectrum = spectral_reconstruction(dplyr::pick(R:B), calib)
)
df |> tidyr::unnest(Spectrum)
# Multiple spectra: as extended dataframes
df |>
dplyr::mutate(
Spectrum = spectral_reconstruction(dplyr::pick(R:B), calib, "wide"))
Summarize numeric columns in dataframes to means
Description
This simple helper function was created to summarize episodes of gaps, clusters, or states, focusing on numeric variables. It calculates mean values for all numeric columns and handles Duration objects appropriately.
Despite its name, the function actually summarizes all double columns, which is more inclusive compared to just numeric columns.
Usage
summarize_numeric(
data,
remove = NULL,
prefix = "mean_",
na.rm = TRUE,
complete.groups.on = NULL,
add.total.duration = TRUE,
durations.dec = 0,
Datetime2Time = TRUE
)
summarise_numeric(
data,
remove = NULL,
prefix = "mean_",
na.rm = TRUE,
complete.groups.on = NULL,
add.total.duration = TRUE,
durations.dec = 0,
Datetime2Time = TRUE
)
Arguments
data |
A dataframe containing numeric data, typically from
|
remove |
Character vector of columns removed from the summary. |
prefix |
A prefix to add to the column names of summarized metrics. Defaults to "mean_". |
na.rm |
Whether to remove NA values when calculating means. Defaults to TRUE. |
complete.groups.on |
Column name that, together with grouping variables,
can be used to provide a complete set. For example, with
|
add.total.duration |
Logical, whether the total duration for a given
group should be calculated. Only relevant if a column |
durations.dec |
Numeric of number of decimals for the mean calculation of durations and times. Defaults to 0. |
Datetime2Time |
Logical of whether POSIXct columns should be transformed
into hms(time) columns, which is usually sensible for averaging (default is
|
Value
A dataframe containing the summarized metrics.
Examples
# Extract clusters and summarize them
dataset <-
sample.data.environment %>%
aggregate_Datetime(unit = "15 mins") |>
extract_clusters(MEDI > 1000)
#input to summarize_numeric
dataset |> utils::head()
#output of summarize_numeric (removing state.count and epoch from the summary)
dataset |> summarize_numeric(c("state.count", "epoch"))
Get all the supported devices in LightLogR
Description
Returns a vector of all the supported devices in LightLogR.
Usage
supported_devices()
Details
These are all supported devices where there is a dedicated import function.
Import functions can be called either through import_Dataset()
with the
respective device = "device"
argument, or directly, e.g.,
import$ActLumus()
.
Value
A character vector of all supported devices
See Also
Examples
supported_devices()
Scale positive and negative values on a log scale
Description
To create a plot with positive and negative (unscaled) values on a log-transformed axis, the values need to be scaled accordingly. R or ggplot2 do not have a built-in function for this, but the following function can be used to create a transformation function for this purpose. The function was coded based on a post on stack overflow. The symlog
transformation is the standard transformation used e.g., in gg_day()
.
Usage
symlog_trans(base = 10, thr = 1, scale = 1)
Arguments
base |
Base for the logarithmic transformation. The default is 10. |
thr |
Threshold after which a logarithmic transformation is applied. If the absolute value is below this |
scale |
Scaling factor for logarithmically transformed values above the |
Details
The symlog
transformation can be accessed either via the trans = "symlog"
argument in a scaling function, or via trans = symlog_trans()
. The latter allows setting the individual arguments.
Value
a transformation function that can be used in ggplot2 or plotly to scale positive and negative values on a log scale.
References
This function's code is a straight copy from a post on stack overflow. The author of the answer is Julius Vainora, and the author of the question Brian B
Examples
dataset <-
sample.data.environment %>%
filter_Date(end = "2023-08-29") %>%
dplyr::mutate(MEDI = dplyr::case_when(
Id == "Environment" ~ -MEDI,
.default = MEDI))
#basic application where transformation, breaks and labels are set manually
dataset %>%
gg_day(aes_col = Id) +
ggplot2::scale_y_continuous(
trans = "symlog")
#the same plot, but with breaks and labels set manually
dataset %>%
gg_day(aes_col = Id) +
ggplot2::scale_y_continuous(
trans = "symlog",
breaks = c(-10^(5:0), 0, 10^(0:5)),
labels = function(x) format(x, scientific = FALSE, big.mark = " "))
#setting individual arguments of the symlog function manually allows
#e.g., to emphasize values smaller than 1
dataset %>%
gg_day(aes_col = Id) +
ggplot2::scale_y_continuous(
trans = symlog_trans(thr = 0.01),
breaks = c(-10^(5:-1), 0, 10^(-1:5)),
labels = function(x) format(x, scientific = FALSE, big.mark = " "))
Find threshold for given duration
Description
This function finds the threshold for which light levels are above/below for
a given duration. This function can be considered as the inverse of
duration_above_threshold
.
Usage
threshold_for_duration(
Light.vector,
Time.vector,
duration,
comparison = c("above", "below"),
epoch = "dominant.epoch",
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
duration |
The duration for which the threshold should be found. Can be either a
duration or a string. If it is a string, it needs to be a valid
duration string, e.g., |
comparison |
String specifying whether light levels above or below the threshold
should be considered. Can be either |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame with be returned? If |
Value
Single numeric value or single column data frame.
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
timing_above_threshold()
Examples
N <- 60
# Dataset with 30 min < 250lx and 30min > 250lx
dataset1 <-
tibble::tibble(
Id = rep("A", N),
Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N),
MEDI = sample(c(sample(1:249, N / 2, replace = TRUE),
sample(250:1000, N / 2, replace = TRUE))),
)
dataset1 %>%
dplyr::reframe("Threshold above which for 30 mins" =
threshold_for_duration(MEDI, Datetime, duration = "30 mins"))
dataset1 %>%
dplyr::reframe("Threshold below which for 30 mins" =
threshold_for_duration(MEDI, Datetime, duration = "30 mins",
comparison = "below"))
dataset1 %>%
dplyr::reframe(threshold_for_duration(MEDI, Datetime, duration = "30 mins",
as.df = TRUE))
Mean/first/last timing above/below threshold.
Description
This function calculates the mean, first, and last timepoint (MLiT, FLiT, LLiT) where light levels are above or below a given threshold intensity within the given time interval.
Usage
timing_above_threshold(
Light.vector,
Time.vector,
comparison = c("above", "below"),
threshold,
na.rm = FALSE,
as.df = FALSE
)
Arguments
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the time above or below threshold
should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the timing corresponding to light levels between the two thresholds will be calculated. |
na.rm |
Logical. Should missing values be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame be returned? If |
Value
List or dataframe with the three values: mean
, first
, and last
timing
above threshold. The output type corresponds to the type of Time.vector
,
e.g., if Time.vector
is HMS, the timing metrics will be also
HMS, and vice versa for POSIXct and numeric.
References
Reid, K. J., Santostasi, G., Baron, K. G., Wilson, J., Kang, J., & Zee, P. C. (2014). Timing and Intensity of Light Correlate with Body Weight in Adults. PLOS ONE, 9(4), e92251. doi:10.1371/journal.pone.0092251
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
See Also
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
dose()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
Examples
# Dataset with light > 250lx between 06:00 and 18:00
dataset1 <-
tibble::tibble(
Id = rep("A", 24),
Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23),
MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5))
)
# Above threshold
dataset1 %>%
dplyr::reframe(timing_above_threshold(MEDI, Datetime, "above", 250, as.df = TRUE))
# Below threshold
dataset1 %>%
dplyr::reframe(timing_above_threshold(MEDI, Datetime, "below", 10, as.df = TRUE))
# Input = HMS -> Output = HMS
dataset1 %>%
dplyr::reframe(timing_above_threshold(MEDI, hms::as_hms(Datetime), "above", 250, as.df = TRUE))