Title: | Create HIDECAN Plots for Visualising Genome-Wide Association Studies and Differential Expression Results |
Version: | 1.1.0 |
Description: | Generates HIDECAN plots that summarise and combine the results of genome-wide association studies (GWAS) and transcriptomics differential expression analyses (DE), along with manually curated candidate genes of interest. The HIDECAN plot is presented in Angelin-Bonnet et al. (2023) (currently in review). |
License: | MIT + file LICENSE |
URL: | https://plantandfoodresearch.github.io/hidecan/, https://github.com/PlantandFoodResearch/hidecan |
BugReports: | https://github.com/PlantandFoodResearch/hidecan/issues |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
Depends: | R (≥ 3.5.0) |
Imports: | dplyr, ggnewscale, ggplot2, ggrepel, purrr, shiny, tibble, tidyr, viridis, vroom |
Suggests: | knitr, rmarkdown, stringr, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
VignetteBuilder: | knitr |
Language: | en-GB |
NeedsCompilation: | no |
Packaged: | 2023-02-09 02:07:11 UTC; HRPOAB |
Author: | Olivia Angelin-Bonnet
|
Maintainer: | Olivia Angelin-Bonnet <olivia.angelin-bonnet@plantandfood.co.nz> |
Repository: | CRAN |
Date/Publication: | 2023-02-10 09:40:02 UTC |
Checks whether some columns are present in a tibble
Description
Checks whether some columns are present in a tibble
Usage
.check_cols(x, col_names, param_name = "Input data-frame")
Arguments
x |
Tibble |
col_names |
character vector of column names |
param_name |
Character, name of the dataframe to use in the error message. |
Value
invisible NULL
Computes chromosomes' length for a tibble of genes
Description
Computes the length (in bp) of each chromosome as the maximum position of genes on the chromosome.
Usage
.compute_chrom_length_genes(x)
Arguments
x |
Either a |
Value
A tibble with two columns: chromosome
(chromosome name) and
length
(chromosome length in base pair).
Creates a CAN_data
object
Description
Creates a CAN_data
object from a tibble or data-frame of candidate genes.
Usage
CAN_data(dat, keep_rownames_as = NULL)
Arguments
dat |
Tibble, set of candidate genes of interest. See Details. |
keep_rownames_as |
Character, the name of the column in which to save the
rownames of the input data-frame. Default value is |
Details
The input data should have one row per gene, and at least the following columns:
-
chromosome
: character column, chromosome on which the gene is located. -
start
andend
: numeric, starting and end position of the gene (in bp). A columnposition
will be constructed as the middle value (mean) betweenstart
andend
. -
name
: character, the name of the candidate genes to be displayed.
Value
A CAN_data
object, i.e. a tibble.
Examples
x <- get_example_data()
CAN_data(x[["CAN"]])
Creates a DE_data
object
Description
Creates a DE_data
object from a tibble or data-frame of differential expression
results.
Usage
DE_data(dat, keep_rownames_as = NULL)
Arguments
dat |
Tibble, results from a differential expression analysis. See Details. |
keep_rownames_as |
Character, the name of the column in which to save the
rownames of the input data-frame. Default value is |
Details
The input data should have one row per gene or transcript, and at least the following columns:
-
chromosome
: character column, chromosome on which the gene/transcript is located. -
start
andend
: numeric, starting and end position of the gene/transcript (in bp). A columnposition
will be constructed as the middle value (mean) betweenstart
andend
. -
score
orpadj
: numeric, the DE score or adjusted p-value of the gene/transcript. If columnscore
column is missing, will be constructed as-log10(padj)
. -
foldChange
orlog2FoldChange
: numeric, the fold-change or log2(fold-change) of the gene/transcript. If columnlog2FoldChange
is missing, will be constructed aslog2(foldChange)
.
Value
A DE_data
object, i.e. a tibble.
Examples
x <- get_example_data()
DE_data(x[["DE"]])
Creates a GWAS_data
object
Description
Creates a GWAS_data
object from a tibble or data-frame of GWAS results.
Usage
GWAS_data(dat, keep_rownames_as = NULL)
Arguments
dat |
Tibble, results from a GWAS analysis. See Details. |
keep_rownames_as |
Character, the name of the column in which to save the
rownames of the input data-frame. Default value is |
Details
The input data should have one row per marker, and at least the following columns:
-
chromosome
: character column, chromosome on which the marker is located. -
position
: numeric, the physical position of the marker along the chromosome (in bp). -
score
orpadj
: numeric, the GWAS score or adjusted p-value of the marker. If columnscore
column is missing, will be constructed as-log10(padj)
.
Value
A GWAS_data
object, i.e. a tibble.
Examples
x <- get_example_data()
GWAS_data(x[["GWAS"]])
Extracts information from GWASpoly output
Description
Extracts GWAS results and chromosome length from GWASpoly output.
Usage
GWAS_data_from_gwaspoly(gwaspoly_output, traits = NULL, models = NULL)
Arguments
gwaspoly_output |
A |
traits |
Character vector, traits for which GWAS results should be
extracted. If |
models |
Character vector, genetic models for which GWAS results should be
extracted. If |
Value
A list with the following elements:
-
gwas_data_list
: A named list ofGWAS_data
objects, giving the markers score for each possible trait/genetic model combination. The names of the list are in the formtrait (genetic model)
. -
gwas_data_thr_list
: if the input data is aGWASpoly.thresh
object (from theGWASpoly::set.threshold()
function), a named list ofGwAS_data_thr
, with the significant markers score for each possible trait/genetic model combination. The names of the list are in the formtrait (genetic model)
. -
chrom_length
: A tibble with one row per chromosome, giving the length (in bp) of each chromosome.
Filters GWAS or DE results based on a threshold
Description
Filters markers or genes/transcripts based on a threshold applied to their GWAS or DE score, and log2(fold-change) (if applicable). For a set of candidate genes, simply returns the list. Note that markers or genes with a missing score or log2(fold-change) will be removed from the dataset.
Usage
apply_threshold(x, score_thr = 0, log2fc_thr = 0)
## S3 method for class 'GWAS_data'
apply_threshold(x, score_thr = 0, log2fc_thr = 0)
## S3 method for class 'DE_data'
apply_threshold(x, score_thr = 0, log2fc_thr = 0)
## S3 method for class 'CAN_data'
apply_threshold(x, score_thr = 0, log2fc_thr = 0)
## Default S3 method:
apply_threshold(x, score_thr = 0, log2fc_thr = 0)
Arguments
x |
Either a |
score_thr |
Numeric, threshold to use on markers' or genes/transcripts' score.
Only markers or genes with a score equal to or higher than this threshold
will be retained. Default value is 0. Ignored for |
log2fc_thr |
Numeric, threshold to use on the absolute value of genes/
transcripts' log2(fold-change). Only genes/transcripts with an absolute
log2(fold-change) equal to or higher than this threshold will be retained.
Ignored for |
Value
A filtered tibble (of class GWAS_data_thr
, DE_data_thr
or
CAN_data_thr
).
Examples
x <- get_example_data()
## For GWAS results
apply_threshold(GWAS_data(x[["GWAS"]]), score_thr = 4)
## For DE results - in second line, no threshold is applied
## on the log2(fold-change)
apply_threshold(DE_data(x[["DE"]]), score_thr = -log10(0.05), log2fc_thr = 1)
apply_threshold(DE_data(x[["DE"]]), score_thr = -log10(0.05), log2fc_thr = 0)
## No effect on the Candidate genes
apply_threshold(CAN_data(x[["CAN"]]))
Computes chromosomes' length from list
Description
Computes the length (in bp) of each chromosome from a list of GWAS and DE results as well as candidate gene lists.
Usage
combine_chrom_length(x)
Arguments
x |
A list of |
Value
A tibble with two columns: chromosome
(chromosome name) and
length
(chromosome length in base pair).
Examples
x <- get_example_data()
y <- list("GWAS" = GWAS_data(x[["GWAS"]]),
"DE" = DE_data(x[["DE"]]),
"CAN" = CAN_data(x[["CAN"]]))
combine_chrom_length(y)
Computes chromosomes' length
Description
Computes the length (in bp) of each chromosome as the maximum position of markers or genes on the chromosome.
Usage
compute_chrom_length(x)
## S3 method for class 'GWAS_data'
compute_chrom_length(x)
## S3 method for class 'DE_data'
compute_chrom_length(x)
## S3 method for class 'CAN_data'
compute_chrom_length(x)
Arguments
x |
Either a |
Value
A tibble with two columns: chromosome
(chromosome name) and
length
(chromosome length in base pair).
Examples
x <- get_example_data()
compute_chrom_length(GWAS_data(x[["GWAS"]]))
compute_chrom_length(DE_data(x[["DE"]]))
compute_chrom_length(CAN_data(x[["CAN"]]))
Creates a HIDECAN plot
Description
Creates a HIDECAN plot from a list of filtered GWAS or DE results and/or candidate genes.
Usage
create_hidecan_plot(
x,
chrom_length,
colour_genes_by_score = TRUE,
remove_empty_chrom = FALSE,
chroms = NULL,
chrom_limits = NULL,
title = NULL,
subtitle = NULL,
n_rows = NULL,
n_cols = 2,
legend_position = "bottom",
point_size = 3,
label_size = 3.5,
label_padding = 0.15
)
Arguments
x |
A list of |
chrom_length |
Tibble with columns |
colour_genes_by_score |
Logical, whether to colour the genes by score
( |
remove_empty_chrom |
Logical, should chromosomes with no significant
markers/genes nor candidate genes be removed from the plot? Default value
if |
chroms |
Character vector, name of chromosomes to include in the plot. |
chrom_limits |
Integer vector of length 2, or named list where the
elements are integer vectors of length 2. If vector, gives the lower and upper
limit of the chromosomes (in bp) to use in the plot. If a named list, names
should correspond to chromosome names. Gives for each chromosome the lower
and upper limits (in bp) to use in the plot. Doesn't have to be specified
for all chromosomes. Default value is |
title |
Character, title of the plot. Default value is |
subtitle |
Character, subtitle of the plot. Default value is |
n_rows |
Integer, number of rows of chromosomes to create in the plot.
Default value is |
n_cols |
Integer, number of columns of chromosomes to create in the plot.
Default value is 2. Will be set to |
legend_position |
Character, position of the legend in the plot. Can be
|
point_size |
Numeric, size of the points in the plot. Default value is 3. |
label_size |
Numeric, size of the gene labels in the plot. Default value is
3.5 (for |
label_padding |
Numeric, amount of padding around gene labels in the plot, as unit or number. Default value is 0.15 (for geom_label_repel). |
Value
A ggplot
.
Examples
if (interactive()) {
x <- get_example_data()
y <- list("GWAS" = GWAS_data(x[["GWAS"]]),
"DE" = DE_data(x[["DE"]]),
"CAN" = CAN_data(x[["CAN"]]))
chrom_length <- combine_chrom_length(y)
z <- list(
apply_threshold(y[["GWAS"]], score_thr = 4),
apply_threshold(y[["DE"]], score_thr = 1.3, log2fc_thr = 0.5),
apply_threshold(y[["CAN"]])
)
create_hidecan_plot(z,
chrom_length,
label_size = 2)
## Colour genes according to their fold-change
create_hidecan_plot(z,
chrom_length,
colour_genes_by_score = FALSE,
label_size = 2)
## Add names to the datasets
create_hidecan_plot(setNames(z, c("Genomics", "RNAseq", "My list")),
chrom_length,
colour_genes_by_score = FALSE,
label_size = 2)
## Add names to some of the datasets only (e.g. not for GWAS results)
create_hidecan_plot(setNames(z, c(" ", "RNAseq", "My list")),
chrom_length,
colour_genes_by_score = FALSE,
label_size = 2)
## Set limits on all chromosomes (to "zoom in" to the 10-20Mb region)
create_hidecan_plot(z,
chrom_length,
label_size = 2,
chrom_limits = c(10e6, 20e6))
## Set limits on some chromosomes only
create_hidecan_plot(z,
chrom_length,
label_size = 2,
chrom_limits = list("ST4.03ch00" = c(10e6, 20e6),
"ST4.03ch02" = c(15e6, 25e6)))
}
Example dataset
Description
Returns a list of example datasets.
Usage
get_example_data()
Value
A list with the following elements:
-
GWAS
: a tibble of GWAS results, with columnsid
,chromosome
,position
andscore
. -
DE
: a tibble of differential expression results, with columnsgene
,chromosome
,padj
,log2FoldChange
,start
,end
andlabel
. -
CAN
: a tibble of candidate genes, with columnsid
,chromosome
,start
,end
,name
andgene_name
.
Wrapper to create a HIDECAN plot
Description
Wrapper function to create a HIDECAN plot from GWAS results, DE results or candidate genes.
Usage
hidecan_plot(
gwas_list = NULL,
de_list = NULL,
can_list = NULL,
score_thr_gwas = 4,
score_thr_de = 2,
log2fc_thr = 1,
chrom_length = NULL,
colour_genes_by_score = TRUE,
remove_empty_chrom = FALSE,
chroms = NULL,
chrom_limits = NULL,
title = NULL,
subtitle = NULL,
n_rows = NULL,
n_cols = 2,
legend_position = "bottom",
point_size = 3,
label_size = 3.5,
label_padding = 0.15
)
Arguments
gwas_list |
Data-frame or list of data-frames containing GWAS results,
each with at least a |
de_list |
Data-frame or list of data-frames containing DE results,
each with at least a |
can_list |
Data-frame or list of data-frames containing candidate genes,
each with at least a |
score_thr_gwas |
Numeric, the score threshold for GWAS results that will be used to select which markers will be plotted. Default value is 4. |
score_thr_de |
Numeric, the score threshold for DE results that will be used to select which markers will be plotted. Default value is 2. |
log2fc_thr |
Numeric, the log2(fold-change) threshold that will be used to select which genes will be plotted. Default value is 1. |
chrom_length |
Optional, tibble with columns |
colour_genes_by_score |
Logical, whether to colour the genes by score
( |
remove_empty_chrom |
Logical, should chromosomes with no significant
markers/genes nor candidate genes be removed from the plot? Default value
if |
chroms |
Character vector, name of chromosomes to include in the plot. |
chrom_limits |
Integer vector of length 2, or named list where the
elements are integer vectors of length 2. If vector, gives the lower and upper
limit of the chromosomes (in bp) to use in the plot. If a named list, names
should correspond to chromosome names. Gives for each chromosome the lower
and upper limits (in bp) to use in the plot. Doesn't have to be specified
for all chromosomes. Default value is |
title |
Character, title of the plot. Default value is |
subtitle |
Character, subtitle of the plot. Default value is |
n_rows |
Integer, number of rows of chromosomes to create in the plot.
Default value is |
n_cols |
Integer, number of columns of chromosomes to create in the plot.
Default value is 2. Will be set to |
legend_position |
Character, position of the legend in the plot. Can be
|
point_size |
Numeric, size of the points in the plot. Default value is 3. |
label_size |
Numeric, size of the gene labels in the plot. Default value is
3.5 (for |
label_padding |
Numeric, amount of padding around gene labels in the plot, as unit or number. Default value is 0.15 (for geom_label_repel). |
Value
a ggplot
.
Examples
if (interactive()) {
x <- get_example_data()
## Typical example with one GWAs result table, one DE result table and
## one table of candidate genes
hidecan_plot(gwas_list = x[["GWAS"]],
de_list = x[["DE"]],
can_list = x[["CAN"]],
score_thr_gwas = -log10(0.0001),
score_thr_de = -log10(0.005),
log2fc_thr = 0,
label_size = 2)
## Example with two sets of GWAS results
hidecan_plot(gwas_list = list(x[["GWAS"]], x[["GWAS"]]),
score_thr_gwas = 4)
## Example with two sets of DE results, with names
hidecan_plot(de_list = list("X vs Y" = x[["DE"]],
"X vs Z" = x[["DE"]]),
score_thr_de = -log10(0.05),
log2fc_thr = 0)
## Set limits on all chromosomes (to "zoom in" to the 10-20Mb region)
hidecan_plot(gwas_list = x[["GWAS"]],
de_list = x[["DE"]],
can_list = x[["CAN"]],
score_thr_gwas = -log10(0.0001),
score_thr_de = -log10(0.005),
log2fc_thr = 0,
label_size = 2,
chrom_limits = c(10e6, 20e6))
## Set limits on some chromosomes only
hidecan_plot(gwas_list = x[["GWAS"]],
de_list = x[["DE"]],
can_list = x[["CAN"]],
score_thr_gwas = -log10(0.0001),
score_thr_de = -log10(0.005),
log2fc_thr = 0,
label_size = 2,
chrom_limits = list("ST4.03ch00" = c(10e6, 20e6),
"ST4.03ch02" = c(15e6, 25e6)))
}
Creates a HIDECAN plot from GWASpoly output
Description
Creates a HIDECAN plot from the GWAS results from GWASpoly.
Usage
hidecan_plot_from_gwaspoly(gwaspoly_output, traits = NULL, models = NULL, ...)
Arguments
gwaspoly_output |
A |
traits |
Character vector, traits for which GWAS results should be
extracted. If |
models |
Character vector, genetic models for which GWAS results should be
extracted. If |
... |
Further arguments passed to the |
Value
A ggplot
.
CAN_data
constructor
Description
CAN_data
constructor
Usage
new_CAN_data(dat)
Arguments
dat |
Tibble, containing information about genes of interest, with at least columns
|
Value
A CAN_data
object, i.e. a tibble.
DE_data
constructor
Description
DE_data
constructor
Usage
new_DE_data(dat)
Arguments
dat |
Tibble, results from a differential expression analysis, with at least columns
|
Value
A DE_data
object, i.e. a tibble.
GWAS_data
constructor
Description
GWAS_data
constructor
Usage
new_GWAS_data(dat)
Arguments
dat |
Tibble, results from a GWAS analysis, with at least columns
|
Value
A GWAS_data
object, i.e. a tibble.
Launches the HIDECAN shiny app
Description
Starts the HIDECAN shiny app. The app reads in csv data to produce a HIDECAN plot.
Usage
run_hidecan_shiny()
Value
No return value, called for side effects (launching the shiny app).
Checks validity of input for CAN_data
constructor
Description
Checks validity of input for CAN_data
constructor
Usage
validate_CAN_data(x)
Arguments
x |
a |
Value
A CAN_data
object, i.e. a tibble.
Checks validity of input for DE_data
constructor
Description
Checks validity of input for DE_data
constructor
Usage
validate_DE_data(x)
Arguments
x |
a |
Value
A DE_data
object, i.e. a tibble.
Checks validity of input for GWAS_data
constructor
Description
Checks validity of input for GWAS_data
constructor
Usage
validate_GWAS_data(x)
Arguments
x |
a |
Value
A GWAS_data
object, i.e. a tibble.