
pathdb is an R package designed to facilitate access to the South Dakota State University (SDSU) bioinformatics database (used in iDEP) and perform essential data preparation tasks for gene expression analysis.
It allows users to seamlessly retrieve species-specific gene and pathway information and process RNA-Seq data for downstream analysis using standard Bioconductor workflows.
You can install the development version of pathdb from GitHub:
# install.packages("remotes")
remotes::install_github("aidanfred24/pathdb")Instead of walking through a complete data analysis pipeline (which
you can find in our vignettes), here is a brief look
at how pathdb can be used to access genomic information for
your species of interest:
library(pathdb)
# 1. Check if your species is supported
species_info <- search_species(query = "Human", name_type = "primary")
human_id <- species_info$id[1] # ID is 96
# 2. Standardize gene IDs in your expression data
data(hypoxia_reads)
clean_data <- convert_id(
genes = rownames(hypoxia_reads),
data = hypoxia_reads,
species_id = human_id
)
# 3. Process data for downstream analysis
processed_data <- process_data(
data = clean_data,
missing_value = "geneMedian",
min_cpm = 0.5,
)
# 4. Retrieve pathways for your genes of interest
pathways <- get_pathways(
species_id = human_id,
genes = rownames(processed_data),
category = "GOBP"
)For detailed, step-by-step tutorials on how to fully utilize
pathdb, please refer to the package vignettes:
vignette("data-access", package = "pathdb")vignette("path-enrichment", package = "pathdb")DBI, dplyr,
edgeR, R.utils, RSQLite,
stats, utils, tools