| Title: | Database-Backed Matrix Classes and Operations |
| Version: | 0.1.0 |
| Description: | Provides S4 classes and methods for storing dense and sparse matrices in 'DuckDB' databases. The package supports constructing database-backed matrices from base R and 'Matrix' objects, extracting slices and summaries, performing arithmetic and selected linear algebra operations, and materializing results for larger-than-memory workflows. It integrates with 'dbProject' to keep database paths, live connections, and lazy matrix tables synchronized across interactive analyses. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| URL: | https://github.com/dbverse-org/dbmatrix-r, https://dbverse-org.github.io/dbmatrix-r/ |
| BugReports: | https://github.com/dbverse-org/dbmatrix-r/issues |
| RoxygenNote: | 7.3.3 |
| Depends: | R (≥ 4.1.0) |
| Imports: | Matrix (≥ 1.6-5), MatrixGenerics (≥ 1.12.3), methods, DBI, dplyr, dbplyr, duckdb (≥ 1.4.0), data.table (≥ 1.12.2), glue, bit64, cli, Rcpp, arrow, nanoarrow, dbProject, rlang |
| LinkingTo: | Rcpp, RcppEigen, RSpectra, nanoarrow |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), irlba, crayon, R.utils, checkmate, reticulate, sparseMatrixStats, RSpectra |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| NeedsCompilation: | yes |
| Packaged: | 2026-05-14 02:35:49 UTC; ecruiz |
| Author: | Edward C. Ruiz |
| Maintainer: | Edward C. Ruiz <ecr7407@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-19 08:30:02 UTC |
Value Matching
Description
Implements the %in% operator for dbMatrix objects. This operator checks if
elements from the left operand are contained in the right operand, returning
a logical vector.
Usage
## S4 method for signature 'dbDenseMatrix,ANY'
x %in% table
## S4 method for signature 'ANY,dbDenseMatrix'
x %in% table
## S4 method for signature 'dbSparseMatrix,ANY'
x %in% table
Arguments
x |
A dbMatrix object or any other object |
table |
Any object or a dbMatrix object |
Details
This is a method for the standard %in% operator for dbMatrix objects.
It follows R's standard behavior for the %in% operator:
When
xis a dbDenseMatrix, it returns a logical vector with the same length as the total number of elements in the matrix.When
tableis a dbDenseMatrix, it allows checking if elements inxare in the matrix.For dbSparseMatrix objects, it throws an error to match the behavior of dgCMatrix.
Value
A logical vector of the same length as x, indicating which elements of x are in table.
Examples
con <- DBI::dbConnect(duckdb::duckdb(), ":memory:")
mat <- matrix(1:9, nrow = 3, ncol = 3)
dbmat <- dbMatrix(
value = mat,
con = con,
name = "example_matrix",
class = "dbDenseMatrix",
overwrite = TRUE
)
dbmat %in% c(1, 3, 5, 7, 9)
c(1, 3, 5, 7, 9) %in% dbmat
DBI::dbDisconnect(con, shutdown = TRUE)
Input validation for data arg
Description
Input validation for data arg
Usage
.check_value(value)
Arguments
value |
A |
Value
No return value. Called for input validation and throws an error if
value is invalid.
Evaluate if a dbSparseMatrix should be densified
Description
Evaluate if a dbSparseMatrix should be densified
Usage
.eval_op_densify(generic_char, vec_matrix)
Arguments
generic_char |
A character string representing the operation to be performed. |
vec_matrix |
A |
Details
Evaluates if a dbSparseMatrix should be
densified for [Arith] operations and specific scalar values for
operations in the order of dbSparseMatrix, vector
Value
TRUE if the operation should densify the sparse matrix before
evaluation, otherwise FALSE.
Join a dbSparseMatrix with a dbMatrix object
Description
Join a dbSparseMatrix with a dbMatrix object
Usage
.join_dbm_vect(dbm, vec_matrix, op, swap_arith_order = FALSE)
Arguments
dbm |
A |
vec_matrix |
A |
op |
A character string representing the operation to be performed. |
swap_arith_order |
order of the arguments for the operation. default: NULL |
Value
A dbMatrix object containing the result of applying op between
dbm and vec_matrix.
Convert a dbSparseMatrix to dbDenseMatrix
Description
Internal function to convert a dbSparseMatrix to
dbDenseMatrix.
Usage
.to_db_dense(x, chunk_size = NULL)
Arguments
x |
A |
chunk_size |
integer. Number of columns to process per chunk during densification.
If NULL (default), the function first checks the global option |
Value
A dbDenseMatrix object
Arith dbMatrix, e2
Description
See methods::Arith for more details.
See methods::Arith for more details.
See methods::Arith for more details.
See methods::Ops for more details.
See methods::Ops for more details.
See methods::Ops for more details.
Usage
## S4 method for signature 'dbMatrix,ANY'
Arith(e1, e2)
## S4 method for signature 'ANY,dbMatrix'
Arith(e1, e2)
## S4 method for signature 'dbMatrix,dbMatrix'
Arith(e1, e2)
## S4 method for signature 'dbMatrix,ANY'
Ops(e1, e2)
## S4 method for signature 'ANY,dbMatrix'
Ops(e1, e2)
## S4 method for signature 'dbMatrix,dbMatrix'
Ops(e1, e2)
## S4 method for signature 'DBIConnection'
dbLoad(conn, name, class)
## S4 method for signature 'dbMatrix'
writeMM(obj, file, ...)
Arguments
e1 |
First operand. |
e2 |
Second operand. |
conn |
DBIConnection object |
name |
valid name value (character) |
class |
character, class of the dbMatrix object (e.g. "dbDenseMatrix" or "dbSparseMatrix") |
obj |
dbMatrix object |
file |
path to file |
... |
additional arguments |
Value
Arithmetic and logical group methods return a
dbMatrixobject of the appropriate dense or sparse subclass, with the same dimensions as the input and transformed values stored in DuckDB.-
dbLoad()returns adbDenseMatrixordbSparseMatrixpointing to an existing DuckDB table. -
writeMM()writes a Matrix Market file tofileand returnsinvisible(TRUE)on success.
Math Operations for dbMatrix Objects
Description
Implements the Math S4groupGeneric functions
for dbMatrix objects. This includes various mathematical operations such as
logarithms, exponentials, trigonometric functions, and other transformations.
Usage
## S4 method for signature 'dbMatrix'
Math(x)
Arguments
x |
A |
Details
This method provides implementations for the following Math functions:
Arithmetic and rounding:
-
abs(),sign(),sqrt(),ceiling(),floor(),trunc()
Cumulative operations:
-
cummax(),cummin(),cumprod(),cumsum() -
Note:
cumprod()is not supported
Logarithmic:
-
log(),log10(),log2(),log1p()
DuckDB Log Function Mappings:
| R Function | DuckDB Function | Notes |
log(x) | LN(x) | Natural logarithm |
log10(x) | LOG10(x) | Base-10 logarithm |
log2(x) | LOG2(x) | Base-2 logarithm |
log1p(x) | LN(x + 1) | log(1+x), computed as LN |
Sparsity-Preserving Log:
For dbSparseMatrix with pending operations, log(x + 1) operations preserve sparsity
since log(0 + 1) = 0. The multiplicative component is applied first,
then the log transformation is applied to sparse values only.
Trigonometric:
-
cos(),sin(),tan(),acos(),asin(),atan() -
cosh(),sinh(),tanh(),acosh(),asinh(),atanh() -
cospi(),sinpi(),tanpi() -
Note:
acosh()asinh()atanh()are not supported
Exponential:
-
exp(),expm1() -
Note:
expm1()is not supported
Special functions:
-
gamma(),lgamma(),digamma(),trigamma() -
Note:
digamma()trigamma()are not supported
The function applies the specified mathematical operation to each element
of the dbMatrix object.
Value
A dbMatrix object with the mathematical operation applied to each element.
Examples
mat <- matrix(1, nrow = 3, ncol = 3)
dbmat <- as.dbMatrix(mat)
log(dbmat)
sqrt(dbmat)
sin(dbmat)
Summary Methods for dbMatrix Objects
Description
Implements the S4groupGeneric group generic functions for dbMatrix objects.
Usage
## S4 method for signature 'dbMatrix'
Summary(x, ..., na.rm = TRUE)
Arguments
x |
A dbMatrix object. |
... |
Additional arguments (not used, but included for compatibility with the generic). |
na.rm |
Logical. If TRUE, remove NA values before computation. Always set to TRUE for this implementation. |
Details
This method provides implementations for the following S4groupGeneric functions:
-
max(): Maximum value -
min(): Minimum value -
range(): Not supported -
prod(): Product of all values -
sum(): Sum of all values -
any(): Returns TRUE if any value is TRUE -
all(): Returns TRUE if all values are TRUE
Value
The result of applying the respective summary function to the dbMatrix object. The type of the return value depends on the specific function called.
Examples
mat <- matrix(1, nrow = 3, ncol = 3)
dbmat <- as.dbMatrix(mat)
max(dbmat)
min(dbmat)
prod(dbmat)
sum(dbmat)
any(dbmat > 0)
all(dbmat > 0)
Extract or replace values in database-backed matrices
Description
Methods for subsetting and replacing values in dbMatrix
objects.
Usage
## S4 method for signature 'dbMatrix,dbIndex,missing,ANY'
x[i, j, ..., drop = TRUE]
## S4 method for signature 'dbMatrix,missing,dbIndex,ANY'
x[i, j, ..., drop = TRUE]
## S4 method for signature 'dbMatrix,dbIndex,dbIndex,ANY'
x[i, j, ..., drop = FALSE]
## S4 method for signature 'dbMatrix,dbMatrix,missing,ANY'
x[i, j, ..., drop = TRUE]
## S4 replacement method for signature 'dbMatrix,dbMatrix,missing,ANY'
x[i, j] <- value
## S4 method for signature 'dbMatrix,dbDenseMatrix,missing,ANY'
x[i, j, ..., drop = TRUE]
## S4 method for signature 'dbMatrix,missing,dbDenseMatrix,ANY'
x[i, j, ..., drop = FALSE]
## S4 method for signature 'dbMatrix,dbDenseMatrix,dbDenseMatrix,ANY'
x[i, j, ..., drop = FALSE]
Arguments
x |
A |
i |
Row, logical matrix, or matrix-style index. |
j |
Column index. |
... |
Additional arguments. |
drop |
Ignored; included for matrix API compatibility. |
value |
Replacement value. |
Value
A subsetted or modified dbMatrix, or an extracted vector for
matrix-style indexing.
Convert Matrix::Matrix to dbMatrix
Description
Converts in-memory matrix, Matrix::dgeMatrix, or
Matrix::dgCMatrix into a dbMatrix object.
Generic function to convert in-memory objects to dbMatrix objects.
Usage
as.dbMatrix(x, con = NULL, name = "dbMatrix", overwrite = FALSE, ...)
as.dbMatrix(x, con = NULL, name = "dbMatrix", overwrite = FALSE, ...)
Arguments
x |
Object to convert (e.g., matrix, dgCMatrix) |
con |
DBI or duckdb connection object |
name |
Table name to assign within database |
overwrite |
Whether to overwrite if table already exists |
... |
Additional arguments passed to methods |
Details
If no con is provided, a temporary in-memory database connection is created.
If no name is provided, a unique table name is generated.
Value
A dbDenseMatrix for dense matrix inputs or a dbSparseMatrix
for sparse matrix inputs. The returned object keeps the input dimensions
and dimnames while storing matrix values in DuckDB.
Convert dbMatrix to in-memory matrix
Description
Converts a dbMatrix object into an in-memory matrix or sparse matrix.
Usage
## S3 method for class 'dbMatrix'
as.matrix(x, ..., sparse = FALSE, names = TRUE)
Arguments
x |
A |
... |
Additional arguments (not used) |
sparse |
Logical indicating if the output should be a sparse matrix |
names |
Logical indicating if the output should have dimnames. |
Details
This method converts a dbMatrix object into an in-memory
Matrix::dgCMatrix (sparse = TRUE) or matrix() (default, sparse = FALSE).
Warning: This function can cause memory issues for large dbMatrix objects.
Set sparse = TRUE to convert to a sparse matrix.
Set names = TRUE to keep dimnames.
Value
A Matrix::dgCMatrix or matrix
Coerce dbMatrix to dgCMatrix
Description
Coercion methods to convert dbMatrix objects to in-memory dgCMatrix objects.
Respects dbMatrix.max_mem_convert option to prevent OOM errors.
Value
A Matrix::dgCMatrix object containing the collected matrix
values. Dense inputs are converted to sparse Matrix format after collection.
Coerce dbMatrix to matrix
Description
Coercion methods to convert dbMatrix objects to in-memory matrix objects.
Respects dbMatrix.max_mem_convert option to prevent OOM errors.
Value
A base R matrix containing the collected matrix values with the
same dimensions and dimnames as the source object.
Coerce matrix to dbMatrix
Description
Coercion methods to convert in-memory matrix objects to dbMatrix objects.
Creates a new in-memory DuckDB connection.
Value
A database-backed matrix object. Dense inputs return a
dbDenseMatrix, while sparse Matrix::dgCMatrix inputs return a
dbSparseMatrix.
Row (column) standard deviations for dbMatrix objects
Description
Calculates the standard deviation for each row (column) of a matrix-like object.
Usage
## S4 method for signature 'dbDenseMatrix'
colSds(
x,
rows = NULL,
cols = NULL,
na.rm = FALSE,
center = NULL,
...,
useNames = TRUE
)
## S4 method for signature 'dbSparseMatrix'
colSds(
x,
rows = NULL,
cols = NULL,
na.rm = FALSE,
center = NULL,
...,
useNames = TRUE
)
## S4 method for signature 'dbDenseMatrix'
rowSds(
x,
rows = NULL,
cols = NULL,
na.rm = TRUE,
center = NULL,
...,
useNames = TRUE
)
## S4 method for signature 'dbSparseMatrix'
rowSds(
x,
rows = NULL,
cols = NULL,
na.rm = TRUE,
center = NULL,
...,
useNames = TRUE
)
Arguments
x |
A |
rows |
Always NULL for |
cols |
Always NULL for |
na.rm |
Always TRUE for |
center |
Always NULL for |
... |
Additional arguments (not used, but included for compatibility with the generic). |
useNames |
Always TRUE for |
Value
A named numeric vector containing one sample standard deviation per
row or column of x.
Force computation of a dbMatrix
Description
Explicitly compute a dbMatrix and save it to a table in the database.
This overrides the default dplyr::compute to use a direct CREATE TABLE AS
statement, which is more robust for large tables in DuckDB.
Usage
## S3 method for class 'dbMatrix'
compute(
x,
name = NULL,
temporary = TRUE,
dimnames = TRUE,
overwrite = FALSE,
...
)
Arguments
x |
A |
name |
Name of the table to create. If NULL, a random name is generated. |
temporary |
Logical. If TRUE (default), create a temporary table. |
dimnames |
default = TRUE. If TRUE, the rownames and colnames will be
saved in the database. This allows full reconstruction of the dbMatrix object
using |
overwrite |
Logical. If TRUE, overwrite the table if it already exists. Default is FALSE. |
... |
Additional arguments passed to methods (ignored). |
Value
A dbMatrix object pointing to the new table.
S4 Class for dbDenseMatrix
Description
Representation of dense matrices using an on-disk database. Inherits from dbMatrix.
Value
Objects of class dbDenseMatrix store all matrix entries explicitly
in DuckDB. They are typically returned by dbMatrix() or as.dbMatrix()
for dense inputs.
S4 virtual class for dbMatrix
Description
Representation of sparse and dense matrices in a database. Each object
is used as a connection to a single table that exists within the database.
Inherits from dbData.
Create an S4 dbMatrix object in sparse or dense triplet vector format.
Usage
dbMatrix(
value,
class = NULL,
con = NULL,
overwrite = FALSE,
name = "dbMatrix",
dims = NULL,
dim_names = NULL,
mtx_rowname_file_path,
mtx_rowname_col_idx = 1,
mtx_colname_file_path,
mtx_colname_col_idx = 1,
...
)
dbMatrix(
value,
class = NULL,
con = NULL,
overwrite = FALSE,
name = "dbMatrix",
dims = NULL,
dim_names = NULL,
mtx_rowname_file_path,
mtx_rowname_col_idx = 1,
mtx_colname_file_path,
mtx_colname_col_idx = 1,
...
)
Arguments
value |
data to be added to the database. See details for supported data types |
class |
class of the dbMatrix: |
con |
DBI or duckdb connection object |
overwrite |
whether to overwrite if table already exists in database |
name |
table name to assign within database |
dims |
dimensions of the matrix |
dim_names |
dimension names of the matrix |
mtx_rowname_file_path |
path to .mtx rowname file to be read into |
mtx_rowname_col_idx |
column index of row name file |
mtx_colname_file_path |
path to .mtx colname file to be read into
database. by default, no header is assumed. |
mtx_colname_col_idx |
column index of column name file |
... |
additional params to pass to |
Details
This function reads in data into a pre-existing DuckDB database.
Supported value data types:
-
Matrix::dgCMatrixIn-memory sparse matrix from theMatrix::Matrixpackage -
Matrix::dgTMatrixIn-memory triplet vector or COO matrix -
matrixIn-memory dense matrix from base R -
.mtxPath to .mtx file -
.csvPath to .csv file -
tbl_duckdb_connectionTable induckdb::duckdbdatabase in ijx format from existingdbMatrixobject.dimsanddim_namesmust be specified ifvalueistbl_duckdb_connection.
Value
dbMatrix() returns an initialized dbDenseMatrix or
dbSparseMatrix S4 object that points to matrix data stored in DuckDB.
A dbDenseMatrix or dbSparseMatrix object, depending on
class, pointing to matrix data stored in DuckDB. The object records the
matrix dimensions and dimension names.
Slots
dim_namesrow (1) and col (2) names
dimsdimensions of the matrix
initlogical. Whether the object is fully initialized
Examples
dgc <- readRDS(system.file("extdata", "dgc.rds", package = "dbMatrix"))
con <- DBI::dbConnect(duckdb::duckdb(), ":memory:")
dbSparse <- dbMatrix(
value = dgc,
con = con,
name = "sparse_matrix",
class = "dbSparseMatrix",
overwrite = TRUE
)
dbSparse
dbMatrix_from_tbl
Description
Constructs a dbSparseMatrix object from a tbl_duckdb_connection object.
Usage
dbMatrix_from_tbl(
tbl,
rownames_colName,
colnames_colName,
value_colName = NULL,
name = "dbMatrix",
overwrite = FALSE,
row_names = NULL,
col_names = NULL,
i_col = NULL,
j_col = NULL
)
Arguments
tbl |
|
rownames_colName |
|
colnames_colName |
|
value_colName |
|
name |
table name to assign within database |
overwrite |
whether to overwrite if table already exists in database |
row_names |
|
col_names |
|
i_col |
|
j_col |
|
Details
The tbl_duckdb_connection object must contain dimension names as columns in long format.
If value_colName is provided, the function uses pre-aggregated counts from that column.
This is useful when the input table already contains aggregated counts (e.g., from a GROUP BY + SUM operation).
If value_colName is NULL (default), the function counts occurrences of each row-column pair.
When row_names and/or col_names are provided, the function uses these directly
instead of querying distinct values from the table. This can significantly improve performance
when the input table is a complex lazy query (e.g., result of spatial joins).
When i_col and j_col are provided, the function uses these pre-computed integer
indices directly, skipping expensive string-to-index encoding. This is the fastest path.
Value
dbMatrix object
dbMatrix Package Global Options
Description
The following global options can be modified to control the
behavior of the dbMatrix package.
Details
Use options() to set the below options.
Value
No return value. This documentation page describes package options.
Options
-
dbMatrix.digits: integer. Number of digits to round to in the show function of dbMatrix objects. Default is 7. -
dbMatrix.max_mem_convert: numeric. Maximum size (in bytes) allowed for implicit conversion ofdbMatrixto in-memory matrix. Default is 8 * 1024^3 (8GB). -
dbMatrix.chunk_size: integer. Number of columns to process per chunk during densification (.to_db_dense). Smaller chunks reduce memory usage but may increase execution time. Default isNULL(automatically calculated based ondbMatrix.max_mem_convert). -
dbMatrix.max_chunks: integer. Maximum number of chunks allowed during densification. This prevents query parser errors caused by excessiveUNION ALLbranches. Default is 10000. -
dbMatrix.verbose: logical. IfTRUE(default), prints informative messages during implicit coercion. -
dbMatrix.precomp_db: character. Path to an external DuckDB file containing precomputed table(s). If set,dbMatrixwill automatically attach this database (read-only) and look for a suitable precomputed table to speed up densification. -
dbMatrix.allow_densify: logical. IfFALSE(default), automatic sparse-to-dense conversion is disabled. This prevents unexpected disk spilling and memory issues when operations would require densification (e.g., division by zero, scalar addition to sparse matrix). Set toTRUEto enable on-disk dense conversion. Warning: Dense conversion can cause massive disk usage for large matrices.
S4 Class for dbSparseMatrix
Description
Representation of sparse matrices using an on-disk database. Inherits from dbMatrix.
Value
Objects of class dbSparseMatrix store only non-zero matrix entries
in DuckDB. They are typically returned by dbMatrix() or as.dbMatrix()
for sparse inputs.
Perform Streaming SVD on a dbMatrix
Description
Perform Streaming SVD on a dbMatrix
Usage
db_svd(
dbm,
k = 10,
center = TRUE,
scale = FALSE,
center_rows = NULL,
memory_limit = getOption("dbMatrix.svd_memory", 8 * 1024^3),
return_format = c("svd", "pca")
)
Arguments
dbm |
A dbSparseMatrix object |
k |
Number of singular values to compute |
center |
Logical, center rows (default TRUE) |
scale |
Logical, scale rows (default FALSE) |
center_rows |
Logical, center rows vs columns (default TRUE for standard PCA) |
memory_limit |
Bytes for Fast Path. Default 8 GB. |
return_format |
"svd" (d, u, v) or "pca" (eigenvalues, loadings, coords) |
Value
List with SVD or PCA components
Dimensions of an Object
Description
Retrieve the dimension of an object.
Usage
## S4 method for signature 'dbMatrix'
dim(x)
Arguments
x |
|
Value
An integer vector of length 2 giving the number of rows and columns
in x.
get_MM_dim
Description
Internal function to read dimensions of a .mtx file
Usage
get_MM_dim(mtx_file_path)
Arguments
mtx_file_path |
path to .mtx file to be read into database |
Details
Scans for the header of an mtx file (starting with %) and takes one more line representing the dimensions and number of nonzero values.
Note: the header size can vary depending on the .mtx file.
Value
integer vector of dimensions
get_MM_dimnames
Description
Internal function to read row and column names of a .mtx file
Usage
get_MM_dimnames(
mtx_file_path,
mtx_rowname_file_path,
mtx_rowname_col_idx = 1,
mtx_colname_file_path,
mtx_colname_col_idx = 1,
...
)
Arguments
mtx_file_path |
path to .mtx file to be read into database |
mtx_rowname_file_path |
path to .mtx rowname file to be read into database. by default, no header is assumed. |
mtx_rowname_col_idx |
column index of row name file |
mtx_colname_file_path |
path to .mtx colname file to be read into database. by default, no header is assumed. |
mtx_colname_col_idx |
column index of column name file |
... |
additional params to pass to |
Details
Can be used to read row and column names from .mtx files. Note: these files must not contain a header (colnames).
The mtx_rowname_col_idx and mtx_colname_col_idx can be used to specify the column index of the row and column name files, respectively. By default, the first column is used for both.
TODO: Support for reading in only rownames or colnames.
Value
list of row and column name character vectors
get_con
Description
get_con
Usage
get_con(dbMatrix)
Arguments
dbMatrix |
A database-backed object inheriting from |
Value
A live DBI connection associated with the database-backed object.
get_dbdir
Description
get_dbdir
Usage
get_dbdir(dbMatrix)
Arguments
dbMatrix |
A database-backed object inheriting from |
Value
A character scalar giving the DuckDB database directory used by the object.
get_tblName
Description
get_tblName
Usage
get_tblName(dbMatrix)
Arguments
dbMatrix |
A database-backed object inheriting from |
Value
A character scalar giving the DuckDB table name associated with the object.
Return the First or Last Parts of an Object
Description
Returns the first or last parts of a vector, matrix, array, table, data frame
or function. Since head() and tail() are generic
functions, they have been extended to other classes, including
"ts" from stats.
Usage
## S4 method for signature 'dbMatrix'
head(x, n = 6L, ...)
## S4 method for signature 'dbMatrix'
tail(x, n = 6L, ...)
Arguments
x |
an object |
n |
an integer vector of length up to |
... |
arguments to be passed to or from other methods. |
Value
A dbMatrix object containing the first or last n rows of x,
with updated dimensions and row names.
Element-wise is.na for dbMatrix
Description
Returns a dbMatrix with numeric values indicating NA positions (1 = NA, 0 = not NA).
Usage
## S4 method for signature 'dbMatrix'
is.na(x)
Arguments
x |
A dbMatrix object. |
Value
A dbMatrix with same dimensions, containing 1 where the original value was NA and 0 otherwise.
Examples
mat <- matrix(c(1, NA, 3, NA), nrow = 2)
dbmat <- as.dbMatrix(mat)
is.na(dbmat)
Length of a dbMatrix Object
Description
Get or set the length of vectors (including lists) and factors, and of any other R object for which a method has been defined.
Usage
## S4 method for signature 'dbMatrix'
length(x)
Arguments
x |
|
Value
A length-one integer giving the number of stored elements in x.
Map dimnames to i,j indices
Description
Map dimnames to i,j indices
Usage
map_ijx_dimnames(dbMatrix, colName_i, colName_j)
Arguments
dbMatrix |
dbMatrix object |
colName_i |
name of column rownames to add to database |
colName_j |
name of column colnames to add to database default: 'FALSE'.' |
Details
Constructs a table in a database that contains the accompanying dimnames for a dbMatrix. The resulting columns in the table:
i (row index)
colName_i (rownames),
j (col index)
j_names (colnames)
x (counts of i,j occcurences)
Value
A lazy tbl_dbi with i, j, x, and the mapped row/column name
columns.
Arithmetic Mean for dbMatrix objects
Description
Generic function for the (trimmed) arithmetic mean.
Usage
## S4 method for signature 'dbDenseMatrix'
mean(x, ...)
## S4 method for signature 'dbSparseMatrix'
mean(x, ...)
Arguments
x |
|
... |
further arguments passed to or from other methods. |
Value
A length-one numeric vector giving the arithmetic mean of all entries
in x.
The names of a dbMatrix Object
Description
The names of a dbMatrix Object
Usage
## S4 method for signature 'dbDenseMatrix'
names(x)
Arguments
x |
A dbMatrix object |
Value
A character vector of the names of the 1D dbMatrix object (1D matrices only)
The Number of Rows/Columns of a dbMatrix Object
Description
nrow and ncol return the number of rows or columns present in x.
Usage
nrow.dbMatrix(x)
ncol.dbMatrix(x)
Arguments
x |
|
Value
A length-one integer giving the number of rows or columns in x.
Compute a dense COO table in a database connection
Description
Precomputes a COO list table in a specificied database connection in column-
major order.
This can speed up operations that involve breaking
sparsity of a dbSparseMatrix,
such as in cases when performing + or - arithmetic operations.
Usage
precompute(conn, m, n, verbose = FALSE)
Arguments
conn |
duckdb database connection |
m |
number of rows of precomputed dbMatrix table |
n |
number of columns of precomputed dbMatrix table |
verbose |
logical, print progress messages. default: FALSE. |
Details
The m and n parameters must exceed the
maximum row and column indices of the dbMatrix in order to be used for
densifying any dbMatrix. If these params are less than the maximum
row and column indices, a new precomputed table will be automatically
generated with the name 'precomp_mXn'.
In such cases, run this function again with a larger
n_rows and num_cols, or to manually remove the precomputed
table set options(dbMatrix.precomp = NULL) in the R console.
Value
A tbl_dbi object referencing the newly created precomputed lookup
table in DuckDB.
Generate array for pretty printing of matrix values
Description
Generate array for pretty printing of matrix values
Usage
print_array(
i = NULL,
j = NULL,
x = NULL,
dims,
rownames = rep("", dims[1]),
class = c("sparse", "dense"),
fill = ".",
digits = 5L
)
Arguments
i, j, x |
matched vectors of integers in i and j, with value in x |
dims |
dimensions of the array (integer vector of 2) |
fill |
fill character |
digits |
default = 5. If numeric, round to this number of digits |
Value
No return value. Called for its side effect of printing a formatted matrix preview to the console.
Row (column) means for dbMatrix objects
Description
Calculates the mean for each row (column) of a matrix-like object.
Usage
## S4 method for signature 'dbMatrix'
rowMeans(x, na.rm = FALSE, dims = 1, ...)
## S4 method for signature 'dbMatrix'
colMeans(x, na.rm = FALSE, dims = 1, ...)
Arguments
x |
An NxK matrix-like object, a numeric data frame, or an array-like object of two or more dimensions. |
na.rm |
Always TRUE for |
dims |
Always 1 for |
... |
Additional arguments passed to specific methods. |
Value
A named numeric vector containing one mean per row or column of x.
Row (column) sums for dbMatrix objects
Description
Calculates the sum for each row (column) of a matrix-like object.
Usage
## S4 method for signature 'dbDenseMatrix'
rowSums(x, na.rm = FALSE, dims = 1, ..., memory = FALSE)
## S4 method for signature 'dbSparseMatrix'
rowSums(x, na.rm = FALSE, dims = 1, ...)
## S4 method for signature 'dbDenseMatrix'
colSums(x, na.rm = FALSE, dims = 1, ...)
## S4 method for signature 'dbSparseMatrix'
colSums(x, na.rm = FALSE, dims = 1, ...)
Arguments
x |
An NxK matrix-like object, a numeric data frame, or an array-like object of two or more dimensions. |
na.rm |
Always TRUE for |
dims |
Always 1 for |
... |
Additional arguments passed to specific methods. |
memory |
logical. If FALSE (default), results returned as dbDenseMatrix. This is recommended for large computations. Set to TRUE to return the results as a vector. |
Value
A named numeric vector containing one sum per row or column of x.
Row (column) variances for dbMatrix objects
Description
Calculates the variance for each row (column) of a matrix-like object.
Usage
## S4 method for signature 'dbDenseMatrix'
rowVars(
x,
rows = NULL,
cols = NULL,
na.rm = TRUE,
center = NULL,
...,
useNames = TRUE
)
## S4 method for signature 'dbSparseMatrix'
rowVars(
x,
rows = NULL,
cols = NULL,
na.rm = TRUE,
center = NULL,
...,
useNames = TRUE
)
## S4 method for signature 'dbDenseMatrix'
colVars(
x,
rows = NULL,
cols = NULL,
na.rm = TRUE,
center = NULL,
...,
useNames = TRUE
)
## S4 method for signature 'dbSparseMatrix'
colVars(
x,
rows = NULL,
cols = NULL,
na.rm = TRUE,
center = NULL,
...,
useNames = TRUE
)
Arguments
x |
A |
rows |
Always NULL for |
cols |
Always NULL for |
na.rm |
Always TRUE for |
center |
Always NULL for |
... |
Additional arguments (not used, but included for compatibility with the generic). |
useNames |
Always TRUE for |
Value
A named numeric vector containing one sample variance per row or
column of x.
Retrieve and Set Row (Column) Dimension Names of dbMatrix Objects
Description
Retrieve and Set Row (Column) Dimension Names of dbMatrix Objects
Usage
rownames.dbMatrix(x, do.NULL = TRUE, prefix = "row")
## S3 replacement method for class 'dbMatrix'
rownames(x) <- value
colnames.dbMatrix(x, do.NULL = TRUE, prefix = "col")
## S3 replacement method for class 'dbMatrix'
colnames(x) <- value
## S4 method for signature 'dbMatrix'
dimnames(x)
## S4 replacement method for signature 'dbMatrix,list'
dimnames(x) <- value
Arguments
x |
a matrix-like R object, with at least two dimensions for
|
do.NULL |
Not used for this method. Included for compatibility with the generic. |
prefix |
Not used for this method. Included for compatibility with the generic. |
value |
a valid value for that component of
|
Value
rownames() and colnames() return character vectors of dimension
names. dimnames() returns a length-2 list containing row and column name
vectors. The replacement forms return the modified dbMatrix object.
sim_dgc
Description
Simulate a dbSparseMatrix in memory
Simulate a dbDenseMatrix in memory.
Usage
sim_duckdb(value = datasets::iris, name = "test", con = NULL, memory = TRUE)
sim_dgc(num_rows = 50, num_cols = 50, n_vals = 50)
sim_denseMat(num_rows = 50, num_cols = 50)
sim_ijx_matrix(mat_type = NULL, num_rows = 50, num_cols = 50, seed_num = 42)
sim_dbSparseMatrix(
num_rows = 50,
num_cols = 50,
seed_num = 42,
name = "sparse_test",
memory = FALSE
)
sim_dbDenseMatrix(
num_rows = 50,
num_cols = 50,
seed_num = 42,
name = "dense_test",
memory = FALSE
)
Arguments
num_rows |
The number of rows in the matrix (default: 50) |
num_cols |
The number of columns in the matrix (default: 50) |
seed_num |
The seed number for reproducibility (default: 42) |
Details
This function generates a simulated sparse matrix (dgCMatrix) with number of rows and columns and sets n_vals random values to a non-zero value.
This function generates a simulated dense matrix object with a specified number of rows and columns.
This function generates an ijx representation of a simulated dgCMatrix object with a specified number of rows and columns and sets 50 random values to a non-zero value.
Value
A dgCMatrix object
Functions
-
sim_duckdb(): Simulate a duckdb connection dplyr tbl_Pool in memory -
sim_dgc(): Simulate a dgcMatrix -
sim_denseMat(): Simulate a dense matrix -
sim_ijx_matrix(): Simulate a duckdb connection dplyr tbl_Pool in memory -
sim_dbSparseMatrix(): Simulate a dbSparseMatrix in memory -
sim_dbDenseMatrix(): Simulate a dbDenseMatrix in memory
Matrix Transpose
Description
Given a dbMatrix x, t returns the transpose of x.
Usage
## S4 method for signature 'dbMatrix'
t(x)
Arguments
x |
|
Value
dbMatrix object
to_ijx_disk
Description
to_ijx_disk
Usage
to_ijx_disk(con, name)
Arguments
con |
duckdb connection |
name |
name of table to convert to ijx on disk |
Value
remote table in long format unpivoted from wide format matrix
Convert dbMatrix to named ijx table
Description
Converts a dbMatrix to a lazy long table where row and column indices are
replaced by dimension names.
Usage
to_named_ijx_tbl(
x,
row_col = "row_name",
col_col = "col_name",
compute = FALSE
)
Arguments
x |
A dbMatrix object (dbSparseMatrix or dbDenseMatrix) |
row_col |
Name for the row-name column (default: "row_name") |
col_col |
Name for the column-name column (default: "col_name") |
compute |
Whether to materialize as temp table (default: FALSE) |
Value
A lazy tbl with columns: row_col, col_col, x