Title: Calculate Pairwise Distances
Version: 0.0.5
Description: A common framework for calculating distance matrices.
Depends: R (≥ 3.2.2)
License: GPL-2 | GPL-3 [expanded from: GPL]
URL: https://github.com/blasern/rdist
BugReports: https://github.com/blasern/rdist/issues
Encoding: UTF-8
LazyData: true
LinkingTo: Rcpp, RcppArmadillo
Imports: Rcpp, methods
RoxygenNote: 7.1.0
Suggests: testthat
NeedsCompilation: yes
Packaged: 2020-05-04 12:51:18 UTC; nbl003
Author: Nello Blaser [aut, cre]
Maintainer: Nello Blaser <nello.blaser@uib.no>
Repository: CRAN
Date/Publication: 2020-05-04 16:00:02 UTC

Farthest point sampling

Description

Farthest point sampling returns a reordering of the metric space P = p_1, ..., p_k, such that each p_i is the farthest point from the first i-1 points.

Usage

farthest_point_sampling(
  mat,
  metric = "precomputed",
  k = nrow(mat),
  initial_point_index = 1L,
  return_clusters = FALSE
)

Arguments

mat

Original distance matrix

metric

Distance metric to use (either "precomputed" or a metric from rdist)

k

Number of points to sample

initial_point_index

Index of p_1

return_clusters

Should the indices of the closest farthest points be returned?

Examples


# generate data
df <- matrix(runif(200), ncol = 2)
dist_mat <- pdist(df)
# farthest point sampling
fps <- farthest_point_sampling(dist_mat)
fps2 <- farthest_point_sampling(df, metric = "euclidean")
all.equal(fps, fps2)
# have a look at the fps distance matrix
rdist(df[fps[1:5], ])
dist_mat[fps, fps][1:5, 1:5]

Metric and triangle inequality

Description

Does the distance matric come from a metric

Usage

is_distance_matrix(mat, tolerance = .Machine$double.eps^0.5)

triangle_inequality(mat, tolerance = .Machine$double.eps^0.5)

Arguments

mat

The matrix to evaluate

tolerance

Differences smaller than tolerance are not reported.

Examples

data <- matrix(rnorm(20), ncol = 2)
dm <- pdist(data)
is_distance_matrix(dm)
triangle_inequality(dm)

dm[1, 2] <- 1.1 * dm[1, 2]
is_distance_matrix(dm)

Product metric

Description

Returns the p-product metric of two metric spaces. Works for output of 'rdist', 'pdist' or 'cdist'.

Usage

product_metric(..., p = 2)

Arguments

...

Distance matrices or dist objects

p

The power of the Minkowski distance

Examples

# generate data
df <- matrix(runif(200), ncol = 2)
# distance matrices
dist_mat <- pdist(df)
dist_1 <- pdist(df[, 1])
dist_2 <- pdist(df[, 2])
# product distance matrix
dist_prod <- product_metric(dist_1, dist_2)
# check equality
all.equal(dist_mat, dist_prod)

rdist: an R package for distances

Description

rdist provide a common framework to calculate distances. There are three main functions:

In particular the cdist function is often missing in other distance functions. All calculations involving NA values will consistently return NA.

Usage

rdist(X, metric = "euclidean", p = 2L)

pdist(X, metric = "euclidean", p = 2)

cdist(X, Y, metric = "euclidean", p = 2)

Arguments

X, Y

A matrix

metric

The distance metric to use

p

The power of the Minkowski distance

Details

Available distance measures are (written for two vectors v and w):