Package 'dispRity'

Title:	Measuring Disparity
Description:	A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.
Authors:	Thomas Guillerme [aut, cre, cph] , Mark Puttick [aut, cph], Jack Hatfield [aut, cph]
Maintainer:	Thomas Guillerme <[email protected]>
License:	GPL-3 \| file LICENSE
Version:	1.9.3
Built:	2025-03-06 16:32:25 UTC
Source:	https://github.com/tguillerme/disprity

Help Index

Measuring Disparity in R
Add, remove or get trees (or subtrees)
adonis dispRity (from vegan::adonis2)
Apply inapplicable characters to a matrix.
as.covar
Beck and Lee 2014 datasets
BeckLee_disparity
Bhattacharyya Coefficient
Bootstraps and rarefies data.
Character differences
Charadriiformes
Check a morphological matrix consistency levels.
Separating data in chronological subsets.
Imports data from Claddis
Cleaning phylogenetic data
covar.plot
Utilities for a dispRity object with covariance matrices
Separates stem and crown species
Separating data into custom subsets.
Demo datasets
disparity
Calculates disparity from a matrix.
Covar projection analyses wrapper
Fast dispRity
Disparity metrics
Disparity in different groups.
Disparity through time.
Randtest distance
dtt dispRity (from geiger::dtt)
Getting the time subsets before and after an extinction event
Imports data from geomorph
Get time bins ages
Generates a contrast matrix.
Extract elements from a dispRity object.
Extracts or modify subsets from a dispRity object.
Make and fill dispRity.
Creating disparity metrics
Match tips or nodes edge vector
MCMCglmm.subsets
MCMCglmm object utility functions
Model Test
Simulate Model Test
Model test wrapper
Ancestral states estimations with multiple trees
Testing a null hypothesis on multidimensional data.
Plots pairwise comparisons
phylolm dispRity (from phylolm::phylolm)
Plots character differences
dispRity object plotting
Prints a dispRity object.
Random circle
Random (permutation) test
Reduce a matrix
Reduce space
Remove zero branch length
Rescaling and centering disparity results.
Selects ordination axes
Adds root time to a tree
Simulates morphological data.
Time slicing a tree.
Stretching a tree
Sorting or ordering a dispRity object.
Creating multidimensional spaces
dispRity object summary
Testing disparity hypotheses
Test disparity metric
Extracting the age of nodes and tips in a tree.

A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics (Guillerme (2018) <doi:10.1111/2041-210X.13022>). The package also provides functions for looking at disparity in a serial way (e.g. disparity through time - Guillerme and Cooper (2018) <doi:10.1111/pala.12364>) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.

Author(s)

Thomas Guillerme <[email protected]>

Add, remove or get trees (or subtrees)

Description

Adding, extracting or removing the tree component from a dispRity object.

Usage

add.tree(data, tree, replace = FALSE)

get.tree(data, subsets = FALSE, to.root = FALSE)

remove.tree(data)
add.tree(data, tree, replace = FALSE)

get.tree(data, subsets = FALSE, to.root = FALSE)

remove.tree(data)

Arguments

`data`	A `dispRity` object.
`tree`	A `phylo` or `mutiPhylo` object.
`replace`	Logical, whether to replace any existing tree (`TRUE`) or add to it (`FALSE`; default).
`subsets`	Either a logical whether to extract the tree for each subset (`TRUE`) or not (`FALSE`; default) or specific subset names or numbers.
`to.root`	Logical, whether to return the subset tree including the root of the tree (`TRUE`) or only containing the elements in the subset (and their most recent common ancestor; `FALSE`; default). If `data` contains time bins (from `chrono.subsets` with `method = "discrete"`), and `to.root = FALSE` it returns the subtrees containing only what's in the bin.

Details

get.tree allows to extract the trees specific to each subsets.

Author(s)

Thomas Guillerme and Jack Hadfield

Examples

## Loading a dispRity object
data(disparity)
## Loading a tree
data(BeckLee_tree)

## Removing  the tree from the dispRity object
(tree_data <- remove.tree(disparity))

## Extracting the tree
get.tree(tree_data) # is null

## Adding a tree to the disparity object
tree_data <- add.tree(tree_data, tree = BeckLee_tree)

## Extracting the tree
get.tree(tree_data) # is a "phylo" object

## Adding the same tree again
tree_data <- add.tree(tree_data, tree = BeckLee_tree)
get.tree(tree_data) # is a "multiPhylo" object (2 trees)

## Replacing the two trees by one tree
tree_data <- add.tree(tree_data, tree = BeckLee_tree, replace = TRUE)
get.tree(tree_data) # is a "phylo" object

## Loading a dispRity object
data(disparity)
## Loading a tree
data(BeckLee_tree)

## Removing  the tree from the dispRity object
(tree_data <- remove.tree(disparity))

## Extracting the tree
get.tree(tree_data) # is null

## Adding a tree to the disparity object
tree_data <- add.tree(tree_data, tree = BeckLee_tree)

## Extracting the tree
get.tree(tree_data) # is a "phylo" object

## Adding the same tree again
tree_data <- add.tree(tree_data, tree = BeckLee_tree)
get.tree(tree_data) # is a "multiPhylo" object (2 trees)

## Replacing the two trees by one tree
tree_data <- add.tree(tree_data, tree = BeckLee_tree, replace = TRUE)
get.tree(tree_data) # is a "phylo" object

adonis dispRity (from `vegan::adonis2`)

Description

Passing dispRity objects to the adonis2 function from the vegan package.

Usage

adonis.dispRity(
  data,
  formula = matrix ~ group,
  method = "euclidean",
  ...,
  warn = TRUE,
  matrix = 1
)
adonis.dispRity(
  data,
  formula = matrix ~ group,
  method = "euclidean",
  ...,
  warn = TRUE,
  matrix = 1
)

Arguments

`data`	A `dispRity` object with subsets
`formula`	The model formula (default is `matrix ~ group`, see details)
`method`	The distance method to be passed to `adonis2` and eventually to `vegdist` (see details - default `method ="euclidean"`)
`...`	Any optional arguments to be passed to `adonis2`
`warn`	`logical`, whether to print internal warnings (`TRUE`; default - advised) or not (`FALSE`).
`matrix`	`numeric`, which matrix to use (default is `1`).

Details

The first element of the formula (the response) must be called matrix and the predictors must be existing in the subsets of the dispRity object.

If data$matrix[[1]] is not a distance matrix, distance is calculated using the dist function. The type of distance can be passed via the standard method argument that will be recycled by adonis2.

If the dispRity data has custom subsets with a single group, the formula is set to matrix ~ group.

If the dispRity data has custom subsets with multiple group categories (separated by a dot, e.g. c("group1.cat1", "group1.cat2", "group2.catA", "group2.catB") being two groups with two categories each), the default formula is matrix ~ first_group but can be set to any combination (e.g. matrix ~ first_group + second_group).

If the dispRity data has time subsets, the predictor is automatically set to time.

Author(s)

Thomas Guillerme

References

Oksanen J, Simpson G, Blanchet F, Kindt R, Legendre P, Minchin P, O'Hara R, Solymos P, Stevens M, Szoecs E, Wagner H, Barbour M, Bedward M, Bolker B, Borcard D, Carvalho G, Chirico M, De Caceres M, Durand S, Evangelista H, FitzJohn R, Friendly M, Furneaux B, Hannigan G, Hill M, Lahti L, McGlinn D, Ouellette M, Ribeiro Cunha E, Smith T, Stier A, Ter Braak C, Weedon J (2024). vegan: Community Ecology Package_. R package version 2.6-8,

Examples

## Adonis with one groups 

## Generating a random character matrix
character_matrix <- sim.morpho(rtree(20), 50, rates = c(rnorm, 1, 0))
## Calculating the distance matrix
distance_matrix <- as.matrix(dist(character_matrix))
## Creating two groups
random_groups <- list("group1" = 1:10, "group2" = 11:20)

## Generating a dispRity object
random_disparity <- custom.subsets(distance_matrix, random_groups)
## Running a default NPMANOVA
adonis.dispRity(random_disparity)


## Adonis with multiple groups

## Creating a random matrix
random_matrix <- matrix(data = rnorm(90), nrow = 10, 
                     dimnames = list(letters[1:10]))
## Creating two groups with two states each
groups <- as.data.frame(matrix(data = c(rep(1,5), rep(2,5), rep(c(1,2), 5)),
         nrow = 10, ncol = 2, dimnames = list(letters[1:10], c("g1", "g2"))))

## Creating the dispRity object
multi_groups <- custom.subsets(random_matrix, groups)

## Running the NPMANOVA
adonis.dispRity(multi_groups, matrix ~ g1 + g2)

## Adonis with time

## Creating time series
data(BeckLee_mat50)
data(BeckLee_tree)
data(BeckLee_ages)
time_subsets <- chrono.subsets(BeckLee_mat50, BeckLee_tree, 
     method = "discrete", inc.nodes = FALSE, time = c(100, 85, 65, 0),
     FADLAD = BeckLee_ages)

## Running the NPMANOVA with time as a predictor
adonis.dispRity(time_subsets, matrix ~ time)

## Running the NPMANOVA with each time bin as a predictor
adonis.dispRity(time_subsets, matrix ~ chrono.subsets)


## Adonis with one groups 

## Generating a random character matrix
character_matrix <- sim.morpho(rtree(20), 50, rates = c(rnorm, 1, 0))
## Calculating the distance matrix
distance_matrix <- as.matrix(dist(character_matrix))
## Creating two groups
random_groups <- list("group1" = 1:10, "group2" = 11:20)

## Generating a dispRity object
random_disparity <- custom.subsets(distance_matrix, random_groups)
## Running a default NPMANOVA
adonis.dispRity(random_disparity)


## Adonis with multiple groups

## Creating a random matrix
random_matrix <- matrix(data = rnorm(90), nrow = 10, 
                     dimnames = list(letters[1:10]))
## Creating two groups with two states each
groups <- as.data.frame(matrix(data = c(rep(1,5), rep(2,5), rep(c(1,2), 5)),
         nrow = 10, ncol = 2, dimnames = list(letters[1:10], c("g1", "g2"))))

## Creating the dispRity object
multi_groups <- custom.subsets(random_matrix, groups)

## Running the NPMANOVA
adonis.dispRity(multi_groups, matrix ~ g1 + g2)

## Adonis with time

## Creating time series
data(BeckLee_mat50)
data(BeckLee_tree)
data(BeckLee_ages)
time_subsets <- chrono.subsets(BeckLee_mat50, BeckLee_tree, 
     method = "discrete", inc.nodes = FALSE, time = c(100, 85, 65, 0),
     FADLAD = BeckLee_ages)

## Running the NPMANOVA with time as a predictor
adonis.dispRity(time_subsets, matrix ~ time)

## Running the NPMANOVA with each time bin as a predictor
adonis.dispRity(time_subsets, matrix ~ chrono.subsets)

Apply inapplicable characters to a matrix.

Description

Apply inapplicable characters to discrete morphological matrix.

Usage

apply.NA(matrix, NAs, tree, invariant = FALSE, verbose = FALSE)
apply.NA(matrix, NAs, tree, invariant = FALSE, verbose = FALSE)

Arguments

`matrix`	A discrete morphological matrix.
`NAs`	Either a numeric value of how many characters to make inapplicable or vector of characters inapplicability source (either `"character"` or `"clade"`; see details). The length of this vector must be at maximum half the total number of characters.
`tree`	If any inapplicable source is `"clade"`, a tree from where to select the clades.
`invariant`	Whether to allow invariant sites among the characters with inapplicable data. If `invariant = FALSE` the algorithm will try to remove such characters (if possible).
`verbose`	Whether to be verbose or not.

Details

If the NAs argument is a numeric value n, generates n characters with inapplicable data based on the "clade" source.

The NAs argument intakes a vector of character inapplicability source rendering a number of characters inapplicable using the following sources:

"character" draws inapplicable characters directly from the character matrix, ignoring the phylogeny (i.e. for a random character X, an other random character Y will have inapplicable characters for each character states 0 for character X).

"clade" draws inapplicable characters from the phylogeny: it will randomly apply inapplicable characters states for some characters by randomly selecting clades from the provided tree. The algorithm randomly assigns an inapplicable token for this character for all taxa in this clade or all taxa outside this clade.

For example NAs = c(rep("character", 2), rep("clade", 2)) will generate 4 characters with inapplicable data, two using previous characters and two other using random clades.

Author(s)

Thomas Guillerme

Examples

set.seed(4)
## A random tree with 15 tips
tree <- rcoal(15)
## setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)
my_substitutions = c(runif, 2, 2)

## A Mk matrix (10*50)
matrixMk <- sim.morpho(tree, characters = 100, model = "ER",
     states = c(0.85, 0.15), rates = my_rates, invariant = FALSE)

## Setting the number and source of inapplicable characters
my_inapplicables <- c(rep("character", 5), rep("clade", 5))

## Apply some inapplicable characters to the matrix
matrix <- apply.NA(matrixMk, my_inapplicables, tree, verbose = TRUE)

set.seed(4)
## A random tree with 15 tips
tree <- rcoal(15)
## setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)
my_substitutions = c(runif, 2, 2)

## A Mk matrix (10*50)
matrixMk <- sim.morpho(tree, characters = 100, model = "ER",
     states = c(0.85, 0.15), rates = my_rates, invariant = FALSE)

## Setting the number and source of inapplicable characters
my_inapplicables <- c(rep("character", 5), rep("clade", 5))

## Apply some inapplicable characters to the matrix
matrix <- apply.NA(matrixMk, my_inapplicables, tree, verbose = TRUE)

as.covar

Description

Changes a dispRity metric to use the covar element from a dispRity object.

Usage

as.covar(fun, ..., VCV = TRUE, loc = FALSE)
as.covar(fun, ..., VCV = TRUE, loc = FALSE)

Arguments

`fun`	a `function` to apply to the `$covar` element of `dispRity`.
`...`	any additional arguments to pass to fun.
`VCV`	logical, whether to use the `$VCV` component of the elements in `dispRity$covar` (`TRUE`; default) or not (`FALSE`) (see details).
`loc`	logical, whether to use the `$loc` component of the elements in `dispRity$covar` (`TRUE`) or not (`FALSE`; default) (see details).

Details

This function effectively transforms the input argument from matrix (or matrix2) to matrix = matrix$VCV and adds a evaluation after the return call to indicate that the function works on a $covar element. Note that if the function does not have an argument called matrix, the first argument is estimated as being the one to be transformed (e.g. if the function has its first argument x, it will transform it to x = x$VCV).

You can toggle between using the $VCV or the $loc argument in the $covar matrix by using either VCV = TRUE, loc = FALSE (to access only fun(matrix = matrix$VCV, ...)), VCV = FALSE, loc = TRUE (to access only matrix = matrix(matrix$loc, nrow = 1), ...) or VCV = TRUE, loc = TRUE (to access fun(matrix = matrix$VCV, loc = matrix$loc, ...); provided fun has an extra loc argument).

For between.groups metrics with matrix and matrix2 arguments, you can provide multiple logicals for VCV and loc to be applied repspectively to matrix and matrix2. For example VCV = TRUE will reinterpret matrix and matrix2 as matrix$VCV and matrix2$VCV but loc = c(TRUE, FALSE) will only reinterpret matrix as matrix$loc (and matrix2 will not be reinterpreted).

Author(s)

Thomas Guillerme

Examples

## Creating a dispRity
data(charadriiformes)

## Creating a dispRity object from the charadriiformes model
covar_data <- MCMCglmm.subsets(data       = charadriiformes$data,
                               posteriors = charadriiformes$posteriors)

## Get one matrix and one covar matrix
one_matrix <- get.matrix(covar_data, subsets = 1)
one_covar  <- get.covar(covar_data, subsets = 1, n = 1)[[1]][[1]]

## Measure the centroids
centroids(one_matrix)

## Measure the centroids on the covar matrix
as.covar(centroids)(one_covar)
## Is the same as:
centroids(one_covar$VCV)

## Apply the measurement on a dispRity object:
## On the traitspace:
summary(dispRity(covar_data, metric = c(sum, centroids))) 
## On the covariance matrices:
summary(dispRity(covar_data, metric = c(sum, as.covar(centroids))))
## The same but with additional options (centre = 100)
summary(dispRity(covar_data,
                 metric = c(sum, as.covar(centroids)),
                 centre = 100))

## Example with the VCV and loc options
## A metric that works with both VCV and loc
## (the sum of the variances minus the distance from the location)
sum.var.dist <- function(matrix, loc = rep(0, ncol(matrix))) {
    ## Get the sum of the diagonal of the matrix
    sum_diag <- sum(diag(matrix))
    ## Get the distance between 0 and the loc
    dist_loc <- dist(matrix(c(rep(0, ncol(matrix)), loc), nrow = 2, byrow = TRUE))[1]
    ## Return the sum of the diagonal minus the distance
    return(sum_diag - dist_loc)
}
## Changing the $loc on one_covar for the demonstration
one_covar$loc <- c(1, 2, 3)
## Metric on the VCV part only
as.covar(sum.var.dist, VCV = TRUE, loc = FALSE)(one_covar)
## Metric on the loc part only
as.covar(sum.var.dist, VCV = FALSE, loc = TRUE)(one_covar)
## Metric on both parts
as.covar(sum.var.dist, VCV = TRUE, loc = TRUE)(one_covar)

## Creating a dispRity
data(charadriiformes)

## Creating a dispRity object from the charadriiformes model
covar_data <- MCMCglmm.subsets(data       = charadriiformes$data,
                               posteriors = charadriiformes$posteriors)

## Get one matrix and one covar matrix
one_matrix <- get.matrix(covar_data, subsets = 1)
one_covar  <- get.covar(covar_data, subsets = 1, n = 1)[[1]][[1]]

## Measure the centroids
centroids(one_matrix)

## Measure the centroids on the covar matrix
as.covar(centroids)(one_covar)
## Is the same as:
centroids(one_covar$VCV)

## Apply the measurement on a dispRity object:
## On the traitspace:
summary(dispRity(covar_data, metric = c(sum, centroids))) 
## On the covariance matrices:
summary(dispRity(covar_data, metric = c(sum, as.covar(centroids))))
## The same but with additional options (centre = 100)
summary(dispRity(covar_data,
                 metric = c(sum, as.covar(centroids)),
                 centre = 100))

## Example with the VCV and loc options
## A metric that works with both VCV and loc
## (the sum of the variances minus the distance from the location)
sum.var.dist <- function(matrix, loc = rep(0, ncol(matrix))) {
    ## Get the sum of the diagonal of the matrix
    sum_diag <- sum(diag(matrix))
    ## Get the distance between 0 and the loc
    dist_loc <- dist(matrix(c(rep(0, ncol(matrix)), loc), nrow = 2, byrow = TRUE))[1]
    ## Return the sum of the diagonal minus the distance
    return(sum_diag - dist_loc)
}
## Changing the $loc on one_covar for the demonstration
one_covar$loc <- c(1, 2, 3)
## Metric on the VCV part only
as.covar(sum.var.dist, VCV = TRUE, loc = FALSE)(one_covar)
## Metric on the loc part only
as.covar(sum.var.dist, VCV = FALSE, loc = TRUE)(one_covar)
## Metric on both parts
as.covar(sum.var.dist, VCV = TRUE, loc = TRUE)(one_covar)

Beck and Lee 2014 datasets

Description

Example datasets from Beck and Lee 2014.

Format

three matrices and one phylogenetic tree.

Details

BeckLee_tree A phylogenetic tree with 50 living and fossil taxa
BeckLee_mat50 The ordinated matrix based on the 50 taxa cladistic distances
BeckLee_mat99 The ordinated matrix based on the 50 taxa + 49 nodes cladistic distances
BeckLee_ages A list of first and last occurrence data for fossil taxa
BeckLee_disparity a dispRity object with estimated sum of variances in 120 time bins, boostrapped 100 times from the Beck and Lee data

References

Beck RMD & Lee MSY. 2014. Ancient dates or accelerated rates? Morphological clocks and the antiquity of placental mammals. Proc. R. Soc. B 2014 281 20141278; DOI: 10.1098/rspb.2014.1278

BeckLee_disparity

Description

An example of a dispRity object.

Format

one dispRity object.

Details

This matrix is based on the BeckLee dataset and split into 120 continuous subsets (chrono.subsets). It was bootstrapped 100 times (boot.matrix) with four rarefaction levels. Disparity was calculated as the sum of the variances (dispRity).

Examples

## Not run: 
## Loading the data
data(BeckLee_mat99)
data(BeckLee_tree)
data(BeckLee_ages)

## Creating the 7 subsets
subsets <- chrono.subsets(BeckLee_mat99, BeckLee_tree,
                          time = seq(from = 0, to = 120, by = 1),
                          method = "continuous", model = "proximity",
                          FADLAD = BeckLee_ages)

## Bootstrapping and rarefying
bootstraps <- boot.matrix(subsets, bootstraps = 100)

## Calculating disparity
BeckLee_disparity <- dispRity(bootstraps, metric = c(sum, variances))

## End(Not run)
## Not run: 
## Loading the data
data(BeckLee_mat99)
data(BeckLee_tree)
data(BeckLee_ages)

## Creating the 7 subsets
subsets <- chrono.subsets(BeckLee_mat99, BeckLee_tree,
                          time = seq(from = 0, to = 120, by = 1),
                          method = "continuous", model = "proximity",
                          FADLAD = BeckLee_ages)

## Bootstrapping and rarefying
bootstraps <- boot.matrix(subsets, bootstraps = 100)

## Calculating disparity
BeckLee_disparity <- dispRity(bootstraps, metric = c(sum, variances))

## End(Not run)

Bhattacharyya Coefficient

Description

Calculates the probability of overlap between two distributions.

Usage

bhatt.coeff(x, y, bw = bw.nrd0, ...)
bhatt.coeff(x, y, bw = bw.nrd0, ...)

Arguments

`x`, `y`	two distributions.
`bw`	the bandwidth size, either a `numeric` or a `function` (see `bw.nrd0`).
`...`	optional arguments to be passed to the `bw` argument.

Author(s)

Thomas Guillerme

References

Bhattacharyya A. 1943. On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc., 35, pp. 99-109

Examples

## Two dummy distributions
x <- rnorm(1000, 0, 1)
y <- rnorm(1000, 1, 2)

## What is the probability of overlap of these distributions?
bhatt.coeff(x, y)

## Two dummy distributions
x <- rnorm(1000, 0, 1)
y <- rnorm(1000, 1, 2)

## What is the probability of overlap of these distributions?
bhatt.coeff(x, y)

Bootstraps and rarefies data.

Description

Bootstraps and rarefies either a matrix or a list of matrices.

Usage

boot.matrix(
  data,
  bootstraps = 100,
  boot.type = "full",
  boot.by = "rows",
  rarefaction = FALSE,
  verbose = FALSE,
  prob = NULL
)
boot.matrix(
  data,
  bootstraps = 100,
  boot.type = "full",
  boot.by = "rows",
  rarefaction = FALSE,
  verbose = FALSE,
  prob = NULL
)

Arguments

`data`	A `matrix` or a list of matrices (typically output from chrono.subsets or custom.subsets - see details).
`bootstraps`	The number of bootstrap pseudoreplicates (`default = 100`).
`boot.type`	The bootstrap algorithm to use (`default = "full"`; see details).
`boot.by`	Which dimension of the data to bootstrap: either `"rows"` to bootstrap the elements (default), `"columns"` for the dimensions or `"dist"` for bootstrapping both equally (e.g. for distance matrices).
`rarefaction`	Either a `logical` value whether to fully rarefy the data, a set of `numeric` values used to rarefy the data or `"min"` to rarefy at the minimum level (see details).
`verbose`	A `logical` value indicating whether to be verbose or not.
`prob`	Optional, a `matrix` or a `vector` of probabilities for each element to be selected during the bootstrap procedure. The `matrix` or the `vector` must have a row names or names attribute that corresponds to the elements in `data`.

Details

data: The data is considered as the multidimensional space and is not transformed (e.g. if ordinated with negative eigen values, no correction is applied to the matrix).

rarefaction: when the input is numeric, the number of elements is set to the value(s) for each bootstrap. If some subsets have fewer elements than the rarefaction value, the subsets is not rarefied. When the input is "min", the smallest number of elements is used (or 3 if some subsets have less than 3 elements).

boot.type: the different bootstrap algorithms are:

"full": resamples all the rows of the matrix and replaces them with a new random sample of rows (with replace = TRUE, meaning all the elements can be duplicated in each bootstrap).
"single": resamples only one row of the matrix and replaces it with a new randomly sampled row (with replace = FALSE, meaning that only one element can be duplicated in each bootstrap).
"null": resamples all rows of the matrix across subsets. I.e. for each subset of n elements, this algorithm resamples n elements across ALL subsets. If only one subset (or none) is used, this does the same as the "full" algorithm.

prob: This option allows to attribute specific probability to each element to be drawn. A probability of 0 will never sample the element, a probability of 1 will always allow it to be sampled. This can also be useful for weighting elements: an element with a weight of 10 will be sampled ten times more. If the argument is a matrix, it must have rownames attributes corresponding to the element names. If the argument is a vector, it must have names attributes corresponding to the element names.

Multiple trees: If the given data is a chrono.subsets based on multiple trees, the sampling is proportional to the presence of each element in each tree: $\sum (1/n) / T$ (with n being the maximum number of elements among the trees and T being the total numbers of trees). For example, for a slice through two trees resulting in the selection of elements A and B in the first tree and A, B and C in the second tree, the "full" bootstrap algorithm will select three elements (with replacement) between A, B and C with a probability of respectively $p(A) = 1/3$ ( $p(A) = (1/3 + 1/3) / 2$ ), $p(B) = 1/3$ and $p(C) = 1/6$ ( $p(C) = (0 + 1/3) / 2$ ).

Value

This function outputs a dispRity object containing:

`matrix`	the multidimensional space (a `matrix`).
`call`	A `list` containing the called arguments.
`subsets`	A `list` containing matrices pointing to the elements present in each subsets.

Use summary.dispRity to summarise the dispRity object.

Author(s)

Thomas Guillerme

Examples

## Load the Beck & Lee 2014 matrix
data(BeckLee_mat50)

## Bootstrapping a matrix
## Bootstrapping an ordinated matrix 20 times
boot.matrix(BeckLee_mat50, bootstraps = 20)
## Bootstrapping an ordinated matrix with rarefaction
boot.matrix(BeckLee_mat50, bootstraps = 20, rarefaction = TRUE)
## Bootstrapping an ordinated matrix with only elements 7, 10 and 11 sampled
boot.matrix(BeckLee_mat50, bootstraps = 20, rarefaction = c(7, 10, 11))
## Bootstrapping an the matrix but without sampling Cimolestes and sampling Maelestes 10x more
boot.matrix(BeckLee_mat50, bootstraps = 20, prob = c("Cimolestes" = 0, "Maelestes" = 10))

## Bootstrapping a subsets of matrices
## Generating a dummy subsets of matrices
ordinated_matrix <- matrix(data = rnorm(90), nrow = 10, ncol = 9,
                           dimnames = list(letters[1:10]))
matrix_list <- custom.subsets(ordinated_matrix, list(A = 1:5, B = 6:10))
## Bootstrapping the subsets of matrices 20 times (each)
boot.matrix(matrix_list, bootstraps = 20)

## Load the Beck & Lee 2014 matrix
data(BeckLee_mat50)

## Bootstrapping a matrix
## Bootstrapping an ordinated matrix 20 times
boot.matrix(BeckLee_mat50, bootstraps = 20)
## Bootstrapping an ordinated matrix with rarefaction
boot.matrix(BeckLee_mat50, bootstraps = 20, rarefaction = TRUE)
## Bootstrapping an ordinated matrix with only elements 7, 10 and 11 sampled
boot.matrix(BeckLee_mat50, bootstraps = 20, rarefaction = c(7, 10, 11))
## Bootstrapping an the matrix but without sampling Cimolestes and sampling Maelestes 10x more
boot.matrix(BeckLee_mat50, bootstraps = 20, prob = c("Cimolestes" = 0, "Maelestes" = 10))

## Bootstrapping a subsets of matrices
## Generating a dummy subsets of matrices
ordinated_matrix <- matrix(data = rnorm(90), nrow = 10, ncol = 9,
                           dimnames = list(letters[1:10]))
matrix_list <- custom.subsets(ordinated_matrix, list(A = 1:5, B = 6:10))
## Bootstrapping the subsets of matrices 20 times (each)
boot.matrix(matrix_list, bootstraps = 20)

Character differences

Description

Calculates the character difference from a discrete matrix

Usage

char.diff(
  matrix,
  method = "hamming",
  translate = TRUE,
  special.tokens,
  special.behaviours,
  order = FALSE,
  by.col = TRUE,
  correction
)
char.diff(
  matrix,
  method = "hamming",
  translate = TRUE,
  special.tokens,
  special.behaviours,
  order = FALSE,
  by.col = TRUE,
  correction
)

Arguments

`matrix`	A discrete matrix or a list containing discrete characters. The differences is calculated between the columns (usually characters). Use `t(matrix)` or `by.col = FALSE` to calculate the differences between the rows.
`method`	The method to measure difference: `"hamming"` (default; Hamming 1950), `"manhattan"`, `"comparable"`, `"euclidean"`, `"maximum"`, `"mord"` (Lloyd 2016), `"none"` or `"binary"`.
`translate`	`logical`, whether to translate the characters following the xyz notation (`TRUE` - default; see details - Felsenstein 2004) or not (`FALSE`). Translation works for up to 26 tokens per character.
`special.tokens`	optional, a named `vector` of special tokens to be passed to `grep` (make sure to protect the character with `"\\"`). By default `special.tokens <- c(missing = "\\?", inapplicable = "\\-", polymorphism = "\\&", uncertainty = "\\/")`. Note that `NA` values are not compared and that the symbol "@" is reserved and cannot be used.
`special.behaviours`	optional, a `list` of one or more functions for a special behaviour for `special.tokens`. See details.
`order`	`logical`, whether the character should be treated as order (`TRUE`) or not (`FALSE` - default). This argument can be a `logical` vector equivalent to the number of rows or columns in `matrix` (depending on `by.col`) to specify ordering for each character.
`by.col`	`logical`, whether to measure the distance by columns (`TRUE` - default) or by rows (`FALSE`).
`correction`	optional, an eventual `function` to apply to the matrix after calculating the distance.

Details

Each method for calculating distance is expressed as a function of $d(x, y)$ where $x$ and $y$ are a pair of columns (if by.col = TRUE) or rows in the matrix and n is the number of comparable rows (if by.col = TRUE) or columns between them and i is any specific pair of rows (if by.col = TRUE) or columns. The different methods are:

"hamming" The relative distance between characters. This is equal to the Gower distance for non-numeric comparisons (e.g. character tokens; Gower 1966). $d(x,y) = \sum[i,n](abs(x[i] - y[i])/n$
"manhattan" The "raw" distance between characters: $d(x,y) = \sum[i,n](abs(x[i] - y[i])$
"comparable" The number of comparable characters (i.e. the number of tokens that can be compared): $d(x,y) = \sum[i,n]((x[i] - y[i])/(x[i] - y[i]))$
"euclidean" The euclidean distance between characters: $d(x,y) = \sqrt(\sum[i,n]((x[i] - y[i])^2))$
"maximum" The maximum distance between characters: $d(x,y) = max(abs(x[i] - y[i]))$
"mord" The maximum observable distance between characters (Lloyd 2016): $d(x,y) = \sum[i,n](abs(x[i] - y[i])/\sum[i,n]((x[i] - y[i])/(x[i] - y[i])$
"none" Returns the matrix with eventual converted and/or translated tokens.
"binary" Returns the matrix with the binary characters.

When using translate = TRUE, the characters are translated following the xyz notation where the first token is translated to 1, the second to 2, etc. For example, the character 0, 2, 1, 0 is translated to 1, 2, 3, 1. In other words when translate = TRUE, the character tokens are not interpreted as numeric values. When using translate = TRUE, scaled metrics (i.e "hamming" and "gower") are divide by $n-1$ rather than $n$ due to the first character always being equal to 1.

special.behaviours allows to generate a special rule for the special.tokens. The functions should can take the arguments character, all_states with character being the character that contains the special token and all_states for the character (which is automatically detected by the function). By default, missing data returns and inapplicable returns NA, and polymorphisms and uncertainties return all present states.

missing = function(x,y) NA
inapplicable = function(x,y) NA
polymorphism = function(x,y) strsplit(x, split = "\\&")[[1]]
uncertainty = function(x,y) strsplit(x, split = "\\/")[[1]]

Functions in the list must be named following the special token of concern (e.g. missing), have only x, y as inputs and a single output a single value (that gets coerced to integer automatically). For example, the special behaviour for the special token "?" can be coded as: special.behaviours = list(missing = function(x, y) return(y) to make all comparisons containing the special token containing "?" return any character state y.

IMPORTANT: Note that for any distance method, NA values are skipped in the distance calculations (e.g. distance(A = {1, NA, 2}, B = {1, 2, 3}) is treated as distance(A = {1, 2}, B = {1, 3})).

IMPORTANT: Note that the number of symbols (tokens) per character is limited by your machine's word-size (32 or 64 bits). If you have more than 64 tokens per character, you might want to use continuous data.

Value

A character difference value or a matrix of class char.diff

Author(s)

Thomas Guillerme

References

Felsenstein, J. 2004. Inferring phylogenies vol. 2. Sinauer Associates Sunderland. Gower, J.C. 1966. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53:325-338. Hamming, R.W. 1950. Error detecting and error correcting codes. The Bell System Technical Journal. DOI: 10.1002/j.1538-7305.1950.tb00463.x. Lloyd, G.T. 2016. Estimating morphological diversity and tempo with discrete character-taxon matrices: implementation, challenges, progress, and future directions. Biological Journal of the Linnean Society. DOI: 10.1111/bij.12746.

Examples

## Comparing two binary characters
char.diff(list(c(0, 1, 0, 1), c(0, 1, 1, 1)))

## Pairwise comparisons in a morphological matrix
morpho_matrix <- matrix(sample(c(0,1), 100, replace = TRUE), 10)
char.diff(morpho_matrix)

## Adding special tokens to the matrix
morpho_matrix[sample(1:100, 10)] <- c("?", "0&1", "-")
char.diff(morpho_matrix)

## Modifying special behaviours for tokens with "&" to be treated as NA
char.diff(morpho_matrix,
          special.behaviours = list(polymorphism = function(x,y) return(NA)))

## Adding a special character with a special behaviour (count "%" as "100")
morpho_matrix[sample(1:100, 5)] <- "%"
char.diff(morpho_matrix,
          special.tokens = c("paragraph" = "\\%"),
          special.behaviours = list(paragraph = function(x,y) as.integer(100)))

## Comparing characters with/without translation
char.diff(list(c(0, 1, 0, 1), c(1, 0, 1, 0)), method = "manhattan")
# no character difference
char.diff(list(c(0, 1, 0, 1), c(1, 0, 1, 0)), method = "manhattan",
          translate = FALSE)
# all four character states are different

## Comparing two binary characters
char.diff(list(c(0, 1, 0, 1), c(0, 1, 1, 1)))

## Pairwise comparisons in a morphological matrix
morpho_matrix <- matrix(sample(c(0,1), 100, replace = TRUE), 10)
char.diff(morpho_matrix)

## Adding special tokens to the matrix
morpho_matrix[sample(1:100, 10)] <- c("?", "0&1", "-")
char.diff(morpho_matrix)

## Modifying special behaviours for tokens with "&" to be treated as NA
char.diff(morpho_matrix,
          special.behaviours = list(polymorphism = function(x,y) return(NA)))

## Adding a special character with a special behaviour (count "%" as "100")
morpho_matrix[sample(1:100, 5)] <- "%"
char.diff(morpho_matrix,
          special.tokens = c("paragraph" = "\\%"),
          special.behaviours = list(paragraph = function(x,y) as.integer(100)))

## Comparing characters with/without translation
char.diff(list(c(0, 1, 0, 1), c(1, 0, 1, 0)), method = "manhattan")
# no character difference
char.diff(list(c(0, 1, 0, 1), c(1, 0, 1, 0)), method = "manhattan",
          translate = FALSE)
# all four character states are different

Charadriiformes

Description

An example of a MCMCglmm model.

Format

one data.frame, one phylo and one MCMCglmm.

Details

This dataset is based on a random subset of 359 Charadriiformes (gulls, plovers and sandpipers) from Cooney et al 2017 and trees from Jetz et al 2012. It contains:

data A "data.frame" .
tree A consensus tree of 359 charadriiformes species ("phylo").
posteriors The posteriors from a "MCMCglmm" model (see example below).
tree_distribution A random distribution of 10 trees of the 359 charadriiformes species ("multiPhylo").

References

Cooney CR, Bright JA, Capp EJ, Chira AM,Hughes EC, Moody CJ, Nouri LO, Varley ZK, Thomas GH. Mega-evolutionary dynamics of the adaptive radiation of birds. Nature. 2017 Feb;542(7641):344-7.

Jetz W, Thomas GH, Joy JB, Hartmann K, Mooers AO. The global diversity of birds in space and time. Nature. 2012 Nov;491(7424):444-8.

Examples

## Not run: 
## Reproducing the MCMCglmm model
require(MCMCglmm)
data(charadriiformes)

## Setting up the model parameters:
## 1 - The formula (the first three PC axes)
model_formula <- cbind(PC1, PC2, PC3) ~ trait:clade-1
## 2 - The residual term
model_residuals <- ~us(trait):units
## 3 - The random terms
## (one per clade and one for the whole phylogeny)
model_randoms <- ~ us(at.level(clade,1):trait):animal
                 + us(at.level(clade,2):trait):animal
                 + us(at.level(clade,3):trait):animal
                 + us(trait):animal

## Flat priors for the residuals and random terms
flat_priors <- list(
     ## The residuals priors
     R = list(
         R1 = list(V = diag(3), nu = 0.002)), 
     ## The random priors (the phylogenetic terms)
     G = list(
         G1 = list(V = diag(3), nu = 0.002),
         G2 = list(V = diag(3), nu = 0.002),
         G3 = list(V = diag(3), nu = 0.002),
         G4 = list(V = diag(3), nu = 0.002)))

## Run the model for 110000 iterations
## sampled every 100 with a burnin (discard)
## of the first 10000 iterations)
model <- MCMCglmm(formula  = model_formula,
                  rcov     = model_residual,
                  random   = model_randoms,
                  family   = rep("gaussian", 3),
                  prior    = flat_priors,
                  nitt     = 110000,
                  burnin   = 10000,
                  thin     = 100,
                  pedigree = charadriiformes$tree,
                  data     = charadriiformes$data)

## End(Not run)
## Not run: 
## Reproducing the MCMCglmm model
require(MCMCglmm)
data(charadriiformes)

## Setting up the model parameters:
## 1 - The formula (the first three PC axes)
model_formula <- cbind(PC1, PC2, PC3) ~ trait:clade-1
## 2 - The residual term
model_residuals <- ~us(trait):units
## 3 - The random terms
## (one per clade and one for the whole phylogeny)
model_randoms <- ~ us(at.level(clade,1):trait):animal
                 + us(at.level(clade,2):trait):animal
                 + us(at.level(clade,3):trait):animal
                 + us(trait):animal

## Flat priors for the residuals and random terms
flat_priors <- list(
     ## The residuals priors
     R = list(
         R1 = list(V = diag(3), nu = 0.002)), 
     ## The random priors (the phylogenetic terms)
     G = list(
         G1 = list(V = diag(3), nu = 0.002),
         G2 = list(V = diag(3), nu = 0.002),
         G3 = list(V = diag(3), nu = 0.002),
         G4 = list(V = diag(3), nu = 0.002)))

## Run the model for 110000 iterations
## sampled every 100 with a burnin (discard)
## of the first 10000 iterations)
model <- MCMCglmm(formula  = model_formula,
                  rcov     = model_residual,
                  random   = model_randoms,
                  family   = rep("gaussian", 3),
                  prior    = flat_priors,
                  nitt     = 110000,
                  burnin   = 10000,
                  thin     = 100,
                  pedigree = charadriiformes$tree,
                  data     = charadriiformes$data)

## End(Not run)

Check a morphological matrix consistency levels.

Description

Performs a fast check of the phylogenetic signal in a morphological matrix using parsimony.

Usage

check.morpho(
  matrix,
  orig.tree,
  parsimony = "fitch",
  first.tree = c(phangorn::dist.hamming, phangorn::NJ),
  distance = phangorn::RF.dist,
  ...,
  contrast.matrix,
  verbose = FALSE
)
check.morpho(
  matrix,
  orig.tree,
  parsimony = "fitch",
  first.tree = c(phangorn::dist.hamming, phangorn::NJ),
  distance = phangorn::RF.dist,
  ...,
  contrast.matrix,
  verbose = FALSE
)

Arguments

`matrix`	A discrete morphological matrix.
`orig.tree`	Optional, the input tree to measure the distance between the parsimony and the original tree.
`parsimony`	Either the parsimony algorithm to be passed to `optim.parsimony` or a parsimony function that can take a `phyDat` object as an input (`default = "fitch"`).
`first.tree`	A list of functions to generate the first most parsimonious tree (default = `c(dist.hamming, NJ)`; see details).
`distance`	Optional, if orig.tree is provided, the function to use for measuring distance between the trees (default = `link[phangorn]{RF.dist}`).
`...`	Any additional arguments to be passed to the parsimony algorithm.
`contrast.matrix`	An optional contrast matrix. By default, the function recognises any character state token as different apart from `?` that is treated as all characters.
`verbose`	Whether to be verbose or not (`default = FALSE`).

Details

The first.tree argument must be a list of functions to be used in a cascade to transform the matrix (as a phyDat object) into a tree using the functions iteratively. For example the default c(dist.hamming, NJ) will apply the following to the matrix: NJ(dist.hamming(matrix))

Value

Returns the parsimony score (using parsimony), the consistency and retention indices (using CI and RI) from the most parsimonious tree obtained from the matrix. Can also return the topological distance from the original tree if provided.

Author(s)

Thomas Guillerme

Examples

## Generating a random tree
random_tree <- rcoal(10)

## Generating a random matrix
random_matrix <- sim.morpho(random_tree, characters = 50, model = "ER",
     rates = c(rgamma, 1, 1))

## Checking the matrix scores
check.morpho(random_matrix, orig.tree = random_tree)

## Generating a random tree
random_tree <- rcoal(10)

## Generating a random matrix
random_matrix <- sim.morpho(random_tree, characters = 50, model = "ER",
     rates = c(rgamma, 1, 1))

## Checking the matrix scores
check.morpho(random_matrix, orig.tree = random_tree)

Separating data in chronological subsets.

Description

Splits the data into a chronological (time) subsets list.

Usage

chrono.subsets(
  data,
  tree = NULL,
  method,
  time,
  model,
  inc.nodes = FALSE,
  FADLAD = NULL,
  verbose = FALSE,
  t0 = FALSE,
  bind.data = FALSE,
  dist.data = FALSE
)
chrono.subsets(
  data,
  tree = NULL,
  method,
  time,
  model,
  inc.nodes = FALSE,
  FADLAD = NULL,
  verbose = FALSE,
  t0 = FALSE,
  bind.data = FALSE,
  dist.data = FALSE
)

Arguments

`data`	A `matrix`, a `list` of matrices or a `dispRity` object.
`tree`	`NULL` (default) or an optional `phylo` or `multiPhylo` object matching the data and with a `root.time` element. This argument can be left missing if `method = "discrete"` and all elements are present in the optional `FADLAD` argument or if `data` has a `$tree` component.
`method`	The time subsampling method: either `"discrete"` (or `"d"`) or `"continuous"` (or `"c"`).
`time`	Either a single `integer` for the number of discrete or continuous samples or a `vector` containing the age of each sample.
`model`	One of the following models: `"acctran"`, `"deltran"`, `"random"`, `"proximity"`, `"equal.split"` or `"gradual.split"`. Is ignored if `method = "discrete"`.
`inc.nodes`	A `logical` value indicating whether nodes should be included in the time subsets. Is ignored if `method = "continuous"`.
`FADLAD`	`NULL` (default) or an optional `data.frame` or `list` of `data.frame`s containing the first and last occurrence data.
`verbose`	A `logical` value indicating whether to be verbose or not. Is ignored if `method = "discrete"`.
`t0`	If `time` is a number of samples, whether to start the sampling from the `tree$root.time` (`TRUE`), or from the first sample containing at least three elements (`FALSE` - default) or from a fixed time point (if `t0` is a single `numeric` value).
`bind.data`	If `data` contains multiple matrices and `tree` contains the same number of trees, whether to bind the pairs of matrices and the trees (`TRUE`) or not (`FALSE` - default).
`dist.data`	A `logical` value indicating whether to treat the data as distance data (`TRUE`) or not (`FALSE` - default).

Details

The data is considered as the multidimensional space with rows as elements and columns as dimensions and is not transformed (e.g. if ordinated with negative eigen values, no correction is applied to the matrix).

If method = "continuous" and when the sampling is done along an edge of the tree, the data selected for the time subsets can be one of the following:

Punctuated models:
- "acctran": always selecting the value from the ancestral node.
- "deltran": always selecting the value from the descendant node or tip.
- "random": randomly selecting between the ancestral node or the descendant node/tip.
- "proximity": selecting the ancestral node or the descendant node/tip with a probability relative to branch length.
Gradual models:
- "equal.split": randomly selecting from the ancestral node or the descendant node or tip with a 50% probability each.
- "gradual.split": selecting the ancestral node or the descendant with a probability relative to branch length.

N.B. "equal.split" and "gradual.split" differ from the punctuated models by outputting a node/tip probability table rather than simply the node and the tip selected. In other words, when bootstrapping using boot.matrix, the two former models will properly integrate the probability to the bootstrap procedure (i.e. different tips/nodes can be drawn) and the two latter models will only use the one node/tip determined by the model before the bootstrapping.

Author(s)

Thomas Guillerme

References

Guillerme T. & Cooper N. 2018. Time for a rethink: time sub-sampling methods in disparity-through-time analyses. Palaeontology. DOI: 10.1111/pala.12364.

Examples

## Load the Beck & Lee 2014 data
data(BeckLee_tree) ; data(BeckLee_mat50)
data(BeckLee_mat99) ; data(BeckLee_ages)

## Time binning (discrete method)
## Generate two discrete time bins from 120 to 40 Ma every 40 Ma
chrono.subsets(data = BeckLee_mat50, tree = BeckLee_tree, method = "discrete",
     time = c(120, 80, 40), inc.nodes = FALSE, FADLAD = BeckLee_ages)
## Generate the same time bins but including nodes
chrono.subsets(data = BeckLee_mat99, tree = BeckLee_tree, method = "discrete",
     time = c(120, 80, 40), inc.nodes = TRUE, FADLAD = BeckLee_ages)

## Time slicing (continuous method)
## Generate five equidistant time slices in the dataset assuming a proximity
## evolutionary model
chrono.subsets(data = BeckLee_mat99, tree = BeckLee_tree,
     method = "continuous", model = "acctran", time = 5,
     FADLAD = BeckLee_ages)

## Load the Beck & Lee 2014 data
data(BeckLee_tree) ; data(BeckLee_mat50)
data(BeckLee_mat99) ; data(BeckLee_ages)

## Time binning (discrete method)
## Generate two discrete time bins from 120 to 40 Ma every 40 Ma
chrono.subsets(data = BeckLee_mat50, tree = BeckLee_tree, method = "discrete",
     time = c(120, 80, 40), inc.nodes = FALSE, FADLAD = BeckLee_ages)
## Generate the same time bins but including nodes
chrono.subsets(data = BeckLee_mat99, tree = BeckLee_tree, method = "discrete",
     time = c(120, 80, 40), inc.nodes = TRUE, FADLAD = BeckLee_ages)

## Time slicing (continuous method)
## Generate five equidistant time slices in the dataset assuming a proximity
## evolutionary model
chrono.subsets(data = BeckLee_mat99, tree = BeckLee_tree,
     method = "continuous", model = "acctran", time = 5,
     FADLAD = BeckLee_ages)

Imports data from Claddis

Description

Takes Claddis data and computes both the distance and the ordination matrix

Usage

Claddis.ordination(data, distance = "mord", ..., k, add = TRUE, arg.cmdscale)
Claddis.ordination(data, distance = "mord", ..., k, add = TRUE, arg.cmdscale)

Arguments

`data`	Data from `read_nexus_matrix` or the path to a file to be read by `read.nexus.data` (see details).
`distance`	Distance type to be computed by `calculate_morphological_distances`. Can be either `"gc"`, `"ged"`, `"red"`, `"mord"`. `distance` can also be set to `NULL` to convert a matrix in `read_nexus_matrix` list type (see details).
`...`	Any optional arguments to be passed to `calculate_morphological_distances`.
`k`	The number of dimensions in the ordination. If left empty, the number of dimensions is set to number of rows - 1.
`add`	whether to use the Cailliez correction for negative eigen values (`add = TRUE`; default - see `cmdscale`) or not (`add = FALSE`).
`arg.cmdscale`	Any optional arguments to be passed to `cmdscale` (as a named list such as `list(x.ret = TRUE)`).

Details

If data is a file path, the function will use a modified version of read.nexus.data (that handles polymorphic and ambiguous characters). The file content will then be converted into a read_nexus_matrix type list treating all characters as unordered. If the distance is set to NULL, data will be only converted into a read_nexus_matrix type list.

Author(s)

Thomas Guillerme

Examples

## Not run: 
require(Claddis)

## Ordinating the distance matrix of Claddis example data
Claddis.ordination(Claddis::michaux_1989)

## Creating simple discrete morphological matrix (with polymorphisms)
cat(
"#NEXUS
BEGIN DATA;
DIMENSIONS  NTAX=5 NCHAR=5;
FORMAT SYMBOLS= \" 0 1 2\" MISSING=? GAP=- ;
MATRIX
     t1  {01}1010
     t2  02120
     t3  1210(01)
     t4  01111
     t5  00101
;
END;", file = "morpho_matrix.nex")

## Ordinating the matrix (using a distance matrix)
Claddis.ordination("morpho_matrix.nex")

## Only converting the nexus matrix into a Claddis format
Claddis_data <- Claddis.ordination("morpho_matrix.nex", distance = NULL)

file.remove("morpho_matrix.nex")

## End(Not run)

## Not run: 
require(Claddis)

## Ordinating the distance matrix of Claddis example data
Claddis.ordination(Claddis::michaux_1989)

## Creating simple discrete morphological matrix (with polymorphisms)
cat(
"#NEXUS
BEGIN DATA;
DIMENSIONS  NTAX=5 NCHAR=5;
FORMAT SYMBOLS= \" 0 1 2\" MISSING=? GAP=- ;
MATRIX
     t1  {01}1010
     t2  02120
     t3  1210(01)
     t4  01111
     t5  00101
;
END;", file = "morpho_matrix.nex")

## Ordinating the matrix (using a distance matrix)
Claddis.ordination("morpho_matrix.nex")

## Only converting the nexus matrix into a Claddis format
Claddis_data <- Claddis.ordination("morpho_matrix.nex", distance = NULL)

file.remove("morpho_matrix.nex")

## End(Not run)

Cleaning phylogenetic data

Description

Cleans a table/tree to match with a given table/tree

Usage

clean.data(data, tree, inc.nodes = FALSE)
clean.data(data, tree, inc.nodes = FALSE)

Arguments

`data`	A `data.frame` or `matrix` with the elements names as row names.
`tree`	A `phylo` or `multiPhylo` object.
`inc.nodes`	Logical, whether to check if the nodes in `tree` are also present in `data` (`TRUE`) or not (`FALSE`; default).

Details

Note if inc.nodes is set to TRUE, the function outputs an error message if there is no matching.

Value

A list containing the cleaned data and tree(s) and information on the eventual dropped tips and rows.

Author(s)

Thomas Guillerme

Examples

##Creating a set of different trees
trees_list <- list(rtree(5, tip.label = LETTERS[1:5]), rtree(4,
     tip.label = LETTERS[1:4]), rtree(6, tip.label = LETTERS[1:6]))
class(trees_list) <- "multiPhylo"

##Creating a matrix
dummy_data <- matrix(c(rnorm(5), runif(5)), 5, 2,
    dimnames = list(LETTERS[1:5], c("var1", "var2")))

##Cleaning the trees and the data
cleaned <- clean.data(data = dummy_data, tree = trees_list)
##The taxa that where dropped (tips and rows):
c(cleaned$dropped_tips, cleaned$dropped_rows)
##The cleaned trees:
cleaned$tree
##The cleaned data set:
cleaned$data

##Creating a set of different trees
trees_list <- list(rtree(5, tip.label = LETTERS[1:5]), rtree(4,
     tip.label = LETTERS[1:4]), rtree(6, tip.label = LETTERS[1:6]))
class(trees_list) <- "multiPhylo"

##Creating a matrix
dummy_data <- matrix(c(rnorm(5), runif(5)), 5, 2,
    dimnames = list(LETTERS[1:5], c("var1", "var2")))

##Cleaning the trees and the data
cleaned <- clean.data(data = dummy_data, tree = trees_list)
##The taxa that where dropped (tips and rows):
c(cleaned$dropped_tips, cleaned$dropped_rows)
##The cleaned trees:
cleaned$tree
##The cleaned data set:
cleaned$data

covar.plot

Description

Visualising components of a dispRity object with covar.

Usage

covar.plot(
  data,
  n,
  points = TRUE,
  major.axes = FALSE,
  ellipses = FALSE,
  level = 0.95,
  dimensions = c(1, 2),
  centres = colMeans,
  scale,
  transparent.scale,
  add = FALSE,
  apply.to.VCV = FALSE,
  ...
)
covar.plot(
  data,
  n,
  points = TRUE,
  major.axes = FALSE,
  ellipses = FALSE,
  level = 0.95,
  dimensions = c(1, 2),
  centres = colMeans,
  scale,
  transparent.scale,
  add = FALSE,
  apply.to.VCV = FALSE,
  ...
)

Arguments

`data`	an `dispRity` object with a covar component.
`n`	optional, a number of random posteriors to use.
`points`	logical, whether to plot the observed elements (`TRUE`; default) or not (`FALSE`).
`major.axes`	can be either logical for plotting all (or `n`) major.axes (`TRUE`) or none (`FALSE`; default) or a `function` for displaying one summarised major axis. See details.
`ellipses`	can be either logical for plotting all (or `n`) ellipses (`TRUE`) or none (`FALSE`; default) or a `function` for displaying one summarised ellipse. See details.
`level`	the confidence interval level of the major axes and ellipses (default is `0.95`).
`dimensions`	which dimensions (default is `c(1,2)`).
`centres`	optional, a way to determine ellipses or major axes positions. Can be either a `function` (default is `colMeans`), a `vector` or a `list` of coordinates vectors or `"intercept"`. See details.
`scale`	optional, the name of a group from `data` on which to scale the ellipses and major axis to be the same size.
`transparent.scale`	optional, if multiple major axes and/or ellipses are plotted, a scaling factor for the transparency. If left empty, the transparency is set to `1/n` or `0.1` (whichever is higher).
`add`	logical, whether to add the plot to an existing plot (`TRUE`) or not (`FALSE`; default).
`apply.to.VCV`	logical, if `ellipse` and/or `major.axes` is a `function`, whether to apply it on all the estimated ellipses/major axes (`FALSE`; default) or on the variance covariance matrices directly (`TRUE`). In other words, whether to apply the function to the ellipses/major axis or the the VCV first (e.g. the average ellipses or the ellipse of the average VCV).
`...`	any graphical options to be passed to `plot`, `lines` or `points`. See details.

Details

When specifying optional arguments with ... in a graph with multiple elements (e.g. points, lines, etc...) you can specify which specific element to affect using the syntax <element>.<argument>. For example if you want everything in the plot to be in blue at the exception of the points to be red, you can use covar.plot(..., col = "blue", points.col = "red").

The arguments major.axes and ellipses can intake a function for summarising the display of multiple variance covariance matrices (if n is missing or greater than one). This can be any central tendency function such as mean, median or mode.val.

The argument centres allows to determine how to calculate the centre of each ellipses or major axes. The argument can be either:

A function to calculate the centre from a group like the default colMeans function that calculates the centroid coordinates of each group;
A numeric value to be replicated as the coordinates for the centre of each group (e.g. centres = 0 sets all the centres at the coordinates c(0,0,0,...)); or a vector of numeric values to be directly used as the coordinates for each group (e.g. centres = c(1,2,3) sets all the centres at the coordinates c(1,2,3)); or a list of numeric values or numeric vectors to be used as the coordinates for the centres of each group;
"intercept" for using the estimated posterior intercept for each sample.

NOTE that if the input contains more dimensions than the visualised dimensions (by default dimensions = c(1,2)) the ellipses and major axes are projected from an n-dimensional space onto a 2D space which might make them look incorrect. NOTE also that the ellipses and major axes are measured independently, when summarising both parameters (e.g. by using ellipses = mean and major.axes = mean), the displayed summarised major axes is not calculated from the summarised ellipse but from the coordinates of all major axes (and therefore might not match the coordinates of the ellipse).

Author(s)

Thomas Guillerme

Examples

data(charadriiformes)

## Creating a dispRity object from the charadriiformes model
covar <- MCMCglmm.subsets(data       = charadriiformes$data,
                          posteriors = charadriiformes$posteriors,
                          group      = MCMCglmm.levels(
                                         charadriiformes$posteriors)[1:4],
                          rename.groups = c("gulls", "plovers",
                                            "sandpipers", "phylogeny"))

## Default plot
covar.plot(covar)

## Same plot with more options
covar.plot(covar, n = 50, ellipses = mean, major.axes = TRUE, 
           col = c("orange", "blue", "darkgreen", "grey", "grey"),
           legend = TRUE, points = TRUE, points.cex = 0.2,
           main = "Charadriiformes shapespace")

data(charadriiformes)

## Creating a dispRity object from the charadriiformes model
covar <- MCMCglmm.subsets(data       = charadriiformes$data,
                          posteriors = charadriiformes$posteriors,
                          group      = MCMCglmm.levels(
                                         charadriiformes$posteriors)[1:4],
                          rename.groups = c("gulls", "plovers",
                                            "sandpipers", "phylogeny"))

## Default plot
covar.plot(covar)

## Same plot with more options
covar.plot(covar, n = 50, ellipses = mean, major.axes = TRUE, 
           col = c("orange", "blue", "darkgreen", "grey", "grey"),
           legend = TRUE, points = TRUE, points.cex = 0.2,
           main = "Charadriiformes shapespace")

Utilities for a dispRity object with covariance matrices

Description

Different utility functions to extract aspects of a MCMCglmm object.

Usage

get.covar(data, subsets, sample, n, dimensions)

axis.covar(data, subsets, sample, n, dimensions, level = 0.95, axis = 1)
get.covar(data, subsets, sample, n, dimensions)

axis.covar(data, subsets, sample, n, dimensions, level = 0.95, axis = 1)

Arguments

`data`	a `dispRity` object with a `covar` element.
`subsets`	optional, a `numeric` or `character` for which subsets to get (if missing, the value for all subsets are given).
`sample`	optional, one or more specific posterior sample IDs (is ignored if n is used) or a function to summarise all axes.
`n`	optional, a random number of covariance matrices to sample (if left empty, all are used).
`dimensions`	optional, which dimensions to use. If missing the dimensions from `data` are used.
`level`	which confidence interval level to use (default is `0.95`).
`axis`	which major axis to calculate (default is `1`, the first one).

Author(s)

Thomas Guillerme

Examples

## Load the Charadriiformes dataset
data(charadriiformes)
## Making a dispRity object with covar data
covar_data <- MCMCglmm.subsets(data       = charadriiformes$data,
                               posteriors = charadriiformes$posteriors)

## Get the two first covar matrices for each level
get.covar(covar_data, sample = c(1,2))
## Get 2 random covar matrices in 2D for each level
get.covar(covar_data, n = 2, dimensions = c(1,2))
## Get mean covar matrix for each level
get.covar(covar_data, sample = mean)

## Get the 0.95 major axis for the 42th covar matrix
axis.covar(covar_data, sample = 42)
## Get the 0.5 major axis for 2 random samples
axis.covar(covar_data, n = 1, level = 0.5)
## Get the median 0.95 minor axis of the 2D ellipse
axis.covar(covar_data, sample = mean, dimensions = c(1,2), axis = 2)

## Load the Charadriiformes dataset
data(charadriiformes)
## Making a dispRity object with covar data
covar_data <- MCMCglmm.subsets(data       = charadriiformes$data,
                               posteriors = charadriiformes$posteriors)

## Get the two first covar matrices for each level
get.covar(covar_data, sample = c(1,2))
## Get 2 random covar matrices in 2D for each level
get.covar(covar_data, n = 2, dimensions = c(1,2))
## Get mean covar matrix for each level
get.covar(covar_data, sample = mean)

## Get the 0.95 major axis for the 42th covar matrix
axis.covar(covar_data, sample = 42)
## Get the 0.5 major axis for 2 random samples
axis.covar(covar_data, n = 1, level = 0.5)
## Get the median 0.95 minor axis of the 2D ellipse
axis.covar(covar_data, sample = mean, dimensions = c(1,2), axis = 2)

Separates stem and crown species

Description

Selects the crown

Usage

crown.stem(tree, inc.nodes = TRUE, output.names = TRUE)
crown.stem(tree, inc.nodes = TRUE, output.names = TRUE)

Arguments

`tree`	a `"phylo"` object
`inc.nodes`	whether to include the nodes (`TRUE`; default) or not (`FALSE`) in the output.
`output.names`	whether to output the taxa names (`TRUE`; default) or two phylogenetic trees (`FALSE`).

Author(s)

Thomas Guillerme

Examples

## A tree with fossil taxa
data(BeckLee_tree)

## Getting both crown and stem taxa lists
crown.stem(BeckLee_tree)

## Splitting the tree into two subtrees
crown_stem_trees <- crown.stem(BeckLee_tree, output.names = FALSE)
## Graphical parameters
op <- par(mfrow = c(1,3))
## Plotting the trees
plot(BeckLee_tree, main = "Full tree")
plot(crown_stem_trees$crown, main = "Crown group")
plot(crown_stem_trees$stem, main = "Stem group")

## A tree with fossil taxa
data(BeckLee_tree)

## Getting both crown and stem taxa lists
crown.stem(BeckLee_tree)

## Splitting the tree into two subtrees
crown_stem_trees <- crown.stem(BeckLee_tree, output.names = FALSE)
## Graphical parameters
op <- par(mfrow = c(1,3))
## Plotting the trees
plot(BeckLee_tree, main = "Full tree")
plot(crown_stem_trees$crown, main = "Crown group")
plot(crown_stem_trees$stem, main = "Stem group")

Separating data into custom subsets.

Description

Splits the data into a customized subsets list.

Usage

custom.subsets(data, group, tree = NULL, dist.data = FALSE)
custom.subsets(data, group, tree = NULL, dist.data = FALSE)

Arguments

`data`	A `matrix` or a `list` of matrices.
`group`	Either a `list` of row numbers or names to be used as different groups, a `data.frame` with the same $k$ elements as in `data` as rownames, a `factor` or a `logical` vector. If `group` is a `phylo` object matching `data`, groups are automatically generated as clades (and the tree is attached to the resulting `dispRity` object).
`tree`	`NULL` (default) or an optional `phylo` or `multiPhylo` object to be attached to the data.
`dist.data`	A `logical` value indicating whether to treat the data as distance data (`TRUE`) or not (`FALSE` - default).

Details

Note that every element from the input data can be assigned to multiple groups!

Author(s)

Thomas Guillerme

Examples

## Generating a dummy ordinated matrix
ordinated_matrix <- matrix(data = rnorm(90), nrow = 10)

## Splitting the ordinated matrix into two groups using row numbers
custom.subsets(ordinated_matrix, list(c(1:4), c(5:10)))

## Splitting the ordinated matrix into three groups using row names
ordinated_matrix <- matrix(data = rnorm(90), nrow = 10,
     dimnames = list(letters[1:10]))
custom.subsets(ordinated_matrix,
     list("A" = c("a", "b", "c", "d"), "B" = c("e", "f", "g", "h", "i", "j"),
          "C" = c("a", "c", "d", "f", "h")))

## Splitting the ordinated matrix into four groups using a dataframe
groups <- as.data.frame(matrix(data = c(rep(1,5), rep(2,5), rep(c(1,2), 5)),
     nrow = 10, ncol = 2, dimnames = list(letters[1:10], c("g1", "g2"))))
custom.subsets(ordinated_matrix, groups)

## Splitting a matrix by clade
data(BeckLee_mat50)
data(BeckLee_tree)
custom.subsets(BeckLee_mat50, group = BeckLee_tree)


## Generating a dummy ordinated matrix
ordinated_matrix <- matrix(data = rnorm(90), nrow = 10)

## Splitting the ordinated matrix into two groups using row numbers
custom.subsets(ordinated_matrix, list(c(1:4), c(5:10)))

## Splitting the ordinated matrix into three groups using row names
ordinated_matrix <- matrix(data = rnorm(90), nrow = 10,
     dimnames = list(letters[1:10]))
custom.subsets(ordinated_matrix,
     list("A" = c("a", "b", "c", "d"), "B" = c("e", "f", "g", "h", "i", "j"),
          "C" = c("a", "c", "d", "f", "h")))

## Splitting the ordinated matrix into four groups using a dataframe
groups <- as.data.frame(matrix(data = c(rep(1,5), rep(2,5), rep(c(1,2), 5)),
     nrow = 10, ncol = 2, dimnames = list(letters[1:10], c("g1", "g2"))))
custom.subsets(ordinated_matrix, groups)

## Splitting a matrix by clade
data(BeckLee_mat50)
data(BeckLee_tree)
custom.subsets(BeckLee_mat50, group = BeckLee_tree)

Demo datasets

Description

A set six trait spaces with different groups and different dimensions.

Details

The content of these datasets and the pipeline to build them is described in details in Guillerme et al 2020.

beck A palaeobiology study of mammals. The data is a 105 dimensions ordination (PCO) of the distances between 106 mammals based on discrete morphological characters.
wright A palaeobiology study of crinoids. The data is a 41 dimensions ordination (PCO) of the distances between 42 crinoids based on discrete morphological characters.
marcy A geometric morphometric study of gophers (rodents). The data is a 134 dimensions ordination (PCA) the Procrustes superimposition of landmarks from 454 gopher skulls.
hopkins A geometric morphometric study of trilobites. The data is a 134 dimensions ordination (PCA) the Procrustes superimposition of landmarks from 46 trilobites cephala.
jones An ecological landscape study. The data is a 47 dimensions ordination (PCO) of the Jaccard distances between 48 field sites based on species composition.
healy A life history analysis of the pace of life in animals. The data is a 6 dimensions ordination (PCA) of 6 life history traits from 285 animal species.

Source

doi:10.1002/ece3.6452

References

Guillerme T, Puttick MN, Marcy AE, Weisbecker V. 2020 Shifting spaces: Which disparity or dissimilarity measurement best summarize occupancy in multidimensional spaces?. Ecol Evol. 2020;00:1-16. (doi:10.1002/ece3.6452)

Beck, R. M., & Lee, M. S. (2014). Ancient dates or accelerated rates? Morphological clocks and the antiquity of placental mammals. Proceedings of the Royal Society B: Biological Sciences, 281(1793), 20141278.

Wright, D. F. (2017). Bayesian estimation of fossil phylogenies and the evolution of early to middle Paleozoic crinoids (Echinodermata). Journal of Paleontology, 91(4), 799-814.

Marcy, A. E., Hadly, E. A., Sherratt, E., Garland, K., & Weisbecker, V. (2016). Getting a head in hard soils: convergent skull evolution and divergent allometric patterns explain shape variation in a highly diverse genus of pocket gophers (Thomomys). BMC evolutionary biology, 16(1), 207.

Hopkins, M.J. and Pearson, J.K., 2016. Non-linear ontogenetic shape change in Cryptolithus tesselatus (Trilobita) using three-dimensional geometric morphometrics. Palaeontologia Electronica, 19(3), pp.1-54.

Jones, N. T., Germain, R. M., Grainger, T. N., Hall, A. M., Baldwin, L., & Gilbert, B. (2015). Dispersal mode mediates the effect of patch size and patch connectivity on metacommunity diversity. Journal of Ecology, 103(4), 935-944.

Healy, K., Ezard, T.H., Jones, O.R., Salguero-Gomez, R. and Buckley, Y.M., 2019. Animal life history is shaped by the pace of life and the distribution of age-specific mortality and reproduction. Nature ecology & evolution, p.1.

Examples

data(demo_data)

## Loading the Beck and Lee 2014 demo data
demo_data$beck

## Loading the Wright 2017 demo data
demo_data$wright

## Loading the Marcy et al. 2015 demo data
demo_data$marcy

## Loading the Hopkins and Pearson 2016 demo data
demo_data$hopkins

## Loading the Jones et al. 2015 demo data
demo_data$jones

## Loading the Healy et al. 2019 demo data
demo_data$healy
data(demo_data)

## Loading the Beck and Lee 2014 demo data
demo_data$beck

## Loading the Wright 2017 demo data
demo_data$wright

## Loading the Marcy et al. 2015 demo data
demo_data$marcy

## Loading the Hopkins and Pearson 2016 demo data
demo_data$hopkins

## Loading the Jones et al. 2015 demo data
demo_data$jones

## Loading the Healy et al. 2019 demo data
demo_data$healy

disparity

Description

An example of a dispRity object.

Format

one dispRity object.

Details

This matrix is based on the BeckLee dataset and split into seven continuous subsets (chrono.subsets). It was bootstrapped 100 times (boot.matrix) with four rarefaction levels. Disparity was calculated as the median of the centroids (dispRity).

Examples

## Not run: 
## Loading the data
data(BeckLee_mat99)
data(BeckLee_tree)
data(BeckLee_ages)

## Creating the 7 subsets
subsets <- chrono.subsets(BeckLee_mat99, BeckLee_tree,
                          time = seq(from = 30, to = 90, by = 10),
                          method = "continuous", model = "ACCTRAN",
                          FADLAD = BeckLee_ages)

## Bootstrapping and rarefying
bootstraps <- boot.matrix(subsets, bootstraps = 100,
                          rarefaction = c(20, 15, 10, 5))

## Calculating disparity
disparity <- dispRity(bootstraps, metric = c(median, centroids))

## End(Not run)
## Not run: 
## Loading the data
data(BeckLee_mat99)
data(BeckLee_tree)
data(BeckLee_ages)

## Creating the 7 subsets
subsets <- chrono.subsets(BeckLee_mat99, BeckLee_tree,
                          time = seq(from = 30, to = 90, by = 10),
                          method = "continuous", model = "ACCTRAN",
                          FADLAD = BeckLee_ages)

## Bootstrapping and rarefying
bootstraps <- boot.matrix(subsets, bootstraps = 100,
                          rarefaction = c(20, 15, 10, 5))

## Calculating disparity
disparity <- dispRity(bootstraps, metric = c(median, centroids))

## End(Not run)

Calculates disparity from a matrix.

Description

Calculates disparity from a matrix, a list of matrices or subsets of a matrix, where the disparity metric can be user specified.

Usage

dispRity(
  data,
  metric,
  dimensions = NULL,
  ...,
  between.groups = FALSE,
  dist.data = NULL,
  verbose = FALSE,
  tree = NULL
)
dispRity(
  data,
  metric,
  dimensions = NULL,
  ...,
  between.groups = FALSE,
  dist.data = NULL,
  verbose = FALSE,
  tree = NULL
)

Arguments

`data`	A matrix or a `dispRity` object (see details).
`metric`	A vector containing one to three functions. At least of must be a dimension-level 1 or 2 function (see details).
`dimensions`	Optional, a vector of `numeric` value(s) or the proportion of the dimensions to keep.
`...`	Optional arguments to be passed to the metric.
`between.groups`	A `logical` value indicating whether to run the calculations between groups (`TRUE`) or not (`FALSE` - default) or a `numeric` list of pairs of groups to run (see details).
`dist.data`	A `logical` value indicating whether to treat the data as distance data (`TRUE`) or not (`FALSE`). By default it is set to `NULL` and interprets whether to use distance data from `data`.
`verbose`	A `logical` value indicating whether to be verbose or not.
`tree`	`NULL` (default) or an optional `phylo` or `multiPhylo` object to be attached to the data. If this argument is not null, it will be recycled by `metric` when possible.

Details

The dispRity object given to the data argument can be: a list of matrices (typically output from the functions chrono.subsets or custom.subsets), a bootstrapped matrix output from boot.matrix, a list of disparity measurements calculated from the dispRity function or a matrix object with rows as elements and columns as dimensions. In any of these cases, the data is considered as the multidimensional space and is not transformed (e.g. if ordinated with negative eigen values, no correction is applied to the matrix).

metric should be input as a vector of functions. The functions are sorted and used by dimension-level from 3 to 1 (see dispRity.metric and make.metric). Typically dimension-level 3 functions take a matrix and output a matrix; dimension-level 2 functions take a matrix and output a vector and dimension-level 1 functions take a matrix or a vector and output a single value. When more than one function is input, they are treated first by dimension-level (i.e. 3, 2 and finally 1). Note that the functions can only take one metric of each dimension-level and thus can only take a maximum of three arguments!

Some metric functions are built into the dispRity package: see dispRity.metric For user specified metrics, please use make.metric to ensure that the metric will work.

HINT: if using more than three functions you can always create your own function that uses more than one function (e.g. my_function <- function(matrix) cor(var(matrix)) is perfectly valid and allows one to use two dimension-level 3 functions - the correlation of the variance-covariance matrix in this case).

The between.groups argument runs the disparity between groups rather within groups. If between.groups = TRUE, the disparity will be calculated using the following inputs:

if the input is an output from custom.subsets, the series are run in a pairwise manner using metric(matrix, matrix2). For example for a custom.subset contains 3 subsets m1, m2 and m3, the code loops through: metric(m1, m2), metric(m2, m3) and metric(m1, m3) (looping through list(c(1,2), c(2,3), c(3,1))).
if the input is an output from chrono.subsets, the series are run in a paired series manner using metric(matrix, matrix2). For example for a chrono.subsets contains 3 subsets m1, m2, m3 and m4, the code loops through: metric(m1, m2) and metric(m2, m3) (looping through list(c(1,2), c(2,3), c(3,4))).

In both cases it is also possible to specify the input directly by providing the list to loop through. For example using between.groups = list(c(1,2), c(2,1), c(4,8)) will apply the metric to the 1st and 2nd subsets, the 2nd and first and the 4th and 8th (in that specific order).

Value

This function outputs a dispRity object containing at least the following:

`matrix`	the multidimensional space (a list of `matrix`).
`call`	A `list` containing the called arguments.
`subsets`	A `list` containing matrices pointing to the elements present in each subsets.
`disparity`	A `list` containing the disparity in each subsets.

Use summary.dispRity to summarise the dispRity object.

Author(s)

Thomas Guillerme

Examples

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)

## Calculating the disparity as the sum of variances from a single matrix
sum_of_variances <- dispRity(BeckLee_mat50, metric = c(sum, variances))
summary(sum_of_variances)
## Bootstrapping this value
bootstrapped_data <- boot.matrix(BeckLee_mat50, bootstraps = 100)
dispRity(bootstrapped_data, metric = c(sum, variances))

## Calculating the disparity from a customised subset
## Generating the subsets
customised_subsets <- custom.subsets(BeckLee_mat50,
     list(group1 = 1:(nrow(BeckLee_mat50)/2),
          group2 = (nrow(BeckLee_mat50)/2):nrow(BeckLee_mat50)))
## Bootstrapping the data
bootstrapped_data <- boot.matrix(customised_subsets, bootstraps = 100)
## Calculating the sum of variances
sum_of_variances <- dispRity(bootstrapped_data, metric = c(sum, variances))
summary(sum_of_variances)

## Calculating disparity with different metrics of different dimension-levels
## Disparity is calculated as the distribution of the variances in each
## dimension (output are distributions)
disparity_level2 <- dispRity(BeckLee_mat50, metric = variances)
## Disparity is calculated as the mean of the variances in each dimension 
## (output are single values)
disparity_level1 <- dispRity(disparity_level2, metric = mean)
## Both disparities have the same means but dimension-level 1 has no quantiles
summary(disparity_level2)
summary(disparity_level1)


## Load the Beck & Lee 2014 data
data(BeckLee_mat50)

## Calculating the disparity as the sum of variances from a single matrix
sum_of_variances <- dispRity(BeckLee_mat50, metric = c(sum, variances))
summary(sum_of_variances)
## Bootstrapping this value
bootstrapped_data <- boot.matrix(BeckLee_mat50, bootstraps = 100)
dispRity(bootstrapped_data, metric = c(sum, variances))

## Calculating the disparity from a customised subset
## Generating the subsets
customised_subsets <- custom.subsets(BeckLee_mat50,
     list(group1 = 1:(nrow(BeckLee_mat50)/2),
          group2 = (nrow(BeckLee_mat50)/2):nrow(BeckLee_mat50)))
## Bootstrapping the data
bootstrapped_data <- boot.matrix(customised_subsets, bootstraps = 100)
## Calculating the sum of variances
sum_of_variances <- dispRity(bootstrapped_data, metric = c(sum, variances))
summary(sum_of_variances)

## Calculating disparity with different metrics of different dimension-levels
## Disparity is calculated as the distribution of the variances in each
## dimension (output are distributions)
disparity_level2 <- dispRity(BeckLee_mat50, metric = variances)
## Disparity is calculated as the mean of the variances in each dimension 
## (output are single values)
disparity_level1 <- dispRity(disparity_level2, metric = mean)
## Both disparities have the same means but dimension-level 1 has no quantiles
summary(disparity_level2)
summary(disparity_level1)

Covar projection analyses wrapper

Description

Wrapper function for a covar projection analyses on dispRity objects

Usage

dispRity.covar.projections(
  data,
  type,
  base,
  sample,
  n,
  major.axis = 1,
  level = 0.95,
  output = c("position", "distance", "degree"),
  inc.base = FALSE,
  ...,
  verbose = FALSE
)
dispRity.covar.projections(
  data,
  type,
  base,
  sample,
  n,
  major.axis = 1,
  level = 0.95,
  output = c("position", "distance", "degree"),
  inc.base = FALSE,
  ...,
  verbose = FALSE
)

Arguments

`data`	a `dispRity` object containing a `$covar` component(e.g. from `MCMCglmm.subsets`)
`type`	either `"groups"` for the projections between groups or `"elements"` for the projections of elements onto groups.
`base`	optional, a specific group to project the elements or the groups onto or a list of pairs of groups to compare (see `between.groups` argument in `dispRity`). If left empty, the groups are projected onto each other in a pairwise manner and the elements are projected onto their respective groups.
`sample`	optional, one or more specific posterior sample IDs (is ignored if n is used) or a function to summarise all axes.
`n`	optional, a random number of covariance matrices to sample (if left empty, all are used).
`major.axis`	which major axis to use (default is `1`; see `axis.covar` for more details).
`level`	the confidence interval to estimate the major axis (default is `0.95`; see `axis.covar` for more details)).
`output`	which values to output from the projection. By default, the three values `c("position", "distance", "degree")` are used to respectively output the projection, rejection and angle values (see `projections` for more details). The argument `"orthogonality"` can also be added to this vector.
`inc.base`	logical, when using `type = "elements"` with a supplied `base` argument, whether to also calculate the projections for the base group (`TRUE`) or not (`FALSE`; default).
`...`	any optional arguments to pass to `projections` (such as `centre` or `abs`). NOTE that this function uses by default `centre = TRUE` and `abs = TRUE` which are not the defaults for `projections`.
`verbose`	logical, whether to be verbose (`TRUE`) or not (`FALSE`, default).

Details

Effectively, the wrapper runs either of the following function (simplified here):

if type = "groups": dispRity(data, metric = as.covar(projections.between), between.groups = TRUE, ) for the projections group in data onto each other.
if type = "elements": dispRity(data, metric = as.covar(projections), ...) for the projections of each element in data onto their main axis.

If base is specified:

type = "groups" will run pairs elements each subset and base (instead of the full pairwise analyses).
type = "elements" will run the projection of each subset onto the major axis from base rather than its own.

Value

A list of class "dispRity" and "projection" which contains dispRity objects corresponding to each projection value from output. The elements of the list can be accessed and analysed individually by selecting them by name (e.g. output$position) or by ID (e.g. output[[1]]). Alternatively, the list can be summarised and plotted using summary.dispRity plot.dispRity.

Author(s)

Thomas Guillerme

References

Guillerme T, Bright JA, Cooney CR, Hughes EC, Varley ZK, Cooper N, Beckerman AP, Thomas GH. 2023. Innovation and elaboration on the avian tree of life. Science Advances. 9(43):eadg1641.

Examples

data(charadriiformes)

## Creating a dispRity object with a covar component
my_covar <-MCMCglmm.subsets(
                 data       = charadriiformes$data,
                 posteriors = charadriiformes$posteriors,
                 tree       = charadriiformes$tree,
                 group      = MCMCglmm.levels(
                                 charadriiformes$posteriors)[1:4],
                 rename.groups = c("gulls", "plovers", "sandpipers", "phylo"))

## Running a projection analyses between groups (on 100 random samples)
between_groups <- dispRity.covar.projections(my_covar, type = "groups", base = "phylo", n = 100)
## Summarising the results
summary(between_groups)

## Measuring the projection of the elements on their own average major axis
elements_proj <- dispRity.covar.projections(my_covar, type = "elements", sample = mean,
                                            output = c("position", "distance"))
## Visualising the results
plot(elements_proj)

## Visualising the correlation
plot(elements_proj, speicfic.args = list(correlation.plot = c("position", "distance")))

data(charadriiformes)

## Creating a dispRity object with a covar component
my_covar <-MCMCglmm.subsets(
                 data       = charadriiformes$data,
                 posteriors = charadriiformes$posteriors,
                 tree       = charadriiformes$tree,
                 group      = MCMCglmm.levels(
                                 charadriiformes$posteriors)[1:4],
                 rename.groups = c("gulls", "plovers", "sandpipers", "phylo"))

## Running a projection analyses between groups (on 100 random samples)
between_groups <- dispRity.covar.projections(my_covar, type = "groups", base = "phylo", n = 100)
## Summarising the results
summary(between_groups)

## Measuring the projection of the elements on their own average major axis
elements_proj <- dispRity.covar.projections(my_covar, type = "elements", sample = mean,
                                            output = c("position", "distance"))
## Visualising the results
plot(elements_proj)

## Visualising the correlation
plot(elements_proj, speicfic.args = list(correlation.plot = c("position", "distance")))

Fast dispRity

Description

Fast disparity calculations. THIS FUNCTION IS LESS SAFE TO USE THAN dispRity (see details).

Usage

dispRity.fast(group, space, metric, ...)
dispRity.fast(group, space, metric, ...)

Arguments

`group`	a logical vector for grouping
`space`	a matrix
`metric`	a metric dispRity style (up to two levels)
`...`	any additional arguments to be passed to `metric`

Details

IN DOUBT, USE dispRity INSTEAD OF THIS FUNCTION. This function should only be used in very specific cases requiring advanced optimisation or embedded customised functions. This function is simply applying metric(space[group,]) for each group and returns a list of results. It does not check the validity of the data, metric and groups. It does not handle specific data (e.g. multiple matrices), specific metrics (e.g. no optional arguments), does not produce meaningful error messages and does not intake nor returns a dispRity object.

Author(s)

Thomas Guillerme

Examples

## A random space
space <- matrix(rnorm(25), 5, 5)

## A metric
metric <- c(sum, variances)

## A group of four observations
group <- c(TRUE, TRUE, TRUE, TRUE, FALSE)

## The disparity
dispRity.fast(group, space, metric)


## A random space
space <- matrix(rnorm(25), 5, 5)

## A metric
metric <- c(sum, variances)

## A group of four observations
group <- c(TRUE, TRUE, TRUE, TRUE, FALSE)

## The disparity
dispRity.fast(group, space, metric)

Disparity metrics

Description

Different implemented disparity metrics.

Usage

dimension.level3.fun(matrix, ...)
dimension.level2.fun(matrix, ...)
dimension.level1.fun(matrix, ...)
between.groups.fun(matrix, matrix2, ...)
dimension.level3.fun(matrix, ...)
dimension.level2.fun(matrix, ...)
dimension.level1.fun(matrix, ...)
between.groups.fun(matrix, matrix2, ...)

Arguments

`matrix`	A matrix.
`...`	Optional arguments to be passed to the function. Usual optional arguments are `method` for specifying the method for calculating distance passed to `vegdist` (e.g. `method = "euclidean"` - default - or `method = "manhattan"`) or `k.root` to scale the result using the $kth$ root. See details below for available optional arguments for each function.
`matrix2`	Optional, a second matrix for metrics between groups.

Details

These are inbuilt functions for calculating disparity. See make.metric for details on dimension.level3.fun, dimension.level2.fun, dimension.level1.fun and between.groups.fun. The dimensions levels (1, 2 and 3) can be seen as similar to ranks in linear algebra.

The currently implemented dimension-level 1 metrics are:

convhull.volume: calculates the convex hull hypervolume of a matrix (calls convhulln(x, options = "FA")$vol).
- Both convhull functions call the convhulln function with the "FA" option (computes total area and volume).
- WARNING: both convhull functions can be computationally intensive above 10 dimensions!
convhull.surface: calculates the convex hull hypersurface of a matrix (calls convhulln(x, options = "FA")$area).
diagonal: calculates the longest distance in the ordinated space.
- WARNING: This function is the generalisation of Pythagoras' theorem and thus works only if each dimensions are orthogonal to each other.
ellipsoid.volume: calculates the ellipsoid volume of a matrix. This function tries to determine the nature of the input matrix and uses one of these following methods to calculate the volume. You can always specify the method using method = "my_choice" to overrun the automatic method choice.
- "eigen": this method directly calculates the eigen values from the input matrix (using eigen). This method is automatically selected if the input matrix is "distance like" (i.e. square with two mirrored triangles and a diagonal).
- "pca": this method calculates the eigen values as the sum of the variances of the matrix (abs(apply(var(matrix),2, sum))). This is automatically selected if the input matrix is NOT "distance like". Note that this method is faster than "eigen" but only works if the input matrix is an ordinated matrix from a PCA, PCO, PCoA, NMDS or MDS.
- "axes": this method calculates the actual semi axes length using the input matrix. It is never automatically selected. By default this method calculates the length of the major axes based on the 0.95 confidence interval ellipse but this can be modified by providing additional arguments from axis.covar.
- <a numeric vector>: finally, you can directly provide a numeric vector of eigen values. This method is never automatically selected and overrides any other options.
func.div: The functional divergence (Villeger et al. 2008): the ratio of deviation from the centroid (this is similar to FD::dbFD()$FDiv).
func.eve: The functional evenness (Villeger et al. 2008): the minimal spanning tree distances evenness (this is similar to FD::dbFD()$FEve). If the matrix used is not a distance matrix, the distance method can be passed using, for example method = "euclidean" (default).
mode.val: calculates the modal value of a vector.
n.ball.volume: calculate the volume of the minimum n-ball (if sphere = TRUE) or of the ellipsoid (if sphere = FALSE).
roundness: calculate the roundness of an elliptical representation of a variance-covariance matrix as the integral of the ranked distribution of the major axes. A value of 1 indicates a sphere, a value between 1 and 0.5 indicates a more pancake like representation and a value between 0.5 and 0 a more cigar like representation. You can force the variance-covariance calculation by using the option vcv = TRUE (default) that will calculate the variance-covariance matrix if the input is not one.

See also mean, median, sum or prod for commonly used summary metrics.

The currently implemented dimension-level 2 metrics are:

ancestral.dist: calculates the distance between each elements coordinates in the matrix and their ancestors' coordinates (if to.root = FALSE; default) or to the root coordinates (if to.root = TRUE) for a given tree. The distance is calculate as Euclidean by default but can be changed through the methods argument (method = "euclidean"; default). Note that the matrix must contain data for both tips and nodes in the tree, otherwise you must provide a matrix to the argument reference.data that contains them. Note that if the function is used in dispRity, both the tree and reference.data can be automatically recycled from the dispRity object (if present).
angles: calculates the angles of the main axis of variation per dimension in a matrix. The angles are calculated using the least square algorithm from the lm function. The unit of the angle can be changed through the unit argument (either "degree" (default), radian or slope) and a base angle to measure the angle from can be passed through the base argument (by default base = 0, measuring the angle from the horizontal line (note that the base argument has to be passed in the same unit as unit). When estimating the slope through lm, you can use the option significant to only consider significant slopes (TRUE) or not (FALSE - default).
centroids: calculates the distance between each row and the centroid of the matrix (Laliberte 2010). This function can take an optional arguments centroid for defining the centroid (if missing (default), the centroid of the matrix is used). This argument can be either a subset of coordinates matching the matrix's dimensions (e.g. c(0, 1, 2) for a matrix with three columns) or a single value to be the coordinates of the centroid (e.g. centroid = 0 will set the centroid coordinates to c(0, 0, 0) for a three dimensional matrix). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
count.neighbours: counts the number of other elements neigbhouring each element within a certain radius. This function can take the optional arguments radius that is the radius for counting the neighbours. This can be either missing (by default this is half the longest distance), a function to calculate the distance taking x as the sole argument (e.g. sd or function(x) sum(x, na.rm = TRUE)/length(x)) or a numeric or integer value. The other option is relative to make the counts relative to the number of elements (relative = TRUE; default) or not (relative = FALSE). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
deviations: calculates the minimal Euclidean distance between each element in and the hyperplane (or line if 2D, or a plane if 3D). You can specify equation of hyperplane of d dimensions in the $intercept + ax + by + ... + nd = 0$ format. For example the line $y = 3x + 1$ should be entered as c(1, 3, -1) or the plane $x + 2y - 3z = 44$ as c(44, 1, 2, -3). If missing the hyperplane (default) is calculated using a least square regression using a gaussian glm. Extra arguments can be passed to glm through .... When estimating the hyperplane, you can use the option significant to only consider significant slopes (TRUE) or not (FALSE - default).
displacements: calculates the ratio between the distance to the centroid (see centroids above) and the distance from a reference (by default the origin of the space). The reference can be changed through the reference argument. NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
edge.length.tree: calculates the edge length from a given tree for each elements present in the matrix. Each edge length is either measured between the element and the root of the tree (to.root = TRUE ; default) or between the element and its last ancestor (to.root = FALSE))
neighbours: calculates the distance to a neighbour (Foote 1990). By default this is the distance to the nearest neighbour (which = min) but can be set to any dimension level - 1 function (e.g. which = mean gives the distance to the most average neighbour). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
pairwise.dist: calculates the pairwise distance between elements - calls vegdist(matrix, method = method, diag = FALSE, upper = FALSE, ...). The distance type can be changed via the method argument (see vegdist - default: method = "euclidean"). This function outputs a vector of pairwise comparisons in the following order: d(A,B), d(A,C), d(B,C) for three elements A, B and C. NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
projections: projects each element on a vector defined as (point1, point2) and measures some aspect of this projection. The different aspects that can be measured are:
- measure = "position" (default), the distance of each element on the vector (point1, point2). Negative values means the element projects on the opposite direction of the vector (point1, point2).
- measure = "distance", the euclidean distance of each element from the vector (point1, point2).
- measure = "degree", the angle between the vector (point1, point2) and any vector (point1, element) in degrees.
- measure = "radian", the angle between the vector (point1, point2) and any vector (point1, element) in radians.
- measure = "orthogonality", the angle between the vector (point1, point2) and any vector (point1, element) expressed in right angle ranging between 0 (non angle) and 1 (right angle).
By default, point1 is the centre of the space (coordinates 0, 0, 0, ...) and point2 is the centroid of the space (coordinates colMeans(matrix)). Coordinates for point1 and point2 can be given as a single value to be repeated (e.g. point1 = 1 is translated into point1 = c(1, 1, ...)) or a specific set of coordinates. Furthermore, by default, the space is scaled so that the vector (point1, point2) becomes the unit vector (distance (point1, point2) is set to 1; option scale = TRUE; default). You can use the unit vector of the space using the option scale = FALSE. Other options include the centering of the projections on 0.5 (centre = TRUE; default is set to FALSE) ranging the projection onto the vector (point1, point2) between -1 and 1 (higher or lower values project beyond the vector); and whether to output the projection values as absolute values (abs = FALSE; default is set to FALSE). These two last options only affect the results from measure = "position".
projections.tree: calculates the projections metric but drawing the vectors from a phylogenetic tree. This metric can intake any argument from projections (see above) but for point1 and point2 that are replaced by the argument type. type is a vector or a list of two elements that designates which vector to draw and can be any pair of the following options (the first element being the origin of the vector and the second where the vector points to):
- "root": the root of the tree (the first element in tree$node.label);
- "ancestor": the element's most recent ancestor;
- "tips": the centroid of the tips;
- "nodes": the centroid of all the nodes;
- "livings": the centroid of the tips the furthest from the root;
- "fossils": the centroid of all the tips that are not the furthest from the root;
- any numeric values that can be interpreted as point1 and point2 in projections;
- or a user defined function that with the inputs matrix and tree and row (the element's ID, i.e. the row number in matrix).
NOTE: the elements to calculate the origin and end points of the vector are calculated by default on the provided input matrix which can be missing data from the tree if used with custom.subsets or chrono.subsets. You can always provide the full matrix using the option reference.data = my_matrix. Additional arguments include any arguments to be passed to projections (e.g. centre or abs).
quantiles: calculates the quantile range of each axis of the matrix. The quantile can be changed using the quantile argument (default is quantile = 95, i.e. calculating the range on each axis that includes 95% of the data). An optional argument, k.root, can be set to TRUE to scale the ranges by using its $kth$ root (where $k$ are the number of dimensions). By default, k.root = FALSE.
radius: calculates a distance from the centre of each axis. The type argument is the function to select which distance to calculate. By default type = max calculates the maximum distance between the elements and the centre for each axis (i.e. the radius for each dimensions)
ranges: calculates the range of each axis of the matrix (Wills 2001). An optional argument, k.root, can be set to TRUE to scale the ranges by using its $kth$ root (where $k$ are the number of dimensions). By default, k.root = FALSE.
variances: calculates the variance of each axis of the matrix (Wills 2001). This function can also take the k.root optional argument described above.
span.tree.length: calculates the length of the minimum spanning tree (see spantree). This function can get slow with big matrices. To speed it up, one can directly use distance matrices as the multidimensional space.

The currently implemented between.groups metrics are:

disalignment: calculates the rejection of a point from matrix from the major axis of matrix2. Options are, axis to choose which major axis to reject from (default is axis = 1); level for the ellipse' confidence interval (to calculate the axis) (default is level = 0.95) and point.to.reject, a numeric value for designating which point in matrix to use or a function for calculating it (default is point.to.reject = colMeans for matrix's centroid).
group.dist: calculates the distance between two groups (by default, this is the minimum euclidean vector norm distance between groups). Negative distances are considered as 0. This function must intake two matrices (matrix and matrix2) and the quantiles to consider. For the minimum distance between two groups, the 100th quantiles are considered (default: probs = c(0,1)) but this can be changed to any values (e.g. distance between the two groups accounting based on the 95th CI: probs = c(0.025, 0.975); distance between centroids: probs = c(0.5), etc...). This function is the linear algebra equivalent of the hypervolume::hypervolume_distance function.
point.dist: calculates the distance between matrix and a point calculated from matrix2. By default, this point is the centroid of matrix2. This can be changed by passing a function to be applied to matrix2 through the point argument (for example, for the centroid: point.dist(..., point = colMeans)). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
projections.between: calculates the projection of the major axis between two matrices. It allows the same arguments as projections. This function measures the major axis from both input matrices, centre their origins and projects the end of the vector of matrix onto the vector from matrix2. Which axis to measure can be changed with the option axis (for the major axis, axis = 1; default) and the confidence interval can be changed using level (for the 95 confidence interval, level = 0.95; default - see axis.covar for more details).

When used in the dispRity function, optional arguments are declared after the metric argument: for example dispRity(data, metric = centroids, centroid = 0, method = "manhattan")

Author(s)

Thomas Guillerme

References

Donohue I, Petchey OL, Montoya JM, Jackson AL, McNally L, Viana M, Healy K, Lurgi M, O'Connor NE, Emmerson MC. 2013. On the dimensionality of ecological stability. Ecology letters. 16(4):421-9.

Lalibert'e E, Legendre P. 2010. A distance-based framework for measuring functional diversity from multiple traits. Ecology, 91(1), pp.299-305.

Vill'eger S, Mason NW, Mouillot D. 2008. New multidimensional functional diversity indices for a multifaceted framework in functional ecology. Ecology. 89(8):2290-301.

Wills MA. 2001. Morphological disparity: a primer. In Fossils, phylogeny, and form (pp. 55-144). Springer, Boston, MA.

Foote, M. 1990. Nearest-neighbor analysis of trilobite morphospace. Systematic Zoology, 39(4), pp.371-382.

Guillerme T, Puttick MN, Marcy AE, Weisbecker V. 2020. Shifting spaces: Which disparity or dissimilarity measurement best summarize occupancy in multidimensional spaces?. Ecology and evolution. 10(14):7261-75.

Examples

## A random matrix
dummy_matrix <- matrix(rnorm(90), 9, 10)

## ancestral.dist
## A random tree with node labels
rand_tree <- rtree(5) ; rand_tree$node.label <- paste0("n", 1:4)
## Adding the tip and node names to the matris
rownames(dummy_matrix) <- c(rand_tree$tip.label, rand_tree$node.label)
## Calculating the distances to the ancestors
ancestral.dist(dummy_matrix, tree = rand_tree)
## Calculating the manhattan distances to the root
ancestral.dist(dummy_matrix, tree = rand_tree,
               to.root = TRUE, method = "manhattan")

## angles
## The angles in degrees of each axis
angles(dummy_matrix)
## The angles in slope from the 1:1 slope (Beta = 1)
angles(dummy_matrix, unit = "slope", base = 1)

## centroids
## Distances between each row and centroid of the matrix
centroids(dummy_matrix)
## Distances between each row and an arbitrary point
centroids(dummy_matrix, centroid = c(1,2,3,4,5,6,7,8,9,10))
## Distances between each row and the origin
centroids(dummy_matrix, centroid = 0)

## convhull.surface
## Making a matrix with more elements than dimensions (for convhull)
thinner_matrix <- matrix(rnorm(90), 18, 5)
## Convex hull hypersurface of a matrix
convhull.surface(thinner_matrix)

## convhull.volume
## Convex hull volume of a matrix
convhull.volume(thinner_matrix)

## count.neigbhours
## Counting the number of neighbours within a radius of half the traitspace
count.neighbours(dummy_matrix)
## The absolute number of neighbours within a radius of 3
count.neighbours(dummy_matrix, radius = 3, relative = FALSE)
## The relative number of neighbours within a radius of one standard deviation
count.neighbours(dummy_matrix, radius = sd, relative = FALSE)

## deviations
## The deviations from the least square hyperplane
deviations(dummy_matrix)
## The deviations from the plane between the x and y axis
deviations(dummy_matrix, hyperplane = c(0,1,1,0,0,0,0,0,0,0,0))

## diagonal
## Matrix diagonal
diagonal(dummy_matrix) # WARNING: only valid if the dimensions are orthogonal

## disalignment
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Measuring the disalignment of matrix_1 from matrix_2
disalignment(matrix_1, matrix_2)
## Same but using the 2nd major axis of the 0.75 CI ellipse
## from matrix_2 and the first point from matrix_1.
disalignment(matrix_1, matrix_2,
             axis = 2, level = 0.75,
             point.to.reject = 1)

## displacements
## displacement ratios (from the centre)
displacements(dummy_matrix)
## displacement ratios (from an arbitrary point)
displacements(dummy_matrix, reference = c(1,2,3,4,5,6,7,8,9,10))
## displacement ratios from the centre (manhattan distance)
displacements(dummy_matrix, method = "manhattan")

## edge.length.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree$tip.label,
                            dummy_tree$node.label)
## The total edge length of each element in the matrix (to the root)
edge.length.tree(named_matrix, tree = dummy_tree)

## The edge lengths for each edge leading to the elements in the matrix
edge.length.tree(named_matrix, tree = dummy_tree, to.root = FALSE)

## ellipsoid.volume
## Ellipsoid volume of a matrix
ellipsoid.volume(dummy_matrix)
## Calculating the same volume with provided eigen values
ordination <- prcomp(dummy_matrix)
## Calculating the ellipsoid volume by providing your own eigen values
ellipsoid.volume(ordination$x, method = ordination$sdev^2)

## func.div
## Functional divergence
func.div(dummy_matrix)

## func.eve
## Functional evenness
func.eve(dummy_matrix) 
## Functional evenness (based on manhattan distances)
func.eve(dummy_matrix, method = "manhattan")

## group.dist
## The distance between groups
dummy_matrix2 <- matrix(runif(40, min = 2, max = 4), 4, 10)
## The minimum distance between both groups
group.dist(dummy_matrix, dummy_matrix2)
## The distance between both groups' centroids
group.dist(dummy_matrix, dummy_matrix2, probs = 0.5)
## The minimum distance between the 50% CI of each group
group.dist(dummy_matrix, dummy_matrix2, probs = c(0.25, 0.75))

## mode.val
## Modal value of a vector
mode.val(dummy_matrix)

## neighbours
## The nearest neighbour euclidean distances
neighbours(dummy_matrix)
## The furthest neighbour manhattan distances
neighbours(dummy_matrix, which = max, method = "manhattan")

## pairwise.dist
## The pairwise distance
pairwise.dist(dummy_matrix)
## The average squared pairwise distance
mean(pairwise.dist(dummy_matrix)^2)
## equal to:
# geiger::disparity(data = dummy_matrix)

## point.dist
## The distances from the rows dummy_matrix
## to the centroids of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2)
## The average distances from dummy_matrix
## to the centroids of dummy_matrix2
mean(point.dist(dummy_matrix, dummy_matrix2))
## The manhattan distance from the rows dummy_matrix
## to the standard deviation of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2, point = sd, method = "manhattan")

## projections
## The distances on the vector defined from the centre of
## the matrix to its centroid (default)
projections(dummy_matrix)
## The distances from the vector defined from the third
## element of the matrix to the point of coordinated
## c(1,1,1, ...) the matrix to its centroid (default)
projections(dummy_matrix, measure = "distance",
            point1 = dummy_matrix[3, ],
            point2 = 1)

## projections.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree$tip.label,
                            dummy_tree$node.label)
## The projection on the vector defined from the root of
## the tree to the ancestor of each element in the matrix
projections.tree(named_matrix, dummy_tree,
                  type = c("root", "ancestor"))
## The rejection from the vector defined from the centroid
## of the nodes to the centroids of the tips
projections.tree(named_matrix, dummy_tree,
                  type = c("nodes", "tips"),
                  measure = "distance")
## A user function that define coordinates based on the 
## centroid of the three first nodes
user.fun <- function(matrix, tree, row = NULL) {
     return(colMeans(matrix[tree$node.label[1:3], ]))
}
## The projection on the vector defined by the coordinates
## 0,0,0 and a user defined function
projections.tree(named_matrix, dummy_tree,
                  type = c(0, user.fun))

## projections.between
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Projecting the major axis of matrix_2 onto the one from matrix_1
projections.between(matrix_1, matrix_2)
## Projecting both second major 0.75 axes
## and getting the rejections (see projections() for option details)
projections.between(matrix_1, matrix_2,
                    measure = "distance",
                    axis = 2, level = 0.75)

## quantiles
## The 95 quantiles
quantiles(dummy_matrix)
## The 100 quantiles (which are equal to the ranges)
quantiles(dummy_matrix, quantile = 100) == ranges(dummy_matrix) # All TRUE

## radius
## The maximal radius of each axis (maximum distance from centre of each axis)
radius(dummy_matrix)

## ranges
## ranges of each column in a matrix
ranges(dummy_matrix)
## ranges of each column in the matrix corrected using the kth root
ranges(dummy_matrix, k.root = TRUE)

## roundness
## calculating the variance-covariance of the dummy_matrix
vcv <- var(dummy_matrix)
## calculating the roundness of it
roundness(vcv)
## calculating the roundness of the dummy matrix by calculating the vcv
roundness(dummy_matrix, vcv = TRUE)

## span.tree.length
## Minimum spanning tree length (default)
span.tree.length(dummy_matrix)
## Minimum spanning tree length from a distance matrix (faster)
distance <- as.matrix(dist(dummy_matrix))
span.tree.length(distance)
## Minimum spanning tree length based on Manhattan distance
span.tree.length(dummy_matrix, method = "manhattan")
span.tree.length(as.matrix(dist(dummy_matrix, method = "manhattan"))) # Same

## variances
## variances of a each column in the matrix
variances(dummy_matrix)
## variances of a each column in the matrix corrected using the kth root
variances(dummy_matrix, k.root = TRUE)


## A random matrix
dummy_matrix <- matrix(rnorm(90), 9, 10)

## ancestral.dist
## A random tree with node labels
rand_tree <- rtree(5) ; rand_tree$node.label <- paste0("n", 1:4)
## Adding the tip and node names to the matris
rownames(dummy_matrix) <- c(rand_tree$tip.label, rand_tree$node.label)
## Calculating the distances to the ancestors
ancestral.dist(dummy_matrix, tree = rand_tree)
## Calculating the manhattan distances to the root
ancestral.dist(dummy_matrix, tree = rand_tree,
               to.root = TRUE, method = "manhattan")

## angles
## The angles in degrees of each axis
angles(dummy_matrix)
## The angles in slope from the 1:1 slope (Beta = 1)
angles(dummy_matrix, unit = "slope", base = 1)

## centroids
## Distances between each row and centroid of the matrix
centroids(dummy_matrix)
## Distances between each row and an arbitrary point
centroids(dummy_matrix, centroid = c(1,2,3,4,5,6,7,8,9,10))
## Distances between each row and the origin
centroids(dummy_matrix, centroid = 0)

## convhull.surface
## Making a matrix with more elements than dimensions (for convhull)
thinner_matrix <- matrix(rnorm(90), 18, 5)
## Convex hull hypersurface of a matrix
convhull.surface(thinner_matrix)

## convhull.volume
## Convex hull volume of a matrix
convhull.volume(thinner_matrix)

## count.neigbhours
## Counting the number of neighbours within a radius of half the traitspace
count.neighbours(dummy_matrix)
## The absolute number of neighbours within a radius of 3
count.neighbours(dummy_matrix, radius = 3, relative = FALSE)
## The relative number of neighbours within a radius of one standard deviation
count.neighbours(dummy_matrix, radius = sd, relative = FALSE)

## deviations
## The deviations from the least square hyperplane
deviations(dummy_matrix)
## The deviations from the plane between the x and y axis
deviations(dummy_matrix, hyperplane = c(0,1,1,0,0,0,0,0,0,0,0))

## diagonal
## Matrix diagonal
diagonal(dummy_matrix) # WARNING: only valid if the dimensions are orthogonal

## disalignment
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Measuring the disalignment of matrix_1 from matrix_2
disalignment(matrix_1, matrix_2)
## Same but using the 2nd major axis of the 0.75 CI ellipse
## from matrix_2 and the first point from matrix_1.
disalignment(matrix_1, matrix_2,
             axis = 2, level = 0.75,
             point.to.reject = 1)

## displacements
## displacement ratios (from the centre)
displacements(dummy_matrix)
## displacement ratios (from an arbitrary point)
displacements(dummy_matrix, reference = c(1,2,3,4,5,6,7,8,9,10))
## displacement ratios from the centre (manhattan distance)
displacements(dummy_matrix, method = "manhattan")

## edge.length.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree$tip.label,
                            dummy_tree$node.label)
## The total edge length of each element in the matrix (to the root)
edge.length.tree(named_matrix, tree = dummy_tree)

## The edge lengths for each edge leading to the elements in the matrix
edge.length.tree(named_matrix, tree = dummy_tree, to.root = FALSE)

## ellipsoid.volume
## Ellipsoid volume of a matrix
ellipsoid.volume(dummy_matrix)
## Calculating the same volume with provided eigen values
ordination <- prcomp(dummy_matrix)
## Calculating the ellipsoid volume by providing your own eigen values
ellipsoid.volume(ordination$x, method = ordination$sdev^2)

## func.div
## Functional divergence
func.div(dummy_matrix)

## func.eve
## Functional evenness
func.eve(dummy_matrix) 
## Functional evenness (based on manhattan distances)
func.eve(dummy_matrix, method = "manhattan")

## group.dist
## The distance between groups
dummy_matrix2 <- matrix(runif(40, min = 2, max = 4), 4, 10)
## The minimum distance between both groups
group.dist(dummy_matrix, dummy_matrix2)
## The distance between both groups' centroids
group.dist(dummy_matrix, dummy_matrix2, probs = 0.5)
## The minimum distance between the 50% CI of each group
group.dist(dummy_matrix, dummy_matrix2, probs = c(0.25, 0.75))

## mode.val
## Modal value of a vector
mode.val(dummy_matrix)

## neighbours
## The nearest neighbour euclidean distances
neighbours(dummy_matrix)
## The furthest neighbour manhattan distances
neighbours(dummy_matrix, which = max, method = "manhattan")

## pairwise.dist
## The pairwise distance
pairwise.dist(dummy_matrix)
## The average squared pairwise distance
mean(pairwise.dist(dummy_matrix)^2)
## equal to:
# geiger::disparity(data = dummy_matrix)

## point.dist
## The distances from the rows dummy_matrix
## to the centroids of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2)
## The average distances from dummy_matrix
## to the centroids of dummy_matrix2
mean(point.dist(dummy_matrix, dummy_matrix2))
## The manhattan distance from the rows dummy_matrix
## to the standard deviation of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2, point = sd, method = "manhattan")

## projections
## The distances on the vector defined from the centre of
## the matrix to its centroid (default)
projections(dummy_matrix)
## The distances from the vector defined from the third
## element of the matrix to the point of coordinated
## c(1,1,1, ...) the matrix to its centroid (default)
projections(dummy_matrix, measure = "distance",
            point1 = dummy_matrix[3, ],
            point2 = 1)

## projections.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree$tip.label,
                            dummy_tree$node.label)
## The projection on the vector defined from the root of
## the tree to the ancestor of each element in the matrix
projections.tree(named_matrix, dummy_tree,
                  type = c("root", "ancestor"))
## The rejection from the vector defined from the centroid
## of the nodes to the centroids of the tips
projections.tree(named_matrix, dummy_tree,
                  type = c("nodes", "tips"),
                  measure = "distance")
## A user function that define coordinates based on the 
## centroid of the three first nodes
user.fun <- function(matrix, tree, row = NULL) {
     return(colMeans(matrix[tree$node.label[1:3], ]))
}
## The projection on the vector defined by the coordinates
## 0,0,0 and a user defined function
projections.tree(named_matrix, dummy_tree,
                  type = c(0, user.fun))

## projections.between
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Projecting the major axis of matrix_2 onto the one from matrix_1
projections.between(matrix_1, matrix_2)
## Projecting both second major 0.75 axes
## and getting the rejections (see projections() for option details)
projections.between(matrix_1, matrix_2,
                    measure = "distance",
                    axis = 2, level = 0.75)

## quantiles
## The 95 quantiles
quantiles(dummy_matrix)
## The 100 quantiles (which are equal to the ranges)
quantiles(dummy_matrix, quantile = 100) == ranges(dummy_matrix) # All TRUE

## radius
## The maximal radius of each axis (maximum distance from centre of each axis)
radius(dummy_matrix)

## ranges
## ranges of each column in a matrix
ranges(dummy_matrix)
## ranges of each column in the matrix corrected using the kth root
ranges(dummy_matrix, k.root = TRUE)

## roundness
## calculating the variance-covariance of the dummy_matrix
vcv <- var(dummy_matrix)
## calculating the roundness of it
roundness(vcv)
## calculating the roundness of the dummy matrix by calculating the vcv
roundness(dummy_matrix, vcv = TRUE)

## span.tree.length
## Minimum spanning tree length (default)
span.tree.length(dummy_matrix)
## Minimum spanning tree length from a distance matrix (faster)
distance <- as.matrix(dist(dummy_matrix))
span.tree.length(distance)
## Minimum spanning tree length based on Manhattan distance
span.tree.length(dummy_matrix, method = "manhattan")
span.tree.length(as.matrix(dist(dummy_matrix, method = "manhattan"))) # Same

## variances
## variances of a each column in the matrix
variances(dummy_matrix)
## variances of a each column in the matrix corrected using the kth root
variances(dummy_matrix, k.root = TRUE)

Disparity in different groups.

Description

Performs a disparity analysis between groups.

Usage

dispRity.per.group(data, group, metric = c(median, centroids), ...)
dispRity.per.group(data, group, metric = c(median, centroids), ...)

Arguments

`data`	An ordinated `matrix`.
`group`	A `list` of row numbers for each group.
`metric`	A vector containing one to three functions (default = `c(median, centroids)`) (see `dispRity` for details).
`...`	Optional arguments to be passed to `custom.subsets`, `boot.matrix` and `dispRity`.

Details

Note that this is a wrapper function that allows users to run a basic disparity among groups analysis without too much effort. As such it has a lot of defaults described in the functions that make up the analysis. See custom.subsets, boot.matrix, dispRity.metric, summary.dispRity, plot.dispRity for more details of the defaults used in each of these functions. Note that any of these defaults can be changed within the disparity.through.time function.

Value

A dispRity object that can be passed to summary or plot.

Author(s)

Thomas Guillerme

Examples

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)

## Run a simple disparity per group analysis comparing stem and crown mammals
result <- dispRity.per.group(BeckLee_mat50, list(crown = c(16, 19:41, 45:50),
                             stem = c(1:15, 17:18, 42:44)))
summary(result) ; plot(result)

## This is equivalent to run the following decomposed code
dispRity(boot.matrix(custom.subsets(BeckLee_mat50, list(crown = c(16, 19:41, 45:50),
                                                        stem = c(1:15, 17:18, 42:44))),
                     bootstraps = 100),
         metric = c(median, centroids))

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)

## Run a simple disparity per group analysis comparing stem and crown mammals
result <- dispRity.per.group(BeckLee_mat50, list(crown = c(16, 19:41, 45:50),
                             stem = c(1:15, 17:18, 42:44)))
summary(result) ; plot(result)

## This is equivalent to run the following decomposed code
dispRity(boot.matrix(custom.subsets(BeckLee_mat50, list(crown = c(16, 19:41, 45:50),
                                                        stem = c(1:15, 17:18, 42:44))),
                     bootstraps = 100),
         metric = c(median, centroids))

Disparity through time.

Description

Performs a disparity through time analysis.

Usage

dispRity.through.time(data, tree, time, metric = c(median, centroids), ...)
dispRity.through.time(data, tree, time, metric = c(median, centroids), ...)

Arguments

`data`	An ordinated `matrix`.
`tree`	A `phylo` object.
`time`	A `numeric` value for the number of subsets to create.
`metric`	A vector containing one to three functions (default = `c(median, centroids)`) (see `dispRity` for details).
`...`	Optional arguments to be passed to `chrono.subsets`, `boot.matrix` and `dispRity`.

Details

By default the time subsets use method = "discrete", the matrix is bootstrapped 100 times.

Note that this is a wrapper function that allows users to run a basic disparity-through-time analysis without too much effort. As such it has a lot of defaults described in the functions that make up the analysis. See chrono.subsets, boot.matrix, dispRity.metric, summary.dispRity, plot.dispRity for more details of the defaults used in each of these functions. Note that any of these defaults can be changed within the disparity.through.time function.

Value

A dispRity object that can be passed to summary or plot.

Author(s)

Thomas Guillerme

Examples

## Load the Beck & Lee 2014 data
data(BeckLee_mat50) ; data(BeckLee_tree)

## Run a simple disparity through time analysis (with three time bins)
result <- dispRity.through.time(BeckLee_mat50, BeckLee_tree, 3)
summary(result) ; plot(result)

## This is equivalent to run the following decomposed code
dispRity(boot.matrix(chrono.subsets(BeckLee_mat50, BeckLee_tree, time = 3, method = "discrete"),
                     bootstraps = 100),
         metric = c(median, centroids))

## Load the Beck & Lee 2014 data
data(BeckLee_mat50) ; data(BeckLee_tree)

## Run a simple disparity through time analysis (with three time bins)
result <- dispRity.through.time(BeckLee_mat50, BeckLee_tree, 3)
summary(result) ; plot(result)

## This is equivalent to run the following decomposed code
dispRity(boot.matrix(chrono.subsets(BeckLee_mat50, BeckLee_tree, time = 3, method = "discrete"),
                     bootstraps = 100),
         metric = c(median, centroids))

Randtest distance

Description

Measures the distance between the observed statistic from a "randtest" object and some specific quantile of the simulated data.

Usage

distance.randtest(xtest, quantile = c(0.025, 0.975), abs = FALSE)
distance.randtest(xtest, quantile = c(0.025, 0.975), abs = FALSE)

Arguments

`xtest`	an object of class `"randtest"`
`quantile`	a `numeric` value for the quantile edges to compare the observed data to on either sides (by default `quantile = c(0.025. 0.975)`).
`abs`	`logical`, whether to calculate the distance as an absolute value (`TRUE`) or not (`FALSE` - default).

Details

To compare the observed value to the simulated median value, you can use quantile = 0.5. Also note that when using abs = FALSE (default), a negative value means that the observed statistic is within the request quantiles.

Author(s)

Thomas Guillerme

Examples

## Simple example
dummy_matrix <- matrix(rnorm(500), 100, 5)

## Testing whether the mean of a random subset
## is different than the means of 100 subsets
dummy_test <- randtest.dispRity(dummy_matrix,
                                subset = sample(1:100, 20),
                                metric = mean)
dummy_test ; plot(dummy_test)

## The distance between the observed data and the 95% quantile
distance.randtest(dummy_test)

## The absolute distance from the median
distance.randtest(dummy_test, quantile = 0.5, abs = TRUE)

## Simple example
dummy_matrix <- matrix(rnorm(500), 100, 5)

## Testing whether the mean of a random subset
## is different than the means of 100 subsets
dummy_test <- randtest.dispRity(dummy_matrix,
                                subset = sample(1:100, 20),
                                metric = mean)
dummy_test ; plot(dummy_test)

## The distance between the observed data and the 95% quantile
distance.randtest(dummy_test)

## The absolute distance from the median
distance.randtest(dummy_test, quantile = 0.5, abs = TRUE)

dtt dispRity (from `geiger::dtt`)

Description

A wrapper for the geiger::dtt function working with any disparity metric.

Usage

dtt.dispRity(
  data,
  metric,
  tree,
  nsim = 0,
  model = "BM",
  alternative = "two-sided",
  scale.time = TRUE,
  ...
)
dtt.dispRity(
  data,
  metric,
  tree,
  nsim = 0,
  model = "BM",
  alternative = "two-sided",
  scale.time = TRUE,
  ...
)

Arguments

`data`	A `dispRity` object or a `matrix`
`metric`	The disparity metric to be passed to `dispRity`.
`tree`	A `phylo` object matching the data and with a `root.time` element. Can be missing if `data` has a `tree` component.
`nsim`	The number of simulations to calculate null disparity-through-time.
`model`	A evolutionary model for the simulations (see `geiger::sim.char` - default is `"BM"`).
`alternative`	The H1 alternative (for calculating the p-value). Can be `"two-sided"` (default), `"greater"` or `"lesser"`; see details.
`scale.time`	Optional, whether to scale the time (between 0 and 1; `TRUE`, default) or not (`FALSE`).
`...`	Any other arguments to be passed to `geiger::dtt`.

Details

See geiger::dtt for details.

Author(s)

Thomas Guillerme

Examples


## Loading morphological data and a tree
data(BeckLee_mat50)
data(BeckLee_tree)

## The average squared pairwise distance metric (used in geiger::dtt)
average.sq <- function(X) mean(pairwise.dist(X)^2)

## Calculate the disparity of the dataset using dtt.dispRity
dispRity_dtt <- dtt.dispRity(data = BeckLee_mat50, metric = average.sq,
                             tree = BeckLee_tree, nsim = 20)

## Plotting the results
plot(dispRity_dtt)


## Loading morphological data and a tree
data(BeckLee_mat50)
data(BeckLee_tree)

## The average squared pairwise distance metric (used in geiger::dtt)
average.sq <- function(X) mean(pairwise.dist(X)^2)

## Calculate the disparity of the dataset using dtt.dispRity
dispRity_dtt <- dtt.dispRity(data = BeckLee_mat50, metric = average.sq,
                             tree = BeckLee_tree, nsim = 20)

## Plotting the results
plot(dispRity_dtt)

Getting the time subsets before and after an extinction event

Description

Getting the reference (pre-extinction) and the comparison (post-extinction) time subsets

Usage

extinction.subsets(data, extinction, lag = 1, names = FALSE, as.list = FALSE)
extinction.subsets(data, extinction, lag = 1, names = FALSE, as.list = FALSE)

Arguments

`data`	a `dispRity` object.
`extinction`	`numerical`, the time at the extinction event.
`lag`	`numerical`, the lag effect (i.e. how many subsets after the extinction to consider - default = `1`).
`names`	`logical`, whether to display the bins names (`TRUE`) or not (`FALSE` - default).
`as.list`	`logical`, whether to output the results as a list for `test.dispRity` (`TRUE`) or not (`FALSE` - default).

Author(s)

Thomas Guillerme

Examples

## Loading some disparity data
data(disparity)

## Time subsets for the K-Pg extinction (66 Mya)
extinction.subsets(disparity, 66, names = TRUE)

## Extinction with a lag effect of 3 slices
extinction_time <- extinction.subsets(disparity, 66, lag = 3, as.list = TRUE)

## Testing the extinction effect with a lag
test.dispRity(disparity, wilcox.test, comparisons = extinction_time,
              correction = "bonferroni")

## Loading some disparity data
data(disparity)

## Time subsets for the K-Pg extinction (66 Mya)
extinction.subsets(disparity, 66, names = TRUE)

## Extinction with a lag effect of 3 slices
extinction_time <- extinction.subsets(disparity, 66, lag = 3, as.list = TRUE)

## Testing the extinction effect with a lag
test.dispRity(disparity, wilcox.test, comparisons = extinction_time,
              correction = "bonferroni")

Imports data from geomorph

Description

Takes geomorph Procrustes object or a geomorph.data.frame object and ordinates it.

Usage

geomorph.ordination(data, ordinate = TRUE, ...)
geomorph.ordination(data, ordinate = TRUE, ...)

Arguments

`data`	An array (p x k x n) typically obtained from a Procrustes superimposition `geomorph::gpagen` or a `geomorph::geomorph.data.frame` object.
`ordinate`	Logical, whether to ordinate the data using `prcomp` (`TRUE`; default) or not (`FALSE`; the code then returns the raw coordinates matrix).
`...`	Any optional arguments to be passed to `prcomp` (is ignored if `ordinate = FALSE`).

Details

If data is a geomorph.data.frame object containing factors, directly performs a custom.subsets using these factors.

Value

A matrix or a dispRity object.

Examples

## Not run: 
require(geomorph)
## Loading the plethodon dataset
data(plethodon)

## Performing a Procrustes transform
procrustes <- geomorph::gpagen(plethodon$land, PrinAxes = FALSE)

## Obtaining the ordination matrix
geomorph.ordination(procrustes)

## Using a geomorph.data.frame
geomorph_df <- geomorph.data.frame(procrustes, species = plethodon$species)

geomorph.ordination(geomorph_df)

## Calculating disparity from dispRity or geomorph::morphol.disparity
geomorph_disparity <- geomorph::morphol.disparity(coords ~ 1,
                      groups= ~ species, data = geomorph_df)
dispRity_disparity <- dispRity(geomorph.ordination(geomorph_df),
                      metric = function(X) return(sum(X^2)/nrow(X)))

## Extracting the raw disparity values
geomorph_val <- round(as.numeric(geomorph_disparity$Procrustes.var), 15)
dispRity_val <- as.vector(summary(dispRity_disparity, digits = 15)$obs)

## Comparing the values (to the 15th decimal!)
geomorph_val == dispRity_val # all TRUE

## End(Not run)

## Not run: 
require(geomorph)
## Loading the plethodon dataset
data(plethodon)

## Performing a Procrustes transform
procrustes <- geomorph::gpagen(plethodon$land, PrinAxes = FALSE)

## Obtaining the ordination matrix
geomorph.ordination(procrustes)

## Using a geomorph.data.frame
geomorph_df <- geomorph.data.frame(procrustes, species = plethodon$species)

geomorph.ordination(geomorph_df)

## Calculating disparity from dispRity or geomorph::morphol.disparity
geomorph_disparity <- geomorph::morphol.disparity(coords ~ 1,
                      groups= ~ species, data = geomorph_df)
dispRity_disparity <- dispRity(geomorph.ordination(geomorph_df),
                      metric = function(X) return(sum(X^2)/nrow(X)))

## Extracting the raw disparity values
geomorph_val <- round(as.numeric(geomorph_disparity$Procrustes.var), 15)
dispRity_val <- as.vector(summary(dispRity_disparity, digits = 15)$obs)

## Comparing the values (to the 15th decimal!)
geomorph_val == dispRity_val # all TRUE

## End(Not run)

Get time bins ages

Description

Gets time bins for a specific tree using stratigraphy

Usage

get.bin.ages(tree, what = "End", type = "Age", ICS = 2015)
get.bin.ages(tree, what = "End", type = "Age", ICS = 2015)

Arguments

`tree`	A `phylo` object with a `$root.time` component
`what`	Which data to output. Can be `"Start"`, `"End"` (default), `"Range"` or `"Midpoint"`.
`type`	The type of stratigraphic frame. Can be `"Age"` (default), `"Eon"`, `"Epoch"`, `"Era"` or `"Period"`.
`ICS`	The reference year of the International Commission on Stratigraphy (default = `2015`).

Author(s)

Thomas Guillerme

Examples

## Loading the data
data(BeckLee_tree)
data(BeckLee_mat50)

## Getting the stratigraphic data
stratigraphy <- get.bin.ages(BeckLee_tree)

## Making stratigraphic time subsets
chrono.subsets(BeckLee_mat50, tree = BeckLee_tree, method = "discrete",
                time = stratigraphy)

## Loading the data
data(BeckLee_tree)
data(BeckLee_mat50)

## Getting the stratigraphic data
stratigraphy <- get.bin.ages(BeckLee_tree)

## Making stratigraphic time subsets
chrono.subsets(BeckLee_mat50, tree = BeckLee_tree, method = "discrete",
                time = stratigraphy)

Generates a contrast matrix.

Description

Creates a contrast matrix using the observed character states in an input matrix.

Usage

get.contrast.matrix(matrix)
get.contrast.matrix(matrix)

Arguments

matrix

a discrete morphological character matrix.

Author(s)

Thomas Guillerme

Examples

## A random multistate matrix
random_matrix <- matrix(sample(c(0,1,2), 100, TRUE), 10, 10)

## Get the contrast matrix
get.contrast.matrix(random_matrix)

## Adding inapplicable and missing data to the matrix
random_matrix[sample(1:100, 10)] <- "?"
random_matrix[sample(1:100, 10)] <- "-"

## Get the contrast matrix
get.contrast.matrix(random_matrix)

## A random multistate matrix
random_matrix <- matrix(sample(c(0,1,2), 100, TRUE), 10, 10)

## Get the contrast matrix
get.contrast.matrix(random_matrix)

## Adding inapplicable and missing data to the matrix
random_matrix[sample(1:100, 10)] <- "?"
random_matrix[sample(1:100, 10)] <- "-"

## Get the contrast matrix
get.contrast.matrix(random_matrix)

Extract elements from a `dispRity` object.

Description

Extract a matrix or the disparity results from a dispRity.

Usage

get.matrix(data, subsets, rarefaction, bootstrap, matrix)

get.disparity(data, subsets, rarefaction, observed, concatenate)
get.matrix(data, subsets, rarefaction, bootstrap, matrix)

get.disparity(data, subsets, rarefaction, observed, concatenate)

Arguments

`data`	A `dispRity` object.
`subsets`	Optional, a `numeric` or `character` for which subsets to get (if missing, the value for all subsets are given).
`rarefaction`	Optional, a single `numeric` value corresponding to the rarefaction level (as the number of elements; if missing, the non-rarefied values are output).
`bootstrap`	Optional, a `numeric` value to select a specific bootstrap draw (`0` is no bootstrap).
`matrix`	A `numeric` value of which matrix to select (default is `1`).
`observed`	A `logical` value indicating whether to output the observed (`TRUE` (default)) or the bootstrapped values (`FALSE`).
`concatenate`	When the disparity metric is a distribution, whether to concatenate it returning the median (`TRUE`; default) or to return each individual values.

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## To get the original matrix
get.matrix(disparity)

## To get the un-bootstrapped matrix from the subset called "80"
get.matrix(disparity, subsets = "80")

## To get the 52nd bootstrap draw of the second rarefaction level (15) of the
## same subset
get.matrix(disparity, subsets = 2, rarefaction = 2, bootstrap = 52)

## Extracting the observed disparity
get.disparity(disparity)

## Extracting the bootstrapped disparity
boot_disp <- get.disparity(disparity, observed = FALSE)
str(boot_disp)
## Or only the rarefied (5) data
boot_disp_rare <- get.disparity(disparity, observed = FALSE,
     rarefaction = 5)

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## To get the original matrix
get.matrix(disparity)

## To get the un-bootstrapped matrix from the subset called "80"
get.matrix(disparity, subsets = "80")

## To get the 52nd bootstrap draw of the second rarefaction level (15) of the
## same subset
get.matrix(disparity, subsets = 2, rarefaction = 2, bootstrap = 52)

## Extracting the observed disparity
get.disparity(disparity)

## Extracting the bootstrapped disparity
boot_disp <- get.disparity(disparity, observed = FALSE)
str(boot_disp)
## Or only the rarefied (5) data
boot_disp_rare <- get.disparity(disparity, observed = FALSE,
     rarefaction = 5)

Extracts or modify subsets from a `dispRity` object.

Description

Extracting or modify some subsets' data and information from a dispRity object.

Usage

n.subsets(data)

name.subsets(data)

size.subsets(data)

get.subsets(data, subsets)

combine.subsets(data, subsets)
n.subsets(data)

name.subsets(data)

size.subsets(data)

get.subsets(data, subsets)

combine.subsets(data, subsets)

Arguments

`data`	A `dispRity` object.
`subsets`	Either a `vector` of the number or name of the subsets to merge or a single. But see details for `combine.subsets`.

Details

For the function combine.subsets, the argument subsets can ALSO be a numeric value of the minimum of elements for each series. If subset is a vector, the subsets are merged in the given input order. c(1, 3, 4) will merge subsets 1 and 3 into 4, while the opposite, c(3, 4, 1) will merge subsets 3 and 4 into 1. When a single numeric value is given, subsets are merged with the next subset until the correct number of elements for each subset is reached (apart from the last subset that gets merged with the previous one).

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## How many subsets are in disparity?
n.subsets(disparity)

## What are the subset names
name.subsets(disparity)

## What are the number of elements per subsets?
size.subsets(disparity)

## Get one subset
get.subsets(disparity, "60")

## Get two subsets
get.subsets(disparity, c(1,5))

## Generate subsets from a dummy matrix
dummy_matrix <- matrix(rnorm(120), 40, dimnames = list(c(1:40)))
dummy_subsets <- custom.subsets(dummy_matrix,
     group = list("a" = c(1:5), "b" = c(6:10), "c" = c(11:20),
                  "d" = c(21:24), "e" = c(25:30), "f" = c(31:40)))

## Merging the two first subsets
combine.subsets(dummy_subsets, c(1,2))

## Merging the three subsets by name
combine.subsets(dummy_subsets, c("d", "c", "e"))

## Merging the subsets to contain at least 20 taxa
combine.subsets(dummy_subsets, 10)

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## How many subsets are in disparity?
n.subsets(disparity)

## What are the subset names
name.subsets(disparity)

## What are the number of elements per subsets?
size.subsets(disparity)

## Get one subset
get.subsets(disparity, "60")

## Get two subsets
get.subsets(disparity, c(1,5))

## Generate subsets from a dummy matrix
dummy_matrix <- matrix(rnorm(120), 40, dimnames = list(c(1:40)))
dummy_subsets <- custom.subsets(dummy_matrix,
     group = list("a" = c(1:5), "b" = c(6:10), "c" = c(11:20),
                  "d" = c(21:24), "e" = c(25:30), "f" = c(31:40)))

## Merging the two first subsets
combine.subsets(dummy_subsets, c(1,2))

## Merging the three subsets by name
combine.subsets(dummy_subsets, c("d", "c", "e"))

## Merging the subsets to contain at least 20 taxa
combine.subsets(dummy_subsets, 10)

Make and fill `dispRity`.

Description

Creating an empty dispRity object from a matrix

Usage

make.dispRity(data, tree, call, subsets)

fill.dispRity(data, tree, check)

remove.dispRity(data, what)
make.dispRity(data, tree, call, subsets)

fill.dispRity(data, tree, check)

remove.dispRity(data, what)

Arguments

`data`	A `matrix`.
`tree`	Optional, a `phylo` or `multiPhylo` object.
`call`	Optional, a `list` to be a `dispRity` call.
`subsets`	Optional, a `list` to be a `dispRity` subsets list.
`check`	Logical, whether to check the data (`TRUE`; default, highly advised) or not (`FALSE`).
`what`	Which elements to remove. Can be any of the following: `"subsets"`, `"bootstraps"`, `"covar"`, `"tree"`, `"disparity"`. See details.

Details

When using remove.dispRity, the function recursively removes any other data depending on "what". For example, for a data with disparity calculated for bootstrapped subsets, removing the subsets (what = "subsets") also removes the bootstraps and the disparity data. But removing the bootstraps (what = "bootstraps") removes only the bootstraps draws and the disparity relating to the bootstraps (but keeps the subsets and the non-bootstrapped disparity values).

Author(s)

Thomas Guillerme

Examples

## An empty dispRity object
make.dispRity()

## Still an empty dispRity object (with a matrix)
(empty <- make.dispRity(data = matrix(rnorm(12), ncol = 3)))

## A dispRity object with a matrix of 4*3
fill.dispRity(empty)

## A dispRity object with a tree
my_tree <- rtree(4, tip.label = c(1:4))
fill.dispRity(empty, tree = my_tree)

## An empty dispRity object
make.dispRity()

## Still an empty dispRity object (with a matrix)
(empty <- make.dispRity(data = matrix(rnorm(12), ncol = 3)))

## A dispRity object with a matrix of 4*3
fill.dispRity(empty)

## A dispRity object with a tree
my_tree <- rtree(4, tip.label = c(1:4))
fill.dispRity(empty, tree = my_tree)

Creating disparity metrics

Description

Testing the dimension-level of disparity metrics

Usage

make.metric(
  fun,
  ...,
  silent = FALSE,
  check.between.groups = FALSE,
  data.dim,
  tree = NULL,
  covar = FALSE,
  get.help = FALSE
)
make.metric(
  fun,
  ...,
  silent = FALSE,
  check.between.groups = FALSE,
  data.dim,
  tree = NULL,
  covar = FALSE,
  get.help = FALSE
)

Arguments

`fun`	A `function`.
`...`	Some arguments to be passed to `fun`.
`silent`	`logical`; if `FALSE` (default), the function will be verbose and give no output; if `TRUE`, the function will only output the function's dimension-level.
`check.between.groups`	`logical`; if `TRUE`, the function will output a named list containing the metric level and a logical indicating whether the metric can be used between groups or not. If `FALSE` (default) the function only outputs the metric level.
`data.dim`	optional, two `numeric` values for the dimensions of the matrix to run the test function testing. If missing, a default 5 rows by 4 columns matrix is used.
`tree`	optional, a `phylo` object.
`covar`	`logical`, whether to treat the metric as applied the a `data$covar` component (`TRUE`) or not (`FALSE`; default).
`get.help`	`logical`, whether to also output the `dist.helper` if the metric has a `dist.help` argument (`TRUE`) or not (`FALSE`; default).

Details

This function tests:

1: if your function can deal with a matrix as an input.
2: which dimension-level is your function (1, 2 or 3, see dispRity.metric).
3: whether the function can properly be implemented in the dispRity function.

The three different metric levels correspond to the dimensions of the output and are:

"dimension-level 1": for functions that decompose a matrix into a single value.
"dimension-level 2": for functions that decompose a matrix into a vector.
"dimension-level 3": for functions that transform the matrix into another matrix.

For example, the disparity metric sum of variances is composed of two metric dimension-levels:

The variances (dimension-level 2) that calculates the variances for each column in a matrix (aggregates a matrix into a vector).
The sum (dimension-level 1) that transforms the vector of variances into a single value.

See function example for a concrete illustration (three different dimension-levels of the function sum).

HINT: it is better practice to name the first argument of fun matrix to avoid potential argument conflicts down the line (the dispRity function assumes the matrix argument for the parsing the metrics).

The input fun can be a "normal" metric function (i.e. that takes a matrix as first argument) or a "between.groups" metric (i.e. that takes two matrix as arguments). If the arguments are named matrix and matrix2, the metric will be assumed to be "between.groups" and be run in a for loop rather than a apply loop in dispRity.

Author(s)

Thomas Guillerme

Examples

## A dimension-level 1 function
my_fun <- function(matrix) sum(matrix)
make.metric(my_fun)

## A dimension-level 2 function
my_fun <- function(matrix) apply(matrix, 2, sum)
make.metric(my_fun)

## A dimension-level 3 function
my_fun <- function(matrix) (matrix + sum(matrix))
make.metric(my_fun)

## A dimension-level 1 function
my_fun <- function(matrix) sum(matrix)
make.metric(my_fun)

## A dimension-level 2 function
my_fun <- function(matrix) apply(matrix, 2, sum)
make.metric(my_fun)

## A dimension-level 3 function
my_fun <- function(matrix) (matrix + sum(matrix))
make.metric(my_fun)

Match tips or nodes edge vector

Description

Match a vector of tips or tips and nodes with the an edge list from a "phylo" or "multiPhylo".

Usage

match.tip.edge(
  vector,
  phylo,
  replace.na,
  use.parsimony = TRUE,
  to.root = FALSE
)
match.tip.edge(
  vector,
  phylo,
  replace.na,
  use.parsimony = TRUE,
  to.root = FALSE
)

Arguments

`vector`	a vector of variables (equal to the number of tips or to the number of tips and nodes) or a vector of tips and nodes names or IDs.
`phylo`	a phylo or multiPhylo object.
`replace.na`	optional, what to replace NAs with.
`use.parsimony`	logical, whether to also colour internal edges parsimoniously (`TRUE` - default; i.e. if two nodes have the same unique ancestor node and the same variable, the ancestor node is assume to be the of the same value as its descendants) or not (`FALSE`).
`to.root`	logical, if `vector` is a list of tips and nodes, whether to colour internal edges all the way to the root (`TRUE`) or not (`FALSE` - default).

Value

If the input vector is a vector of variables, the function returns a vector of variables equal to the number of edges in the tree (or a list of vectors if the phylo input is of class "multiPhylo"). Else it returns an integer vector for the selected edges.

Author(s)

Thomas Guillerme

Examples

## A random tree
tree <- rtree(20)

## A random vector of two variables for each tips
tip_values <- sample(c("blue", "red"), 20, replace = TRUE)

## Matching the colors (blue and red) to the tips descendants
edge_colors <- match.tip.edge(tip_values, tree, replace.na = "grey")

## Plotting the results
plot(tree, show.tip.label = FALSE, edge.color = edge_colors)
tiplabels(1:20, bg = tip_values)

## Same but without assuming parsimony for the internal nodes
plot(tree, show.tip.label = FALSE,
     edge.color = match.tip.edge(tip_values, tree,
                                 use.parsimony = FALSE,
                                 replace.na = "grey"))

## Matching the tips and nodes colors with the edges
node_values <- sample(c("blue", "red"), 19, replace = TRUE)
edge_colors <- match.tip.edge(c(tip_values, node_values), tree)
plot(tree, show.tip.label = FALSE, edge.color = edge_colors) 
tiplabels(1:20, bg = tip_values)
nodelabels(1:19, bg = node_values)

## Matching the tips and nodes colours to the root
data(bird.orders)

## Getting the bird orders starting with a "C"
some_orders <- sort(bird.orders$tip.label)[4:9]

## Get the edges linking these orders
edges_of_interest <- match.tip.edge(vector = some_orders,
                                    phylo  = bird.orders)

## Create a colour vector for all edges
all_edges <- rep("grey", Nedge(bird.orders))
## Replacing the edges of interest by another colour
all_edges[edges_of_interest] <- "black"

## Plot the results
plot(bird.orders, edge.color = all_edges)

## A random tree
tree <- rtree(20)

## A random vector of two variables for each tips
tip_values <- sample(c("blue", "red"), 20, replace = TRUE)

## Matching the colors (blue and red) to the tips descendants
edge_colors <- match.tip.edge(tip_values, tree, replace.na = "grey")

## Plotting the results
plot(tree, show.tip.label = FALSE, edge.color = edge_colors)
tiplabels(1:20, bg = tip_values)

## Same but without assuming parsimony for the internal nodes
plot(tree, show.tip.label = FALSE,
     edge.color = match.tip.edge(tip_values, tree,
                                 use.parsimony = FALSE,
                                 replace.na = "grey"))

## Matching the tips and nodes colors with the edges
node_values <- sample(c("blue", "red"), 19, replace = TRUE)
edge_colors <- match.tip.edge(c(tip_values, node_values), tree)
plot(tree, show.tip.label = FALSE, edge.color = edge_colors) 
tiplabels(1:20, bg = tip_values)
nodelabels(1:19, bg = node_values)

## Matching the tips and nodes colours to the root
data(bird.orders)

## Getting the bird orders starting with a "C"
some_orders <- sort(bird.orders$tip.label)[4:9]

## Get the edges linking these orders
edges_of_interest <- match.tip.edge(vector = some_orders,
                                    phylo  = bird.orders)

## Create a colour vector for all edges
all_edges <- rep("grey", Nedge(bird.orders))
## Replacing the edges of interest by another colour
all_edges[edges_of_interest] <- "black"

## Plot the results
plot(bird.orders, edge.color = all_edges)

MCMCglmm.subsets

Description

Creating a dispRity object from a MCMCglmm posterior output

Usage

MCMCglmm.subsets(
  data,
  posteriors,
  group,
  tree,
  rename.groups,
  set.loc = TRUE,
  ...
)
MCMCglmm.subsets(
  data,
  posteriors,
  group,
  tree,
  rename.groups,
  set.loc = TRUE,
  ...
)

Arguments

`data`	The `data.frame` or `matrix` used for the `MCMCglmm` model.
`posteriors`	A `MCMCglmm` object, the posteriors of the model.
`group`	Optional, a named vector of which group to include from the posteriors (if left empty the random and residual terms are used). See details.
`tree`	Optional, the tree(s) used in the MCMCglmm analyses.
`rename.groups`	Optional, a vector of group names for renaming them. See details.
`set.loc`	Optional, if no location is available for a subset (`$Sol = 0`), set the location of the subset from `data` (the centroid of the group) (default is `TRUE`).
`...`	Optional arguments to be passed to `MCMCglmm.covars`.

Details

For the group option, the group names must be ones found in the posteriors formula in the format <Type = Term:FactorLevel> as returned by MCMCglmm.levels(posteriors). For example, for returning two random effect, the phylogenetic one ("animal") and one for a specific clade (say the 2nd clade) as well as two residual terms for a specific factor (say level 1 and 4) you can use group = c(random = "animal", random = "animal:clade2", residual = "units:myfactor1", residual = "units:myfactor4").
For the rename.groups option, the vector must be of class "character" and must of the same length as the number of random and residual terms in posteriors or of group argument (if used). If the group argument is left empty, the groups are extracted from the posteriors in the following order: the random terms first then the residual terms as specified in the posteriors object formulas (respectively posteriors$Random$formula and posteriors$Residual$formula).

NOTE that the output dispRity inherits the dimensions used in the posteriors argument. You can always check the selected dimensions using: data$call$dimensions

Author(s)

Thomas Guillerme

Examples

data(charadriiformes)

## Creating a dispRity object from the charadriiformes model
MCMCglmm.subsets(data       = charadriiformes$data,
                 posteriors = charadriiformes$posteriors)

## Same but selecting only the three first random terms
MCMCglmm.subsets(data       = charadriiformes$data,
                 posteriors = charadriiformes$posteriors,
                 tree       = charadriiformes$tree,
                 group      = MCMCglmm.levels(
                                 charadriiformes$posteriors)[1:3],
                 rename.groups = c("gulls", "plovers", "sandpipers"))


data(charadriiformes)

## Creating a dispRity object from the charadriiformes model
MCMCglmm.subsets(data       = charadriiformes$data,
                 posteriors = charadriiformes$posteriors)

## Same but selecting only the three first random terms
MCMCglmm.subsets(data       = charadriiformes$data,
                 posteriors = charadriiformes$posteriors,
                 tree       = charadriiformes$tree,
                 group      = MCMCglmm.levels(
                                 charadriiformes$posteriors)[1:3],
                 rename.groups = c("gulls", "plovers", "sandpipers"))

MCMCglmm object utility functions

Description

Different utility functions to extract aspects of a MCMCglmm object.

Usage

MCMCglmm.traits(MCMCglmm)

MCMCglmm.levels(MCMCglmm, convert)

MCMCglmm.sample(MCMCglmm, n)

MCMCglmm.covars(MCMCglmm, n, sample)

MCMCglmm.variance(MCMCglmm, n, sample, levels, scale)
MCMCglmm.traits(MCMCglmm)

MCMCglmm.levels(MCMCglmm, convert)

MCMCglmm.sample(MCMCglmm, n)

MCMCglmm.covars(MCMCglmm, n, sample)

MCMCglmm.variance(MCMCglmm, n, sample, levels, scale)

Arguments

`MCMCglmm`	A `MCMCglmm` object.
`convert`	Logical, whether to return the raw term names names as expressed in the model column names (`FALSE`) or to convert it to something more reader friendly (`TRUE`; default).
`n`	Optional, a number of random samples to extract.
`sample`	Optional, the specific samples to extract (is ignored if `n` is present).
`levels`	Optional, a vector `"character"` values (matching `MCMCglmm.levels(..., convert = TRUE)`) or of `"numeric"` values designating which levels to be used to calculate the variance (if left empty, all the levels are used).
`scale`	Logical, whether to scale the variance relative to all the levels (`TRUE`; default) or not (`FALSE`)/

Details

MCMCglmm.levels returns the different random and residual terms levels of a MCMCglmm object. This function uses the default option convert = TRUE to convert the names into something more readable. Toggle to convert = FALSE for the raw names.
MCMCglmm.traits returns the column names of the different traits of a MCMCglmm formula object.
MCMCglmm.sample returns a vector of sample IDs present in the MCMCglmm object. If n is missing, all the samples IDs are returned. Else, a random series of sample IDs are returned (with replacement if n greater than the number of available samples).
MCMCglmm.covars returns a list of covariance matrices and intercepts from a MCMCglmm object (respectively from MCMCglmm$VCV and MCMCglmm$Sol). By default, all the covariance matrices and intercepts are returned but you can use either of the arguments sample to return specific samples (e.g. MCMCglmm.covars(data, sample = c(1, 42)) for returning the first and 42nd samples) or n to return a specific number of random samples (e.g. MCMCglmm.covars(data, n = 42) for returning 42 random samples).
MCMCglmm.variance returns a list of covariance matrices and intercepts from a MCMCglmm object (respectively from MCMCglmm$VCV and MCMCglmm$Sol). By default, all the covariance matrices and intercepts are returned but you can use either of the arguments sample to return specific samples (e.g. MCMCglmm.covars(data, sample = c(1, 42)) for returning the first and 42nd samples) or n to return a specific number of random samples (e.g. MCMCglmm.covars(data, n = 42) for returning 42 random samples).

Author(s)

Thomas Guillerme

Examples

## Loading the charadriiformes model
data(charadriiformes)
model <- charadriiformes$posteriors
class(model) # is MCMCglmm

## Get the list of levels from the model
MCMCglmm.levels(model)
## The raw levels names (as they appear in the MCMCglmm object)
MCMCglmm.levels(model, convert = FALSE)

## Get the traits names from the model
MCMCglmm.traits(model)

## Get all the available samples in the model
length(MCMCglmm.sample(model))
## Get 5 random sample IDs from the model
MCMCglmm.sample(model, n = 5)

## Get one specific samples from the model
MCMCglmm.covars(model, sample = 42)
## Get two random samples from the model
MCMCglmm.covars(model, n = 2)

## Get the variance for each terms in the model
terms_variance <- MCMCglmm.variance(model)
boxplot(terms_variance, horizontal = TRUE, las = 1)

## Loading the charadriiformes model
data(charadriiformes)
model <- charadriiformes$posteriors
class(model) # is MCMCglmm

## Get the list of levels from the model
MCMCglmm.levels(model)
## The raw levels names (as they appear in the MCMCglmm object)
MCMCglmm.levels(model, convert = FALSE)

## Get the traits names from the model
MCMCglmm.traits(model)

## Get all the available samples in the model
length(MCMCglmm.sample(model))
## Get 5 random sample IDs from the model
MCMCglmm.sample(model, n = 5)

## Get one specific samples from the model
MCMCglmm.covars(model, sample = 42)
## Get two random samples from the model
MCMCglmm.covars(model, n = 2)

## Get the variance for each terms in the model
terms_variance <- MCMCglmm.variance(model)
boxplot(terms_variance, horizontal = TRUE, las = 1)

Model Test

Description

Fit models of disparity change through time

Usage

model.test(
  data,
  model,
  pool.variance = NULL,
  time.split = NULL,
  fixed.optima = FALSE,
  control.list = list(fnscale = -1),
  verbose = TRUE
)
model.test(
  data,
  model,
  pool.variance = NULL,
  time.split = NULL,
  fixed.optima = FALSE,
  control.list = list(fnscale = -1),
  verbose = TRUE
)

Arguments

`data`	A `dispRity` object used to test models of evolution through time.
`model`	The model(s) of evolution to allow for changes in disparity-through-time using a homogenous or hetergenous model, either using a single input or a list containing different models (See Details). If a vector with multiple modes is supplied then the model will test for shifts in modes at the time supplied by `time.split`.
`pool.variance`	If `NULL` (default) the difference in variances will be calculated using `bartlett.test` of equal variances. If there is no significant difference among variances, then variance in samples will be pooled and the same variance will be used for all samples. A significance difference will not pool variances and the original variance will be used for model-testing. If argument `TRUE` or `FALSE` are used, Bartlett's test will be ignored and the analyses will use the user-set pooling of variances.
`time.split`	The age of the change in mode (`numeric`). The age is measured in positive units as the time before the most recent sample, and multiple ages can be supplied in a vector. If no age is supplied for models then all possible time shifts are fit in the model, and the highest likelihood model is returned. Note this only applies to heterogenous models (See Details).
`fixed.optima`	A `logical` value, whether to use an estimated optimum value in OU models (`FALSE` - default), or whether to set the OU optimum to the ancestral value (`TRUE`).
`control.list`	A `list` of fine-tune control inputs for the `optim` function.
`verbose`	`logical`, whether to display the model results while they are computed (`TRUE` - default).

Details

DISCLAIMER: this function is working properly (i.e. it does what it is supposed to do), however, the interpretation of the results has not yet been thought through, discussed and peer-reviewed (what does a Brownian motion like disparity curve means biologically?).

The models are fit using maximum likelihood optimisation using the function optim. Fine-tuning of the search algorithms can be applied using the control.list argument. Models can be fit using a homogenous model with the same process applied to the entire sequence or models with time splits that represent a change in parameters or a shift in mode. When a heterogeneous and/or a time-shift model is specified with a specified time.split then the shift is tested at that value only. If no time shift is supplied then multiple shift times are tested, with all bins that allow for at least 10 bins either side of the split. If the entire sample is fewer than 30 samples long then no time splits are searched for (unless a time split is supplied by the user). Parameters are shared across different modes. For example, c("BM", "OU") would fit a model in which the process starts with a BM model and shifts to an OU process. The ancestral value at the start of the sequence and sigma squared value are shared across the models. Any combination of the following homogenous models (with the exception of "multi.OU") can be fit to the data:

BM: Fits a unbiased random walk model of Brownian motion evolution (Felsenstein 1973; 1985; Hunt 2006). The model optimises the ancestral state and the 'step-variance' (sigma-squared).
OU: The Ornstein-Uhlenbeck model of evolution in which the change in variance is constrained to an optimum value (Hansen 1997). In this model there are three parameters: optima, alpha, and ancestral state. The strength of attraction based on the parameter alpha and the ancestral state is estimated from the data. The optima value is estimated from the data, and this can lead to optima being found outside the known data values, and thus the model can resemble a trend. If the argument fixed.optima = TRUE, the model will not estimate optima but constrain it to the first (ancestral) value in the sequence as is done in phylogenetic OU models.
Trend: Fits a Brownian motion model with a directional component. This model is also known as the General Random Walk (Hunt 2006). This model has three parameters: the ancestral state, the 'step-variance' (sigma-squared), and the positive or negative trend.
Stasis: Fits a model in which traits evolve with variance (omega) around a mean (theta). This model is time-independent in that the model is guided only by the variance and attraction to the mean (Hunt 2006).
EB: Early-Burst, trait variance accumulates early in the evolution of a trait and decreases exponentially through time (Blomberg et al. 2003; Harmon et al. 2010). This model has three parameters: ancestral state, sigma-squared, and the exponential rate of decrease. Note this model expects the mean to remain unchanged through the model, so does not explicitly model a rapid change to a new mean or optimum value.
multi.OU: Fits a model in which the value of the optima shifts at one or more time splits. The values of the 'step-variance' (sigma squared) and attraction to the optima (alpha) are shared across all the samples. This model can not be fit with other models - the multi.OU system can be fit to the model only.

Value

A list of class dispRity and model.test that can be plotted and summarised via summary.dispRity and plot.dispRity. The list is composed of:

$aic.models summary for each model's small sample Akaike Information Criterion (AICc), delta AICc, and AICc weight
$full.models the list of the full models outputs from optim with the estimated parameters, log-likelihood, convergence statistics, and the split.time if applicable
$call the model input
$models.data input data used by the model(s)
$fixed.optima Logical indicating whether a fixed optima was assumed for OU model(s)

Author(s)

Mark N Puttick and Thomas Guillerme

References

Blomberg SP, Garland T Jr, & Ives AR. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution. 57, 717-745.

Hansen TF. 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution. 51, 1341-1351.

Harmon LJ, et al. 2010. Early bursts of body size and shape evolution are rare in comparative data. 64, 2385-2396.

Hunt G. 2006. Fitting and comparing models of phyletic evolution: random walks and beyond. Paleobiology. 32, 578-601. DOI: 10.1666/05070.1.

Hunt G, Hopkins MJ & Lidgard S. 2015. Simple versus complex models of trait evolution and stasis as a response to environmental change. Proceedings of the National Academy of Sciences. 112, 4885-4890. DOI: 10.1073/pnas.1403662111

Felsenstein J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. American Journal of Human Genetics. 25, 471-492.

Felsenstein J. 1985. Phylogenies and the comparative method. The American Naturalist. 51, 1-15.

Murrell DJ. 2018. A global envelope test to detect non-random bursts of trait evolution. Methods in Ecology and Evolution. DOI: 10.1111/2041-210X.13006

Examples

## Not run: 
## Mammal disparity through time
data(BeckLee_disparity)

## The four models to fit
models <- list("BM", "OU", "multi.OU", c("BM", "OU"))

## Fitting the four models to the disparity data
tests <- model.test(BeckLee_disparity, models, time.split = 66)

## Summarising the models
summary(tests)

## Plotting only the models support
plot(tests)

## End(Not run)

## Not run: 
## Mammal disparity through time
data(BeckLee_disparity)

## The four models to fit
models <- list("BM", "OU", "multi.OU", c("BM", "OU"))

## Fitting the four models to the disparity data
tests <- model.test(BeckLee_disparity, models, time.split = 66)

## Summarising the models
summary(tests)

## Plotting only the models support
plot(tests)

## End(Not run)

Simulate Model Test

Description

Simulate models of disparity change through time

Usage

model.test.sim(
  sim = 1,
  model,
  model.rank = 1,
  alternative = "two-sided",
  time.split = NULL,
  time.span = 100,
  variance = 1,
  sample.size = 100,
  parameters = list(),
  fixed.optima = FALSE
)
model.test.sim(
  sim = 1,
  model,
  model.rank = 1,
  alternative = "two-sided",
  time.split = NULL,
  time.span = 100,
  variance = 1,
  sample.size = 100,
  parameters = list(),
  fixed.optima = FALSE
)

Arguments

`sim`	The number of separate simulations to run.
`model`	Either (i) the named model of evolution to simulate for changes in disparity-through-time using a homogenous or hetergenous model (see list in `model.test`) or (ii) an object of class `dispRity` returned from `model.test` function. If a `dispRity` object is supplied, all remaining arguments apart from `sim` and `model.rank` and `alternative` are ignored as the model specified by the input model is used.
`model.rank`	If a `dispRity` object is supplied, which model is used for simulation. The rank refers to the order of models as specified by AICc, so if `model.rank = 1` (default) the best-fitting model is used for simulation.
`alternative`	If the simulation is based on a `dispRity` object, what is the alternative hypothesis: can be `"two-sided"` (default), `"greater"` or `"lesser"`.
`time.split`	The age of the change in mode. The age is measured as the time before the most recent sample, and multiple ages can be supplied in a vector. Note this only applies to heterogenous models.
`time.span`	The length of the sequence (`numeric`). If one number is supplied this is treated as the length of the sequence and the time span is treated as sequence from 0 to `time.span` in unit increments. If a vector of length > 1 is supplied, this is treated as the the age of each sample in the sequence.
`variance`	The variance of each sample (`numeric`). If one number is supplied this is the variance for all samples in the sequence. If a vector of equal length to the `time.span` vector is supplied, this is used for the variance of each sample in the sequence
`sample.size`	The sample size of each sample (`numeric`). If one number is supplied this is the sample size for all samples in the sequence. If a vector of equal length to the `time.span` vector is supplied, this is used for the sample size of each sample in the sequence
`parameters`	A `list` of model parameters used for simulations. See details.
`fixed.optima`	A `logical` value, whether to use an estimated optimum value in OU models (`FALSE` - default), or whether to set the OU optimum to the ancestral value (`TRUE`).

Details

The parameters is a list of arguments to be passed to the models. These arguments can be:

ancestral.state, ancestral value of the disparity applicable to all models (default = 0.01).
sigma.squared, rate of step variance to all models except Stasis (default = 1).
alpha, strength of attraction to the optimum in OU models (default = 1).
optima.1, the value of the optimum in a OU model, or the first bin optimum in a multi-OU model (default = 0.15).
optima.2, the second bin optimum in a multi-OU model (default = 0.15).
optima.3, the third bin optimum in a multi-OU model (default = 0.15).
theta.1, the mean in a Stasis model, or the first bin mean in a multi-Stasis model (default = 1).
theta.2, the second bin optimum in a multi-OU model (default = 1).
theta.3, the third bin optimum in a multi-OU model (default = 1).
omega, the variance in a Stasis model (default = 1).
trend, the trend parameter in the Trend model (default = 0.5).
eb.rate, the rate of exponential rate decrease in the EB model (default = -0.1).

Value

A list of class dispRity and model.sim. Each list element contains the simulated central tendency, as well as the variance, sample size, and subsets used to simulate the data.

Author(s)

Mark N Puttick and Thomas Guillerme

References

Blomberg SP, Garland T Jr, & Ives AR. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution. 57, 717-745.

Hansen TF. 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution. 51, 1341-1351.

Harmon LJ, et al. 2010. Early bursts of body size and shape evolution are rare in comparative data. 64, 2385-2396.

Hunt G. 2006. Fitting and comparing models of phyletic evolution: random walks and beyond. Paleobiology. 32, 578-601. DOI: 10.1666/05070.1.

Felsenstein J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. American Journal of Human Genetics. 25, 471-492.

Felsenstein J. 1985. Phylogenies and the comparative method. The American Naturalist. 51, 1-15.

Murrell DJ. 2018. A global envelope test to detect non-random bursts of trait evolution. Methods in Ecology and Evolution. DOI: 10.1111/2041-210X.13006

Citation for the envelope code:

Examples

## Not run: 
## Disparity through time data
data(BeckLee_disparity)

## List of models to test
models <- list("Trend", "BM")

## Testing the models on the observed disparity
model_test_output <- model.test(BeckLee_disparity, models, time.split = 66)
 
## simulations using the output from model.test
model_test_sim_output <- model.test.sim(sim = 100, model= model_test_output)
 
## Plot the simulated best model
plot(model_test_sim_output)
## Add the observed data
plot(BeckLee_disparity, add = TRUE, col = c("pink", "#ff000050", "#ff000050"))

## Simulating a specific model with specific parameters parameters
model_simulation <- model.test.sim(sim = 100, model = "BM", time.span = 120, variance = 0.1,
                                   sample.size = 100, parameters = list(ancestral.state = 0,
                                   sigma.squared = 0.1))

## Summarising the results
plot(model_simulation, main = "A simple Brownian motion")

## End(Not run)

## Not run: 
## Disparity through time data
data(BeckLee_disparity)

## List of models to test
models <- list("Trend", "BM")

## Testing the models on the observed disparity
model_test_output <- model.test(BeckLee_disparity, models, time.split = 66)
 
## simulations using the output from model.test
model_test_sim_output <- model.test.sim(sim = 100, model= model_test_output)
 
## Plot the simulated best model
plot(model_test_sim_output)
## Add the observed data
plot(BeckLee_disparity, add = TRUE, col = c("pink", "#ff000050", "#ff000050"))

## Simulating a specific model with specific parameters parameters
model_simulation <- model.test.sim(sim = 100, model = "BM", time.span = 120, variance = 0.1,
                                   sample.size = 100, parameters = list(ancestral.state = 0,
                                   sigma.squared = 0.1))

## Summarising the results
plot(model_simulation, main = "A simple Brownian motion")

## End(Not run)

Model test wrapper

Description

A wrapper function for model.test to perform a model fitting analysis on disparity through time data.

Usage

model.test.wrapper(
  data,
  model,
  pool.variance = NULL,
  time.split = NULL,
  fixed.optima = FALSE,
  control.list = list(fnscale = -1),
  verbose = TRUE,
  sim = 1000,
  plot.sim = TRUE,
  col.sim,
  col.obs = "hotpink",
  lwd.obs = 2,
  show.p = FALSE,
  cex.p,
  legend = FALSE,
  ...
)
model.test.wrapper(
  data,
  model,
  pool.variance = NULL,
  time.split = NULL,
  fixed.optima = FALSE,
  control.list = list(fnscale = -1),
  verbose = TRUE,
  sim = 1000,
  plot.sim = TRUE,
  col.sim,
  col.obs = "hotpink",
  lwd.obs = 2,
  show.p = FALSE,
  cex.p,
  legend = FALSE,
  ...
)

Arguments

`data`	A `dispRity` object used to test models of evolution through time.
`model`	The model(s) of evolution to allow for changes in disparity-through-time using a homogenous or hetergenous model, either using a single input or a list containing different models (see list in `model.test`). If a vector with multiple modes is supplied then the model will test for shifts in modes at the time supplied by `time.split`.
`pool.variance`	If `NULL` (default) the difference in variances will be calculated using `bartlett.test` of equal variances. If there is no significant difference among variances, then variance in samples will be pooled and the same variance will be used for all samples. A significance difference will not pool variances and the original variance will be used for model-testing. If argument `TRUE` or `FALSE` are used, Bartlett's test will be ignored and the analyses will use the user-set pooling of variances.
`time.split`	The age of the change in mode (`numeric`). The age is measured in positive units as the time before the most recent sample, and multiple ages can be supplied in a vector. If no age is supplied for models then all possible time shifts are fit in the model, and the highest likelihood model is returned. Note this only applies to heterogenous models (See Details).
`fixed.optima`	A `logical` value, whether to use an estimated optimum value in OU models (`FALSE` - default), or whether to set the OU optimum to the ancestral value (`TRUE`).
`control.list`	A `list` of fine-tune control inputs for the optim function.
`verbose`	`logical`, whether to display the model results as computed (`TRUE` - default).
`sim`	The number of separate simulations (default = 1000).
`plot.sim`	Logical. If `TRUE` (default) the plots of the simulated and observed disparity are returned for all models.
`col.sim`	Colour options used for the plotting of simulated values. See `plot.dispRity` for more details. If missing, the default colours `c("black", "lightgrey", "grey")` are used.
`col.obs`	Colour of the observed data on the plot. Default colour is `"hotpink"`
`lwd.obs`	Line width of the observed value.
`show.p`	Logical, when `plot.sim = TRUE`, whether to display the p-value of rank envelope tests (`TRUE`) or not (`FALSE` - default).
`cex.p`	A numerical value for the the font size of the displayed p-value (if `show.p = TRUE`). If missing, the value is set to 1.
`legend`	Logical, when `plot.sim = TRUE`, whether to display the legend in the first panel (`TRUE`) or not (`FALSE` - default).
`...`	Any additional arguments to be passed to `plot.dispRity` or `summary.dispRity`.

Details

This function gives the relative fit of model.test output using log-likelihood and AICc values, as well as the Rank Envelope Test significance to elucidate if empirical data is significantly different to simulated data modelled using the estimated model parameters from model.test.sim. This is equivalent to running test <- model.test.sim(sim = 1000, model = model.test(data, model)); summary(test) ; plot(test) ; plot(data, add = TRUE).

Value

A matrix with the relative fit, parameter values, and Rank Envelope test p values for each model, and a plot of simulated data from each model alongside observed data for each model if plot.sim is TRUE

Author(s)

Mark N Puttick and Thomas Guillerme

References

Blomberg SP, Garland T Jr, & Ives AR. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution. 57, 717-745.

Hansen TF. 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution. 51, 1341-1351.

Harmon LJ, et al. 2010. Early bursts of body size and shape evolution are rare in comparative data. 64, 2385-2396.

Hunt G. 2006. Fitting and comparing models of phyletic evolution: random walks and beyond. Paleobiology. 32, 578-601. DOI: 10.1666/05070.1.

Felsenstein J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. American Journal of Human Genetics. 25, 471-492.

Felsenstein J. 1985. Phylogenies and the comparative method. The American Naturalist. 51, 1-15.

Murrell DJ. 2018. A global envelope test to detect non-random bursts of trait evolution. Methods in Ecology and Evolution. DOI: 10.1111/2041-210X.13006

Examples


## Not run: 
## Mammal disparity through time
data(BeckLee_disparity)

## The models to be fit to disparity data
models <- list("BM", "OU", "multi.OU", "Trend")

## test all models, and assess the significance of simulated data
## against the empirical distribution for each
model.test.wrapper(data = BeckLee_disparity, model = models, fixed.optima = TRUE,
                   time.split = 66, show.p = TRUE)

## End(Not run)

## Not run: 
## Mammal disparity through time
data(BeckLee_disparity)

## The models to be fit to disparity data
models <- list("BM", "OU", "multi.OU", "Trend")

## test all models, and assess the significance of simulated data
## against the empirical distribution for each
model.test.wrapper(data = BeckLee_disparity, model = models, fixed.optima = TRUE,
                   time.split = 66, show.p = TRUE)

## End(Not run)

Ancestral states estimations with multiple trees

Description

Fast ancestral states estimations run on multiple trees using the Mk model from castor::asr_mk_model.

Usage

multi.ace(
  data,
  tree,
  models,
  sample = 1,
  sample.fun = list(fun = runif, param = list(min = min, max = max)),
  threshold = TRUE,
  special.tokens,
  special.behaviours,
  brlen.multiplier,
  verbose = FALSE,
  parallel = FALSE,
  output,
  options.args,
  estimation.details = NULL
)
multi.ace(
  data,
  tree,
  models,
  sample = 1,
  sample.fun = list(fun = runif, param = list(min = min, max = max)),
  threshold = TRUE,
  special.tokens,
  special.behaviours,
  brlen.multiplier,
  verbose = FALSE,
  parallel = FALSE,
  output,
  options.args,
  estimation.details = NULL
)

Arguments

`data`	A `matrix`, `data.frame` or `list` with the characters for each taxa.
`tree`	A `phylo` or `mutiPhylo` object (if the `tree` argument contains node labels, they will be used to name the output).
`models`	A `character` vector, unambiguous named `list` or `matrix` to be passed as model arguments to `castor::asr_mk_model` or `ape::ace` (see details).
`sample`	An `integer` for the number of matrices to sample per tree (default is `1`). See details.
`sample.fun`	If `sample > 1`, a named list containing the following elements: `fun` the sampling distribution for continuous characters; and `param` (optional) a named list of parameters and their estimation function (default is `sample.fun = list(fun = runif, param = list(min = min, max = max))`). See details.
`threshold`	Is ignored if `sample > 1`, else either a `logical` for applying a relative threshold (`TRUE` - default) or no threshold (`FALSE`) or a `numeric` value of the threshold (e.g. 0.95). See details.
`special.tokens`	optional, a named `vector` of special tokens to be passed to `grep` (make sure to protect the character with `"\\"`). By default `special.tokens <- c(missing = "\\?", inapplicable = "\\-", polymorphism = "\\&", uncertainty = "\\/")`. Note that `NA` values are not compared and that the symbol "@" is reserved and cannot be used.
`special.behaviours`	optional, a `list` of one or more functions for a special behaviour for `special.tokens`. See details.
`brlen.multiplier`	optional, a vector of branch length modifiers (e.g. to convert time branch length in changes branch length) or a list of vectors (the same length as `tree`).
`verbose`	`logical`, whether to be verbose (`TRUE`) or not (`FALSE` - default).
`parallel`	Either a `logical`, whether to use parallel algorithm (`TRUE`) or not (`FALSE` - default); or directly an `integer` indicating the number of cores to use (note that if `parallel = 1`, one core will be used but the parallel integration will still be called).
`output`	optional, see `Value` section below.
`options.args`	optional, a named list of options to be passed to function called by `castor::asr_mk_model`.
`estimation.details`	optional, whether to also return the details for each estimation as returned by `castor::asr_mk_model` or `ape::ace`. This argument can be left `NULL` (default) or be any combination of the elements returned by `castor::asr_mk_model` or `ape::ace` (e.g. `c("loglikelihood", "transition_matrix", "CI95")`).

Details

Depending on the type of characters models argument can be either:

the name of a single model to apply to all characters (if all characters are discrete or all are continuous); see below for the list of available names. For example models = "ER" applies the Equal Rates model to all characters (assuming they are all discrete characters).
a vector of model names to apply to different type of characters (see below for the list). For example models = c("ER", "ER", "BM") applies the Equal Rates model to the two first characters (discrete) and the "BM" model to the third character (continuous).
a transition "matrix" to be applied to all characters (if discrete). For example models = matrix(0.2, 2, 2).
an single named list of arguments to be applied to all characters by passing it to ape::ace (if continuous). For example models = list(method = "GLS", corStruct = corBrownian(1, my_tree)).
an un-ambiguous list of arguments to be passed to either castor::asr_mk_model (discrete characters) or ape::ace (continuous characters). For example models = list("char1" = list(transition_matrix = matrix(0.2, 2, 2)), "char2" = list(method = "GLS", corStruct = corBrownian(1, my_tree))) to be specifically passed to the characters named "char1" and "char2".

The available built-in models for discrete characters in castor::asr_mk_model are:

"ER" for all equal rates
"SYM" for symmetric rates
"ARD" all rates are different
"SUEDE" equal stepwise transitions (e.g. for meristic/counting characters)
"SRD" different stepwise transitions

See directly castor::asr_mk_model for more models.

The available built-in models and methods for continuous characters in ape::ace are:

"BM" model: for a default Brownian Motion with the "REML" method
"REML" method: for a default Brownian Motion with the "REML" method (same as above)
"ML" method: for a default Brownian Motion with the "ML" method
"pic" method: for a default Brownian Motion with the "pic" (least squared) method

special.behaviours allows to generate a special rule for the special.tokens. The functions should can take the arguments character, all_states with character being the character that contains the special token and all_states for the character (which is automatically detected by the function). By default, missing data returns and inapplicable returns all states, and polymorphisms and uncertainties return all present states.

missing = function(x,y) y
inapplicable = function(x,y) y
polymorphism = function(x,y) strsplit(x, split = "\\&")[[1]]
uncertainty = function(x,y) strsplit(x, split = "\\/")[[1]]

Functions in the list must be named following the special token of concern (e.g. missing), have only x, y as inputs and a single output a single value (that gets coerced to integer automatically). For example, the special behaviour for the special token "?" can be coded as: special.behaviours = list(missing = function(x, y) return(NA) to make ignore the character for taxa containing "?".

When using the parallel option (either through using parallel = TRUE by using the number of available cores minus on or manually setting the number of cores - e.g. parallel = 5), the castor::asr_mk_model function will use the designated number of cores (using the option Nthreads = <requested_number_of_cores>). Additionally, if the input tree is a "multiPhylo" object, the trees will be run in parallel for each number of cores, thus decreasing computation time accordingly (e.g. if 3 cores are requested and tree contains 12 "phylo" objects, 4 different "phylo" objects will be run in parallel on the 3 cores making the calculation around 3 times faster).

When using the default sample = 1, only one estimation is sampled per tree:

For continuous characters, this estimation is the average estimated ancestral value;
For discrete characters, this estimation is the one calculated using the threshold option (see details below).

When using sample > 1, multiple estimations are sampled per tree:

For continuous characters, this estimation is sample from the 95
For discrete characters, the estimations are sampled using their scaled likelihood.

The threshold option allows to convert ancestral states likelihoods into discrete states. When threshold = FALSE, the ancestral state estimated is the one with the highest likelihood (or at random if likelihoods are equal). When threshold = TRUE, the ancestral state estimated are all the ones that are have a scaled likelihood greater than the maximum observed scaled likelihood minus the inverse number of possible states (i.e. select_state >= (max(likelihood) - 1/n_states)). This option makes the threshold selection depend on the number of states (i.e. if there are more possible states, a lower scaled likelihood for the best state is expected). Finally using a numerical value for the threshold option (e.g. threshold = 0.95) will simply select only the ancestral states estimates with a scaled likelihood equal or greater than the designated value. This option makes the threshold selection absolute. Regardless, if more than one value is select, the uncertainty token (special.tokens["uncertainty"]) will be used to separate the states. If no value is selected, the uncertainty token will be use between all observed characters (special.tokens["uncertainty"]).

The sample.fun option allows to specify a function and parameters for the sampling of the continuous traits. The default is sample.fun = list(fun = runif, param = list(min = min, max = max)) for applying a random uniform sampling (runif) with the parameters (the minimum and the maximum are applied using respectively the min and max functions on the estimated data). For applying different samplings to different traits, you can use a list of arguments in the sample format as sample.fun (e.g. sample.fun = list(trait_uniform = list(fun = runif, param = list(min = min, max = max)), trait_normal = list(fun = rnorm, param = list(mean = mean, sd = function(x)return(diff(range(x))/4))) - here the standard deviation is calculated as a quarter of the 95

Value

Returns a "matrix" or "list" of ancestral states. By default, the function returns the ancestral states in the same format as the input matrix. This can be changed using the option output = "matrix" or "list" to force the class of the output. To output the combined ancestral states and input, you can use "combined" (using the input format) or "combined.matrix" or "combined.list". If using continuous characters only, you can use the output option "dispRity" to directly output a usable dispRity object with all trees and all the data (estimated and input). NOTE that if the input data had multiple character types (continuous and discrete) and that "matrix" or "combined.matrix" output is requested, the function returns a "data.frame".

Author(s)

Thomas Guillerme

Examples

set.seed(42)
## A simple example:
## A random tree with 10 tips
tree <- rcoal(10)
## Setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)

## A random Mk matrix (10*50)
matrix_simple <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
                            invariant = FALSE)

## Run a basic ancestral states estimations
ancestral_states <- multi.ace(matrix_simple, tree)
ancestral_states[1:5, 1:5]

## A random continuous trait matrix
matrix_continuous <- space.maker(elements = 10, dimensions = 2,
                                 distribution = rnorm, 
                                 elements.names = tree$tip.label)
## A basic estimations for discrete data
ancestral_traits <- multi.ace(matrix_continuous, tree)

## A more complex example for discrete data
## Create a multiple list of 5 trees
multiple_trees <- rmtree(5, 10)

## Modify the matrix to contain missing and special data
matrix_complex <- matrix_simple
matrix_complex[sample(1:length(matrix_complex), 50)] <- "-"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "0%2"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "?"
matrix_complex[1:5,1:5]

## Set a list of extra special tokens
my_spec_tokens <- c("weirdtoken" = "%")

## Set some special behaviours for the "weirdtoken" and for "-" and "?"
my_spec_behaviours <- list()
## Inapplicable tokens "-" are ignored
my_spec_behaviours$inapplicable <- function(x,y) return(NA)
## Missing tokens "?" are considered as all states
my_spec_behaviours$missing      <- function(x,y) return(y)
## Weird tokens are considered as state 0 and 3
my_spec_behaviours$weirdtoken   <- function(x,y) return(c(1,2))

## Create a random branch length modifier to apply to each tree
branch_lengths <- rnorm(18)^2

## Setting a list of model ("ER" for the 25 first characters and then "SYM")
my_models <- c(rep("ER", 25), rep("SYM", 25))

## Run the ancestral states on all the tree with multiple options
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
                              verbose = TRUE,
                              models = my_models,
                              threshold = 0.95,
                              special.tokens = my_spec_tokens,
                              special.behaviours = my_spec_behaviours,
                              brlen.multiplier = branch_lengths,
                              output = "combined.matrix")

## The results for the the two first characters for the first tree
ancestral_states[[1]][, 1:2]

## Not run: 
## The same example but running in parallel
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
                              verbose = TRUE,
                              models = my_models,
                              threshold = 0.95,
                              special.tokens = my_spec_tokens,
                              special.behaviours = my_spec_behaviours,
                              brlen.multiplier = branch_lengths,
                              output = "combined.matrix",
                              parallel = TRUE)

## End(Not run)

## Estimating ancestral states by sampling distributions
## Discrete characters:
sampled_ancestors <- multi.ace(matrix_simple, multiple_trees, sample = 25)
## A list of 125 matrices sampled across 25 estimations and 5 trees
sampled_ancestors <- unlist(sampled_ancestors, recursive = FALSE)

## Continuous characters:
## Sampling 100 matrices from the ancestral estimations
sampled_traits <- multi.ace(matrix_continuous, tree, sample = 100)
## Setting up two specific sampling functions
sample_fun <- list(
     unif_sampl = list(fun = runif,
                       param = list(min = min, max = max)),
     norm_sampl = list(fun = rnorm,
                       param = list(mean = mean,
                                    sd = function(x)return(diff(range(x))/4)))
)
## Sampling 100 matrices with different distribution functions
sampled_traits <- multi.ace(matrix_continuous, tree, sample = 100,
                            sample.fun = sample_fun)

set.seed(42)
## A simple example:
## A random tree with 10 tips
tree <- rcoal(10)
## Setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)

## A random Mk matrix (10*50)
matrix_simple <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
                            invariant = FALSE)

## Run a basic ancestral states estimations
ancestral_states <- multi.ace(matrix_simple, tree)
ancestral_states[1:5, 1:5]

## A random continuous trait matrix
matrix_continuous <- space.maker(elements = 10, dimensions = 2,
                                 distribution = rnorm, 
                                 elements.names = tree$tip.label)
## A basic estimations for discrete data
ancestral_traits <- multi.ace(matrix_continuous, tree)

## A more complex example for discrete data
## Create a multiple list of 5 trees
multiple_trees <- rmtree(5, 10)

## Modify the matrix to contain missing and special data
matrix_complex <- matrix_simple
matrix_complex[sample(1:length(matrix_complex), 50)] <- "-"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "0%2"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "?"
matrix_complex[1:5,1:5]

## Set a list of extra special tokens
my_spec_tokens <- c("weirdtoken" = "%")

## Set some special behaviours for the "weirdtoken" and for "-" and "?"
my_spec_behaviours <- list()
## Inapplicable tokens "-" are ignored
my_spec_behaviours$inapplicable <- function(x,y) return(NA)
## Missing tokens "?" are considered as all states
my_spec_behaviours$missing      <- function(x,y) return(y)
## Weird tokens are considered as state 0 and 3
my_spec_behaviours$weirdtoken   <- function(x,y) return(c(1,2))

## Create a random branch length modifier to apply to each tree
branch_lengths <- rnorm(18)^2

## Setting a list of model ("ER" for the 25 first characters and then "SYM")
my_models <- c(rep("ER", 25), rep("SYM", 25))

## Run the ancestral states on all the tree with multiple options
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
                              verbose = TRUE,
                              models = my_models,
                              threshold = 0.95,
                              special.tokens = my_spec_tokens,
                              special.behaviours = my_spec_behaviours,
                              brlen.multiplier = branch_lengths,
                              output = "combined.matrix")

## The results for the the two first characters for the first tree
ancestral_states[[1]][, 1:2]

## Not run: 
## The same example but running in parallel
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
                              verbose = TRUE,
                              models = my_models,
                              threshold = 0.95,
                              special.tokens = my_spec_tokens,
                              special.behaviours = my_spec_behaviours,
                              brlen.multiplier = branch_lengths,
                              output = "combined.matrix",
                              parallel = TRUE)

## End(Not run)

## Estimating ancestral states by sampling distributions
## Discrete characters:
sampled_ancestors <- multi.ace(matrix_simple, multiple_trees, sample = 25)
## A list of 125 matrices sampled across 25 estimations and 5 trees
sampled_ancestors <- unlist(sampled_ancestors, recursive = FALSE)

## Continuous characters:
## Sampling 100 matrices from the ancestral estimations
sampled_traits <- multi.ace(matrix_continuous, tree, sample = 100)
## Setting up two specific sampling functions
sample_fun <- list(
     unif_sampl = list(fun = runif,
                       param = list(min = min, max = max)),
     norm_sampl = list(fun = rnorm,
                       param = list(mean = mean,
                                    sd = function(x)return(diff(range(x))/4)))
)
## Sampling 100 matrices with different distribution functions
sampled_traits <- multi.ace(matrix_continuous, tree, sample = 100,
                            sample.fun = sample_fun)

Testing a null hypothesis on multidimensional data.

Description

Testing the difference between the observed disparity and disparity under a null model.

Usage

null.test(
  data,
  replicates = 100,
  null.distrib,
  null.args = NULL,
  null.cor = NULL,
  null.scree = NULL,
  alter = "two-sided",
  scale = FALSE,
  ...
)
null.test(
  data,
  replicates = 100,
  null.distrib,
  null.args = NULL,
  null.cor = NULL,
  null.scree = NULL,
  alter = "two-sided",
  scale = FALSE,
  ...
)

Arguments

`data`	a `dispRity` object.
`replicates`	the number of replicates for the test (default = `100`).
`null.distrib`	one or more distribution functions to generate the null model to be passed to `space.maker`.
`null.args`	any additional distribution arguments to be passed to `space.maker` (see `arguments` within; `default = NULL`).
`null.cor`	an additional correlation matrix to be passed to `space.maker` (see `cor.matrix` within; `default = NULL`).
`null.scree`	an additional vector of variance per axis (equivalent to screeplot output); `default = NULL`).
`alter`	the type of alternative hypothesis (H1) as used in `randtest` (`default = "two-sided"`).
`scale`	whether to scale the simulated and the observed data.
`...`	optional arguments to be passed to `as.randtest`.

Author(s)

Thomas Guillerme

References

Diaz, S., Kattge, J., Cornelissen, J.H., Wright, I.J., Lavorel, S., Dray, S., Reu, B., Kleyer, M., Wirth, C., Prentice, I.C. and Garnier, E., 2016. The global spectrum of plant form and function. Nature, 529(7585), pp.167-171.

Examples

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)
## Calculating the disparity as the ellipsoid volume
obs_disparity <- dispRity(BeckLee_mat50, metric = ellipsoid.volume)
## Testing against normal distribution
results <- null.test(obs_disparity, replicates = 100, null.distrib = rnorm)
results ; plot(results)

## Running the test on multiple subsets (may take some time!)
## Generating the subsets
groups <- as.data.frame(matrix(data = c(rep(1, 12), rep(2, 13), rep(3, 12),
     rep(4, 13)), dimnames = list(rownames(BeckLee_mat50))), ncol = 1)
customised_subsets <- custom.subsets(BeckLee_mat50, groups)
## Bootstrapping the data
bootstrapped_data <- boot.matrix(customised_subsets, bootstraps = 100)
## Calculating variances of each dimension
sum_variances <- dispRity(bootstrapped_data, metric = c(sum, variances))
## Testing against normal distribution
results <- null.test(sum_variances, replicates = 100, null.distrib = rnorm)
summary(results) ; plot(results)

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)
## Calculating the disparity as the ellipsoid volume
obs_disparity <- dispRity(BeckLee_mat50, metric = ellipsoid.volume)
## Testing against normal distribution
results <- null.test(obs_disparity, replicates = 100, null.distrib = rnorm)
results ; plot(results)

## Running the test on multiple subsets (may take some time!)
## Generating the subsets
groups <- as.data.frame(matrix(data = c(rep(1, 12), rep(2, 13), rep(3, 12),
     rep(4, 13)), dimnames = list(rownames(BeckLee_mat50))), ncol = 1)
customised_subsets <- custom.subsets(BeckLee_mat50, groups)
## Bootstrapping the data
bootstrapped_data <- boot.matrix(customised_subsets, bootstraps = 100)
## Calculating variances of each dimension
sum_variances <- dispRity(bootstrapped_data, metric = c(sum, variances))
## Testing against normal distribution
results <- null.test(sum_variances, replicates = 100, null.distrib = rnorm)
summary(results) ; plot(results)

Plots pairwise comparisons

Description

Plots pairwise comparisons from a data frame (typically output from test.dispRity).

Usage

pair.plot(
  data,
  what,
  col = c("black", "white"),
  legend = FALSE,
  binary,
  diag,
  add,
  lower = TRUE,
  ...
)
pair.plot(
  data,
  what,
  col = c("black", "white"),
  legend = FALSE,
  binary,
  diag,
  add,
  lower = TRUE,
  ...
)

Arguments

`data`	A `matrix` or a `data.frame` object with comparisons' pair names as row names. The number of rows must be equal to a pairwise combination of `n` elements (see details).
`what`	A `numeric` or `character` value designating which column to plot.
`col`	The two extremes of a color gradient (default = `c("black", "white")`).
`legend`	Logical, whether to plot the legend or not.
`binary`	Optional, if the results must be binary, a `numeric` value for the threshold of acceptance (values greater will be 1, lower will be 0).
`diag`	Optional, can be `"max"` or `"min"` or a single `numeric` value.
`add`	Optional, whether to add significance tokens can be `numeric` for a point type to print (`pch`) or `"character"` to print (e.g. `"*"`).
`lower`	Optional, logical, whether to add tokens for values lower than `binary` (default is `TRUE`; `FALSE` will add tokens for values bigger than `binary`).
`...`	Any other options to be passed to `plot`.

Details

The number of rows (i.e. comparisons) in matrix must be equal to the results of a pairwise combination. In general, the number of rows x must satisfy the equation: $x = n^2 / 2 - n / 2$ where n must be an integer greater or equal than 2.

Author(s)

Thomas Guillerme

Examples

## A small matrix of two pairwise comparisons of seven elements (2*21 comparisons)
data <- matrix(data = runif(42), ncol = 2)

## Plotting the first column as a pairwise comparisons
pair.plot(data, what = 1, col = c("orange", "blue"), legend = TRUE, diag = 1)

## Adding some tokens for each value below 0.2 in the second column
pair.plot(data, what = 2, binary = 0.2, add = "*", cex = 2)

## Loading disparity data
data(disparity)

## Testing the pairwise difference between slices
tests <- test.dispRity(disparity, test = wilcox.test, correction = "bonferroni")

## Plotting the significance
pair.plot(as.data.frame(tests), what = "p.value", binary = 0.05)

## A small matrix of two pairwise comparisons of seven elements (2*21 comparisons)
data <- matrix(data = runif(42), ncol = 2)

## Plotting the first column as a pairwise comparisons
pair.plot(data, what = 1, col = c("orange", "blue"), legend = TRUE, diag = 1)

## Adding some tokens for each value below 0.2 in the second column
pair.plot(data, what = 2, binary = 0.2, add = "*", cex = 2)

## Loading disparity data
data(disparity)

## Testing the pairwise difference between slices
tests <- test.dispRity(disparity, test = wilcox.test, correction = "bonferroni")

## Plotting the significance
pair.plot(as.data.frame(tests), what = "p.value", binary = 0.05)

phylolm dispRity (from `phylolm::phylolm`)

Description

Passing dispRity objects to the phylolm function from the phylolm package. Typically to run some PGLS.

Usage

pgls.dispRity(data, tree, formula, model = "BM", ..., optim = list())
pgls.dispRity(data, tree, formula, model = "BM", ..., optim = list())

Arguments

`data`	A `dispRity` object with a metric of dimension level 2 at least
`tree`	If `data` does not contain a tree component, a `"phylo"` or `"multiPhylo"` object to be used as the tree. If `data` already contains a tree component and the `tree` argument is not missing, the provided `tree` will replace any contained in `data`.
`formula`	The PGLS formula. If left empty, runs either `disparity ~ 1` or `disparity ~ subsets` if `data` contains subsets.
`model`	The covariance model (default is `"BM"`). For more details (including the models available) see the manual for `phylolm`.
`...`	Any optional arguments to be passed to `phylolm`
`optim`	An optional named list of arguments to be passed to the function `optim`

Details

The formula needs to be expressed by always naming the response variable disparity to use the calculated disparity data from data.

Optional arguments ... correspond to all the non-ambiguous named arguments from the phylolm. Optional arguments for the internal optim function can be passed as a named list to the optim argument.

Author(s)

Thomas Guillerme

Examples

## Simple example
data(BeckLee_mat50)
data(BeckLee_tree)
disparity <- dispRity(BeckLee_mat50, metric = centroids, tree = BeckLee_tree)

## Running a simple PGLS
model <- pgls.dispRity(disparity)
summary(model)

## More complex example running a PGLS
## on multiple trees and using groups as a predictor 

## Simple example
data(BeckLee_mat50)
data(BeckLee_tree)
disparity <- dispRity(BeckLee_mat50, metric = centroids, tree = BeckLee_tree)

## Running a simple PGLS
model <- pgls.dispRity(disparity)
summary(model)

## More complex example running a PGLS
## on multiple trees and using groups as a predictor

Plots character differences

Description

Plots a character difference matrix from a discrete character matrix or its character differences density profile.

Usage

## S3 method for class 'char.diff'
plot(
  x,
  ...,
  type = "matrix",
  legend = TRUE,
  legend.title = "Difference",
  legend.pos = "topleft",
  legend.round = 0,
  axis = TRUE,
  xlim,
  ylim,
  xlab,
  ylab,
  col,
  main
)
## S3 method for class 'char.diff'
plot(
  x,
  ...,
  type = "matrix",
  legend = TRUE,
  legend.title = "Difference",
  legend.pos = "topleft",
  legend.round = 0,
  axis = TRUE,
  xlim,
  ylim,
  xlab,
  ylab,
  col,
  main
)

Arguments

`x`	A discrete matrix or an already computed character difference matrix of class `char.diff`.
`...`	Any additional graphical arguments to be passed to `image`.
`type`	Either `"matrix"` (or `"m"`) or `"density"` (or `"d"`) for respectively plotting the matrix of character differences or its character differences density profile.
`legend`	A logical value stating whether to print the legend or not (default = `TRUE`).
`legend.title`	A `character` string to be displayed as the title of the legend (default = `Difference`).
`legend.pos`	The position of the legend. Can be two `numeric`. Default is `"topleft"`.
`legend.round`	A `numeric` value for digits up legend values. Default is `0`.
`axis`	A logical value stating whether to print the axis or not (default = `TRUE`).
`xlim`	Two `numeric` values to determine the x axis limits. If missing (default), the limits are calculated automatically to fit the plot window.
`ylim`	Two `numeric` values to determine the y axis limits. If missing (default), the limits are calculated automatically to fit the plot window.
`xlab`	A `character` string for the the x axis. Can be missing.
`ylab`	A `character` string for the the y axis. Can be missing.
`col`	Two colors for forming the gradient if `type = "correlation"` or for the density lines colors if `type = "density"`.
`main`	An overall title for the plot.

Author(s)

Thomas Guillerme

Examples

## Comparing two characters
char.diff(list(c(0, 1, 0, 1), c(0, 1, 1, 1)))

## Pairwise comparisons in a morphological matrix
morpho_matrix <- matrix(sample(c(0,1), 100, replace = TRUE), 10)

## Plotting a matrix
plot.char.diff(morpho_matrix)

## Plotting the density profile of a char.diff object
char.diff_matrix <- char.diff(morpho_matrix)
plot(char.diff_matrix, type = "density")

## Comparing two characters
char.diff(list(c(0, 1, 0, 1), c(0, 1, 1, 1)))

## Pairwise comparisons in a morphological matrix
morpho_matrix <- matrix(sample(c(0,1), 100, replace = TRUE), 10)

## Plotting a matrix
plot.char.diff(morpho_matrix)

## Plotting the density profile of a char.diff object
char.diff_matrix <- char.diff(morpho_matrix)
plot(char.diff_matrix, type = "density")

dispRity object plotting

Description

Plots a dispRity object.

Usage

## S3 method for class 'dispRity'
plot(
  x,
  ...,
  type,
  quantiles = c(50, 95),
  cent.tend = median,
  rarefaction = NULL,
  elements = FALSE,
  observed = FALSE,
  add = FALSE,
  density = NULL,
  specific.args
)
## S3 method for class 'dispRity'
plot(
  x,
  ...,
  type,
  quantiles = c(50, 95),
  cent.tend = median,
  rarefaction = NULL,
  elements = FALSE,
  observed = FALSE,
  add = FALSE,
  density = NULL,
  specific.args
)

Arguments

`x`	A `dispRity` object.
`...`	Any optional arguments to be passed to `plot`. See details.
`type`	Either `"continuous"`, `"box"`, `"line"`, `"polygon"` or `"space"`. When unspecified, if no disparity was calculated, `"preview"` is used. If disparity was calculated, `"continuous"` is used for `chrono.subsets` and `"box"` for `custom.subsets`. See details.
`quantiles`	The quantiles to display (default is `quantiles = c(50, 95)`; is ignored if the `dispRity` object is not bootstrapped).
`cent.tend`	A function for summarising the bootstrapped disparity values (default is `median`).
`rarefaction`	Either `NULL` (default) or `FALSE` for not using the rarefaction scores; a `numeric` value of the level of rarefaction to plot; or `TRUE` for plotting the rarefaction curves.
`elements`	`logical` whether to plot the number of elements per subsets (default is `FALSE`) or a `list` of any of the graphical arguments `"col"`, `"pch"` and/or `"cex"`.
`observed`	`logical` whether to add the observed values on the plot as crosses (default is `FALSE`) or a `list` of any of the graphical arguments `"col"`, `"pch"` and/or `"cex"`.
`add`	`logical` whether to add the new plot an existing one (default is `FALSE`).
`density`	the density of shading lines to be passed to `polygon`. Is ignored if `type = "box"` or `type = "line"`.
`specific.args`	optional, a named list of arguments to be passed for some specific plot types. See details.

Details

The different type arguments are:

"continuous": plots the results as a continuous line.
"box": plots the results as discrete box plots (note that this option ignores the user set quantiles and central tendency).
"line": plots the results as discrete vertical lines with the user's set quantiles and central tendency.
"polygon": identical as "line" but using polygons rather than vertical lines.
"preview": plots two dimensional preview of the space (default is c(1,2)). WARNING: the plotted dimensions might not be representative of the full multi-dimensional space!

The different specific.args arguments for the following options are:

if type = "preview", the specific arguments can be:
- dimensions: two specific dimensions to plot (default is specific.args = list(dimensions = c(1,2));
- matrix: which specific matrix to plot the data from (by default, all the matrices are used).
- tree: whether to plot the underlying tree(s) or not. Can be either logical, whether to plot no tree (default is specific.args = list(tree = FALSE)), all trees (specific.args = list(tree = TRUE)) or a specific set of trees (e.g. specific.args = list(tree = c(1,2)))
if data is a "test.metric" result that was obtained with the option save.steps = TRUE (see test.metric), it is possible to specify which steps to by specifying the following specific argument: specific.args = list(visualise.steps = c(1,4,5)) for visualising steps 1, 4 and 5 of the different shifts. By default, if the "test.metric" was obtained with the option save.steps = TRUE, four steps are displayed.
if data is a "dispRity" and "projection" object (from dispRity.covar.projections), it is possible to plot either the boxplot of disparity values for each projection (using correlation.plot = NULL; default) or to plot the correlation between two calculated elements (e.g. correlation.plot = c("position", "distance")).

When plotting "randtest" or "test.metric" data or when using type = "preview" a legend is plotted by default. To remove the legend you can use the argument legend = FALSE. You can control specific arguments for the legend using the ... optional arguments preceded by legend.. For example, to change the legend position you can use legend.x = "topleft" or legend.x = 4.2 and legend.y = 1.23. General legend arguments such as col, legend, pch, etc... are recycled by the function but can always be specified using this syntax.

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Discrete plotting
plot(disparity, type = "box")

## Using polygons rather than boxes (quantiles and central tendency can be
## set by the user)
plot(disparity, type = "polygon", quantiles = c(10, 50, 95),
     cent.tend = mean)

## Using different options
plot(disparity, type = "line", elements = TRUE, ylim = c(0, 3),
     xlab = ("Time (Ma)"), ylab = "disparity")

## Continuous plotting (all default options)
plot(disparity, type = "continuous")
 
## Rarefactions plots
plot(disparity, rarefaction = TRUE)

## Observed data
plot(disparity, observed = TRUE)

## Observed data with graphical details
plot(disparity, observed = list("pch" = 19, col = "blue", cex = 4))

## For plotting dispRity objects with the dual classes "randtest", "dtt",
## "model.test", "model.sim" and "test.metric" see the examples
## in the specific function manuals from the "See also" section above

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Discrete plotting
plot(disparity, type = "box")

## Using polygons rather than boxes (quantiles and central tendency can be
## set by the user)
plot(disparity, type = "polygon", quantiles = c(10, 50, 95),
     cent.tend = mean)

## Using different options
plot(disparity, type = "line", elements = TRUE, ylim = c(0, 3),
     xlab = ("Time (Ma)"), ylab = "disparity")

## Continuous plotting (all default options)
plot(disparity, type = "continuous")
 
## Rarefactions plots
plot(disparity, rarefaction = TRUE)

## Observed data
plot(disparity, observed = TRUE)

## Observed data with graphical details
plot(disparity, observed = list("pch" = 19, col = "blue", cex = 4))

## For plotting dispRity objects with the dual classes "randtest", "dtt",
## "model.test", "model.sim" and "test.metric" see the examples
## in the specific function manuals from the "See also" section above

Prints a `dispRity` object.

Description

Summarises the content of a dispRity object.

Usage

## S3 method for class 'dispRity'
print(x, all = FALSE, ...)
## S3 method for class 'dispRity'
print(x, all = FALSE, ...)

Arguments

`x`	A `dispRity` object.
`all`	`logical`; whether to display the entire object (`TRUE`) or just summarise its contents (`FALSE` - default).
`...`	further arguments to be passed to `print` or to `print.dispRity`.

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Displaying the summary of the object content
disparity
print(disparity) # the same
print.dispRity(disparity) # the same

## Displaying the full object
print.dispRity(disparity, all = TRUE)

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Displaying the summary of the object content
disparity
print(disparity) # the same
print.dispRity(disparity) # the same

## Displaying the full object
print.dispRity(disparity, all = TRUE)

Random circle

Description

Creates coordinates for a random circle

Usage

random.circle(n, distribution, inner = 0, outer = Inf, ...)
random.circle(n, distribution, inner = 0, outer = Inf, ...)

Arguments

`n`	The number of pairs x,y of coordinates.
`distribution`	The distribution from which the coordinates are sampled.
`inner`	Optional, the radius for an empty inner circle.
`outer`	Optional, the maximum radius for the circle.
`...`	Any additional argument to be passed to `distribution`.

Author(s)

Thomas Guillerme

Examples

## A simple uniform circle
plot(random.circle(1000, runif), pch = 20)

## A normal ring with inner and outer boundaries
plot(random.circle(1000, rnorm, inner = 0.5, outer = 5), pch = 20)

## A simple uniform circle
plot(random.circle(1000, runif), pch = 20)

## A normal ring with inner and outer boundaries
plot(random.circle(1000, rnorm, inner = 0.5, outer = 5), pch = 20)

Random (permutation) test

Description

Performs a random test (aka permutation test) on a matrix or a dispRity object.

Usage

## S3 method for class 'dispRity'
randtest(
  xtest,
  subsets,
  metric,
  replicates = 100,
  resample = TRUE,
  alter = "two-sided",
  ...
)
## S3 method for class 'dispRity'
randtest(
  xtest,
  subsets,
  metric,
  replicates = 100,
  resample = TRUE,
  alter = "two-sided",
  ...
)

Arguments

`xtest`	The `matrix` or a `dispRity` object to draw from.
`subsets`	A `vector` of elements to test (or a `list` of `vectors` - see details).
`metric`	A `function` to be the statistic to apply to the subset.
`replicates`	A `numeric` value for the number of replicates (`default = 100`).
`resample`	`logical` whether to resample the full distribution (`TRUE`; default) or the distribution without the subset (`FALSE`).
`alter`	The alternative hypothesis. Can be `"two-sided"` (default), `"greater"` or `"lesser"`.
`...`	optional arguments to be passed to `metric`.

Details

This test checks whether the metric calculated on a given subset of the data is significantly different from the metric calculated on any random subset of the same size. In other words: does the given subset have a clearly different disparity value than the rest of the data?

First, the metric (statistic) is applied to the subset sampled from the data (population). Second, the metric is applied to random equally sized subsets from the data. If the observed difference falls out of the random differences distribution, the differences are significant. This algorithm is based on a similar procedure than in link[ade4]{rantest}.

If data is a dispRity object, the subsets, metric and replicates can be left missing and are automatically inherited from the dispRity if it contains respectively subsets (from chrono.subsets or custom.subsets) a metric (from dispRity) and bootstrap draws (from boot.matrix).

If data is a dispRity object subsets can be a list of subsets to compare for example list(c("A", "B"), c("B", "A")) will run two tests comparing respectively sample A to B and B to A. Note that it will only compare these two samples and use their combined size as the population size, if you want to compare a subset to all the subsets you can use list(c("A") or write down the specific subsets to be used.

Value

This function returns a "randtest" object that can be passed to the generic S3 functions print.randtest or plot.randtest. The output also contains to extra elements output$observed and output$random containing the raw results of respectively the observed and random tests.

Author(s)

Thomas Guillerme

Examples

## Simple example
dummy_matrix <- matrix(rnorm(500), 100, 5)

## Testing whether the mean of a random subset
## is different than the means of 100 subsets
dummy_test <- randtest.dispRity(dummy_matrix,
                                subset = sample(1:100, 20),
                                metric = mean)
dummy_test ; plot(dummy_test)

## Applying this on dispRity objects
data(disparity)
test_disparity <- test.dispRity(disparity,
                                test = randtest.dispRity)

## The summarised results
summary(test_disparity)

## Plotting the results
plot(test_disparity)

## Applying this on a dispRity object with specific subset comparisons
test_disparity2 <- randtest.dispRity(disparity, subsets = list(
     ## Comparing subset 90 to the whole population (traitspace)
     c(observed = "90"),
     ## Comparing subset "70" to "90", "70" and "30"
     c(observed = "70", random = c("90", "70", "30"))))

## Summarising and plotting the results
summary(test_disparity2)
plot(test_disparity2)

## Simple example
dummy_matrix <- matrix(rnorm(500), 100, 5)

## Testing whether the mean of a random subset
## is different than the means of 100 subsets
dummy_test <- randtest.dispRity(dummy_matrix,
                                subset = sample(1:100, 20),
                                metric = mean)
dummy_test ; plot(dummy_test)

## Applying this on dispRity objects
data(disparity)
test_disparity <- test.dispRity(disparity,
                                test = randtest.dispRity)

## The summarised results
summary(test_disparity)

## Plotting the results
plot(test_disparity)

## Applying this on a dispRity object with specific subset comparisons
test_disparity2 <- randtest.dispRity(disparity, subsets = list(
     ## Comparing subset 90 to the whole population (traitspace)
     c(observed = "90"),
     ## Comparing subset "70" to "90", "70" and "30"
     c(observed = "70", random = c("90", "70", "30"))))

## Summarising and plotting the results
summary(test_disparity2)
plot(test_disparity2)

Reduce a matrix

Description

Reduce the number of rows/columns in a matrix to optimise overlap

Usage

reduce.matrix(matrix, distance = "gower", by.row = TRUE, verbose = FALSE)
reduce.matrix(matrix, distance = "gower", by.row = TRUE, verbose = FALSE)

Arguments

`matrix`	A matrix
`distance`	which distance to consider (passed to `vegdist`, default = `"gower"`)
`by.row`	Whether to do it by rows (`TRUE` - default), or by columns (`FALSE`)
`verbose`	Whether to do be verbose (`TRUE`) or not (`FALSE` - default)

Author(s)

Thomas Guillerme

Examples

set.seed(1)
## A 10*5 matrix
na_matrix <- matrix(rnorm(50), 10, 5)
## Making sure some rows don't overlap
na_matrix[1, 1:2] <- NA
na_matrix[2, 3:5] <- NA
## Adding 50% NAs
na_matrix[sample(1:50, 25)] <- NA
## Illustrating the gappy matrix
image(t(na_matrix), col = "black")

## Reducing the matrix by row
(reduction <- reduce.matrix(na_matrix))
## Illustrating the overlapping matrix
image(t(na_matrix[-as.numeric(reduction$rows.to.remove), ]), col = "black")

## Reducing the matrix by columns (and being verbose)
reduce.matrix(na_matrix, by.row = FALSE, verbose = TRUE)

set.seed(1)
## A 10*5 matrix
na_matrix <- matrix(rnorm(50), 10, 5)
## Making sure some rows don't overlap
na_matrix[1, 1:2] <- NA
na_matrix[2, 3:5] <- NA
## Adding 50% NAs
na_matrix[sample(1:50, 25)] <- NA
## Illustrating the gappy matrix
image(t(na_matrix), col = "black")

## Reducing the matrix by row
(reduction <- reduce.matrix(na_matrix))
## Illustrating the overlapping matrix
image(t(na_matrix[-as.numeric(reduction$rows.to.remove), ]), col = "black")

## Reducing the matrix by columns (and being verbose)
reduce.matrix(na_matrix, by.row = FALSE, verbose = TRUE)

Reduce space

Description

Remove elements from a multidimensional space

Usage

reduce.space(
  space,
  type,
  remove,
  parameters,
  tuning,
  verbose = FALSE,
  return.optim = FALSE
)
reduce.space(
  space,
  type,
  remove,
  parameters,
  tuning,
  verbose = FALSE,
  return.optim = FALSE
)

Arguments

`space`	the trait space
`type`	how to reduce the space (either `"random"`, `"size"`, `"density"`, `"evenness"` or `"position"`)
`remove`	the proportion of elements to be removed (in probability)
`parameters`	the parameter(s) for removal selection (see details). If left empty, the `parameters` is estimated to reach the amount set by `remove`.
`tuning`	Optinal parameters for tuning the parameter estimations (if remove is required and parameters is missing) a list of three parameters: "max" for the maximum of operations, "tol" for the tuning (e.g. 0.1 close), "inc.steps" for the initial increment value during optimisation (default = 2 - the bigger the value, the slower the increment).
`verbose`	wether to be verbose or not
`return.optim`	logical, whether to also return the optimal value.

Details

The type of reductions algorithms select the proportion of elements to remove (from the remove parameter). The different algorithms are:

"random" for randomly selecting a proportion of data points (using sample(..., replace = FALSE)).
"size" for selecting the proportion of data points closer to the centre.
"density" for selecting the proportion of data points with the lower nearest neigbhour distances.
"evenness" for randomly selecting the proportion of data points from the regions with most density.
"position" for selecting the proportion of data closer to a corner of the the space.

The parameters for each reduction type algorithms are:

"size" parameters: a list of parameters$centre, the centre from which to count the radius (if missing, is set to 0); and parameters$radius, the radius for removal.
"density" parameters: a list of parameters$what "close" (default) for close neighbours or "distant" for distant ones; parameters$diameter the diameter for considering closeness or distance; parameters$output either "singles" or "pairs" to return the pairs of neighbours or one of them only (the first).
"position" parameters: a list of parameters$value, value the threshold value from which to remove elements.
"evenness" parameters: a list of parameters$bw, a bandwith selector function (bw.nrd0 by default); and parameters$power a scaling factor for exaggerating the flatting/narrowing of the curve (the counts are set to this parameter exponent: default is 1).

See Guillerme et al. 2020 and https://github.com/TGuillerme/moms for details.

Value

A vector of logical values of the rows to remove selected by the function. TRUE corresponds to the following (and FALSE to the opposite):

"random": the randomly selected points.
"size": the points closer to the centre of the space.
"density": the points closer to each other.
"position": the points on the "positive" side of the space (typically upper right corner in 2D).
"evenness": the randomly select points from the higher density regions.

Author(s)

Thomas Guillerme

References

Examples

set.seed(1)
## Creating a two dimensional space
space <- space.maker(100, 2, distribution = stats::rnorm)

## Generating the four types of reductions
random <- reduce.space(space, "random", remove = 0.5)
size <- reduce.space(space, "size", remove = 0.5)
density <- reduce.space(space, "density", remove = 0.5)
position <- reduce.space(space, "position", remove = 0.5)
evenness <- reduce.space(space, "evenness", remove = 0.5)

## Plotting the four different results
par(mfrow = c(3,2))
plot(space, pch = 19, col = c("grey", "black")[as.factor(random)],
     main = "Random removal") 
plot(space, pch = 19, col = c("grey", "black")[as.factor(size)],
     main = "Size removal")
plot(space, pch = 19, col = c("grey", "black")[as.factor(density)],
     main = "Density removal")
plot(space, pch = 19, col = c("grey", "black")[as.factor(position)],
     main = "Position removal")
plot(space, pch = 19, col = c("grey", "black")[as.factor(evenness)],
     main = "Evenness removal")

## The space reduction with specific parameters:
# Using the point with coordinates (2,2) as the centre 
# Running over a maximum of 300 iterations
# With a tolerance of 0.05 (5%)
reduce.space(space, "size", remove = 0.2,
             parameters = list("centre" = c(2,2)), 
             tuning = list("max" = 300, "tol" = 0.05))

## Remove a specific amount to match a specific parameter
reduce.space(space, type = "size", parameters = list("radius" = 1.206866))


set.seed(1)
## Creating a two dimensional space
space <- space.maker(100, 2, distribution = stats::rnorm)

## Generating the four types of reductions
random <- reduce.space(space, "random", remove = 0.5)
size <- reduce.space(space, "size", remove = 0.5)
density <- reduce.space(space, "density", remove = 0.5)
position <- reduce.space(space, "position", remove = 0.5)
evenness <- reduce.space(space, "evenness", remove = 0.5)

## Plotting the four different results
par(mfrow = c(3,2))
plot(space, pch = 19, col = c("grey", "black")[as.factor(random)],
     main = "Random removal") 
plot(space, pch = 19, col = c("grey", "black")[as.factor(size)],
     main = "Size removal")
plot(space, pch = 19, col = c("grey", "black")[as.factor(density)],
     main = "Density removal")
plot(space, pch = 19, col = c("grey", "black")[as.factor(position)],
     main = "Position removal")
plot(space, pch = 19, col = c("grey", "black")[as.factor(evenness)],
     main = "Evenness removal")

## The space reduction with specific parameters:
# Using the point with coordinates (2,2) as the centre 
# Running over a maximum of 300 iterations
# With a tolerance of 0.05 (5%)
reduce.space(space, "size", remove = 0.2,
             parameters = list("centre" = c(2,2)), 
             tuning = list("max" = 300, "tol" = 0.05))

## Remove a specific amount to match a specific parameter
reduce.space(space, type = "size", parameters = list("radius" = 1.206866))

Remove zero branch length

Description

Remove zero or negative branch lengths on trees by sliding nodes randomly in a postorder traversal based on slide.nodes.

Usage

remove.zero.brlen(tree, slide, verbose = FALSE)
remove.zero.brlen(tree, slide, verbose = FALSE)

Arguments

`tree`	A `"phylo"` or `"multiPhylo"` object with edge lengths
`slide`	An optional sliding `numeric` values. If left empty, 1% of the shortest branch length is used.
`verbose`	A `logical` value indicating whether to be verbose or not.

Details

The sliding value will be used to slide the nodes up and down to remove zero branch lengths by minimising the amount of branch changes. The algorithm slides the nodes up and down (when possible) on each node in a recursive way while there is still zero or negative branch lengths. If two recursions produce the same series of zero branches (e.g. by sliding node A towards node B equally so that the distance A:B becomes 0), the sliding value is divided by two until the next slide.

Value

A "phylo" object with a postorder edge table and no zero branch lengths.

Author(s)

Thomas Guillerme

Examples

set.seed(42)
## Generating a tree
tree <- rtree(20)
## Adding some zero branch lengths (5)
tree$edge.length[sample(1:Nedge(tree), 5)] <- 0
any(tree$edge.length == 0) # TRUE

## And now removing these zero branch lengths!
tree_no_zero <- remove.zero.brlen(tree)
any(tree_no_zero$edge.length == 0) # FALSE

## Exaggerating the removal (to make it visible)
tree_exaggerated <- remove.zero.brlen(tree, slide = 1)

## Plot the differences
par(mfrow = c(3,1))
plot(tree, main = "zero branch length")
plot(tree_no_zero, main = "no zero branch length")
plot(tree_exaggerated, main = "exaggerated slidding")

## Removing negative branch lengths
## Generating a tree with negative branch length
set.seed(3)
tree_negative <- chronoMPL(rtree(10))
## Removing the negative branch length (and make it non-zero)
tree_positive <- remove.zero.brlen(tree_negative)
## Plot the differences
par(mfrow = c(2, 1))
plot(tree_negative, main = "Negative branch lengths")
plot(tree_positive, main = "Positive branch lengths")

set.seed(42)
## Generating a tree
tree <- rtree(20)
## Adding some zero branch lengths (5)
tree$edge.length[sample(1:Nedge(tree), 5)] <- 0
any(tree$edge.length == 0) # TRUE

## And now removing these zero branch lengths!
tree_no_zero <- remove.zero.brlen(tree)
any(tree_no_zero$edge.length == 0) # FALSE

## Exaggerating the removal (to make it visible)
tree_exaggerated <- remove.zero.brlen(tree, slide = 1)

## Plot the differences
par(mfrow = c(3,1))
plot(tree, main = "zero branch length")
plot(tree_no_zero, main = "no zero branch length")
plot(tree_exaggerated, main = "exaggerated slidding")

## Removing negative branch lengths
## Generating a tree with negative branch length
set.seed(3)
tree_negative <- chronoMPL(rtree(10))
## Removing the negative branch length (and make it non-zero)
tree_positive <- remove.zero.brlen(tree_negative)
## Plot the differences
par(mfrow = c(2, 1))
plot(tree_negative, main = "Negative branch lengths")
plot(tree_positive, main = "Positive branch lengths")

Rescaling and centering disparity results.

Description

Scales or/and centers the disparity measurements.

Usage

## S3 method for class 'dispRity'
scale(x, center = FALSE, scale = TRUE, ...)
## S3 method for class 'dispRity'
scale(x, center = FALSE, scale = TRUE, ...)

Arguments

`x`	a `dispRity` object.
`center`	either a `logical` value or a `numeric` vector of length equal to the number of elements of `data` (default is `FALSE`).
`scale`	either a `logical` value or a `numeric` vector of length equal to the number of elements of `data` (default is `TRUE`).
`...`	optional arguments to be passed to `scale`.

Details

To scale or and center using the full distribution (i.e. all the disparity values) or only the distribution within each subsets of bootstraps you can use the optional argument use.all as a logical. By default is use.all = TRUE and uses all the disparity values not only the ones in the subset.

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Scaling the data
summary(scale.dispRity(disparity, scale = TRUE)) # Dividing by the maximum
## Multiplying by 10 (dividing by 0.1)
summary(scale.dispRity(disparity, scale = 0.1))

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Scaling the data
summary(scale.dispRity(disparity, scale = TRUE)) # Dividing by the maximum
## Multiplying by 10 (dividing by 0.1)
summary(scale.dispRity(disparity, scale = 0.1))

Selects ordination axes

Description

Selects the axes required to explain a cumulative threshold amount of variance in an ordination (e.g. > 95%).

Usage

select.axes(data, group, threshold = 0.95, inc.threshold = TRUE)
select.axes(data, group, threshold = 0.95, inc.threshold = TRUE)

Arguments

`data`	The trait space to analyse. This can be either a `"matrix"`, `"prcomp"`, `"princomp"` or a `"dispRity"` object.
`group`	Optional, either a `list` of row numbers or names to be used as different groups or a `data.frame` with the same $k$ elements as in `data` as rownames. If `data` is a `"dispRity"` object that already contains groups, the `group` argument is recycled.
`threshold`	The arbitrary threshold amount of variance (by default this is `0.95`).
`inc.threshold`	Logical, whether to output the axes that contain the threshold value (`TRUE`; default) or not (`FALSE`). See details. , i.e. the axes necessary to include at least the threshold value

Details

If inc.threshold = TRUE, the returned axes are the ones that contains at least the threshold value (e.g. if the threshold is 0.95, all the returned axes contain at least 0.95 of the variance, potentially more). If inc.threshold = FALSE, the returned axes are the ones before reaching this threshold (e.g. the cumulative variance returned is strictly less or equal to 0.95).

Value

A "dispRity", "axes" object that can be printed, summarised and plot through generic print, summary and plot functions. The object is a list containing:

$dimensions: the maximum number of dimensions selected across all groups;
$dim.list: the selected dimensions per group;
$var: the variance per axes per group;
$scaled.var: the variance scaled variance per axes per group;
$cumsum.var: the cumulative scaled variance per axes per group;
$call: a list containing the $threshold value and the $inc.threshold option used.

Author(s)

Thomas Guillerme

Examples

## Ordinating the USArrests dataset
ordination <- princomp(USArrests, cor = TRUE)
## Which dimensions to select?
(selected <- select.axes(ordination))
## The selected dimensions
selected$dimensions
## Visualising the results
plot(selected)

## Same but by grouping the data into three groups
states_groups <- list("Group1" = c("Mississippi","North Carolina",
                                   "South Carolina", "Georgia", "Alabama",
                                   "Alaska", "Tennessee", "Louisiana"),
                      "Group2" = c("Florida", "New Mexico", "Michigan",
                                   "Indiana", "Virginia", "Wyoming", "Montana",
                                   "Maine", "Idaho", "New Hampshire", "Iowa"),
                      "Group3" = c("Rhode Island", "New Jersey", "Hawaii",
                                   "Massachusetts"))
(selected <- select.axes(ordination, group = states_groups))
## Note that the required number of axes is now 4 (instead of 3)
plot(selected)

## Loading some example dispRity data
data(demo_data)
## How many axes are required to explain 99% of the variance
## for each group in the Healy et al 2019 data?
(how_many <- select.axes(demo_data$healy, threshold = 0.99))
summary(how_many)
plot(how_many)


## Ordinating the USArrests dataset
ordination <- princomp(USArrests, cor = TRUE)
## Which dimensions to select?
(selected <- select.axes(ordination))
## The selected dimensions
selected$dimensions
## Visualising the results
plot(selected)

## Same but by grouping the data into three groups
states_groups <- list("Group1" = c("Mississippi","North Carolina",
                                   "South Carolina", "Georgia", "Alabama",
                                   "Alaska", "Tennessee", "Louisiana"),
                      "Group2" = c("Florida", "New Mexico", "Michigan",
                                   "Indiana", "Virginia", "Wyoming", "Montana",
                                   "Maine", "Idaho", "New Hampshire", "Iowa"),
                      "Group3" = c("Rhode Island", "New Jersey", "Hawaii",
                                   "Massachusetts"))
(selected <- select.axes(ordination, group = states_groups))
## Note that the required number of axes is now 4 (instead of 3)
plot(selected)

## Loading some example dispRity data
data(demo_data)
## How many axes are required to explain 99% of the variance
## for each group in the Healy et al 2019 data?
(how_many <- select.axes(demo_data$healy, threshold = 0.99))
summary(how_many)
plot(how_many)

Adds root time to a tree

Description

Adds or replace root time to a tree by calculating it's root's depth

Usage

set.root.time(tree, present = 0)
set.root.time(tree, present = 0)

Arguments

`tree`	A `phylo`, `mutiPhylo` or `dispRity` object that contains trees.
`present`	The age of the most recent tip. By default this is set to `0`.

Examples

## A random tree with no root.time
my_tree <- rtree(10)
my_tree$root.time # is NULL
## Adding a root time
my_tree <- set.root.time(my_tree)
my_tree$root.time # is not NULL
## Rewrite the root time with a different present
my_tree <- set.root.time(my_tree, present = 10)
my_tree$root.time # is older

## A random tree with no root.time
my_tree <- rtree(10)
my_tree$root.time # is NULL
## Adding a root time
my_tree <- set.root.time(my_tree)
my_tree$root.time # is not NULL
## Rewrite the root time with a different present
my_tree <- set.root.time(my_tree, present = 10)
my_tree$root.time # is older

Simulates morphological data.

Description

Generates a morphological matrix using rTraitDisc or gen.seq.HKY functions.

Usage

sim.morpho(
  tree,
  characters,
  states = 1,
  model = "ER",
  rates,
  substitution = c(stats::runif, 2, 2),
  invariant = TRUE,
  verbose = FALSE
)
sim.morpho(
  tree,
  characters,
  states = 1,
  model = "ER",
  rates,
  substitution = c(stats::runif, 2, 2),
  invariant = TRUE,
  verbose = FALSE
)

Arguments

`tree`	A phylogenetic tree to use for generating the characters.
`characters`	The number of morphological characters to generate.
`states`	A `numeric` string of probabilities for the number of states for each character (`default = 1`; i.e. 100% binary state characters; see details).
`model`	Either an implemented (`"ER"`, `"HKY"` or `"MIXED"`) or user defined model (see details).
`rates`	A function an its parameters for the rates distribution (see details).
`substitution`	A function an its parameters for the substitutions distribution (see details; `default = c(runif, 2, 2)`).
`invariant`	`logical`, whether to allow any invariant sites (`default = TRUE`).
`verbose`	Whether to be verbose or not (`default = FALSE`).

Details

The model arguments must be either a user's defined function for generating the discrete morphological characters (that takes the states, rates and substitution arguments) or one of the two following:
- "ER" uses the ape::rTraitDisc function with the "ER" model argument (= Mk model).
- "HKY" uses the phyclust::gen.seq.HKY function with kappa sampled from the substitution argument, pi = runif(4) (divided by sum(runif(4))), rate.scale sampled from the rates distribution and L being the number of characters and transforms the purines (A, G) into 0 and the pyrimidines (C, T) into 1.
- "MIXED" randomly uses "ER" or "HKY" for binary characters and "ER" for any character with more than two states.
- the user defined model must be a function that generates a single discrete morphological character and takes one element from at least the following arguments: tree, states, rates, substitution.
The states argument attributes a number of states to each character by using the given probability vector for each number of states starting from two. For example states = c(0.7, 0.2, 0.1) will generate 70% of characters with two states, 20% of characters with three states and 10% of characters with four states.
The rates and substitution arguments require a function that outputs a distribution and its optional parameters. For example rates = c(runif, 1, 10) creates a uniform distribution between 1 and 10 for the rates distribution.

Author(s)

Thomas Guillerme

Examples


set.seed(4)
## A random tree with 15 tips
tree <- rcoal(15)
## Setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)
my_substitutions = c(runif, 2, 2)

## HKY binary (15*50)
matrixHKY <- sim.morpho(tree, characters = 50, model = "HKY",
     rates = my_rates, substitution = my_substitutions)

## Mk matrix (15*50) (for Mkv models)
matrixMk <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates) 

## Mk invariant matrix (15*50) (for Mk models)
matrixMk <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
     invariant = FALSE)

## MIXED model invariant matrix (15*50)
matrixMixed <- sim.morpho(tree, characters = 50, model = "MIXED",
     rates = my_rates, substitution = my_substitutions,  invariant = FALSE,
     verbose = TRUE)

set.seed(4)
## A random tree with 15 tips
tree <- rcoal(15)
## Setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)
my_substitutions = c(runif, 2, 2)

## HKY binary (15*50)
matrixHKY <- sim.morpho(tree, characters = 50, model = "HKY",
     rates = my_rates, substitution = my_substitutions)

## Mk matrix (15*50) (for Mkv models)
matrixMk <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates) 

## Mk invariant matrix (15*50) (for Mk models)
matrixMk <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
     invariant = FALSE)

## MIXED model invariant matrix (15*50)
matrixMixed <- sim.morpho(tree, characters = 50, model = "MIXED",
     rates = my_rates, substitution = my_substitutions,  invariant = FALSE,
     verbose = TRUE)

Time slicing a tree.

Description

Time slicing through a phylogenetic tree.

Usage

slice.tree(tree, age, model, FAD, LAD, keep.all.ancestors = FALSE)
slice.tree(tree, age, model, FAD, LAD, keep.all.ancestors = FALSE)

Arguments

`tree`	A `phylo` object with a `root.time` element.
`age`	A single `numeric` value indicating where to perform the slice.
`model`	One of the following models: `"acctran"`, `"deltran"`, `"random"`, `"proximity"`, `"equal.split"` or `"gradual.split"`. Is ignored if `method = "discrete"`. See `chrono.subsets` for the models description.
`FAD`, `LAD`	The first and last occurrence data.
`keep.all.ancestors`	Optional, whether to also include the ancestors of the tree slice (`TRUE`) or just the ones linking the elements present at the slice (`FALSE`; default)

Author(s)

Thomas Guillerme

References

Guillerme T. & Cooper N. 2018. Time for a rethink: time sub-sampling methods in disparity-through-time analyses. Palaeontology. DOI: 10.1111/pala.12364.

Examples

set.seed(1)
## Generate a random ultrametric tree
tree <- rtree(20)

## Add some node labels
tree$node.label <- letters[1:19]

## Add its root time
tree$root.time <- max(tree.age(tree)$ages)

## Slice the tree at age 1.5
tree_slice <- slice.tree(tree, age = 1.5, "deltran")

## The slice at age 0.5 but keeping all the ancestors
deep_slice <- slice.tree(tree, age = 0.5, "deltran",
                            keep.all.ancestors = TRUE)

## Visualising the trees
old_par <- par(mfrow = c(2,2))
plot(ladderize(tree), main = "full tree"); axisPhylo()
abline(v =  tree$root.time - 1.5)
plot(ladderize(tree_slice), main = "tree slice"); axisPhylo()
plot(ladderize(deep_slice), main = "slice with ancestors"); axisPhylo()

par(old_par)

set.seed(1)
## Generate a random ultrametric tree
tree <- rtree(20)

## Add some node labels
tree$node.label <- letters[1:19]

## Add its root time
tree$root.time <- max(tree.age(tree)$ages)

## Slice the tree at age 1.5
tree_slice <- slice.tree(tree, age = 1.5, "deltran")

## The slice at age 0.5 but keeping all the ancestors
deep_slice <- slice.tree(tree, age = 0.5, "deltran",
                            keep.all.ancestors = TRUE)

## Visualising the trees
old_par <- par(mfrow = c(2,2))
plot(ladderize(tree), main = "full tree"); axisPhylo()
abline(v =  tree$root.time - 1.5)
plot(ladderize(tree_slice), main = "tree slice"); axisPhylo()
plot(ladderize(deep_slice), main = "slice with ancestors"); axisPhylo()

par(old_par)

Stretching a tree

Description

Stretches a phylogenetic tree at a particular node

Usage

slide.nodes(nodes, tree, slide, allow.negative.root = FALSE)
slide.nodes(nodes, tree, slide, allow.negative.root = FALSE)

Arguments

`nodes`	A list of the ID nodes to slide (`"integer"`) or names (`"character"`). The first node is `ape::Ntip(tree) + 1`, etc.
`tree`	a `"phylo"` object.
`slide`	the sliding value.
`allow.negative.root`	logical, whether to allow negative branch lengths and moving the root node (`TRUE`) or not (`FALSE`; default).

Details

The sliding works by subtracting the slide value to the branch leading to the node and adding it to the descendant branches. Note that the slide value can be negative to slide nodes the other way (up); the only requirement is that the slide does not lead to negative branch length values.

Value

A "phylo" object.

Author(s)

Thomas Guillerme

Examples

set.seed(42)
## Generating a coalescent tree
tree <- rcoal(5)

## Stretching node 8 up and down
tree_slide_up <- slide.nodes(8, tree, slide = 0.075)
tree_slide_down <- slide.nodes(8, tree, slide = -0.075)

## Display the results
par(mfrow = c(3,1))
plot(tree) ; axisPhylo() ; nodelabels()
plot(tree_slide_up) ; axisPhylo() ; nodelabels()
plot(tree_slide_down) ; axisPhylo() ; nodelabels()

## Stretching many nodes
set.seed(42)
tree <- rtree(50)
move_nodes <- c(99, 93, 53, 86, 58, 63, 60, 84)
tree_slided <- slide.nodes(move_nodes, tree, slide = 0.07)

## Display the results
par(mfrow = c(2, 1))
node_colors <- c("lightblue", "orange")[((1:Nnode(tree))+Ntip(tree)) %in% move_nodes + 1]
plot(tree, show.tip.label = FALSE) ; axisPhylo()
nodelabels(bg = node_colors, cex = 0.5)
plot(tree_slided, show.tip.label = FALSE) ; axisPhylo()
nodelabels(bg = node_colors, cex = 0.5)

set.seed(42)
## Generating a coalescent tree
tree <- rcoal(5)

## Stretching node 8 up and down
tree_slide_up <- slide.nodes(8, tree, slide = 0.075)
tree_slide_down <- slide.nodes(8, tree, slide = -0.075)

## Display the results
par(mfrow = c(3,1))
plot(tree) ; axisPhylo() ; nodelabels()
plot(tree_slide_up) ; axisPhylo() ; nodelabels()
plot(tree_slide_down) ; axisPhylo() ; nodelabels()

## Stretching many nodes
set.seed(42)
tree <- rtree(50)
move_nodes <- c(99, 93, 53, 86, 58, 63, 60, 84)
tree_slided <- slide.nodes(move_nodes, tree, slide = 0.07)

## Display the results
par(mfrow = c(2, 1))
node_colors <- c("lightblue", "orange")[((1:Nnode(tree))+Ntip(tree)) %in% move_nodes + 1]
plot(tree, show.tip.label = FALSE) ; axisPhylo()
nodelabels(bg = node_colors, cex = 0.5)
plot(tree_slided, show.tip.label = FALSE) ; axisPhylo()
nodelabels(bg = node_colors, cex = 0.5)

Sorting or ordering a `dispRity` object.

Description

Sort (or order) the subsets of a dispRity object.

Usage

## S3 method for class 'dispRity'
sort(x, decreasing = FALSE, sort, ...)
## S3 method for class 'dispRity'
sort(x, decreasing = FALSE, sort, ...)

Arguments

`x`	A `dispRity` object.
`decreasing`	`logical`. Should the sort be in ascending or descending order? Is ignored if `sort` is used.
`sort`	An optional `vector` of `numeric` values corresponding to the order in which to return the subsets.
`...`	optional arguments to be passed to `sort`.

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Sorting the data
summary(disparity)
summary(sort(disparity, decreasing = TRUE))
summary(sort(disparity, sort = c(7,1,3,4,5,2,6)))

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Sorting the data
summary(disparity)
summary(sort(disparity, decreasing = TRUE))
summary(sort(disparity, sort = c(7,1,3,4,5,2,6)))

Creating multidimensional spaces

Description

Creates a multidimensional space with a given number of elements and dimensions

Usage

space.maker(
  elements,
  dimensions,
  distribution,
  arguments = NULL,
  cor.matrix = NULL,
  scree = NULL,
  elements.names = NULL,
  replicates = NULL
)
space.maker(
  elements,
  dimensions,
  distribution,
  arguments = NULL,
  cor.matrix = NULL,
  scree = NULL,
  elements.names = NULL,
  replicates = NULL
)

Arguments

`elements`	An `numeric` value.
`dimensions`	An `numeric` value smaller than `elements`.
`distribution`	One or more `functions` to determine the distribution of the `elements` along each `dimension`. The function must have a single input: `elements`.
`arguments`	Optional `list` of arguments to be passed to the distributions functions in the order they appear (`default = NULL`, see details).
`cor.matrix`	An optional correlation `matrix` of size `dimensions * dimensions` (`default = NULL`, see details).
`scree`	An optional proportional `numeric` vector for approximating the `dimensions` variance (`default = NULL`, see details).
`elements.names`	Optional, a `character` or `integer` string for naming the elements in the matrix.
`replicates`	Optional, an `integer` to replicate the simulations and generating multiple spaces.

Details

When passing additional arguments to different distributions, these must be given as a list to each function in the order they appear. For example if distribution = c(runif, rnorm, rgamma) and one wants the distributions to be runif(elements, min = 1, max = 10), rnorm(elements, mean = 8) and rgamma(elements, shape = 1, log = TRUE), the additional arguments should be passed as c(list(min = 1, max = 10), list(mean = 8), list(shape = 1, log = TRUE). If no arguments have to be passed to a certain function, it can be left as NULL (e.g. c(list(min = 1, max = 10), list(NULL), list(shape = 1, log = TRUE)).

The cor.matrix argument should be a correlation matrix between the dimensions. If not NULL, the multidimensional space is multiplied by the the Choleski decomposition (chol) of the correlation matrix. The scree argument is simply a value multiplier for each dimension to adjust their variance to approximate the scree one.

Author(s)

Thomas Guillerme

Examples

## A square space
plot(space.maker(5000, 2, runif), pch = 20)

## A circular space
plot(space.maker(5000, 2, rnorm), pch = 20)

## A 2-dimensional cylindrical space
plot(space.maker(5000, 2, c(rnorm, runif)), pch = 20)

## A 4-dimensional space with different distributions
space.maker(5, 4, c(runif, runif, rnorm, rgamma),
     arguments = list(list(min = 1, max = 10), list(min = 1, max = 2),
     list(mean = 8), list(shape = 1)))

## A 3-dimensional correlated space
cor_matrix <- matrix(cbind(1, 0.8 ,0.2, 0.8, 1, 0.7, 0.2, 0.7, 1), nrow = 3)
space <- space.maker(10000, 3, rnorm, cor.matrix = cor_matrix)
round(cor(space), 1) ; cor_matrix ## Both should be really similar matrices

## A 3-dimensional space with a priori approximated variance for each dimension
space <- space.maker(10000, 3, rnorm, scree = c(0.6, 0.3, 0.1))
## The resulting screeplot
barplot(apply(space, 2, var))

## Generate 3 2D normal spaces with rownames 
space.maker(10, 2, rnorm, elements.names = letters[1:10], replicates = 3)

## Not run: 
require(scatterplot3d)
## A cube space
scatterplot3d(space.maker(5000, 3, runif), pch = 20)

## A plane space
scatterplot3d(space.maker(5000, 3, c(runif, runif, runif),
     arguments = list(list(min = 0, max = 0), NULL, NULL)), pch = 20)

## A sphere space
scatterplot3d(space.maker(5000, 3, rnorm), pch = 20)

## A 3D cylindrical space
scatterplot3d(space.maker(5000, 3, c(rnorm, rnorm, runif)), pch = 20)

## Generating a doughnut space
doughnut <- space.maker(5000, 3, c(rnorm, random.circle),
     arguments = list(list(mean = 0), list(runif, inner = 0.5, outer = 1)))
## Reodering the axis for projecting the dougnut in 2D
scatterplot3d(doughnut[,c(2,1,3)], pch = 20)

## End(Not run)

## A square space
plot(space.maker(5000, 2, runif), pch = 20)

## A circular space
plot(space.maker(5000, 2, rnorm), pch = 20)

## A 2-dimensional cylindrical space
plot(space.maker(5000, 2, c(rnorm, runif)), pch = 20)

## A 4-dimensional space with different distributions
space.maker(5, 4, c(runif, runif, rnorm, rgamma),
     arguments = list(list(min = 1, max = 10), list(min = 1, max = 2),
     list(mean = 8), list(shape = 1)))

## A 3-dimensional correlated space
cor_matrix <- matrix(cbind(1, 0.8 ,0.2, 0.8, 1, 0.7, 0.2, 0.7, 1), nrow = 3)
space <- space.maker(10000, 3, rnorm, cor.matrix = cor_matrix)
round(cor(space), 1) ; cor_matrix ## Both should be really similar matrices

## A 3-dimensional space with a priori approximated variance for each dimension
space <- space.maker(10000, 3, rnorm, scree = c(0.6, 0.3, 0.1))
## The resulting screeplot
barplot(apply(space, 2, var))

## Generate 3 2D normal spaces with rownames 
space.maker(10, 2, rnorm, elements.names = letters[1:10], replicates = 3)

## Not run: 
require(scatterplot3d)
## A cube space
scatterplot3d(space.maker(5000, 3, runif), pch = 20)

## A plane space
scatterplot3d(space.maker(5000, 3, c(runif, runif, runif),
     arguments = list(list(min = 0, max = 0), NULL, NULL)), pch = 20)

## A sphere space
scatterplot3d(space.maker(5000, 3, rnorm), pch = 20)

## A 3D cylindrical space
scatterplot3d(space.maker(5000, 3, c(rnorm, rnorm, runif)), pch = 20)

## Generating a doughnut space
doughnut <- space.maker(5000, 3, c(rnorm, random.circle),
     arguments = list(list(mean = 0), list(runif, inner = 0.5, outer = 1)))
## Reodering the axis for projecting the dougnut in 2D
scatterplot3d(doughnut[,c(2,1,3)], pch = 20)

## End(Not run)

dispRity object summary

Description

Creates a summary of a dispRity object.

Usage

## S3 method for class 'dispRity'
summary(
  object,
  ...,
  quantiles = c(50, 95),
  cent.tend = median,
  recall = FALSE,
  digits
)
## S3 method for class 'dispRity'
summary(
  object,
  ...,
  quantiles = c(50, 95),
  cent.tend = median,
  recall = FALSE,
  digits
)

Arguments

`object`	A `dispRity` object.
`...`	Additional arguments to be passed to `summary` or `cent.tend`.
`quantiles`	The quantiles to display (default is `quantiles = c(50, 95)`; is ignored if the `dispRity` object is not bootstrapped).
`cent.tend`	A function for summarising the bootstrapped disparity values (default is `median`).
`recall`	`logical` value specifying whether to recall the `dispRity` parameters input (default = `FALSE`).
`digits`	Optional, a value for digits the values in the output table (default = 2).

Details

If the dispRity object to summarise comes from a chrono.subsets using a "multiPhylo" object, the displayed number of observations (n) corresponds to the maximum number of observation at the specific time slice (some slices through some trees might have less observations).

Value

A data.frame with:

`subsets`	the subset names.
`n`	the maximum number of elements in each subset (see details).
`observed`	the observed disparity or the the observed central tendency (<cent_tend>) of disparity (`obs.<cent_tend>`).
`bootstraps...`	if `data` is bootstrapped, the bootstrapped disparity's central tendency (`bs.<cent_tend>`) and the quantiles of the bootstrapped disparities (or, if `data` is not bootstrapped but disparity is calculated as a distribution - see `dispRity`) - the quantiles of the observed disparity are displayed).

Author(s)

Thomas Guillerme

Examples

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Summarising the results
summary(disparity) # default
## Using different options
summary(disparity, quantiles = 75, cent.tend = mean, digits = 8,
     recall = TRUE)

## Load the disparity data based on Beck & Lee 2014
data(disparity)

## Summarising the results
summary(disparity) # default
## Using different options
summary(disparity, quantiles = 75, cent.tend = mean, digits = 8,
     recall = TRUE)

Testing disparity hypotheses

Description

Applying statistical tests to dispRity objects

Usage

test.dispRity(
  data,
  test,
  comparisons = "pairwise",
  rarefaction = NULL,
  correction = "none",
  concatenate = TRUE,
  conc.quantiles = c(mean, c(95, 50)),
  details = FALSE,
  ...
)
test.dispRity(
  data,
  test,
  comparisons = "pairwise",
  rarefaction = NULL,
  correction = "none",
  concatenate = TRUE,
  conc.quantiles = c(mean, c(95, 50)),
  details = FALSE,
  ...
)

Arguments

`data`	A `dispRity` object.
`test`	A test `function` to apply to the data.
`comparisons`	If data contains more than two subsets, the type of comparisons to apply: either `"pairwise"` (default), `"referential"`, `"sequential"`, `"all"` or a list of pairs of subset names/number to compare (see details).
`rarefaction`	A `numeric` value indicating whether to use a specific rarefaction level (default = `NULL`).
`correction`	Which p-value correction to apply to `htest` category test (see `p.adjust`; default = `"none"`).
`concatenate`	Logical, whether to concatenate bootstrapped disparity values (`TRUE`; default) or to apply the test to each bootstrapped value individually (`FALSE`).
`conc.quantiles`	If `concatenate = TRUE`, must be a central tendency function and a vector of quantiles (default = `c(mean, c(95, 50))`).
`details`	Whether to output the details of each test (non-formatted; default = `FALSE`).
`...`	Additional options to pass to the test `function`.

Details

The comparison argument can be:

"pairwise": pairwise comparisons of all the subsets (default).
"referential": compares the first subset to all the others.
"sequential": compares each subset sequentially (e.g. first against second, second against third, etc.).
"all": compares all the subsets simultaneously to the data (i.e. bootstrapped disparity ~ subsets names). This argument is used for lm or glm type tests.
A list of pairs of number of subsets to compare. Each element of the list must contain two elements (e.g. list(c("a","b"), ("b", "a")) to compare "a" to "b" and then "b" to "a").
If the called test is null.test, the comparison argument is ignored.

IMPORTANT: if you are performing multiple comparisons (e.g. when using "pairwise", "referential" or "sequential"), don't forget about the Type I error rate inflation. You might want to use a p-value correction (see p.adjust).

Author(s)

Thomas Guillerme

Examples

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)
data(BeckLee_tree)

## Calculating the disparity from customised subsets
## Generating the subsets
groups <- crown.stem(BeckLee_tree, inc.nodes = FALSE)
customised_subsets <- custom.subsets(BeckLee_mat50, groups)
## Bootstrapping the data
bootstrapped_data <- boot.matrix(customised_subsets, bootstraps = 100)
## Calculating the sum of variances
sum_of_variances <- dispRity(bootstrapped_data, metric = c(sum, variances))

## Measuring the subset overlap
test.dispRity(sum_of_variances, bhatt.coeff, "pairwise")

## Measuring differences from a reference subset
test.dispRity(sum_of_variances, wilcox.test, "referential")

## Measuring disparity as a distribution
disparity_var <- dispRity(bootstrapped_data, metric = variances)
## Differences between the concatenated bootstrapped values of the subsets
test.dispRity(disparity_var, test = t.test, comparisons = "pairwise",
     concatenate = TRUE, correction = "bonferroni")
## Differences between the subsets bootstrapped
test.dispRity(disparity_var, test = t.test, comparisons = "pairwise",
     concatenate = FALSE, correction = "bonferroni",
     conc.quantiles = c(mean, c(95, 5)))

## Load the Beck & Lee 2014 data
data(BeckLee_mat50)
data(BeckLee_tree)

## Calculating the disparity from customised subsets
## Generating the subsets
groups <- crown.stem(BeckLee_tree, inc.nodes = FALSE)
customised_subsets <- custom.subsets(BeckLee_mat50, groups)
## Bootstrapping the data
bootstrapped_data <- boot.matrix(customised_subsets, bootstraps = 100)
## Calculating the sum of variances
sum_of_variances <- dispRity(bootstrapped_data, metric = c(sum, variances))

## Measuring the subset overlap
test.dispRity(sum_of_variances, bhatt.coeff, "pairwise")

## Measuring differences from a reference subset
test.dispRity(sum_of_variances, wilcox.test, "referential")

## Measuring disparity as a distribution
disparity_var <- dispRity(bootstrapped_data, metric = variances)
## Differences between the concatenated bootstrapped values of the subsets
test.dispRity(disparity_var, test = t.test, comparisons = "pairwise",
     concatenate = TRUE, correction = "bonferroni")
## Differences between the subsets bootstrapped
test.dispRity(disparity_var, test = t.test, comparisons = "pairwise",
     concatenate = FALSE, correction = "bonferroni",
     conc.quantiles = c(mean, c(95, 5)))

Test disparity metric

Description

Test whether a metric captures changes trait space size, density and position.

Usage

test.metric(
  data,
  metric,
  ...,
  shifts,
  shift.options,
  model,
  replicates = 3,
  steps = 10,
  dimensions,
  verbose = FALSE,
  save.steps = FALSE
)
test.metric(
  data,
  metric,
  ...,
  shifts,
  shift.options,
  model,
  replicates = 3,
  steps = 10,
  dimensions,
  verbose = FALSE,
  save.steps = FALSE
)

Arguments

`data`	A matrix or a `dispRity` object (see details).
`metric`	A vector containing one to three functions. At least of must be a dimension-level 1 or 2 function (see details). If `data` is a `dispRity` object with disparity already calculated, this argument can be left empty (and the one from `data` is recycled)
`...`	Optional arguments to be passed to the metric.
`shifts`	The types of shits to test, can be `"random"`, `"size"`, `"density"`, `"evenness"` and `"position"`. See details.
`shift.options`	Optional, a `list` of named arguments to be passed to `reduce.space`
`model`	Optional, which model to fit for testing the metric. See details.
`replicates`	A `numeric` number of replicates to increase variance. By default `replicates = 3`. If `replicates = 1`, the `model` is not run.
`steps`	The number of steps in the space reduction to output between 10% and 100%. By default `steps = 10`.
`dimensions`	Optional, a `numeric` value or proportion of the dimensions to keep.
`verbose`	A `logical` value indicating whether to be verbose (`TRUE`) or not (`FALSE`; default).
`save.steps`	A `logical` value indicating whether to save the data for visualising the the shift steps if plotting the results (`TRUE`) or not (`FALSE`; default).

Details

For the three non-random shifts: "size", "density", "evenness" and "position", the function returns both of shifts as:

"size.inner" and "size.outer" removing data from the edges or the centre respectively (contracting the size and "hollowing" it respectively).
"density.higher" and "density.lower" removing data to increase or decrease density respectively (increasing/decreasing nearest neighbour distance).
"evenness.flattened" and "evenness.compacted" removing data to from the centre of the distribution or from the edges to resepectively "flatten" or "condense" the distribution.
"position.top" and "position.bottom" removing data from one side or the other of the space (the sides are selected from the point with lowest/highest scores on each dimensions respectively).

See figure 2 in Guillerme et al. 2020 for more details.

The default model is a linear model using the following function: model = function(data) lm(disparity ~ reduction, data) You can provide your own as long as it is a single function with data as a single argument. The two terms from data should be called reduction for the variable on the x axis and disparity for the variable on the y axis. For example: model = function(data) nls(disparity ~ a*reduction/(b+reduction), data) Note that models (like this example) should be specific to the dataset. Any type of model can be fitted but only the ones with an associated summary function will be correctly displayed by summary.dispRity. To not run any model, use model = NULL.

Value

This function outputs a dispRity object containing a list of simulated reductions in trait space. The results can be accessed through the usual S3 methods (print, summary, plot) or accessed directly through x$<name_of_the_shift> (e.g. x$random for the random shift results).

Author(s)

Thomas Guillerme

References

Examples

## Creating a 2D uniform space
space <- space.maker(300, 2, runif)

## A simple test with only 1 replicate for two shifts (random and size):
simple_test <- test.metric(space, metric = c(prod, ranges),
                           replicates = 1, shifts = c("random", "size")) 

## Summarising the tests
summary(simple_test)

## Visualising the test
plot(simple_test)

## Applying the test directly on a disparity object
data(disparity)
median_centroid_test <- test.metric(disparity, shifts = "size")

## Summarising the tests
summary(median_centroid_test)

## Visualising the test
plot(median_centroid_test)

## Not run: 
## Note that the tests can take several minutes to run.

## Testing the sum of variance on all shifts 
sum_var_test <- test.metric(space, metric = c(sum, variances),
                            shifts = c("random", "size", "density", "position"))

## Summarising the tests
summary(sum_var_test)

## Visualising the test
plot(sum_var_test)

## Creating a 2D uniform space
space <- space.maker(300, 2, runif)

## Re-running the test on two shifts with data saving for visualisation
median_centroid_test <- test.metric(space,
                                    metric = c(median, centroids),
                                    shifts = c("random", "size"),
                                    save.steps = TRUE)

## Visualising the tests results and display the shifts visualisation
plot(median_centroid_test)

## End(Not run)
 
## Creating a 2D uniform space
space <- space.maker(300, 2, runif)

## A simple test with only 1 replicate for two shifts (random and size):
simple_test <- test.metric(space, metric = c(prod, ranges),
                           replicates = 1, shifts = c("random", "size")) 

## Summarising the tests
summary(simple_test)

## Visualising the test
plot(simple_test)

## Applying the test directly on a disparity object
data(disparity)
median_centroid_test <- test.metric(disparity, shifts = "size")

## Summarising the tests
summary(median_centroid_test)

## Visualising the test
plot(median_centroid_test)

## Not run: 
## Note that the tests can take several minutes to run.

## Testing the sum of variance on all shifts 
sum_var_test <- test.metric(space, metric = c(sum, variances),
                            shifts = c("random", "size", "density", "position"))

## Summarising the tests
summary(sum_var_test)

## Visualising the test
plot(sum_var_test)

## Creating a 2D uniform space
space <- space.maker(300, 2, runif)

## Re-running the test on two shifts with data saving for visualisation
median_centroid_test <- test.metric(space,
                                    metric = c(median, centroids),
                                    shifts = c("random", "size"),
                                    save.steps = TRUE)

## Visualising the tests results and display the shifts visualisation
plot(median_centroid_test)

## End(Not run)

Extracting the age of nodes and tips in a tree.

Description

Extracting the age of each node and tip in a tree give the height of the tree or some specified age.

Usage

tree.age(tree, age, order = "past", fossil = TRUE, digits = 4)
tree.age(tree, age, order = "past", fossil = TRUE, digits = 4)

Arguments

`tree`	A `phylo` object.
`age`	The age of the tree. If missing the age is set to be the tree height.
`order`	Either "past" if the units express time since the present (e.g. million years ago), or "present" if the unit is expressed in time since the root.
`fossil`	`logical`, whether to always consider the tree as containing at least one living taxa (`TRUE`) or allowing only fossil taxa (`FALSE` - default), see details.
`digits`	A `numeric` value or `integer` for the precision of the output.

Details

When fossil = TRUE, if the tree contains a tree$root.time element (for tree's root age), and that order is set to "past", the output ages are adjusted to be starting from the root.time. Else, if no tree$root.time exists or fossil = FALSE, tips and nodes age is relative from the tip furthest away from the root. THIS FUNCTION DOES NOT ESTIMATE TREE AGES, it just extracts branch length information and converts it into time units. For basic dating functions in R, check chronos, chronopl, chronoMPL or use more specialised dating software (e.g. MrBayes, BEAST, RAxML, etc.).

Author(s)

Thomas Guillerme

Examples

## A dated random phylogeny with a root 50 units of time old.
tree.age(rtree(10), age = 50)
## A random tree with the distance since the root.
tree.age(rtree(10), order = 'present')

## A dated random phylogeny with a root 50 units of time old.
tree.age(rtree(10), age = 50)
## A random tree with the distance since the root.
tree.age(rtree(10), order = 'present')

Package 'dispRity'

Help Index

Measuring Disparity in R

Description

Author(s)

Add, remove or get trees (or subtrees)

Description

Usage

Arguments

Details

Author(s)

See Also

Examples

adonis dispRity (from vegan::adonis2)

Description

Usage

Arguments

Details

Author(s)

References

See Also

Examples

Apply inapplicable characters to a matrix.

Description

Usage

Arguments

Details

Author(s)

See Also

Examples

as.covar

Description

Usage

Arguments

Details

Author(s)

See Also

Examples

Beck and Lee 2014 datasets

Description

Format

Details

References

See Also

BeckLee_disparity

Description

Format

Details

See Also

Examples

Bhattacharyya Coefficient

Description

Usage

Arguments

Author(s)

References

See Also

Examples

Bootstraps and rarefies data.

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Character differences

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Charadriiformes

Description

Format

adonis dispRity (from `vegan::adonis2`)