Skip to contents

Maximum likehood estimation under Several examined and concealed States-dependent Speciation and Extinction (SecSSE) where some parameters are functions of other parameters and/or factors.

Usage

secsse_ml_func_def_pars(
  phy,
  traits,
  num_concealed_states,
  idparslist,
  idparsopt,
  initparsopt,
  idfactorsopt,
  initfactors,
  idparsfix,
  parsfix,
  idparsfuncdefpar,
  functions_defining_params = NULL,
  cond = "proper_cond",
  root_state_weight = "proper_weights",
  sampling_fraction,
  tol = c(1e-04, 1e-05, 1e-07),
  maxiter = 1000 * round((1.25)^length(idparsopt)),
  optimmethod = "simplex",
  num_cycles = 1,
  loglik_penalty = 0,
  is_complete_tree = FALSE,
  take_into_account_root_edge = FALSE,
  num_threads = 1,
  atol = 1e-08,
  rtol = 1e-06,
  method = "odeint::runge_kutta_cash_karp54",
  return_root_state = FALSE
)

Arguments

phy

phylogenetic tree of class phylo, rooted and with branch lengths. Alternatively, multiple phylogenetic trees can be provided as the multiPhylo class.

traits

vector with trait states for each tip in the phylogeny. The order of the states must be the same as the tree tips. For help, see vignette("starting_secsse", package = "secsse"). When providing a multiPhylo set of multiple phylognies, traits should be a list where each entry in the list corresponds to the matching phylogeny on that position.

num_concealed_states

number of concealed states, generally equivalent to the number of examined states in the dataset.

idparslist

overview of parameters and their values.

idparsopt

a numeric vector with the ID of parameters to be estimated.

initparsopt

a numeric vector with the initial guess of the parameters to be estimated.

idfactorsopt

id of the factors that will be optimized. There are not fixed factors, so use a constant within functions_defining_params.

initfactors

the initial guess for a factor (it should be set to NULL when no factors).

idparsfix

a numeric vector with the ID of the fixed parameters.

parsfix

a numeric vector with the value of the fixed parameters.

idparsfuncdefpar

id of the parameters which will be a function of optimized and/or fixed parameters. The order of id should match functions_defining_params.

functions_defining_params

a list of functions. Each element will be a function which defines a parameter e.g. id_3 <- (id_1 + id_2) / 2. See example.

cond

condition on the existence of a node root: "maddison_cond", "proper_cond" (default). For details, see vignette.

root_state_weight

the method to weigh the states: "maddison_weights", "proper_weights" (default) or "equal_weights". It can also be specified for the root state: the vector c(1, 0, 0) indicates state 1 was the root state. When using a multiPhylo object, root_state_weight should be list where each entry in the list corresponds to the root_state_weight for each tree.

sampling_fraction

vector that states the sampling proportion per trait state. It must have as many elements as there are trait states. When using a multiPhylo object, sampling fraction should be list where each entry in the list corresponds to the sampling proportion for each tree.

tol

A numeric vector with the maximum tolerance of the optimization algorithm. Default is c(1e-04, 1e-05, 1e-05).

maxiter

max number of iterations. Default is 1000 * round((1.25) ^ length(idparsopt)).

optimmethod

A string with method used for optimization. Default is "simplex". Alternative is "subplex". Both are called from DDD::optimizer(), simplex is implemented natively in DDD, while subplex is ultimately called from subplex::subplex().

num_cycles

Number of cycles of the optimization. When set to Inf, the optimization will be repeated until the result is, within the tolerance, equal to the starting values, with a maximum of 10 cycles.

loglik_penalty

the size of the penalty for all parameters; default is 0 (no penalty).

is_complete_tree

logical specifying whether or not a tree with all its extinct species is provided. If set to TRUE, it also assumes that all all extinct lineages are present on the tree. Defaults to FALSE.

take_into_account_root_edge

if TRUE, the LL integration is continued along the root edge. This also affects conditioning (as now, conditioning no longer needs to assume a speciation event at the start of the tree)

num_threads

number of threads to be used. Default is one thread.

atol

A numeric specifying the absolute tolerance of integration.

rtol

A numeric specifying the relative tolerance of integration.

method

integration method used, available are: "odeint::runge_kutta_cash_karp54", "odeint::runge_kutta_fehlberg78", "odeint::runge_kutta_dopri5", "odeint::bulirsch_stoer" and "odeint::runge_kutta4". Default method is: "odeint::runge_kutta_cash_karp54".

return_root_state

if TRUE, returns the state of the system at the root, this can be useful to use as the starting point of a simulation. When used in ML, after finishing the ML optimization, the found optimum is evaluated one more time to retrieve the root state (to avoid having to store the root state every ML evaluation).

Value

Parameter estimates and maximum likelihood

Examples

# Example of how to set the arguments for an ML search. The ML search is stopped
# after 10 iterations to keep run time short.
library(secsse)
library(DDD)
set.seed(16)
phylotree <- ape::rbdtree(0.07,0.001,Tmax=50)
startingpoint<-bd_ML(brts = ape::branching.times(phylotree))
#> You are optimizing lambda0 mu0 
#> You are fixing lambda1 mu1 
#> Optimizing the likelihood - this may take a while. 
#> The loglikelihood for the initial parameter values is -106.425581129189.
#> 
#> Maximum likelihood parameter estimates: lambda0: 0.056537, mu0: 0.000000, lambda1: 0.000000, mu1: 0.000000: 
#> Maximum loglikelihood: -104.592844
intGuessLamba <- startingpoint$lambda0
intGuessMu <- startingpoint$mu0
traits <- sample(c(0,1,2), ape::Ntip(phylotree),replace=TRUE) #get some traits
num_concealed_states<-3
idparslist<-id_paramPos(traits, num_concealed_states)
idparslist[[1]][c(1,4,7)] <- 1
idparslist[[1]][c(2,5,8)] <- 2
idparslist[[1]][c(3,6,9)] <- 3
idparslist[[2]][] <- 4
masterBlock <- matrix(c(5,6,5,6,5,6,5,6,5),ncol = 3,nrow = 3,byrow = TRUE)
diag(masterBlock) <- NA
diff.conceal <- FALSE
idparslist[[3]] <- q_doubletrans(traits,masterBlock,diff.conceal)
idparsfuncdefpar <- c(3,5,6)
idparsopt <- c(1,2)
idparsfix <- c(0,4)
initparsopt <- c(rep(intGuessLamba,2))
parsfix <- c(0,0)
idfactorsopt <- 1
initfactors <- 4
# functions_defining_params is a list of functions. Each function has no
# arguments and to refer
# to parameters ids should be indicated as "par_" i.e. par_3 refers to
# parameter 3. When a function is defined, be sure that all the parameters
# involved are either estimated, fixed or
# defined by previous functions (i.e, a function that defines parameter in
# 'functions_defining_params'). The user is responsible for this. In this
# exampl3, par_3 (i.e., parameter 3) is needed to calculate par_6. This is
# correct because par_3 is defined in
# the first function of 'functions_defining_params'. Notice that factor_1
# indicates a value that will be estimated to satisfy the equation. The same
# factor can be shared to define several parameters.
functions_defining_params <- list()
functions_defining_params[[1]] <- function(){
 par_3 <- par_1 + par_2
}
functions_defining_params[[2]] <- function(){
 par_5 <- par_1 * factor_1
}
functions_defining_params[[3]] <- function(){
 par_6 <- par_3 * factor_1
}

tol = c(1e-03, 1e-03, 1e-03)
optimmethod = "simplex"
cond<-"proper_cond"
root_state_weight <- "proper_weights"
sampling_fraction <- c(1,1,1)
maxiter <- 10
model <- secsse_ml_func_def_pars(phylotree,
traits,
num_concealed_states,
idparslist,
idparsopt,
initparsopt,
idfactorsopt,
initfactors,
idparsfix,
parsfix,
idparsfuncdefpar,
functions_defining_params,
cond,
root_state_weight,
sampling_fraction,
tol,
maxiter,
optimmethod,
num_cycles = 1)
#> Note: you set some transitions as impossible to happen.
#> 1 0.0565371031208431 0.0565371031208431 4 -138.298934553027 initial 
#> 2 0.0593639582768853 0.0509284124003145 4.20000000000001 -137.992309626735 expand 
#> 3 0.0537252944328596 0.0509284124003145 4.41666666666667 -137.649434791941 expand 
#> 4 0.0565371031208431 0.0453789553777978 4.90909090909091 -137.365580447333 expand 
#> 5 0.0565371031208431 0.0344540282301516 5.19047619047621 -136.872942801612 expand 
#> 6 0.051859055237338 0.0362589643945071 5.61016949152544 -136.832868258171 reflect 
#> 7 0.056223939100232 0.0267047506917309 6.31250000000001 -136.693662720223 reflect 
#> 8 0.056223939100232 0.0267047506917309 6.31250000000001 -136.693662720223 contract outside 
#> 9 0.056223939100232 0.0267047506917309 6.31250000000001 -136.693662720223 reflect 
#> 10 0.056223939100232 0.0267047506917309 6.31250000000001 -136.693662720223 contract inside 
#> 11 0.056223939100232 0.0267047506917309 6.31250000000001 -136.693662720223 contract outside 
#> Maximum number of iterations has been exceeded. 
#> Warning: Optimization has not converged. Try again with different initial values or increase the number of iterations.
print(model$ML)
#> NULL
# ML -136.45265 if run till completion