Evaluates prediction goodness of fit — evaluateModelFit • webSDM

Evaluate goodness of fit by comparing a true versus a predicted dataset of species distribution. Ypredicted is typically predicted using a prediction method of trophicSDM (in cross-validation if trophicSDM_CV() is used).

evaluateModelFit(tSDM, Ynew = NULL, Ypredicted = NULL)

Arguments

tSDM: A trophicSDMfit object obtained with trophicSDM().
Ynew: A sites x species matrix containing the true species occurrences state. If set to NULL (default), it is set to the species distribution data Y on which the model is fitted.
Ypredicted: A sites x species matrix containing the predicted species occurrences state. If set to NULL (default), it is set to the fitted values, i.e. predictions on the dataset used to train the model.

Value

A table specifying the goodness of fit metrics for each species. For presence-absence data, the model computes TSS and AUC. For Gaussian data, the R2.

References

Grace, J. B., Johnson, D. J., Lefcheck, J. S., and Byrnes, J. E. K.. 2018. Quantifying relative importance: computing standardized effects in models with binary outcomes. Ecosphere 9(6):e02283.

Author

Giovanni Poggiato

Examples

data(Y, X, G)
# define abiotic part of the model
env.formula = "~ X_1 + X_2"
# Run the model with bottom-up control using stan_glm as fitting method and no penalisation
# (set iter = 1000 to obtain reliable results)
m = trophicSDM(Y, X, G, env.formula, iter = 20,
               family = binomial(link = "logit"), penal = NULL, 
               mode = "prey", method = "stan_glm")
#> Warning: The largest R-hat is 1.69, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#> Warning: Markov chains did not converge! Do not analyze results!
#> Warning: The largest R-hat is 1.53, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#> Warning: Markov chains did not converge! Do not analyze results!
#> Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 3.01, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#> Warning: Markov chains did not converge! Do not analyze results!
#> Warning: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 2.44, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#> Warning: Markov chains did not converge! Do not analyze results!
#> Warning: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 3.01, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#> Warning: Markov chains did not converge! Do not analyze results!
#> Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 2.37, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#> Warning: Markov chains did not converge! Do not analyze results!
# Evaluate the quality of model predictions on the training
# Predict (fullPost = FALSE) as we used stan_glm to fit the model
# but here we are only intested in the posterior mean
Ypred = predict(m, fullPost = FALSE)
# format predictions to obtain a sites x species dataset whose
# columns are ordered as Ynew
Ypred = do.call(cbind,
                lapply(Ypred, function(x) x$predictions.mean))
                
Ypred = Ypred[,colnames(Y)]
evaluateModelFit(m, Ynew = Y, Ypredicted = Ypred)
#>         auc        tss species
#> 1 0.6733574 0.28030560      Y1
#> 2 0.7047471 0.31164779      Y2
#> 3 0.6686958 0.27789398      Y3
#> 4 0.5997744 0.16842425      Y4
#> 5 0.5710803 0.14031106      Y5
#> 6 0.5226098 0.07784933      Y6

# Note that this is equivalent to `evaluateModelFit(m)`
# If we fitted the model using "glm"
m = trophicSDM(Y, X, G, env.formula,
               family = binomial(link = "logit"), penal = NULL, 
               mode = "prey", method = "glm")
Ypred = predict(m, fullPost = FALSE)
# format predictions to obtain a sites x species dataset whose
# columns are ordered as Ynew
Ypred = do.call(cbind, Ypred)
Ypred = Ypred[,colnames(Y)]

evaluateModelFit(m, Ynew = Y, Ypredicted = Ypred)
#>         auc        tss species
#> 1 0.6735526 0.27860781      Y1
#> 2 0.7048508 0.31312676      Y2
#> 3 0.6690564 0.27931242      Y3
#> 4 0.6007825 0.18707494      Y4
#> 5 0.5965700 0.18247163      Y5
#> 6 0.5219250 0.07426284      Y6
# Note that this is equivalent to:
# \donttest{
evaluateModelFit(m)
#> You did not provide Ynew, the observed species distribution Y is used as default.
#> You did not provide Ypredicted, species predictions are obtained using predict()
#>         auc       tss species
#> 1 0.6735526 0.2786078      Y1
#> 2 0.7048508 0.3131268      Y2
#> 3 0.6690564 0.2793124      Y3
#> 4 0.6035269 0.1740171      Y4
#> 5 0.5758344 0.1360235      Y5
#> 6 0.5393439 0.0921997      Y6
# }