A model for presence-only data

This is the set-up of a conditional auto-logistic regressive model (CAR) for predicting species presence using a sample signal and a presence-only data.

Set-up

Let \(Sp \) be a species and \(Y \) and \(X \) two random variables corresponding to the events of: \(Sp \) is in location \(x_i\) and: location \(x_i \) has been sampled. (The variable \(X\) and \(x_i\) are not related)

\(Y \) and \(X \) are consider to be independent binary (Bernoulli) variables conditional to the latent processes \(S \) and \(P\) respectively.

Latent processes

These variables are modeled as this:

$$ logit(S_k) = \beta_s^t d_s(x_k) + \psi_k +O_k $$

$$ logit(P_k) = \beta_p^t d_p(x_k) + \psi_k + O_k $$

Where \(O_k\) is an offset term, \(d_p(x_k), d_s(x_k)\) are the covariates for p and s respectively; and \(\psi_k\) is modeled as a Gaussian Markov Random Field.

Common Gaussian Markov Field (CGMRF)

This process is modeled as this:

$$ \psi_k = \phi_k + \theta_k $$

$$ \phi_k | \phi_{-k}, \mathbb{W},\tau^2 \sim N \left( \frac{\sum_{i=1}^{K} w_{ki} \phi_i}{\sum_{i=1}^{K}w_{ki}}, \frac{\tau^2}{\sum_{i=1}^{K}w_{ki}}\right) $$
$$\theta_k \sim N\left(0, \sigma^2\right)$$

$$\tau^2, \sigma^2 \sim^{iid} Inv.Gamma(a,b) $$

Where \(\mathbb{W}\) is the adjacency matrix of the lattice, \(\theta_k\) is an independent noise term with constant variance. \(sigma^2\) and \(\tau^2\) are independent and identically distributed hyperparameters sampled from an inverse gamma distribution.

The corresponding Directed acyclic Graph can be seen in the next figure.

Implementation

A current implementation of this model can be found here:
Of particular interest is the file: joint.binomial.bymCAR.R where you can find the joint sample between line 113 and 149.


Published by

Juan Escamilla Mólgora

I'm a mathematical and computational statistical ecologist working at the intersection of Spatial Statistics, Software Development, Machine Learning and Cloud Computing. I'm researching novel methods for integration, harmonization and modelling of big environmental data. I developed the Wild Fire Alert and Monitoring System for Mexico and Central America

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.