A model for presence-only data

This is the set-up of a conditional auto-logistic regressive model (CAR) for predicting species presence using a sample signal and a presence-only data.


Let \(Sp \) be a species and \(Y \) and \(X \) two random variables corresponding to the events of: \(Sp \) is in location \(x_i\) and: location \(x_i \) has been sampled. (The variable \(X\) and \(x_i\) are not related)

\(Y \) and \(X \) are consider to be independent binary (Bernoulli) variables conditional to the latent processes \(S \) and \(P\) respectively.

Latent processes

These variables are modeled as this:

$$ logit(S_k) = \beta_s^t d_s(x_k) + \psi_k +O_k $$

$$ logit(P_k) = \beta_p^t d_p(x_k) + \psi_k + O_k $$

Where \(O_k\) is an offset term, \(d_p(x_k), d_s(x_k)\) are the covariates for p and s respectively; and \(\psi_k\) is modeled as a Gaussian Markov Random Field.

Common Gaussian Markov Field (CGMRF)

This process is modeled as this:

$$ \psi_k = \phi_k + \theta_k $$

$$ \phi_k | \phi_{-k}, \mathbb{W},\tau^2 \sim N \left( \frac{\sum_{i=1}^{K} w_{ki} \phi_i}{\sum_{i=1}^{K}w_{ki}}, \frac{\tau^2}{\sum_{i=1}^{K}w_{ki}}\right) $$
$$\theta_k \sim N\left(0, \sigma^2\right)$$

$$\tau^2, \sigma^2 \sim^{iid} Inv.Gamma(a,b) $$

Where \(\mathbb{W}\) is the adjacency matrix of the lattice, \(\theta_k\) is an independent noise term with constant variance. \(sigma^2\) and \(\tau^2\) are independent and identically distributed hyperparameters sampled from an inverse gamma distribution.

The corresponding Directed acyclic Graph can be seen in the next figure.


A current implementation of this model can be found here:
Of particular interest is the file: joint.binomial.bymCAR.R where you can find the joint sample between line 113 and 149.