The “inside cell” ratio

Let \( C_s = \{c _1 , c_2 , …d_n \}\) be the
set of cells at scale \(s\) where the occurrences of a node X where found. The \(C _{s−1} = \{d_ 1 , d_ 2 , …d _k \}\) is
the corresponding set of cells at an upper scale (ancestor of \(s \)) where the occurrences of a node X where found.

Note that the ratio:
\(r_s = \frac{\#C_{s-1}}{\#C_s}\)

gives us an indicator of how the occurrences are dispersed in the space.

If \(r_s\) is low means that the
spatial distribution is constrained in a region as small as the unit area of the upper scale while if \(r_s\) is close to 1 it tells us that the occurrences are as spatially distributed as the cells in the upper scale.
The method can be applied recursively to the sucessive scales to obtain a list of ratios \(r_1 , r_2 , ..r_s ,.. \) that can be fitted in model to estimate geographic extensions.

A model for presence-only data

This is the set-up of a conditional auto-logistic regressive model (CAR) for predicting species presence using a sample signal and a presence-only data.


Let \(Sp \) be a species and \(Y \) and \(X \) two random variables corresponding to the events of: \(Sp \) is in location \(x_i\) and: location \(x_i \) has been sampled. (The variable \(X\) and \(x_i\) are not related)

\(Y \) and \(X \) are consider to be independent binary (Bernoulli) variables conditional to the latent processes \(S \) and \(P\) respectively.

Latent processes

These variables are modeled as this:

$$ logit(S_k) = \beta_s^t d_s(x_k) + \psi_k +O_k $$

$$ logit(P_k) = \beta_p^t d_p(x_k) + \psi_k + O_k $$

Where \(O_k\) is an offset term, \(d_p(x_k), d_s(x_k)\) are the covariates for p and s respectively; and \(\psi_k\) is modeled as a Gaussian Markov Random Field.

Common Gaussian Markov Field (CGMRF)

This process is modeled as this:

$$ \psi_k = \phi_k + \theta_k $$

$$ \phi_k | \phi_{-k}, \mathbb{W},\tau^2 \sim N \left( \frac{\sum_{i=1}^{K} w_{ki} \phi_i}{\sum_{i=1}^{K}w_{ki}}, \frac{\tau^2}{\sum_{i=1}^{K}w_{ki}}\right) $$
$$\theta_k \sim N\left(0, \sigma^2\right)$$

$$\tau^2, \sigma^2 \sim^{iid} Inv.Gamma(a,b) $$

Where \(\mathbb{W}\) is the adjacency matrix of the lattice, \(\theta_k\) is an independent noise term with constant variance. \(sigma^2\) and \(\tau^2\) are independent and identically distributed hyperparameters sampled from an inverse gamma distribution.

The corresponding Directed acyclic Graph can be seen in the next figure.


A current implementation of this model can be found here:
Of particular interest is the file: joint.binomial.bymCAR.R where you can find the joint sample between line 113 and 149.