How to add new spatial data source in Biospytial

Biospytial is a Knowledge Engine that merges different data in order to model aspects of Biodiversity using geostatistical framework. Biospytial is still in testing mode. This means that at this point, many features for easing the maintenance, provisioning and configuration are not developed yet. However, it is still a powerful tool to extract, analyse and model complex macro-ecological patterns.

Aims of this tutorial

In this tutorial you will find a rough guide on how to install new data sources. We will install a vector-based data source called: ‘global_ecoregions’ and raster-based data source: ‘World Population for Latin America”.

Assumptions

Here I will assume that you have a fully installed/running Biospytial Suite. i.e. the containers:

  • Geoprocessing-Backend (GBP)
  • Graph-Computing-Engine (GCE)
  • Biospytial-Client. (BPE)
  • The data sources downloaded and allocated in an accessible path from the Biospytial Client.

Converting the data to a Django Models

For handling the data, Biospytial uses the ORM model for accessing geospatial data stored in the Geoprocessing-Backend. For achieving this, we need to specify a Class called Model using a given datasource. In other words, each datasource needs to have a class specification. For communicating with the Relational Database manager.

Vector data

We will make use of the tool ‘ogrinspect’ to generate the model definition for a shapefile file and follow these steps.

  1. Login to Biosptial-Client session (the bash shell and not the ipython environment).
  2. Locate the path where the data is stored. In this case we are interested in adding the datasource ‘terr-ecoregions-TNC’ which has an ESRI-Shapefile format.

Ingest the shapefile into the GPB

  • We will make use of the ‘LayerMapping’ utility.
  • Use the tool ‘ogrinspect’ described in the manage.py module inside the folder ‘\apps’ where all the Biospytial source is located. The general syntaxis of this command is:
    python manage.py ogrinspect [options] [options]

For this example:

python manage.py ogrinspect /mnt/data1/maps/terr-ecoregions-TNC/tnc_terr_ecoregions.shp TerrEcoregions --srid=4326 --mapping --multi

Where:

  • –srid=4326 option sets the SRID for the geographic field.
  • –mapping option tells ogrinspect to also generate a mapping dictionary for use with LayerMapping.
  • –multi option is specified so that the geographic field is a MultiPolygonField instead of just a PolygonField.

More information in: (https://docs.djangoproject.com/en/2.0/ref/contrib/gis/tutorial/)

The command will print in the standard out (screen) the class definition for this dataset. If we decided to use the ‘–mapping’ option it will include as well a dictionary with a standarized format for the column names.

Export ShaPefile into the Database (Geoprocessing Container)

We will use the ‘LayerMapping’ utility to make this process faster. The first thing is to edit or create the file ‘load_shapefiles.py’ inside the ‘ecoregions’ app.

We will define here the mapping names dictionary (see above) and the necessary code to insert the shapefile into the database.

This is the content of the file ‘load_shapefile.py’ ”’

!/usr/bin/env python
-- coding: utf-8 --

from future import absolute_import, division, print_function, unicode_literals 
import os from django.contrib.gis.utils 
import LayerMapping from .models 
import TerrEcoregions from biospytial 
import settings 

""" Functions for exporting shapefiles into the Postgis Database. """

author = "Juan Escamilla Mólgora"
copyright = "Copyright 2018, JEM"
license = "GPL" 
mantainer = "Juan" 
email ="molgor@gmail.com"

#Generated by ogrinspect

terrecoregions_mapping = { 'eco_id_u' : 'ECO_ID_U', 'eco_code' : 'ECO_CODE', 'eco_name' : 'ECO_NAME', 'eco_num' : 'ECO_NUM', 'ecode_name' : 'ECODE_NAME', 'cls_code' : 'CLS_CODE', 'eco_notes' : 'ECO_NOTES', 'wwf_realm' : 'WWF_REALM', 'wwf_realm2' : 'WWF_REALM2', 'wwf_mhtnum' : 'WWF_MHTNUM', 'wwf_mhtnam' : 'WWF_MHTNAM', 'realmmht' : 'RealmMHT', 'er_update' : 'ER_UPDATE', 'er_date_u' : 'ER_DATE_U', 'er_ration' : 'ER_RATION', 'sourcedata' : 'SOURCEDATA', 'geom' : 'MULTIPOLYGON', }

file_shp = os.path.abspath( os.path.join(settings.PATH_RAWDATASOURCES, 'terr-ecoregions-TNC', 'tnc_terr_ecoregions.shp'), )

def run(verbose=True):
    lm = LayerMapping( TerrEcoregions, file_shp, terrecoregions_mapping,           transform=False, ) 
    lm.save(strict=True, verbose=verbose) '''

To load the layer, log into the Biospytial iPython environment:

python manage.py shell' from ecoregions import load_shapefiles load_shapefiles.run() '

Example 2: add Roads Layer

I’ve downloaded the roads in Shapefile format from: http://www.conabio.gob.mx/informacion/gis/maps/geo/carre1mgw.zip

Using ‘ogrinspect’ tool we have the following:

"""
This is an auto-generated Django model module created by ogrinspect.

from django.contrib.gis.db import models

class MexRoads(models.Model): 
    fnode_field = models.BigIntegerField()
    tnode_field = models.BigIntegerField()
    lpoly_field = models.BigIntegerField() 
    rpoly_field = models.BigIntegerField() 
    length = models.FloatField() 
    cov_field = models.BigIntegerField() 
    cov_id = models.BigIntegerField() 
    geom = models.MultiLineStringField(srid=4326)

#Auto-generated LayerMapping dictionary for MexRoads model

mexroads_mapping = { 'fnode_field' : 'FNODE_',
 'tnode_field' : 'TNODE_',
 'lpoly_field' : 'LPOLY_',
 'rpoly_field' : 'RPOLY_',
 'length' : 'LENGTH',
 'cov_field' : 'COV_',
 'cov_id' : 'COV_ID',
 'geom' : 'MULTILINESTRING'
 } """

 

We need to include this layer in an app. For ease of use.

We will load it in the ‘sketches’ app, i.e. sketches/models.py

Load the Shapefile into the postgis database using shp2psql (Alternative Method. It Works!)

These were the commands used. Inside the Postgis Container:

shp2pgsql -I -s 4326 carre1mgw.shp MexRoads > mex_roads.sql * psql -d biospytial -U biospytial -h localhost -f mex_roads.sql

Adding Raster data

We will use the raster support from postgis > 2.x . We will use the script: migrateToPostgis.bash inside: /apps/raster_api/bash_raster_tools/bash_scripts

However, the tools for ingesting data into the database are stored in the Geospatial Processing Container. We need to log into this container and run the above file.
You can copy the ‘bash_raster_tools’ inside this container and run the command ‘migrateToPostgis.bash’.

Example:

'/mnt/data1/bash_raster_tools/bash_scripts/migrateToPostgis.bash [RasterData.tif]'

After this, the raster data is already stored in the Database.

How to add a class definition for a Loaded Raster Layer

We need to explicitly add the Model /Class definition inside the file: ‘raster_api/models.py’

The base class is GenericRaster. We need to extend this class into a new definition.

Follow this example:

class DistanceToRoadMex(GenericRaster):
    """
    ..
    Abstract model for all the Distance to Road datasource.
    
    Attributes
    ==========
    I'll us the default attributes given by the raster2pgsql
    id : int Unique primary key
        This is the identification number of each element in the mesh.
    
    """
    number_bands = 1
    neo_label_name = 'Dist_to_road_mex'
    link_type_name = 'HAS_A_DISTANCE_OF'
    units = '(meters)'
    
    class Meta:
        managed = False
        db_table = 'dist_map_wgs84_clip'
    
    def __str__(self):
        c = "< Distance to Road Raster Data: %s >"
        return c
 

Add the new model into the ‘raster_models_dic’ dictionary:

raster_models_dic = {
'WindSpeed' : raster_models[7],
'Elevation' : raster_models[0],
'Vapor' : raster_models[6],
'MaxTemperature' : raster_models[5] ,
'MinTemperature' : raster_models[4] ,
'MeanTemperature' : raster_models[3] ,
'SolarRadiation' : raster_models[2], 
'Precipitation' : raster_models[1], 
'WorldPopLatam2010' : raster_models[8] ,
'DistanceToRoadMex' : raster_models[9],
}

The data is ready to be used in Biospytial.

Testing

See the ‘notebook inside’ the ‘raster_api’ module and run interactively 🙂

 

 

Published by

Juan Escamilla Mólgora

I'm a mathematical and computational statistical ecologist working at the intersection of Spatial Statistics, Software Development, Machine Learning and Cloud Computing. I'm researching novel methods for integration, harmonization and modelling of big environmental data. I developed the Wild Fire Alert and Monitoring System for Mexico and Central America

3 thoughts on “How to add new spatial data source in Biospytial

  1. Didn’t work for me.
    I needed to use the shp2pgsql tool like this.
    1. First export to sql format with
    ‘ shp2pgsql -I -s 4326 tnc_terr_ecoregions.shp TerrEco > terr_eco.sql ‘
    2. Import it with:
    psql -d biospytial -U biospytial -h localhost -f terr_eco.sql

    Inside the DBMS container.

    1. It works now. You need to use load the new layers/data inside the Geoprocessing Container, not the Biospytial Client.
      Cheers!

  2. I needed to add an id field into the class.
    # This is an auto-generated Django model module created by ogrinspect.
    from django.contrib.gis.db import models

    class LandUseConabio(models.Model):
    id = models.AutoField(primary_key=True,db_column=”gid”)
    area = models.FloatField()
    perimeter = models.FloatField()
    cov_field = models.BigIntegerField()
    cov_id = models.BigIntegerField()
    agrupado = models.CharField(max_length=40)
    tipos = models.CharField(max_length=80)
    geom = models.MultiPolygonField(srid=4326)

    # Auto-generated `LayerMapping` dictionary for LandUseConabio model
    landuseconabio_mapping = {
    ‘area’ : ‘AREA’,
    ‘perimeter’ : ‘PERIMETER’,
    ‘cov_field’ : ‘COV_’,
    ‘cov_id’ : ‘COV_ID’,
    ‘agrupado’ : ‘AGRUPADO’,
    ‘tipos’ : ‘TIPOS’,
    ‘geom’ : ‘MULTIPOLYGON’,
    }

Leave a Reply to Juan Escamilla Molgora Cancel reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.