H Setting up an SQL API

Last updated: 2023-01-05 18:18:21

H.1 Introduction

This tutorial summarizes the steps for setting up a dynamic server with an SQL API (Section 9.2.3). The SQL API exposes a PostGIS database through HTTP, which makes it possible to get subsets of spatial layers and display them on a Leaflet map (Chapters 911) or to submit drawn layers for permanent storage (Chapter 13).

Setting up the SQL API requires going through the following steps:

  • Creating a cloud server (Section H.2)
  • Installing PostgreSQL and PostGIS (Section H.3)
  • Loading data into the database (Section H.4)
  • Adding a read-only database user (Section H.5)
  • Installing Node.js and required modules (Section H.6)
  • Testing the SQL API (Section H.9)
  • Domain name and HTTPS (optional) (Section H.10)

These instructions are more advanced than the material given in the main part of the book. Importantly, the tutorial assumes you are comfortable with using the Linux command line, such as logging-in to a remote Linux machine via SSH, navigating through directories, installing software, etc. If you are not familiar with the software components being used, be prepared for some amount of trial and error, which is part of the learning process:

  • If you already have experience with the type of tasks listed above, completing this entire tutorial may take at least 2-3 hours.
  • If you are new to this type of tasks, allocate at least a few days.

H.2 Creating a cloud server

The first thing we need to take care of is having a computer with a fixed IP address, where the SQL API will run and clients will send their requests to. You can use a physical machine for this, however due to technical considerations (see Section 5.6.2.5) the best option is to use a cloud service.

Among cloud pervive providers, one of the simplest and the one we are using in this tutorial is DigitalOcean. We can launch the smallest virtual machine, or droplet in DigitalOcean terminology, for as little as 6$ a month. The machine will have just 1 CPU, 1GB of RAM, and 25GB of disk space, but this is enough for demonstration purposes (Figure H.1). The default operating system—Ubuntu 22.04 at the time of writing—can be used.

Selecting DigitalOcean droplet size

FIGURE H.1: Selecting DigitalOcean droplet size

Choose a root password, which you will use to log in to the droplet. Once the droplet is initiated, you need to write down its IP address. From now on you need to connect to the droplet through SSH to log into the computer and gain access to its command line interface. Everything from now on is done through the command line.

Connecting through SSH will look as follows:

ssh root@46.101.193.143

where 46.101.193.143 is the droplet/server IP address. You will be prompted for the root password (the one you entered when creating the droplet).

The first thing to do is create a new regular , i.e., non-root, user:

adduser michael

And add it to the sudo group:

usermod -aG sudo michael

Log out of the server (type exit). Now, log in with the regular user through SSH (Figure H.2):

ssh michael@46.101.193.143
PostGIS database query example

FIGURE H.2: PostGIS database query example

H.3 Installing PostgreSQL and PostGIS

The first step is to install the PostgreSQL database and enable the PostGIS extension. The instructions for installing PostgreSQL are based on the DigitalOcean tutorial How To Install PostgreSQL on Ubuntu 22.04.

First, we install the required packages:

sudo apt install postgresql postgresql-contrib
sudo apt install postgis postgresql-14-postgis-3

Log in to the database as postgres user:

sudo -u postgres psql

Create a database named gisdb:

CREATE DATABASE gisdb;
\connect gisdb;

To enable remote connections to the database we need to do the following.

Log in:

sudo -u postgres psql

then:

ALTER SYSTEM SET listen_addresses='*';
\q

Edit the pg_hba.conf file:

sudo nano /etc/postgresql/14/main/pg_hba.conf

Scroll to the bottom of the pg_hba.conf file and add a line like this (right after the local rules):

hostssl    all             all             0.0.0.0/0               md5

Press Ctrl+X to save your changes, then Y to write them to the file, and Enter to exit.

Restart PostgreSQL:

sudo service postgresql restart 

Create a superuser:

CREATE ROLE mysuperuser LOGIN PASSWORD 'whatever' SUPERUSER;
\q

Log in as the new user:

psql -h localhost -U mysuperuser gisdb

Setup PostGIS:

CREATE EXTENSION postgis;

Try running a spatial query to see if PostGIS is operational:

SELECT 
  (ST_SetSRID(ST_MakePoint(0, 0), 4326)::geography <->
  ST_SetSRID(ST_MakePoint(1, 0), 4326)::geography) / 1000
AS dist_km;

If PostGIS is working, you should see the following output:

      dist_km       
--------------------
 111.19507973463158
(1 row)

H.4 Loading data into the database

Now that the database is set, you need to add some data. There are many ways to do that. We will use R.

Assuming R is installed, open the R console on a different computer, the one with data you want to insert into the database:

R

Read the plants.geojson file (download) and import it to the database from R using the following script:

library(sf)
library(RPostgreSQL) 

# Connect
con = dbConnect(
  PostgreSQL(), 
  dbname = "gisdb", 
  host = "46.101.193.143", 
  port = 5432, 
  user = "mysuperuser", 
  password = "whatever"
)

# Import table to database
dat = st_read("plants.geojson")
st_geometry(dat) = "geom"  ## According to PostGIS convention
st_write(obj = dat, dsn = con, layer = "plants", delete_layer = TRUE)

# Disconnect
dbDisconnect(con)

# Quit R
q()

Let’s test if the database works properly.

Log in to the database on the server (if you are not already):

psql -h localhost -U mysuperuser gisdb

Then run the following query:

SELECT id, name_lat, ST_AsText(geom) FROM plants LIMIT 5;

If all works well, you should see a result such as the following (Figure H.3):

   id   |    name_lat    |         st_astext          
--------+----------------+----------------------------
 321432 | Iris haynei    | POINT(35.679761 32.770133)
 321433 | Iris haynei    | POINT(35.654005 32.741372)
 321456 | Iris atrofusca | POINT(35.193367 31.44711)
 321457 | Iris atrofusca | POINT(35.189142 31.514754)
 321459 | Iris vartanii  | POINT(35.139696 31.474149)
(5 rows)
PostGIS database query example

FIGURE H.3: PostGIS database query example

H.5 Adding a read-only database user

To make sure users cannot do harmful changes in the database through the SQL API, it is useful to create a read-only user.

Log in to database:

psql -h localhost -U mysuperuser gisdb
CREATE ROLE readonlyuser WITH LOGIN PASSWORD 'whatever2'
NOSUPERUSER INHERIT NOCREATEDB NOCREATEROLE NOREPLICATION VALID UNTIL 'infinity';

Assign permission to this read only user:

GRANT CONNECT ON DATABASE gisdb TO readonlyuser;
GRANT USAGE ON SCHEMA public TO readonlyuser;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonlyuser;
GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO readonlyuser;

Now, log out from the database, and log in again as the new user:

psql -h localhost -U readonlyuser gisdb

and try:

DROP TABLE plants;

If all works well, you should see an error:

ERROR:  must be owner of relation plants

Also try:

DELETE FROM plants WHERE name_lat='Iris haynei';

And you should see:

ERROR:  permission denied for relation plants

Also make sure that PostGIS works for the new user, using the same query shown above:

SELECT 
  (ST_SetSRID(ST_MakePoint(0, 0), 4326)::geography <->
  ST_SetSRID(ST_MakePoint(1, 0), 4326)::geography) / 1000
AS dist_km;

H.6 Installing Node.js and required modules

Next, we need to install Node.js. This is based on the How To Install Node.js on Ubuntu 22.04 tutorial.

Run the following commands to get the required packages:

sudo apt update
sudo apt install nodejs
sudo apt install npm

Check that Node.js is installed with the following command, which should print the installed version:

node -v

H.7 Writing the server-side script

We will use a Node.js script to create an SQL API for our database.

This is based on the postgis-preview example.

Create a new folder for your server app:

mkdir node
cd node

Run the following commands to initialize the server and download the required Node packages:

npm init
npm install express --save
npm install pg-promise --save
npm install dbgeo --save

Now, inside the directory create a new file named index.js with:

nano index.js

Then paste the following Node.js script for the SQL API inside it:

// Dependencies
let express = require("express");
let pgp = require("pg-promise")();
let dbgeo = require("dbgeo");

// Create express app and prepare db connection
let app = express();
let port = 4000;
let db = pgp("postgres://readonlyuser:whatever2@localhost:5432/gisdb");

// Settings
app.use(function(req, res, next) {
    res.header("Access-Control-Allow-Origin", "*");
    res.header("Access-Control-Allow-Methods", "GET, POST");
    res.header("Access-Control-Allow-Headers", "Origin, X-Requested-With, Content-Type, Accept");
    next();
});
app.use(express.json());                             // to support JSON-encoded bodies
app.use(express.urlencoded( { extended: true } ));   // to support URL-encoded bodies

// SQL API (GET + POST)
function sql_handler(req, res) {
    // Get SQL query
    let sql;
    if(req.method == "GET") {
        sql = req.query.q;
    }
    if(req.method == "POST") {
        sql = req.body.q;
    }
    console.log("Executing SQL: " + sql);
    // Query using 'pg-promise'
    db.any(sql)
        // Use 'dbgeo' to convert WKB from PostGIS into GeoJSON
        .then(function(data) {
            if(data.length > 0 && "geom" in data[0]) {
                return parse_to_geojson(data);
            } else {
                return data;
            }
        })
        .then(function(data) {
            res.send(data);
        })
        // Send the error message if the query didn't work
        .catch(function(err) {
            let msg = err.message || err;
            console.log("ERROR:", msg);
            res.send({
                error: msg
            });
        });
}
app.get("/sql", sql_handler);
app.post("/sql", sql_handler);

// Parse to GeoJSON
function parse_to_geojson(data) {
    return new Promise(function(resolve, reject) {
        dbgeo.parse(data, {
            outputFormat: "geojson"
        }, function (err, result) {
            if (err) {
                reject(err);
            } else {
                resolve(result);
            }
        });
    });
}

// Start the server
let server = app.listen(port, "0.0.0.0", function () {
  let host = server.address().address;
  let port = server.address().port;
  console.log("App listening at http://%s:%s", host, port);
});

While still inside the directory, you can type the following command to “manually” start the server.

node index.js

However, we are not going to do that. Instead, we will setup a service to start the server automatically (see below).

H.8 Creating a service

To run the server indefinitely, forcing startup when errors are raised or when the server reboots, you need to create a service. This can be done using systemd, which is installed by default on Ubuntu. First create a service file, such as index.service:

nano index.service

and paste the following contents:

[Service]
ExecStart=/usr/bin/node /home/michael/node/index.js
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=index
User=michael
Group=nogroup
Environment=NODE_ENV=production
WorkingDirectory=/home/michael/

[Install]
WantedBy=multi-user.target

Note that ExecStart is composed of the node executable and the path to the application .js file.

Copy the index.service file to /etc/systemd/system/:

sudo cp index.service /etc/systemd/system/

Then, start the service:

sudo systemctl enable index
sudo systemctl start index

You can check the service status with:

sudo systemctl status index

If the server works, you should see a printout such as follows (note the Active entry):

 index.service
     Loaded: loaded (/etc/systemd/system/index.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-08-03 07:23:18 UTC; 5s ago
   Main PID: 21122 (node)
      Tasks: 7 (limit: 1119)
     Memory: 19.9M
        CPU: 327ms
     CGroup: /system.slice/index.service
             └─21122 /usr/bin/node /home/michael/node/index.js

Aug 03 07:23:18 test systemd[1]: Started index.service.
Aug 03 07:23:18 test index[21122]: App listening at http://0.0.0.0:4000

You can also print the server logs with:

sudo journalctl -u index

H.9 Testing the SQL API

Enter the following URL in your browser:

http://46.101.193.143:4000/sql?q=SELECT id, name_lat, geom FROM plants LIMIT 2

If the SQL API is working properly, you should see the following GeoJSON content in the response (Figure H.4). At the same time you should see a message in the Node.js server log (Figure H.5).

{
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    35.679761,
                    32.770133
                ]
            },
            "properties": {
                "id": 321432,
                "name_lat": "Iris haynei"
            }
        },
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    35.654005,
                    32.741372
                ]
            },
            "properties": {
                "id": 321433,
                "name_lat": "Iris haynei"
            }
        }
    ]
}
Testing the SQL API in the browser

FIGURE H.4: Testing the SQL API in the browser

Node.js server log

FIGURE H.5: Node.js server log

You can also test a non-spatial query with:

http://46.101.193.143:4000/sql?q=SELECT id, name_lat FROM plants LIMIT 2

In this case, the result should be:

[
  {
    "id": 321432,
    "name_lat": "Iris haynei"
  },
  {
    "id": 321433,
    "name_lat": "Iris haynei"
  }
]

This setup is sufficient to reproduce the examples from Chapters 911, where the SQL API is being used to query the plants table from within a Leaflet map in several ways.

For reproducing the examples in Chapter 13, you need to create the additional table beer_sheva as instructed in Section 13.6.1. Type the following SQL expressions when logged in as the superuser:

CREATE TABLE beer_sheva(
    geom geometry,
    description text,
    name text
);

GRANT INSERT (geom, description, name) 
  ON beer_sheva 
  TO readonlyuser;

GRANT SELECT ON beer_sheva TO readonlyuser;

Now, execute an INSERT query through the SQL API, by pasting the following URL in your browser:

http://46.101.193.143:4000/sql?q=INSERT INTO beer_sheva (geom, description, name) VALUES (ST_SetSRID(ST_GeomFromGeoJSON('{"type":"Point","coordinates":[34.838848,31.296301]}'),4326),'Point 1','Michael')

On the server, check if the beer_sheva table was indeed updated:

SELECT ST_AsText(geom) AS geom, description, name FROM beer_sheva;

If all worked fine, you should see the following printout:

            geom            | description |  name   
----------------------------+-------------+---------
 POINT(34.838848 31.296301) | Point 1     | Michael
(1 row)

H.10 Domain name and HTTPS (optional)

The final step is setting up HTTPS (Section 5.3.1) on the server, which requires associating it with a domain name. There are two main advantages to communicating through HTTPS:

  • The communication is secure (although this is not crucial unless in data collection apps)
  • The browser does not display a “not secure” warning

A domain name can be purchased from providers such as namecheap, where the https://geobgu.xyz/ domain with the demonstration service was purchased (for 14$ per year, at the time of writing). The server IP address then can be associated with the domain name using the web interface of the provider (Figure H.6).

Domain settings interface on www.namecheap.com.

FIGURE H.6: Domain settings interface on www.namecheap.com.

Then, certificate files need to be generated on the server. This can be done for free using Let’s encrypt. Use the Let’s encrypt instructions which involves stopping the server and running a command such as:

sudo certbot certonly --standalone

Finally, the certificate files need to be imported into the server script (Section H.6) and the server needs to be started a little differently, as follows (where mydomain.com is the domain name):

let fs = require("fs");
let https = require("https");

// ...
// Rest of the script goes here...
// ...

// Read certificate
let privateKey = fs.readFileSync("/etc/letsencrypt/live/mydomain.com/privkey.pem", "utf8");
let certificate = fs.readFileSync("/etc/letsencrypt/live/mydomain.com/cert.pem", "utf8");
let ca = fs.readFileSync("/etc/letsencrypt/live/mydomain.com/chain.pem", "utf8");
let credentials = {key: privateKey, cert: certificate, ca: ca};

// Start HTTPS server
let httpsServer = https.createServer(credentials, app);
httpsServer.listen(443, "0.0.0.0", function () {
  console.log("App listening at port 443");
});

You may also have to run the following command(s) to let Node.js access the certificate files:

sudo apt-get install libcap2-bin
sudo setcap cap_net_bind_service=+ep `readlink -f \`which node\``

and modify the certificate owner using an expression such as:

sudo chown michael -R /etc/letsencrypt