SpatialDB Advisor
The people over at the Free and Open Source Software Learning Centre on 19-22nd May 2009 hosted an event that included three sessions on PostGIS (on 21st May):
They are worth taking a look at.


















I find having a vectorising function in my database kit-bag almost indispensable in my work. I have written and revised my Oracle PL/SQL functions that do this many times over the years and so, when I first started using PostGIS, I decided to learn how to program in PL/pgSQL by implementing something familiar: a GetVector() function.
Now, my PostgreSQL/PostGIS is not bad, but I am always willing to learn from others who are more expert than me: so don’t take the following function (which , for example, doesn’t handle Curves as they are still in their infancy in PostGIS) as being perfect or correct in every manner.
Firstly, we need to create two data types that we can use.
A Coordinate Type
CREATE TYPE coordtype AS (
x double precision,
y double precision,
z double precision,
m double precision
);
Query returned successfully with no result in 78 ms.
A Vector Type based on the Coordinate Type
CREATE TYPE vectortype AS (
startcoord coordtype,
endcoord coordtype
);
Query returned successfully with no result in 16 ms.
What I tried to do in the following GetVector() function is handle all geometry types in the one function. Initially I used one cursor per geometry type (and a single refCursor for processing) but in this implementation I have chosen to use a single cursor with UNION ALLs between the SQL that processes the vertices of each geometry type. I am hoping that the PostgreSQL query optimiser does its equivalent of Oracle’s partition pruning. Perhaps someone may test this aspect and report if it is not performing as fast as one would hope.
CREATE OR REPLACE FUNCTION ST_GetVector(p_geometry geometry)
RETURNS SETOF VectorType IMMUTABLE
AS $$
DECLARE
v_GeometryType varchar(1000);
v_rec RECORD;
v_vector VectorType;
v_start CoordType;
v_end CoordType;
c_points CURSOR ( p_geom geometry,
p_Geometrytype text )
IS
SELECT ST_X(sp) as sx,ST_Y(sp) as sy,ST_Z(sp) as sz,ST_M(sp) as sm,
ST_X(ep) as ex,ST_Y(ep) as ey,ST_Z(ep) as ez,ST_M(ep) as em
FROM ( SELECT pointn(p_geom, generate_series(1, npoints(p_geom)-1)) as sp,
pointn(p_geom, generate_series(2, npoints(p_geom) )) as ep
WHERE ( p_GeometryType = 'ST_LineString' )
) AS linestring
UNION ALL
SELECT ST_X(sp),ST_Y(sp),ST_Z(sp),ST_M(sp),ST_X(ep),ST_Y(ep),ST_Z(ep),ST_M(ep)
FROM ( SELECT ST_PointN(b.geom, generate_series(1, ST_NPoints(b.geom)-1)) as sp,
ST_PointN(b.geom, generate_series(2, ST_NPoints(b.geom) )) as ep
FROM (SELECT ST_GeometryN(p_geom,generate_series(1,ST_NumGeometries(p_geom))) as geom) as b
WHERE ( p_GeometryType = 'ST_MultiLineString' )
) AS multiline
UNION ALL
SELECT ST_X(sp),ST_Y(sp),ST_Z(sp),ST_M(sp),ST_X(ep),ST_Y(ep),ST_Z(ep),ST_M(ep)
FROM ( SELECT ST_PointN(a.geom, generate_series(1, ST_NPoints(a.geom)-1)) as sp,
ST_PointN(a.geom, generate_series(2, ST_NPoints(a.geom) )) as ep
FROM ( SELECT ST_ExteriorRing(p_geom) as geom
WHERE ( p_GeometryType = 'ST_Polygon' )
UNION ALL
SELECT ST_InteriorRingN(p_geom,generate_series(1,ST_NumInteriorRings(p_geom))) as geom
WHERE ( p_GeometryType = 'ST_Polygon' )
) a
) as polygon
UNION ALL
SELECT ST_X(sp),ST_Y(sp),ST_Z(sp),ST_M(sp),ST_X(ep),ST_Y(ep),ST_Z(ep),ST_M(ep)
FROM ( SELECT ST_pointn(a.geom, generate_series(1, ST_npoints(a.geom)-1)) as sp,
ST_pointn(a.geom, generate_series(2, ST_npoints(a.geom) )) as ep
FROM ( SELECT ST_exteriorring(b.geom) as geom
FROM (select ST_geometryn(p_geom,generate_series(1,ST_numgeometries(p_geom))) as geom ) b
WHERE ( p_GeometryType = 'ST_MultiPolygon' )
UNION ALL
SELECT ST_InteriorRingn(c.geom,generate_series(1,ST_NumInteriorRings(c.geom))) as geom
FROM (select ST_GeometryN(p_geom,generate_series(1,ST_NumGeometries(p_geom))) as geom ) c
WHERE ( p_GeometryType = 'ST_MultiPolygon' )
) a
) as multipoly;
Begin
If ( p_geometry is NULL ) Then
return;
End If;
v_GeometryType := ST_GeometryType(p_geometry);
If ( v_GeometryType in ('ST_Point','ST_MultiPoint') ) Then
return;
End If;
IF ( v_GeometryType = 'ST_Geometry' ) THEN
-- Could be anything... use native PostGIS function
v_GeometryType = GeometryType(p_geometry);
IF ( v_geometryType = 'GEOMETRYCOLLECTION' ) THEN
FOR v_geom IN 1..ST_NumGeometries(p_geometry) LOOP
FOR v_rec IN SELECT * FROM ST_GetVector(ST_GeometryN(p_geometry,v_geom)) LOOP
RETURN NEXT v_rec;
END LOOP;
END LOOP;
ELSE
-- Probably CURVED something...
RETURN;
END IF;
END IF;
IF ( v_geometryType NOT IN ('ST_Geometry','GEOMETRYCOLLECTION') ) THEN
OPEN c_points(p_geometry,v_Geometrytype);
LOOP
FETCH c_points INTO
v_start.x, v_start.y, v_start.z, v_start.m,
v_end.x, v_end.y, v_end.z, v_end.m;
v_vector.startcoord := v_start;
v_vector.endcoord := v_end;
EXIT WHEN NOT FOUND;
RETURN NEXT v_vector;
END LOOP;
CLOSE c_points;
END IF;
end;
$$ LANGUAGE 'plpgsql';
Query returned successfully with no result in 156 ms.
Now, so that we can use this function in nested table Select statements we construct a “wrapper”.
CREATE OR REPLACE FUNCTION ST_GetVectorSQL(p_geometry geometry)
RETURNS SETOF VectorType IMMUTABLE
AS
$$
SELECT * FROM ST_GetVector($1);
$$
LANGUAGE 'sql';
Query returned successfully with no result in 31 ms.
Now let’s conduct some simple tests.
select * from ST_GetVector('GEOMETRYCOLLECTION(POINT(2 3 4),LINESTRING(2 3 4,3 4 5))'::geometry);
| startcoord coordtype |
endcoord coordtype |
|---|---|
| (2,3,4,) | (3,4,5,) |
select * from ST_GetVector('LINESTRING(0 0, 1 1, 2 2, 3 3)'::geometry) as geom;
| startcoord coordtype |
endcoord coordtype |
|---|---|
| (0,0,,) | (1,1,,) |
| (1,1,,) | (2,2,,) |
| (2,2,,) | (3,3,,) |
select * from ST_GetVector('MULTILINESTRING((0 0,1 1,1 2),(2 3,3 2,5 4))'::geometry) As GV;
| startcoord coordtype |
endcoord coordtype |
|---|---|
| (0,0,,) | (1,1,,) |
| (1,1,,) | (1,2,,) |
| (2,3,,) | (3,2,,) |
| (3,2,,) | (5,4,,) |
select * from ST_GetVector('POLYGON((326454.7 5455793.7,326621.3 5455813.7,326455.4 5455796.6,326454.7 5455793.7))'::geometry);
| startcoord coordtype |
endcoord coordtype |
|---|---|
| (326454.7,5455793.7,,) | (326621.3,5455813.7,,) |
| (326621.3,5455813.7,,) | (326455.4,5455796.6,,) |
| (326455.4,5455796.6,,) | (326454.7,5455793.7,,) |
select * from ST_GetVector('MULTIPOLYGON(((326454.7 5455793.7,326621.3 5455813.7,326455.4 5455796.6,326454.7 5455793.7)),((326771.6 5455831.6,326924.1 5455849.9,326901.9 5455874.2,326900.7 5455875.8,326888.9 5455867.3,326866 5455853.1,326862 5455851.2,326847.4 5455845.8,326827.7 5455841.2,326771.6 5455831.6)))'::geometry);
| startcoord coordtype |
endcoord coordtype |
|---|---|
| (326454.7,5455793.7,,) | (326621.3,5455813.7,,) |
| (326621.3,5455813.7,,) | (326455.4,5455796.6,,) |
| (326455.4,5455796.6,,) | (326454.7,5455793.7,,) |
| (326771.6,5455831.6,,) | (326924.1,5455849.9,,) |
| (326924.1,5455849.9,,) | (326901.9,5455874.2,,) |
| (326901.9,5455874.2,,) | (326900.7,5455875.8,,) |
| (326900.7,5455875.8,,) | (326888.9,5455867.3,,) |
| (326888.9,5455867.3,,) | (326866,5455853.1,,) |
| (326866,5455853.1,,) | (326862,5455851.2,,) |
| (326862,5455851.2,,) | (326847.4,5455845.8,,) |
| (326847.4,5455845.8,,) | (326827.7,5455841.2,,) |
| (326827.7,5455841.2,,) | (326771.6,5455831.6,,) |
Table Tests
Now we can test our function against some real tables and SQL.
DROP TABLE v_polygons;
| addgeometrycolumn text |
|---|
| public.v_polygons.geom SRID:28355 TYPE:MULTIPOLYGON DIMS:2 |
ALTER TABLE v_polygons DROP CONSTRAINT enforce_geotype_geom;
| oid integer |
startcoord coordtype |
endcoord coordtype |
|---|---|---|
| 1 | (326454.7,5455793.7,,) | (326621.3,5455813.7,,) |
| 1 | (326621.3,5455813.7,,) | (326455.4,5455796.6,,) |
| 1 | (326455.4,5455796.6,,) | (326454.7,5455793.7,,) |
| 1 | (326771.6,5455831.6,,) | (326924.1,5455849.9,,) |
| 1 | (326924.1,5455849.9,,) | (326901.9,5455874.2,,) |
| 1 | (326901.9,5455874.2,,) | (326900.7,5455875.8,,) |
| 1 | (326900.7,5455875.8,,) | (326888.9,5455867.3,,) |
| 1 | (326888.9,5455867.3,,) | (326866,5455853.1,,) |
| 1 | (326866,5455853.1,,) | (326862,5455851.2,,) |
| 1 | (326862,5455851.2,,) | (326847.4,5455845.8,,) |
| 1 | (326847.4,5455845.8,,) | (326827.7,5455841.2,,) |
| 1 | (326827.7,5455841.2,,) | (326771.6,5455831.6,,) |
| 2 | (326667.8,5455760.3,,) | (326669.7,5455755.6,,) |
| 2 | (326669.7,5455755.6,,) | (326688.1,5455766,,) |
| 2 | (326688.1,5455766,,) | (326685.3,5455770.5,,) |
| 2 | (326685.3,5455770.5,,) | (326667.8,5455760.3,,) |
Let’s visualise this by moving the resultant vectors parallel 10 meters from the original polygon boundaries.
We will use Regina Obe’s upgis_lineshift function to move individual polygon vectors to the right by 10m.
CREATE OR REPLACE FUNCTION upgis_lineshift(centerline geometry, dist double precision)
RETURNS geometry AS
$$
DECLARE
delx float;
dely float;
x0 float;
y0 float;
x1 float;
y1 float;
az float;
dir integer;
line geometry;
BEGIN
az := ST_Azimuth (ST_StartPoint(centerline), ST_EndPoint(centerline));
dir := CASE WHEN az < pi() THEN -1 ELSE 1 END;
delx := ABS(COS(az)) * dist * dir;
dely := ABS(SIN(az)) * dist * dir;
Then we need a table to hold the vectors we create and shift.
DROP TABLE polygon_vectors;
| addgeometrycolumn text |
|---|
| public.polygon_vectors.geom SRID:28355 TYPE:LINESTRING DIMS:2 |
Finally, let’s create, shift and write the vectors.
INSERT INTO polygon_vectors (geom)
SELECT upgis_lineshift(ST_SetSRID(ST_MakeLine(ST_MakePoint((startcoord).x,(startcoord).y)::geometry,
ST_MakePoint((endcoord).x, (endcoord).y)::geometry),
28355),
10)
FROM (SELECT (ST_GetVector(v_polygons.geom)).*
FROM v_polygons ) as v;
| count bigint |
|---|
| 16 |
Finally, let’s visualise the polygons and the shifted vectors.

I hope this function, and article, is of use to someone.


















As I have pointed out in other blog articles, spatial data quality should not engender either/or solutions when building business applications. That is, if I can only create points for parcel centroids that fall within land parcels, then I don’t just build the rule in to the editing software that implements the end-user edit capabilities of the system: I should also add such constraints to the database that is holding the application data model and datat. The ability of databases in the area of spatial data quality is variable: this is because of the comparative youthful age of current spatial object relational databases.
One of the benefits of the mathematics behind relational theory is that, where the prescriptions of the model are implemented and used, data quality is ensured. We have many relational and object-relational databases available today (both commercially in Oracle, SQL Server etc and open source, MySQL, PostgreSQL etc) but these databases all implement relational theory (or the SQL standards that have taken over from the pure science) to lesser or greater extent. A complaint of C.J. Date is that many of todays databases that purport to be relational are not; he even argues persuasively that SQL is not relational.
Prosaically, the sorts of constraints in relational databases that are of use in ensuring data quality are:
In the SQL-92 standard there is the little know ASSERTION (constraint) that, as far as I know, no commercial database today implements.
While all these constraints are based on attribute data and not complex data types such as spatial data, only some can be used to control spatial data quality. These are the Column and Table versions of the CHECK constraint.
PostGIS’s AddGeometryColumn() Function
PostGIS can ensure spatial data quality through both an ordinary and advanced use of the CHECK constraint.
One of the nice things about registering a table/column using PostGIS’s AddGeometryColumn() function is that it automatically adds in CHECK constraints that test that the entered geometry:
The following statements show how this is done.
-- 0 Start with a clean data model
DROP TABLE simon.parcel_centroid;
Query returned successfully with no result in 16 ms.
| addgeometry text |
|---|
| simon.parcel.geom SRID:28355 TYPE:POLYGON DIMS:2 |
/* Note that the table now has the following structure.
CREATE TABLE simon.parcel
(
gid serial NOT NULL,
geom geometry,
CONSTRAINT parcel_pkey PRIMARY KEY (gid),
CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POLYGON'::text OR geom IS NULL),
CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
*/
There is something important to note about these simple CHECK constraints: they use functions to extract a value from the geometry object that can be tested in a first order predicate expression that returns true or false: “function() == value”.
We can now conduct some tests on the table created above to show how these CHECK constraints work.
-- 3. try and insert a POLYGON with wrong dimensionality
insert into simon.parcel(gid,geom) values (1,ST_GeomFromEWKT('POLYGON ((100 0 -9,120 0 -9,120 20 -9,100 20 -9,100 0 -9))'));
ERROR: new row for relation "parcel" violates check constraint "enforce_dims_geom"
| gid integer |
st_astext text |
|---|---|
| 1 | “POLYGON ((100 0,120 0,120 20,100 20,100 0))” |
| 2 | “POLYGON ((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5))” |
| 3 | “MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5)), ((50 5,50 7,70 7,70 5,50 5)))” |
Parcel Centroid Points within Parcel Polygons
Let’s add a new rule that a land parcel has to have a (single) centroid point that falls within the polygon representing the parcel. We can implement this in a number of ways.
Two Column Table
Firstly, let’s start by adding a new column to our existing table. Tables with multiple geometric columns are not common mainly because GIS packages have been unable to deal with them: though some nowadays do. However, in this situation we have a single centroid point with no other attributes describing the land parcel so it makes sense to add it to the parcel table. Note: If your GIS can’t cope with a multi-geometry columned table try using views for both visualisation and update.
SELECT addGeometryColumn('simon','parcel','centroid','28355','POINT','2');
| addgeometry text |
|---|
| simon.parcel.centroid SRID:28355 TYPE:POINT DIMS:2 |
Our table now looks like this.
CREATE TABLE simon.parcel
(
gid serial NOT NULL,
geom geometry,
centroid geometry,
CONSTRAINT parcel_pkey PRIMARY KEY (gid),
CONSTRAINT enforce_dims_centroid CHECK (st_ndims(centroid) = 2),
CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
CONSTRAINT enforce_geotype_centroid CHECK (geometrytype(centroid) = 'POINT'::text OR centroid IS NULL),
CONSTRAINT enforce_geotype_geom CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL),
CONSTRAINT enforce_srid_centroid CHECK (st_srid(centroid) = 28355),
CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
WITH (
OIDS=FALSE
);
How can we implement a spatial constraint to ensure that any point added/updated is checked to ensure it falls within the host polygon? Well, we can use the ST_Covers() function directly in the check constraint. However, we cannot apply this constraint as we have rows in the table with no centroids and PostgreSQL does not have a NOVALIDATE (as Oracle does – this means no not apply the constraint to existing records only inserts/updates from now on) option when adding a constraint.
-- 1 Add centroids to existing polygons
UPDATE simon.parcel SET centroid = ST_Centroid(geom);
Query returned successfully: 3 rows affected, 62 ms execution time.
| gid integer |
st_covers boolean |
|---|---|
| 1 | t |
| 2 | f |
| 3 | t |
-- 4 Fix it
UPDATE simon.parcel SET centroid = ST_PointOnSurface(geom) WHERE gid = 2;
Query returned successfully: 1 rows affected, 63 ms execution time.
| gid integer |
st_covers boolean |
|---|---|
| 2 | t |
-- 6. Now we can apply the constraint
ALTER TABLE simon.parcel ADD CONSTRAINT centroid_in_parcel CHECK (centroid IS NOT NULL AND ST_Covers(geom,centroid) = True);
Query returned successfully with no result in 16 ms.
Two Table Solution with Foreign Key
It is not common to see the above modelling. Mostly one sees the parcel polygons in one table with another table being used to hold the parcel centroids. We can still do our centroid_in_parcel check constraint, however, but there is a little bit more involved.
-- 1. Create a table to hold the centroid
DROP TABLE simon.parcel_centroid;
Query returned successfully with no result in 47 ms.
| addgeometry text |
|---|
| simon.parcel_centroid.geom SRID:28355 TYPE:POINT DIMS:2 |
Our parcel_centroid table now looks like this:
CREATE TABLE simon.parcel_centroid
(
gid serial NOT NULL,
gid_parcel integer NOT NULL,
geom geometry,
CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
REFERENCES simon.parcel (gid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL),
CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
WITH (
OIDS=FALSE
);
*/
The only way we can check that these centroids fall inside the geometry polygons in the related, parcel, table is by constructing a function and using it instead of ST_Covers in our CHECK constraint
-- 4. Construction function
CREATE OR REPLACE FUNCTION simon.centroid_in_parcel(geometry)
RETURNS integer AS
$BODY$
SELECT count(*)::integer FROM simon.parcel p WHERE ST_Covers(p.geom,$1);
$BODY$
LANGUAGE 'sql' IMMUTABLE
COST 100;
Query returned successfully with no result in 16 ms.
| gid integer |
st_covers boolean |
|---|---|
| 1 | t |
| 2 | t |
| 3 | f |
-- 6. Fix gid = 3
UPDATE simon.parcel_centroid c SET geom = (SELECT ST_PointOnSurface(geom) FROM simon.parcel p WHERE p.gid = c.gid_parcel) WHERE gid = 3;
Query returned successfully: 1 rows affected, 15 ms execution time.
| gid integer |
st_covers boolean |
|---|---|
| 3 | t |
-- 8. Now apply constraint
ALTER TABLE simon.parcel_centroid ADD CONSTRAINT centroid_in_parcel_ck CHECK (simon.centroid_in_parcel(geom) > 0);
Query returned successfully with no result in 16 ms.
Excellent it works exactly as we wish: no-one can now write a centroid point that falls outside its land parcel!
Because PostgreSQL allows functions to be called from within a table’s CHECK constraints, powerful spatial data quality constraints can be added to a data model with spatial data embedded with it.
I hope this little article is of use to someone.


















My good friend Regina Obe has taken my article Loading and Processing GPX 1.1 files using Oracle XMLDB and implemented it using PostgreSQL’s XML implementation.
For those who are interested the article is called Loading and Processing GPX XML files using PostgreSQL

















