Go to content Go to navigation and search

Home

PostGIS Blog Articles

Search

RSS / Atom

Email me

textpattern

How to apply spatial constraints to PostGIS tables

· May 14, 05:54 pm by Simon Greener

As I have pointed out in other blog articles, spatial data quality should not engender either/or solutions when building business applications. That is, if I can only create points for parcel centroids that fall within land parcels, then I don’t just build the rule in to the editing software that implements the end-user edit capabilities of the system: I should also add such constraints to the database that is holding the application data model and datat. The ability of databases in the area of spatial data quality is variable: this is because of the comparative youthful age of current spatial object relational databases.

One of the benefits of the mathematics behind relational theory is that, where the prescriptions of the model are implemented and used, data quality is ensured. We have many relational and object-relational databases available today (both commercially in Oracle, SQL Server etc and open source, MySQL, PostgreSQL etc) but these databases all implement relational theory (or the SQL standards that have taken over from the pure science) to lesser or greater extent. A complaint of C.J. Date is that many of todays databases that purport to be relational are not; he even argues persuasively that SQL is not relational.

Prosaically, the sorts of constraints in relational databases that are of use in ensuring data quality are:

  1. Primary/Unique Constraints;
  2. Not Null
  3. Foreign Key Constraints
  4. Column Check Constraints
  5. Table Check Constraints.

In the SQL-92 standard there is the little know ASSERTION (constraint) that, as far as I know, no commercial database today implements.

While all these constraints are based on attribute data and not complex data types such as spatial data, only some can be used to control spatial data quality. These are the Column and Table versions of the CHECK constraint.

PostGIS’s AddGeometryColumn() Function

PostGIS can ensure spatial data quality through both an ordinary and advanced use of the CHECK constraint.

One of the nice things about registering a table/column using PostGIS’s AddGeometryColumn() function is that it automatically adds in CHECK constraints that test that the entered geometry:

  1. Is of the right type ie POLYGON and not POINT
  2. Has right SRID (if supplied)
  3. Has right dimensionality

The following statements show how this is done.

-- 0 Start with a clean data model
DROP   TABLE simon.parcel_centroid;
Query returned successfully with no result in 16 ms.

DROP   TABLE simon.parcel;
Query returned successfully with no result in 16 ms.

-- 1 Create bare bones table without geometry column
CREATE TABLE simon.parcel
(
  gid serial NOT NULL Primary Key
)
WITH (
  OIDS=FALSE
);
NOTICE:  CREATE TABLE will create implicit sequence "parcel_gid_seq" for serial column "parcel.gid"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "parcel_pkey" for table "parcel"
Query returned successfully with no result in 109 ms.

/* Table looks like this:
** 
CREATE TABLE simon.parcel
(
  gid serial NOT NULL,
  CONSTRAINT parcel_pkey PRIMARY KEY (gid)
)
WITH (
  OIDS=FALSE
);
*/

-- Grant ownership 
ALTER TABLE simon.parcel OWNER TO postgres;
Query returned successfully with no result in 16 ms.

-- 2. Add geometry column and CHECK constraints on geometry
SELECT addGeometryColumn('simon','parcel','geom','28355','POLYGON','2'); 

addgeometry
text
simon.parcel.geom SRID:28355 TYPE:POLYGON DIMS:2

/* Note that the table now has the following structure.
CREATE TABLE simon.parcel
(
  gid serial NOT NULL,
  geom geometry,
  CONSTRAINT parcel_pkey PRIMARY KEY (gid),
  CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
  CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POLYGON'::text OR geom IS NULL),
  CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
*/

There is something important to note about these simple CHECK constraints: they use functions to extract a value from the geometry object that can be tested in a first order predicate expression that returns true or false: “function() == value”.

We can now conduct some tests on the table created above to show how these CHECK constraints work.

-- 3. try and insert a POLYGON with wrong dimensionality
insert into simon.parcel(gid,geom) values (1,ST_GeomFromEWKT('POLYGON ((100 0 -9,120 0 -9,120 20 -9,100 20 -9,100 0 -9))'));
ERROR:  new row for relation "parcel" violates check constraint "enforce_dims_geom"

-- 4. Try and insert POLYGON with right dimensionality
insert into simon.parcel(gid,geom) values (1,ST_PolygonFromText('POLYGON ((100 0,120 0,120 20,100 20,100 0))'));
ERROR:  new row for relation "parcel" violates check constraint "enforce_srid_geom"

-- 5. Try and insert geometry with right SRID
insert into simon.parcel(gid,geom) values (1,ST_PolygonFromText('POLYGON ((100 0,120 0,120 20,100 20,100 0))',28355));
 Query returned successfully: 1 rows affected, 16 ms execution time.

-- 6. Insert Another POLYGON
insert into simon.parcel(gid,geom) values (2,ST_PolygonFromText('POLYGON ((0 0,20 0,20 20,0 20,0 0),(10 10,10 11,11 11,11 10,10 10),(5 5,5 7,7 7,7 5,5 5))',28355));
Query returned successfully: 1 rows affected, 16 ms execution time.

-- 7. Now try and insert a MULTIPOLYGON
insert into simon.parcel(gid,geom) values (3,ST_MultiPolygonFromText('MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0),(10 10,10 11,11 11,11 10,10 10),(5 5,5 7,7 7,7 5,5 5)),((50 5,50 7,70 7,70 5,50 5)))',28355));
ERROR:  new row for relation "parcel" violates check constraint "enforce_geotype_geom"

-- How do we fix this if we want both POLYGON and MULTIPOLYGONS in our table?
-- We could do this back at the original call to AddGeometryColumn but here we will show how to do it post-factum.

-- 8. Modify the constraint directly
ALTER TABLE simon.parcel DROP CONSTRAINT enforce_geotype_geom;
Query returned successfully with no result in 15 ms.

ALTER TABLE simon.parcel ADD  CONSTRAINT enforce_geotype_geom CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL);
Query returned successfully with no result in 31 ms.

-- 9. Try again
insert into simon.parcel(gid,geom) values (3,ST_MultiPolygonFromText('MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5)), ((50 5,50 7,70 7,70 5,50 5)))',28355));
Query returned successfully: 1 rows affected, 32 ms execution time.

select gid, ST_AsText(geom)
  from simon.parcel;

gid
integer
st_astext
text
1 POLYGON ((100 0,120 0,120 20,100 20,100 0))”
2 POLYGON ((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5))”
3 MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5)), ((50 5,50 7,70 7,70 5,50 5)))”


Parcel Centroid Points within Parcel Polygons
Let’s add a new rule that a land parcel has to have a (single) centroid point that falls within the polygon representing the parcel. We can implement this in a number of ways.

Two Column Table
Firstly, let’s start by adding a new column to our existing table. Tables with multiple geometric columns are not common mainly because GIS packages have been unable to deal with them: though some nowadays do. However, in this situation we have a single centroid point with no other attributes describing the land parcel so it makes sense to add it to the parcel table. Note: If your GIS can’t cope with a multi-geometry columned table try using views for both visualisation and update.

SELECT addGeometryColumn('simon','parcel','centroid','28355','POINT','2'); 

addgeometry
text
simon.parcel.centroid SRID:28355 TYPE:POINT DIMS:2

Our table now looks like this.

CREATE TABLE simon.parcel
(
  gid serial NOT NULL,
  geom geometry,
  centroid geometry,
  CONSTRAINT parcel_pkey PRIMARY KEY (gid),
  CONSTRAINT enforce_dims_centroid CHECK (st_ndims(centroid) = 2),
  CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
  CONSTRAINT enforce_geotype_centroid CHECK (geometrytype(centroid) = 'POINT'::text OR centroid IS NULL),
  CONSTRAINT enforce_geotype_geom CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL),
  CONSTRAINT enforce_srid_centroid CHECK (st_srid(centroid) = 28355),
  CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
WITH (
  OIDS=FALSE
);

How can we implement a spatial constraint to ensure that any point added/updated is checked to ensure it falls within the host polygon? Well, we can use the ST_Covers() function directly in the check constraint. However, we cannot apply this constraint as we have rows in the table with no centroids and PostgreSQL does not have a NOVALIDATE (as Oracle does – this means no not apply the constraint to existing records only inserts/updates from now on) option when adding a constraint.

-- 1 Add centroids to existing polygons
UPDATE simon.parcel SET centroid = ST_Centroid(geom);
Query returned successfully: 3 rows affected, 62 ms execution time.

-- 2. Now, apply centroid constraint
ALTER TABLE simon.parcel ADD  CONSTRAINT centroid_in_parcel CHECK (centroid IS NOT NULL AND ST_Covers(geom,centroid) = True);
ERROR:  check constraint "centroid_in_parcel" is violated by some row

-- 3. Find which rows fail
SELECT gid, ST_Covers(geom,centroid)
  FROM simon.parcel;

gid
integer
st_covers
boolean
1 t
2 f
3 t

-- 4 Fix it
UPDATE simon.parcel SET centroid = ST_PointOnSurface(geom) WHERE gid = 2;
Query returned successfully: 1 rows affected, 63 ms execution time.

-- 5. Check
SELECT gid, ST_Covers(geom,centroid)
  FROM simon.parcel
 WHERE gid = 2;

gid
integer
st_covers
boolean
2 t

-- 6. Now we can apply the constraint 
ALTER TABLE simon.parcel ADD  CONSTRAINT centroid_in_parcel CHECK (centroid IS NOT NULL AND ST_Covers(geom,centroid) = True);
Query returned successfully with no result in 16 ms.

/* Table now looks like this
CREATE TABLE simon.parcel
(
  gid serial NOT NULL,
  geom geometry,
  centroid geometry,
  CONSTRAINT parcel_pkey PRIMARY KEY (gid),
  CONSTRAINT centroid_in_parcel CHECK (centroid IS NOT NULL AND st_covers(geom, centroid) = true),
  CONSTRAINT enforce_dims_centroid CHECK (st_ndims(centroid) = 2),
  CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
  CONSTRAINT enforce_geotype_centroid CHECK (geometrytype(centroid) = 'POINT'::text OR centroid IS NULL),
  CONSTRAINT enforce_geotype_geom CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL),
  CONSTRAINT enforce_srid_centroid CHECK (st_srid(centroid) = 28355),
  CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
WITH (
  OIDS=FALSE
);
*/

Two Table Solution with Foreign Key
It is not common to see the above modelling. Mostly one sees the parcel polygons in one table with another table being used to hold the parcel centroids. We can still do our centroid_in_parcel check constraint, however, but there is a little bit more involved.

-- 1. Create a table to hold the centroid
DROP   TABLE simon.parcel_centroid;
Query returned successfully with no result in 47 ms.

CREATE TABLE simon.parcel_centroid
(
  gid        serial NOT NULL PRIMARY KEY,
  gid_parcel integer not null,
  CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel) REFERENCES simon.parcel(gid)
)
WITH (
  OIDS=FALSE
);
NOTICE:  CREATE TABLE will create implicit sequence "parcel_centroid_gid_seq" for serial column "parcel_centroid.gid"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "parcel_centroid_pkey" for table "parcel_centroid"
Query returned successfully with no result in 156 ms.

/* Table looks like this
**
CREATE TABLE simon.parcel_centroid
(
  gid serial NOT NULL,
  gid_parcel integer NOT NULL,
  geom geometry,
  CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
  CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
      REFERENCES simon.parcel (gid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
  OIDS=FALSE
);
*/

-- Change ownership
ALTER TABLE simon.parcel_centroid OWNER TO postgres;
Query returned successfully with no result in 16 ms.

-- 2. Add geometry column and CHECK constraints on geometry
SELECT addGeometryColumn('simon','parcel_centroid','geom','28355','POINT','2'); 

addgeometry
text
simon.parcel_centroid.geom SRID:28355 TYPE:POINT DIMS:2


Our parcel_centroid table now looks like this:

CREATE TABLE simon.parcel_centroid
(
  gid serial NOT NULL,
  gid_parcel integer NOT NULL,
  geom geometry,
  CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
  CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
      REFERENCES simon.parcel (gid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
  CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL),
  CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
WITH (
  OIDS=FALSE
);
*/

-- 3. Now, insert data via a query against the parcel table
insert into simon.parcel_centroid(gid_parcel, geom) 
  select gid,ST_Centroid(geom)
    from simon.parcel;
Query returned successfully: 3 rows affected, 16 ms execution time.

The only way we can check that these centroids fall inside the geometry polygons in the related, parcel, table is by constructing a function and using it instead of ST_Covers in our CHECK constraint

-- 4. Construction function
CREATE OR REPLACE FUNCTION simon.centroid_in_parcel(geometry)
  RETURNS integer AS
$BODY$
SELECT count(*)::integer FROM simon.parcel p WHERE ST_Covers(p.geom,$1);
$BODY$
  LANGUAGE 'sql' IMMUTABLE
  COST 100;
Query returned successfully with no result in 16 ms.

ALTER FUNCTION simon.centroid_in_parcel(geometry) OWNER TO postgres;
Query returned successfully with no result in 16 ms.

-- 5. Check for validity before application
SELECT c.gid, ST_Covers(p.geom,c.geom)
  FROM simon.parcel p
       INNER JOIN
       simon.parcel_centroid c
       ON ( c.gid_parcel = p.gid );

gid
integer
st_covers
boolean
1 t
2 t
3 f

-- 6. Fix gid = 3
UPDATE simon.parcel_centroid c SET geom = (SELECT ST_PointOnSurface(geom) FROM simon.parcel p WHERE p.gid = c.gid_parcel) WHERE gid = 3;
Query returned successfully: 1 rows affected, 15 ms execution time.

-- 7. Check again for validity before application
SELECT c.gid, ST_Covers(p.geom,c.geom)
  FROM simon.parcel p
       INNER JOIN
       simon.parcel_centroid c
       ON ( c.gid_parcel = p.gid )
 WHERE c.gid = 3;

gid
integer
st_covers
boolean
3 t

-- 8. Now apply constraint
ALTER TABLE simon.parcel_centroid ADD CONSTRAINT centroid_in_parcel_ck CHECK (simon.centroid_in_parcel(geom) > 0);
Query returned successfully with no result in 16 ms.

/* Table looks like this
**
CREATE TABLE simon.parcel_centroid
(
  gid serial NOT NULL,
  gid_parcel integer NOT NULL,
  geom geometry,
  CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
  CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
      REFERENCES simon.parcel (gid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT centroid_in_parcel_ck CHECK (simon.centroid_in_parcel(geom) > 0),
  CONSTRAINT enforce_dims_geom CHECK (st_ndims(geom) = 2),
  CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL),
  CONSTRAINT enforce_srid_geom CHECK (st_srid(geom) = 28355)
)
WITH (
  OIDS=FALSE
);
*/

-- 9. Now test with an invalid centroid
UPDATE simon.parcel_centroid c SET geom = (SELECT ST_Centroid(geom) FROM simon.parcel p WHERE p.gid = c.gid_parcel) WHERE gid = 3;
ERROR:  new row for relation "parcel_centroid" violates check constraint "centroid_in_parcel_ck"

Excellent it works exactly as we wish: no-one can now write a centroid point that falls outside its land parcel!

Because PostgreSQL allows functions to be called from within a table’s CHECK constraints, powerful spatial data quality constraints can be added to a data model with spatial data embedded with it.

I hope this little article is of use to someone.

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Newsvinepost this at Ma.gnoliapost this at Furlpost this at Blinklistpost this at Spurlpost this at Wistspost this at Simpypost this at Redditpost this at Farkpost this at Blogmarkspost this at Yahoo! my webpost this at Mr. Wongpost this at Windows Livepost this at Google Bookmarkspost this to Twitter

Loading and Processing GPX 1.1 files using PostgreSQL XML

· Apr 30, 06:53 pm by Simon Greener

My good friend Regina Obe has taken my article Loading and Processing GPX 1.1 files using Oracle XMLDB and implemented it using PostgreSQL’s XML implementation.

For those who are interested the article is called Loading and Processing GPX XML files using PostgreSQL

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Newsvinepost this at Ma.gnoliapost this at Furlpost this at Blinklistpost this at Spurlpost this at Wistspost this at Simpypost this at Redditpost this at Farkpost this at Blogmarkspost this at Yahoo! my webpost this at Mr. Wongpost this at Windows Livepost this at Google Bookmarkspost this to Twitter

Converting Oracle Optimized Rectangles to PostGIS

· Apr 27, 10:39 pm by Simon Greener

Someone on the PostGIS discussion list asked about a problem converting an Oracle database to PostGIS and he had run into some difficulties with Oracle’s Optimized Rectangles.

Is there a way to store rectangles in postGIS in a similar fashion?

With the “similar fashion” being all about converting to equivalent 5 vertex POLYGONS:

MDSYS.SDO_GEOMETRY(2003,NULL,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,1003,3),MDSYS.SDO_ORDINATE_ARRAY(topleftlat,topleftlon,bottomrightlat,bottomrightlon)),null)

I understand this loads a rectangle in 2 points. I’ve written this postGIS code:

GeomFromText(‘POLYGON (” + topleftlat + “ “ + topleftlon + “ “ + bottomrightlat + “ “ + bottomrightlon + “))’, -1)

Putting aside the inversion of the rectangle above, and the incorrect specification of the WKT Polygon, I endeavoured to help him within some SQL scripting.

Here ‘tis.

Create Optimized Rectangle Data

So that you can replicate the conversion, let’s first start by creating a SQL script that will generate 500 Oracle Spatial optimized rectangles.

DROP   TABLE Conversion PURGE;
CREATE TABLE Conversion ( gid integer, geom mdsys.sdo_geometry );
SET FEEDBACK OFF
    INSERT INTO Conversion
      SELECT rownum,
             mdsys.sdo_geometry(2003,4326,NULL,
                   MDSYS.SDO_ELEM_INFO_ARRAY(1,1003,3), -- optimized rectangle
                   MDSYS.SDO_ORDINATE_ARRAY(
                         ROUND(lon,6),
                         ROUND(lat,6),
                         ROUND(lon+dbms_random.value(0.1,1.0),6),
                         ROUND(lat+dbms_random.value(0.1,1.0),6)
                         ))
       FROM (SELECT dbms_random.value(147,149) as lon,
                    dbms_random.value(-44,-42) as lat
               FROM DUAL)
     CONNECT BY LEVEL <= 500;
commit;
select distinct sdo_geom.validate_geometry(geom,0.005) from conversion;

-- Write CSV header and data
SET ECHO OFF
@write_csv

The last @write_csv call is done to ensure that “SET ECHO OFF” does not reprint the SELECT statements in write_csv.sql into the spooled CSV file.

Export from Oracle as 4 numeric columns

Now, to create a CSV file we need a second file called “write_csv.sql” into which we will write the optimized rectangle’s 2 coordinates.

SET NEWPAGE 0
SET SPACE 0
SET LINESIZE 9999
SET PAGESIZE 0
SET FEEDBACK OFF
SET VERIFY OFF
SET HEADING OFF
SET MARKUP HTML OFF
SET TRIMSPOOL ON
SET LONG 4000
SET LONGCHUNKSIZE 500
SET TERMOUT OFF
SET SQLPROMPT OFF
spool c:\temp\conversion.csv
select 'gid,bottomleftlon,bottomleftlat,toprightlon,toprightlat' from dual;
select gid || ',' || bottomleftlon || ',' || bottomleftlat || ',' || toprightlon || ',' ||  toprightlat
  from (select a.gid, 
               sum(case when MOD(rownum,2) = 1 then v.x else null end) as bottomleftlon, 
               sum(case when MOD(rownum,2) = 1 then v.y else null end) as bottomleftlat, 
               sum(case when MOD(rownum,2) = 0 then v.x else null end) as toprightlon, 
               sum(case when MOD(rownum,2) = 0 then v.y else null end) as toprightlat
          from conversion a, 
               table(sdo_util.getvertices(a.geom)) v
        group by a.gid
        order by 1
       );
spool off
SET SQLPROMPT 'SQL> '

The conversion.csv file looks like this:

gid,bottomleftlon,bottomleftlat,toprightlon,toprightlat
1,148.629941,-42.548192,149.541498,-42.20901
2,148.629941,-42.548192,148.855305,-42.255013
3,148.629941,-42.548192,148.990817,-41.829754
4,148.629941,-42.548192,149.515169,-42.380816

Load into PostGIS as BBOX

The CSV file holds all the data for creating a PostGIS BBOX. The following script shows how this would be done.

DROP   TABLE Conversion;

Query returned successfully with no result in 16 ms.

CREATE TABLE Conversion (
  GID           integer,
  bottomLeftLon double precision,
  bottomLeftLat double precision,
  topRightLon   double precision,
  topRightLat   double precision);

Query returned successfully with no result in 15 ms.

COPY Conversion ( gid,bottomLeftLon,bottomLeftLat,topRightLon, topRightLat )
    FROM 'c:/temp/conversion.csv'
    WITH 
          DELIMITER AS ','
          CSV HEADER ;

Query returned successfully: 500 rows affected, 16 ms execution time.

SELECT addGeometryColumn('postgis','conversion','bbox','4326','POLYGON','2'); 

addgeometry
text
------------------------------------------------------
"public.conversion.bbox SRID:4326 TYPE:POLYGON DIMS:2"

UPDATE Conversion set bbox = ST_SetSRID(ST_MakeBox2D(ST_MakePoint(bottomLeftLon,bottomLeftLat), 
                                        ST_MakeBox2D(ST_MakePoint(topRightLon, topRightLat)),
                                        4326);

Query returned successfully: 500 rows affected, 32 ms execution time.

SELECT gid, ST_AsText(bbox)
  FROM Conversion
  LIMIT 10;

gid integer st_astext text
1 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.660003662109,149.660003662109 149.660003662109,149.660003662109 -43.4000015258789,148.789993286133 -43.4000015258789))”
2 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.600006103516,149.600006103516 149.600006103516,149.600006103516 -43.4000015258789,148.789993286133 -43.4000015258789))”
3 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.240005493164,149.240005493164 149.240005493164,149.240005493164 -43.4000015258789,148.789993286133 -43.4000015258789))”
4 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.380004882813,149.380004882813 149.380004882813,149.380004882813 -43.4000015258789,148.789993286133 -43.4000015258789))”
5 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 148.990005493164,148.990005493164 148.990005493164,148.990005493164 -43.4000015258789,148.789993286133 -43.4000015258789))”
6 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.419998168945,149.419998168945 149.419998168945,149.419998168945 -43.4000015258789,148.789993286133 -43.4000015258789))”
7 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.470001220703,149.470001220703 149.470001220703,149.470001220703 -43.4000015258789,148.789993286133 -43.4000015258789))”
8 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.009994506836,149.009994506836 149.009994506836,149.009994506836 -43.4000015258789,148.789993286133 -43.4000015258789))”
9 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 148.919998168945,148.919998168945 148.919998168945,148.919998168945 -43.4000015258789,148.789993286133 -43.4000015258789))”
10 POLYGON ((148.789993286133 -43.4000015258789,148.789993286133 149.169998168945,149.169998168945 149.169998168945,149.169998168945 -43.4000015258789,148.789993286133 -43.4000015258789))”

-- Create a spatial index for faster querying
CREATE INDEX conversion_bbox ON conversion USING GIST ( bbox );

Query returned successfully with no result in 0 ms.

-- Now, use the newly indexed spatial column in the spatial equivalent of the above query
SELECT count(*)
  FROM conversion
 WHERE bbox && SetSRID('BOX3D(148.7 -42.1, 148.8 -42.3)'::box3d,4326) ;

count
bigint
------
438

OK, so that is all based on CSV holding the 4 ordinate values of the optimized rectangle. You could, of course, export the data to a shapfile, or we could alter the above process to export the data as an OGC WKT Geometry.

Export from Oracle as OGC WKT

At 10g, Oracle Spatial has a .GET_WKT() method on the SDO_GEOMETRY class. We can use this to export the WKT description of an optimized rectangle. (You will notice in the output below that Oracle automatically converts this to a 5 point POLYGON object.)

All we need to do is change the SQL in the write_csv.sql script above.

SET NEWPAGE 0
SET SPACE 0
SET LINESIZE 9999
SET PAGESIZE 0
SET FEEDBACK OFF
SET VERIFY OFF
SET HEADING OFF
SET MARKUP HTML OFF
SET TRIMSPOOL ON
SET LONG 4000
SET LONGCHUNKSIZE 500
SET TERMOUT OFF
SET SQLPROMPT OFF
spool c:\temp\conversion.csv
select 'gid,geom_wkt' from dual;
select a.gid || ',"' || CAST(a.geom.get_wkt() as VARCHAR2(1000)) || '"'
  from conversion a
order by a.gid;
spool off
SET SQLPROMPT 'SQL> '

The conversion.csv file looks like this:

gid,geom_wkt
1,“POLYGON ((148.738888 -42.967608, 149.122624 -42.967608, 149.122624 -42.505418, 148.738888 -42.505418, 148.738888 -42.967608))”
2,“POLYGON ((148.738888 -42.967608, 149.143449 -42.967608, 149.143449 -42.078775, 148.738888 -42.078775, 148.738888 -42.967608))”
3,“POLYGON ((148.738888 -42.967608, 149.406925 -42.967608, 149.406925 -42.584677, 148.738888 -42.584677, 148.738888 -42.967608))”
4,“POLYGON ((148.738888 -42.967608, 149.727345 -42.967608, 149.727345 -42.703013, 148.738888 -42.703013, 148.738888 -42.967608))”

Load into PostGIS as POLYGON

Here is our, revamped, load script for loading the OGC WKT into PostGIS.

DROP   TABLE Conversion;

Query returned successfully with no result in 16 ms.

CREATE TABLE Conversion (
  GID       integer,
  GEOM_WKT  text);

Query returned successfully with no result in 94 ms.

COPY Conversion ( gid,geom_wkt)
    FROM 'c:/temp/conversion.csv'
    WITH 
          DELIMITER AS ','
          CSV HEADER ;

Query returned successfully: 500 rows affected, 16 ms execution time.

SELECT gid, ST_PolygonFromText(geom_wkt), geom_wkt
  FROM Conversion
  LIMIT 10;

gid geom_wkt
1 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.122624 -42.967608, 149.122624 -42.505418, 148.738888 -42.505418, 148.738888 -42.967608))”
2 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.143449 -42.967608, 149.143449 -42.078775, 148.738888 -42.078775, 148.738888 -42.967608))”
3 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.406925 -42.967608, 149.406925 -42.584677, 148.738888 -42.584677, 148.738888 -42.967608))”
4 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.727345 -42.967608, 149.727345 -42.703013, 148.738888 -42.703013, 148.738888 -42.967608))”
5 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.245611 -42.967608, 149.245611 -41.96837, 148.738888 -41.96837, 148.738888 -42.967608))”
6 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.472138 -42.967608, 149.472138 -42.211894, 148.738888 -42.211894, 148.738888 -42.967608))”
7 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.443473 -42.967608, 149.443473 -42.32729, 148.738888 -42.32729, 148.738888 -42.967608))”
8 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.277115 -42.967608, 149.277115 -42.075974, 148.738888 -42.075974, 148.738888 -42.967608))”
9 “01030000000100000005000000056D72F8A49762408A….”;“POLYGON ((148.738888 -42.967608, 149.21169 -42.967608, 149.21169 -42.264614, 148.738888 -42.264614, 148.738888 -42.967608))”
10 “01030000000100000005000000056D72F8A49762408….”;“POLYGON ((148.738888 -42.967608, 149.300831 -42.967608, 149.300831 -42.01735, 148.738888 -42.01735, 148.738888 -42.967608))”

SELECT addGeometryColumn('postgis','conversion','geom','4326','POLYGON','2'); 

addgeometry
text
------------------------------------------------------
"public.conversion.geom SRID:4326 TYPE:POLYGON DIMS:2"

UPDATE Conversion set geom = ST_SetSRID(ST_PolygonFromText(geom_wkt),4326);

Query returned successfully: 500 rows affected, 78 ms execution time.

SELECT gid, ST_AsText(geom)
  FROM Conversion
  LIMIT 10;

gid ST_AsText text
1 POLYGON ((148.738888 -42.967608,149.122624 -42.967608,149.122624 -42.505418,148.738888 -42.505418,148.738888 -42.967608))”
2 POLYGON ((148.738888 -42.967608,149.143449 -42.967608,149.143449 -42.078775,148.738888 -42.078775,148.738888 -42.967608))”
3 POLYGON ((148.738888 -42.967608,149.406925 -42.967608,149.406925 -42.584677,148.738888 -42.584677,148.738888 -42.967608))”
4 POLYGON ((148.738888 -42.967608,149.727345 -42.967608,149.727345 -42.703013,148.738888 -42.703013,148.738888 -42.967608))”
5 POLYGON ((148.738888 -42.967608,149.245611 -42.967608,149.245611 -41.96837,148.738888 -41.96837,148.738888 -42.967608))”
6 POLYGON ((148.738888 -42.967608,149.472138 -42.967608,149.472138 -42.211894,148.738888 -42.211894,148.738888 -42.967608))”
7 POLYGON ((148.738888 -42.967608,149.443473 -42.967608,149.443473 -42.32729,148.738888 -42.32729,148.738888 -42.967608))”
8 POLYGON ((148.738888 -42.967608,149.277115 -42.967608,149.277115 -42.075974,148.738888 -42.075974,148.738888 -42.967608))”
9 POLYGON ((148.738888 -42.967608,149.21169 -42.967608,149.21169 -42.264614,148.738888 -42.264614,148.738888 -42.967608))”
10 POLYGON ((148.738888 -42.967608,149.300831 -42.967608,149.300831 -42.01735,148.738888 -42.01735,148.738888 -42.967608))”

-- Create a spatial index for faster querying
CREATE INDEX conversion_geom ON conversion USING GIST ( geom );

Query returned successfully with no result in 62 ms.

-- Now, use the newly indexed spatial column in the spatial equivalent of the above query
SELECT count(*)
  FROM conversion
 WHERE geom && SetSRID('BOX3D(148.7 -42.1, 148.8 -42.3)'::box3d,4326) ;

count
bigint
------
183

I hope this is of use to someone.

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Newsvinepost this at Ma.gnoliapost this at Furlpost this at Blinklistpost this at Spurlpost this at Wistspost this at Simpypost this at Redditpost this at Farkpost this at Blogmarkspost this at Yahoo! my webpost this at Mr. Wongpost this at Windows Livepost this at Google Bookmarkspost this to Twitter

Building polygons from overlapping linestrings requiring intersection

· Apr 11, 02:48 pm by Simon Greener

I received an email a few weeks ago asking:

I was wondering if you could post an article explaining how to create a polygon from overlapping lines, if this is possible.

I am new as in 1 week into exploring this product and am finding the documentation for ST_BuildArea and the like a little hard to understand at the moment.

This is an example of the sort of thing I would like to be able to do.

The correspondant provided me with an ascii image of 4 overlapping lines. Here is an image of the actual test data I used as it is the same as what he asked (except I have added a mid-point vertex in each linestring).

PostGIS ST_BuildArea: Starting Four Lines

First off, let’s build a table and populate it with linestrings.

drop table test;

create table test ( 
  gid serial );

SELECT AddGeometryColumn('public', 'test', 'geom', 28355, 'GEOMETRY', 2);

insert into test (geom)
select geom
from (select ST_SetSRID(ST_GeomFromText('LINESTRING( 1 0, 1 5, 1 11 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 10 0, 10 5, 10 11 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 0 1, 5 1, 11 1 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 0 10, 5 10, 11 10 )'),28355) as geom) as g;

select gid, ST_AsText(geom)
  from test;

gid integer st_astext text
1 LINESTRING (1 0,1 5,1 11)”
2 LINESTRING (10 0,10 5,10 11)”
3 LINESTRING (0 1,5 1,11 1)”
4 LINESTRING (0 10,5 10,11 10)”
6 POINT (1 5)”
7 POINT (1 11)”
1 POINT (1 0)”
9 POINT (10 5)”
10 POINT (10 11)”
2 POINT (10 0)”
12 POINT (5 1)”
13 POINT (11 1)”
3 POINT (0 1)”
15 POINT (5 10)”
16 POINT (11 10)”
4 POINT (0 10)”

The steps to build a polygon from the four linestrings is:

Here is the SQL that does all this.

insert into test (geom)
select ST_SetSRID(ST_BuildArea(ST_Collect(ST_GeomFromText(p.geom))),28355) as geom
  from (select DISTINCT ST_AsText(g.geom) as geom, g.iPoint
	  from (select (ST_Dump(e.geom)).geom,
		       f.ig as iPoint
		  from (SELECT /* This query is the set of each line with each and every line that intersects it */
			       ST_SymDifference((select a.geom from test a where a.gid = c.gid),c.geom) as geom
			  FROM (select a.gid as gid, ST_Collect(b.geom) as geom
				  from test a, 
				       test b
				 where a.gid <> b.gid
				   and ST_Intersects(a.geom,b.geom)
				 group by a.gid
				) as c
			) e, 
		       (select /* Collect the set of all intersection points in a single multipoint geometry */
		               ST_Collect(i.point) as ig
			  from ( select ST_SetSRID(ST_Point(ST_X(a.pnt),ST_Y(a.pnt)),28355) as point
				    from (select ST_Intersection(a.geom,b.geom) as pnt
					    from test a, 
					         test b
					   where a.gid <> b.gid
 					     and ST_Intersects(a.geom,b.geom)
				         ) as a
				 group by ST_X(a.pnt), ST_Y(a.pnt)
				 having count(*) > 1
				) i
			) f
		) g
  where /* We only want those linestrings that have an intersection point (see d3) at both ends */
        ST_Intersects(g.ipoint,ST_StartPoint(g.geom)) 
    and ST_Intersects(g.ipoint,ST_EndPoint(g.geom))
     ) as p;

Execution of this SQL gives us a lovely, square polygon with all the mid-point vertices in tact (I added them in to show that any internal vertices, not matter how many describing a complex line, are maintained).

PostGIS ST_BuildArea: Resultant -Orange - Polygon formed from 4 lines

Now, ST_BuildArea will work with more lines that just 4 straight lines forming a square. I will demonstrate by processing a hexagon (I won’t include the processing SQL as it is the same.)

-- Test Hexagon
drop table test;

create table test ( 
  gid serial );
SELECT AddGeometryColumn('public', 'test', 'geom', 28355, 'GEOMETRY', 2);

insert into test (geom)
values 
(ST_SetSRID(ST_GeomFromText('LINESTRING(5 14, 13 10)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(13 2, 5 -1)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(-1 10, 7 14)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(1 12, -2 5)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(-2 7, 1 0)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(-1 2, 7 -1)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(11 12, 14 5)'),28355)),
(ST_SetSRID(ST_GeomFromText('LINESTRING(11 0, 14 7)'),28355));

select gid, ST_AsText(geom)
  from test;

gid integer st_astext text
1 LINESTRING (5 14,13 10)”
2 LINESTRING (13 2,5 -1)”
3 LINESTRING (-1 10,7 14)”
4 LINESTRING (1 12,-2 5)”
5 LINESTRING (-2 7,1 0)”
6 LINESTRING (-1 2,7 -1)”
7 LINESTRING (11 12,14 5)”
8 LINESTRING (11 0,14 7)”

The formed Hexagonal polygon is as follows.

PostGIS ST_BuildArea: Starting With 8 Lines

It is not possible have ST_BuildArea create two polygons where those polygons share a simple line.

Here is the data I used.

-- Test 5 lines 2 polygons
drop table test;

create table test ( 
  gid serial );

SELECT AddGeometryColumn('public', 'test', 'geom', 28355, 'GEOMETRY', 2);

insert into test (geom)
select geom
from (select ST_SetSRID(ST_GeomFromText('LINESTRING( 1 0, 1 11 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 10 0, 10 11 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 5.5 0, 5.5 11 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 0 1, 11 1 )'),28355) as geom
      union all
      select ST_SetSRID(ST_GeomFromText('LINESTRING( 0 10, 11 10 )'),28355) as geom) as g;

select gid, ST_AsText(geom)
  from test;

gid integer st_astext text
1 LINESTRING (1 0,1 11)”
2 LINESTRING (10 0,10 11)”
3 LINESTRING (5.5 0,5.5 11)”
4 LINESTRING (0 1,11 1)”
5 LINESTRING (0 10,11 10)”

And you can see from the shading that there is only one polygon even though 5 linestrings went in to ST_BuildArea.

PostGIS BuildArea: 5 Lines 2 Polygons Do Not Make

However, even though this is an article about ST_BuildArea, ST_Polygonize will do what we want (thanks to Regina Obe for pointing this out in the comments below).

Here is a modified version of the SQL that finishes with the MULTIPOLYGON that ST_Polygonize produces being “exploded” into individual polygons for writing back to the table.

-- ST_BuildArea doesn't produce two polygons so let's try ST_Polygonize
insert into test (geom)
SELECT geom
FROM ST_Dump(
(SELECT ST_SetSRID(ST_Polygonize(ST_GeomFromText(p.geom)),28355) 
  from (select DISTINCT ST_AsText(g.geom) as geom, g.iPoint
	  from (select (ST_Dump(e.geom)).geom,
		       f.ig as iPoint
		  from (SELECT /* This query is the set of each line with each and every line that intersects it */
			       ST_SymDifference((select a.geom from test a where a.gid = c.gid),c.geom) as geom
			  FROM (select a.gid as gid, ST_Collect(b.geom) as geom
				  from test a, 
				       test b
				 where a.gid <> b.gid
				   and ST_Intersects(a.geom,b.geom)
				 group by a.gid
				) as c
			) e, 
		       (select /* Collect the set of all intersection points in a single multipoint geometry */
		               ST_Collect(i.point) as ig
			  from ( select ST_SetSRID(ST_Point(ST_X(a.pnt),ST_Y(a.pnt)),28355) as point
				    from (select ST_Intersection(a.geom,b.geom) as pnt
					    from test a, 
					         test b
					   where a.gid <> b.gid
 					     and ST_Intersects(a.geom,b.geom)
				         ) as a
				 group by ST_X(a.pnt), ST_Y(a.pnt)
				 having count(*) > 1
				) i
			) f
		) g
  where /* We only want those linestrings that have an intersection point (see d3) at both ends */
        ST_Intersects(g.ipoint,ST_StartPoint(g.geom)) 
    and ST_Intersects(g.ipoint,ST_EndPoint(g.geom))
     ) As p)) As final;

The following image shows that two polygons are created.

ST_Polygonize builds two areas as expected

Finally, what happens if we give ST_BuildArea the linework for two disjoint polygons? Here is the linework.

PostGIS ST_BuildArea: Multipolygon linework

Here is the code to create a table with the linework.

-- Can we create multiple polygons?
drop table test;

create table test ( 
  gid serial );

SELECT AddGeometryColumn('public', 'test', 'geom', 28355, 'GEOMETRY', 2);

insert into test (geom)
values
( ST_SetSRID(ST_GeomFromText('LINESTRING(-4 8, -4 14)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(-5 13, 2 13)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(-5 9, 2 9)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(1 14, 1 8)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(5 14, 5 8)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(4 9, 11 9)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(4 13, 11 13)'),28355)),
( ST_SetSRID(ST_GeomFromText('LINESTRING(10 14, 10 8)'),28355));

select gid, ST_AsText(geom)
  from test;

gid integer st_astext text
1 LINESTRING
2 LINESTRING
3 LINESTRING
4 LINESTRING
5 LINESTRING
6 LINESTRING
7 LINESTRING
8 LINESTRING

Here is a modified version of the SQL that, as with ST_Polygonize, finishes with the MULTIPOLYGON that ST_BuildArea produces being “exploded” into individual polygons for writing back to the table.

insert into test (geom)
SELECT (ST_Dump(polys.geom)).geom
  FROM (select ST_SetSRID(ST_BuildArea(ST_Collect(p.geom)),28355) as geom
	  from (select DISTINCT g.geom, g.iPoint
		  from (select (ST_Dump(e.geom)).geom,
			       f.ig as iPoint
			  from (select /* Intersect each line witheach and every line that intersects it */
				       ST_SymDifference((select a.geom from test a where a.gid = c.gid),c.geom) as geom
				  FROM (select a.gid as gid, ST_Collect(b.geom) as geom
					  from test a, 
					       test b
					 where a.gid <> b.gid
					   and ST_Intersects(a.geom,b.geom)
					 group by a.gid
					) as c
				) e, 
			       (select ST_Collect(i.point) as ig
				  from ( select ST_SetSRID(ST_Point(ST_X(a.pnt),ST_Y(a.pnt)),28355) as point
					    from (select ST_Intersection(a.geom,b.geom) as pnt
						   from test a, 
						       test b
						 where a.gid <> b.gid
						   and ST_Intersects(a.geom,b.geom)
					       ) as a
					 group by ST_X(a.pnt), ST_Y(a.pnt)
					 having count(*) > 1
					) i
				) f
			) g
	  where ST_Intersects(g.ipoint,ST_StartPoint(g.geom)) /* We only want those linestrings that have an intersection point (see d3) at both ends */
	    and ST_Intersects(g.ipoint,ST_EndPoint(g.geom))
	     ) as p
     ) as polys;

And here are the polygons:

PostGIS ST_BuildArea: Resultant Polygons

I hope this article was useful to someone.

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Newsvinepost this at Ma.gnoliapost this at Furlpost this at Blinklistpost this at Spurlpost this at Wistspost this at Simpypost this at Redditpost this at Farkpost this at Blogmarkspost this at Yahoo! my webpost this at Mr. Wongpost this at Windows Livepost this at Google Bookmarkspost this to Twitter