Loading Point Data from a CSV File in PostGIS
Friday February 20 2009 at 05:48
I recently wrote an article on the use of External File with Oracle tables to load point data into Oracle.
I thought I would have a look on how to do this in PostGIS.
A bit of research came up with the PostgreSQL COPY SQL command.
There appears to be no equivalent to Oracle’s Organisation External for a table. In PostgreSQL you simply create your table and then insert data into it from an external file. You don’t seem to need to concern yourself with special permissions to do this: just get in and do it.
Which is what we will do right now.
-- First off let's create our target table
DROP TABLE lidar;
Query returned successfully with no result in 15 ms.
CREATE TABLE lidar(
Query returned successfully with no result in 16 ms.
-- Now we can execute the COPY command to load our data (PostgreSQL or perhaps PGAdminIII) gets
-- a little upset if we use Windows backslashes between the folders in the filename.
-- Yes, you can use a double backslash which will work but PostgreSQL will continue to complain.
-- Solution: simply use Linux/Unix forward slash filenames.
COPY lidar ( easting, northing, height )
DELIMITER AS ','
CSV HEADER ;
Query returned successfully: 40401 rows affected, 297 ms execution time.
-- Now let's query the loaded data.
FROM lidar l
WHERE l.easting BETWEEN 478040 AND 478060
AND l.northing BETWEEN 5228050 AND 5228070
AND l.height BETWEEN 200 AND 600;
-- Now let's add a 3D point geometry column to the table...
SELECT AddGeometryColumn('public', 'lidar', 'geom', 28355, 'POINT', 3);
"public.lidar.geom SRID:28355 TYPE:POINT DIMS:3"
-- Now populate the added column
update lidar set geom = ST_SetSRID(ST_MakePoint(easting,northing,height),28355);
Query returned successfully: 40401 rows affected, 359 ms execution time.
-- Create a spatial index for faster querying
CREATE INDEX lidar_geom ON lidar USING GIST ( geom );
Query returned successfully with no result in 1344 ms.
-- Now, use the newly indexed spatial column in the spatial equivalent of the above query
WHERE geom && SetSRID('BOX3D(478040 5228050,478060 5228070)'::box3d,28355)
AND height BETWEEN 200 AND 600;
Now, that was pretty simple. There are certainly advantages to the PostgreSQL way of loading CSV organised data. I am not sure about the lack of security with regards to external file access though. One thing in Oracle’s favour is that externally organised tables can be used without having to load the data which can be useful in many situations.
Anyway, I hope this article is of use to someone.