Go to content Go to navigation and search

Home

Current Oracle Spatial Blog Articles


Search

Browse

RSS / Atom

Email me

textpattern

Creative Commons License
All Blog Articles, Data Models and Free Source Code by Simon Greener, The SpatialDB Advisor is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Loading Spatial Data from an external CSV file in Oracle

Friday February 20 2009 at 12:14

There have been many, many articles written on the web describing how to access external data (data held in operating system files not under the control of Oracle and not stored within its own datafiles and tablespaces) from within Oracle such that one can query the data using SQL.

In short, Oracle allows a user to access external data by wrapping a table definition over the top of the external data files and using its SQL loader and DataPump technologies to convert the data “on the fly” as users query it. As the Oracle documentation notes, this is a read-only table whose metadata is stored in the database but whose data in stored outside the database.

The table “wrapper” is achieved by using the CREATE TABLE statement to define the columns of the table in the normal manner, but the appending to the definition its physical_properties, in particular, the ORGANIZATION EXTERNAL clause.

This clause is made up of two dependent sub-clauses, the external_table_clause and the external_data_properties that allows you to specify the properties so described. For example, the TYPE clause of external_table_clause allows you to indicate the driver to be used to access external table’s data. Oracle database provides two access drivers: ORACLE_LOADER and ORACLE_DATAPUMP. Whereas the ACCESS PARAMETERS clause of “external_data_properties” allows you to describe the physical layout of the file (what type of delimiter, number of fields etc).

The best way to describe the external data loading capability of Oracle is to show you how to use it to load a small sample of LiDAR data (after it has been post-processed).

Firstly, we need to place our data in an operating system directory which Oracle can access, (in this example I have placed a file called ground.csv in C:\Temp\Lidar), and then create an Oracle Directory object and grant the user that will create and load the data (here my standard “CODESYS” user), read access on that directory.

-- Create directory that points to actual data file.
CREATE OR REPLACE DIRECTORY EXTERNAL_DATA_DIRECTORY AS 'C:\temp\lidar';

CREATE OR REPLACE DIRECTORY succeeded.

-- If create directory done by DBA as SYSTEM then you would need to grant the user doing the processing the ability to read from that directory via:
GRANT READ ON DIRECTORY EXTERNAL_DATA_DIRECTORY TO codesys;

GRANT READ succeeded.

Now we need to define our table object in Oracle. We also need to describe its external properties as part of the one definition.

DROP   TABLE lidar;

DROP   succeeded.

CREATE TABLE lidar(
  easting NUMBER,
 northing NUMBER,
   height NUMBER
)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
DEFAULT DIRECTORY DATA_PUMP_DIR
ACCESS PARAMETERS (
         RECORDS DELIMITED BY NEWLINE 
         FIELDS TERMINATED BY ',' 
         MISSING FIELD VALUES ARE NULL 
         (easting, northing, height)
	)
LOCATION (EXTERNAL_DATA_DIRECTORY:'ground.csv'))
REJECT LIMIT UNLIMITED;

CREATE TABLE succeeded.

--Note that if our external file was TAB delimited we would use the following clause:
-- FIELDS TERMINATED BY 0x'09' 

Now we execute SQL SELECT statements against the table.

SELECT count(*) 
  FROM lidar;

COUNT(*)               
---------------------- 
40401                  

SELECT count(*)
  FROM lidar l
 WHERE l.easting  BETWEEN 478040  AND 478060
   AND l.northing BETWEEN 5228050 AND 5228070
   AND l.height   BETWEEN 200     AND 600;

1 rows selected
COUNT(*)               
---------------------- 
301                    

1 rows selected

I will now demonstrate that you cannot create spatial indexes on this table. Firstly, note that I have no SDO_GEOMETRY column in this table that I can index. However, I could try creating a function-based index (even though I doubt it would work because Oracle indexes normally contain ROWIDs that link the index to the actual row in the table holding the data. External tables do not have ROWIDs.)

It is worth noting the following from the Oracle documentation.

Restrictions on External Tables
External tables are subject to the following restrictions:

  • An external table cannot be a temporary table.
  • You cannot specify constraints on an external table.
  • An external table cannot have object type, varray, or LONG columns.

(I will look at the last limitation at the end of this article.)

Let’s create the function based index in the common manner, including the metadata.

-- Create Function that returns a point geometry so we can create a function-based index
--
create or replace function observation(p_srid in number,
                                       p_x    in number, 
                                       p_y    in number, 
                                       p_z    in number )
       return mdsys.sdo_geometry deterministic
Is
Begin
  Return MDSYS.SDO_GEOMETRY(3001,p_srid,mdsys.sdo_point_type(p_x,p_y,p_z),NULL,NULL);
End observation;
/
show errors

function observation(p_srid Compiled.
No Errors.

-- Generate sdo_geom_metadata entry
--
DELETE FROM USER_SDO_GEOM_METADATA
      WHERE table_name = 'LIDAR';

1 rows deleted
COMMIT;

COMMIT succeeded.

-- Generate USER_SDO_GEOM_METADATA entry based on actual extent of data in external table.
DECLARE
  v_minx Number;
  v_maxx Number;
  v_miny Number;
  v_maxy Number;
  v_minz Number;
  v_maxz Number;
BEGIN
  SELECT min(easting),max(easting),min(northing),max(northing),min(height),max(height)
    INTO      v_minx,      v_maxx,       v_miny,       v_maxy,     v_minz,     v_maxz
    FROM LIDAR;
  INSERT INTO USER_SDO_GEOM_METADATA(
    TABLE_NAME,
    COLUMN_NAME,
    DIMINFO,
    SRID
    ) VALUES (
    'LIDAR',
    'OBSERVATION(28355,EASTING,NORTHING,HEIGHT)',
    MDSYS.SDO_DIM_ARRAY(
      MDSYS.SDO_DIM_ELEMENT('X',v_minx,v_maxx,0.005),
      MDSYS.SDO_DIM_ELEMENT('Y',v_miny,v_maxy,0.005),
      MDSYS.SDO_DIM_ELEMENT('Z',v_minz,v_maxz,0.005)
      ),
    28355
  );
END;
/
SHOW ERRORS

anonymous block completed
No Errors.

COMMIT;

COMMIT succeeded.

-- Let's check to make sure the entry has been created
SELECT TABLE_NAME,COLUMN_NAME,DIMINFO 
  FROM USER_SDO_GEOM_METADATA 
 WHERE TABLE_NAME = 'LIDAR';

TABLE_NAME COLUMN_NAME                                DIMINFO
---------- ------------------------------------------ --------------------------------------------------------------------------------------------------------------------------------------------------------------
LIDAR      OBSERVATION(28355,EASTING,NORTHING,HEIGHT) MDSYS.SDO_DIM_ARRAY(MDSYS.SDO_DIM_ELEMENT(X,478000,478100,0.005),MDSYS.SDO_DIM_ELEMENT(Y,5228000,5228100,0.005),MDSYS.SDO_DIM_ELEMENT(Z,1.019,1999.959,0.005))

1 rows selected

-- Now try and create an index on this function

DROP INDEX LIDAR_GEOM;

Error starting at line 84 in command:
DROP INDEX LIDAR_GEOM
Error report:
SQL Error: ORA-01418: specified index does not exist
01418. 00000 -  "specified index does not exist"

CREATE INDEX LIDAR_GEOM
       ON LIDAR(OBSERVATION(28355,EASTING,NORTHING,HEIGHT))
       INDEXTYPE is mdsys.spatial_index 
       parameters('sdo_indx_dims=2, layer_gtype=point');

Error starting at line 85 in command:
CREATE INDEX LIDAR_GEOM
       ON LIDAR(OBSERVATION(28355,EASTING,NORTHING,HEIGHT))
       INDEXTYPE is mdsys.spatial_index 
       parameters('sdo_indx_dims=2, layer_gtype=point')
Error at Command Line:86 Column:10
Error report:
SQL Error: ORA-30657: operation not supported on external organized table
30657.0000 -  "operation not supported on external organized table"
*Cause:    User attempted on operation on an external table which is
           not supported.
*Action:   Don't do that!

Don’t you love the recommended Action!

OK, so our external data of LiDAR points can really on read via full-table scans as one cannot create an index on an externally organized table.

If we want to spatially index we have to bring the data inside the database which we will now do.

-- Rename our existing LiDAR table to LiDAR_EXT
DROP TABLE lidar_ext;

Error starting at line 90 in command:
drop table lidar_ext
Error report:
SQL Error: ORA-00942: table or view does not exist
00942. 00000 -  "table or view does not exist"

RENAME lidar TO lidar_ext;

rename lidar succeeded.

CREATE TABLE lidar
AS 
SELECT sdo_geometry(3001,28355,sdo_point_type(easting,northing,height),null,null) as geom
  FROM lidar_ext;

create table succeeded.

-- Change column name in existing USER_SDO_GEOM_METADATA
UPDATE user_sdo_geom_metadata
   SET column_name = 'GEOM'
 WHERE table_name = 'LIDAR';

1 rows updated

COMMIT;

commit succeeded.

-- Now, rebuild index.
  DROP INDEX LIDAR_GEOM;

Error starting at line 104 in command:
DROP INDEX LIDAR_GEOM
Error report:
SQL Error: ORA-01418: specified index does not exist
01418. 00000 -  "specified index does not exist"

CREATE INDEX LIDAR_GEOM
       ON LIDAR(GEOM)
       INDEXTYPE is mdsys.spatial_index 
       parameters('sdo_indx_dims=2, layer_gtype=point');

CREATE INDEX succeeded.

DROP TABLE LIDAR_EXT;

DROP TABLE LIDAR_EXT succeeded.

So, now we can query the data.

SELECT count(*)
  FROM lidar l
 WHERE SDO_ANYINTERACT(l.geom,sdo_geometry(2003,28355,null,sdo_elem_info_array(1,3,3),sdo_ordinate_array(478040,5228050,478060,5228070))) = 'TRUE'
   AND l.geom.sdo_point.z BETWEEN 200 AND 600;

COUNT(*)
---------------------- 
301

1 rows selected

Data Other than Points

Our example is based on point data that is loaded into three numeric attributes rather than to an SDO_GEOMETRY object directly. As noted above:

  • An external table cannot have object type, varray, or LONG columns.

Instead of modifying my existing example to try and define the SDO_GEOMETRY object directly on the X, Y and Z fields in the CSV, I thought I would attempt to load the linear data associated with my article on executing shp2sdo from within the database.

The shp2sdo utility creates a SQL Loader control file with the actual data stored in-line inside the control file or externally in a data file. This is controlled by the “-d” parameter. I re-ran the tool so that I got the shapefile’s data in the separate data file. I the opened the CTL file, grabbed the load parameters and constructed an appropriate CREATE TABLE statement from it inside Oracle database. Here is that statement:

CREATE OR REPLACE DIRECTORY TEMP_DATA_DIRECTORY AS 'C:\temp';

-- GRANT READ ON DIRECTORY TEMP_DATA_DIRECTORY TO codesys;

CREATE TABLE GEODATA250K_TAS_ROADS 
   (	GID NUMBER(38,0), 
	CLASS VARCHAR2(20 BYTE), 
	FEATWIDTH VARCHAR2(20 BYTE), 
	NRN VARCHAR2(20 BYTE), 
	PID VARCHAR2(20 BYTE), 
	FEATREL VARCHAR2(20 BYTE), 
	FORMATION VARCHAR2(20 BYTE), 
	ATTRREL VARCHAR2(20 BYTE), 
	RETIRED VARCHAR2(20 BYTE), 
	SRN VARCHAR2(20 BYTE), 
	SOURCE VARCHAR2(20 BYTE), 
	SHAPE_LENG VARCHAR2(20 BYTE), 
	NAME VARCHAR2(20 BYTE), 
	TEXTNOTE VARCHAR2(20 BYTE), 
	FEATTYPE VARCHAR2(20 BYTE), 
	PLANACC VARCHAR2(20 BYTE), 
	SYMBOL VARCHAR2(20 BYTE), 
	CREATED VARCHAR2(20 BYTE), 
	PEANO NUMBER, 
	GEOM MDSYS.SDO_GEOMETRY
)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEMP_DATA_DIRECTORY
ACCESS PARAMETERS (
 CONTINUEIF NEXT(1:1) = '#'
 FIELDS TERMINATED BY '|'
 TRAILING NULLCOLS (
   GID INTEGER EXTERNAL,
   CLASS 	NULLIF CLASS = BLANKS,
   FEATWIDTH 	NULLIF FEATWIDTH = BLANKS,
   NRN 	NULLIF NRN = BLANKS,
   PID 	NULLIF PID = BLANKS,
   FEATREL 	NULLIF FEATREL = BLANKS,
   FORMATION 	NULLIF FORMATION = BLANKS,
   ATTRREL 	NULLIF ATTRREL = BLANKS,
   RETIRED 	NULLIF RETIRED = BLANKS,
   SRN 	NULLIF SRN = BLANKS,
   SOURCE 	NULLIF SOURCE = BLANKS,
   SHAPE_LENG 	NULLIF SHAPE_LENG = BLANKS,
   NAME 	NULLIF NAME = BLANKS,
   TEXTNOTE 	NULLIF TEXTNOTE = BLANKS,
   FEATTYPE 	NULLIF FEATTYPE = BLANKS,
   PLANACC 	NULLIF PLANACC = BLANKS,
   SYMBOL 	NULLIF SYMBOL = BLANKS,
   CREATED 	NULLIF CREATED = BLANKS,
   PEANO,
   GEOM COLUMN OBJECT 
   (
     SDO_GTYPE       INTEGER EXTERNAL, 
     SDO_SRID        INTEGER EXTERNAL, 
     SDO_ELEM_INFO   VARRAY TERMINATED BY '|/' 
       (X            FLOAT EXTERNAL), 
     SDO_ORDINATES   VARRAY TERMINATED BY '|/' 
       (X            FLOAT EXTERNAL) 
   )
)
)
LOCATION (TEMP_DATA_DIRECTORY:'GeoData250K_Tas_roads.dat'))
REJECT LIMIT UNLIMITED;

On execution this is what happened.

Error starting at line 4 in command:
CREATE TABLE GEODATA250K_TAS_ROADS 
   (	GID NUMBER(38,0), 
	CLASS VARCHAR2(20 BYTE), 
	FEATWIDTH VARCHAR2(20 BYTE), 
	NRN VARCHAR2(20 BYTE), 
	PID VARCHAR2(20 BYTE), 
	FEATREL VARCHAR2(20 BYTE), 
	FORMATION VARCHAR2(20 BYTE), 
	ATTRREL VARCHAR2(20 BYTE), 
	RETIRED VARCHAR2(20 BYTE), 
	SRN VARCHAR2(20 BYTE), 
	SOURCE VARCHAR2(20 BYTE), 
	SHAPE_LENG VARCHAR2(20 BYTE), 
	NAME VARCHAR2(20 BYTE), 
	TEXTNOTE VARCHAR2(20 BYTE), 
	FEATTYPE VARCHAR2(20 BYTE), 
	PLANACC VARCHAR2(20 BYTE), 
	SYMBOL VARCHAR2(20 BYTE), 
	CREATED VARCHAR2(20 BYTE), 
	PEANO NUMBER, 
	GEOM MDSYS.SDO_GEOMETRY
)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
DEFAULT DIRECTORY TEMP_DATA_DIRECTORY
ACCESS PARAMETERS (
 CONTINUEIF NEXT(1:1) = '#'
 FIELDS TERMINATED BY '|'
 TRAILING NULLCOLS (
   GID INTEGER EXTERNAL,
   CLASS 	NULLIF CLASS = BLANKS,
   FEATWIDTH 	NULLIF FEATWIDTH = BLANKS,
   NRN 	NULLIF NRN = BLANKS,
   PID 	NULLIF PID = BLANKS,
   FEATREL 	NULLIF FEATREL = BLANKS,
   FORMATION 	NULLIF FORMATION = BLANKS,
   ATTRREL 	NULLIF ATTRREL = BLANKS,
   RETIRED 	NULLIF RETIRED = BLANKS,
   SRN 	NULLIF SRN = BLANKS,
   SOURCE 	NULLIF SOURCE = BLANKS,
   SHAPE_LENG 	NULLIF SHAPE_LENG = BLANKS,
   NAME 	NULLIF NAME = BLANKS,
   TEXTNOTE 	NULLIF TEXTNOTE = BLANKS,
   FEATTYPE 	NULLIF FEATTYPE = BLANKS,
   PLANACC 	NULLIF PLANACC = BLANKS,
   SYMBOL 	NULLIF SYMBOL = BLANKS,
   CREATED 	NULLIF CREATED = BLANKS,
   PEANO,
   GEOM COLUMN OBJECT 
   (
     SDO_GTYPE       INTEGER EXTERNAL, 
     SDO_SRID        INTEGER EXTERNAL, 
     SDO_ELEM_INFO   VARRAY TERMINATED BY '|/' 
       (X            FLOAT EXTERNAL), 
     SDO_ORDINATES   VARRAY TERMINATED BY '|/' 
       (X            FLOAT EXTERNAL) 
   )
)
)
LOCATION (TEMP_DATA_DIRECTORY:'GeoData250K_Tas_roads.dat'))
REJECT LIMIT UNLIMITED
Error at Command Line:24 Column:1
Error report:
SQL Error: ORA-30656: column type not supported on external organized table
30656.0000 -  "column type not supported on external organized table"
*Cause:    Attempt to create an external organized table with a column
           of type LONG, LOB, BFILE, ADT, or VARRAY.
*Action:   These column types are not supported, change the DDL.

Note, as per the documentation, “These column types are not supported”. For non-point geometry data you must use other methods for loading.

I hope this is useful to someone.

Creative Commons License

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Redditpost this at Farkpost this at Yahoo! my webpost this at Windows Livepost this at Google Bookmarkspost this to Twitter

Comment [2]

Hello, I am looking for list of companies worldwide who are feeding lidar data to Oracle Spatial.

It would be great if you could help me out. New to Oracle Spacial and need to do a research on how this product is being exploited by companies to provide better services and how successful they are doing so.

Thanks,

Victor Nair

— Victor Nair · 19 September 2009, 23:48 · #

Victor,

I would suggest you ask you question on the Oracle Spatial Forum

I have heard of people loading LiDAR into Oracle but I cannot see what business value it would have to a lot of organisations. In most of the work I have been involved with, the processed results of LiDAR data eg a Kriged surface represented as a 2D grid is of more use inside an Oracle database as it can then be combined with vector data, such as a polygon defining a business entity, to calculate new variables or measures such as average elevation etc.

Sorry I can not help you in any other way.

regards
Simon

Simon · 20 September 2009, 18:50 · #