Spatial Sorting of Data via Morton Key
Thursday September 17 2009 at 07:21
I have often advocated the use of a spatial sort when loading static spatial data into Oracle (See 1. Spatially sort read-only data. in my article on performance tips). The idea here is to try and place spatial data that is close together in space (called spatial autocorrelation), close together on disk. Then, when the hard-disk’s head goes down on the disk to retrieve one or more records, the head will not have to travel far to retrieve the geographically close data. Imagine you zooming in to a particular area in your GIS. The search will be a minimum bounding rectangle based query that is specifically requesting data that is close together geographically. If disk activity can be minimised by spatial autocorrelation then performance will improve.
But really, I had never gotten around to producing a robust solution for use solely within the Oracle database.
Until now.
Object Relational Databases like Oracle are ideally suited to this task through the SELECT statement’s ORDER BY clause such that we can create a new table and populate it as follows:
CREATE TABLE spatially_sorted
AS
SELECT id, geom
FROM spatially_unsorted
ORDER BY <some sort of spatial sort on the geom column>;
(Of course we are assuming that when Oracle creates the table it will use contiguous blocks on disk. This will only be true is the tablespace is new or has been degragmented.)
Now, the problem is that a sort order is not defined on an Oracle Sdo_Geometry object such that one could sort as follows:
CREATE TABLE spatially_sorted
AS
SELECT id, geom
FROM spatially_unsorted
ORDER BY geom -- geom is an SDO_Geometry object
Oracle’s lack of a sort of any kind based on the SDO_Geometry object is why I have not provided a practical example before now. (Though it is worth noting that PostGIS can sort by geometry with Regina Obe observing that the sort is a btree sort: not rtree as in PostGIS’s CLUSTER ON geometry).
The standard way of sorting based on the Oracle RTree is to simply do an SDO_Filter search using the MBR of the whole of the data as follows:
CREATE TABLE spatially_sorted
AS
SELECT id, geom
FROM spatially_unsorted su,
(SELECT sdo_aggr_mbr(su2.geom) as mbr
FROM spatially_unsorted su2) su3
WHERE SDO_Filter(su.geom,su3.mbr) = 'TRUE'
One could also sort simply by X and Y via use of any point in a geometry. Thus, for Oracle point data, this would also work:
CREATE TABLE spatially_sorted
AS
SELECT id, geom
FROM spatially_unsorted su,
ORDER BY su.geom.sdo_point.x, su.geom.sdo_point.y;
Suitability of Sort
Of course both the PostGIS “ORDER BY geometry” sort and the Oracle RTree (SDO_Filter…) and XY based sorts will sort the data, but they will do so in strips based on Y and X.
What other theory can be used to define a spatial sort function?
Space curves
There is a body of theory that describes a space-filling curve such as in Z-Order) or Peano (my favourite). These curves, I would assert, when implemented, provide a better method for taking advantage of the spatially autocorrelation that is implicit to spatial data.
PostGIS 1.4 has implemented the ST_GeoHash function which I assume is a form of a space curve – see GeoHash Website – but this only for geodetic – longitude/latitude data.
Oracle is supposed to have implemented the Z-Order access method. From what I recall this would have been part of the original MultiDimension implementation in the 1990s. Now, the only place I have can find an implementation is MD.HHENCODE. I have tried to use this over the years but have been unsatisfied that it was working as I wanted. (The MD.HHENCODE function is undocumented.)
So, to implement a spatial sort that best represents spatial autocorrelation we must implement space curve key generation function on SDO_Geometry.
Implementation
I have Peano and Morton space-filling curves implemented in C but an implementation in Oracle (as EXPROC) based on these, while technically not difficult, does required changes to the LISTENER which is not popular with many DBAs.
I have also converted both these algorithms into Java and have successfully deployed them into the Oracle Java Virtual Machine with PL/SQL wrappers. While not difficult it does require DBA involvement to implement.
So, today, after having received an email from a colleague (whom I admire a lot) asking the following question, I decided to try and create a solution based entirely on PL/SQL:
We have a production instance in Oracle for our spatial data and another instance for web viewing to which we regularly copy our production data.
Normally we run a script which first removes indices, empties the tables, fills them with an INSERT … SELECT … and recreates the indices.
Was wondering if it is possible (and a good idea) to do the INSERT in such a manner that the data is spatially clustered when copied, maybe by using some sort of ORDER BY in the INSERT … SELECT.
Morton Implementation
The original C code for generating a Morton space key is as follows:
unsigned xy_to_morton (col,row)
/* this procedure calculates the Morton number of a cell
at the given row and col[umn]
Written: D.M. Mark, Jan 1984;
Converted to Vax/VMS: July 1985
*/
unsigned col,row;
{
unsigned key;
int level, left_bit, right_bit, quadrant;
key = 0;
level = 0;
while ((row>0) || (col>0))
{
/* split off the row (left_bit) and column (right_bit) bits and
then combine them to form a bit-pair representing the
quadrant */
left_bit = row % 2;
right_bit = col % 2;
quadrant = right_bit + 2*left_bit;
key += quadrant<<(2*level);
/* row, column, and level are then modified before the loop
continues */
row /= 2;
col /= 2;
level++;
}
return (key);
}
Now there are a few difficulties to over-come to get a successful implementation. The first is the use of unsigned integers. To be frank, I am going to ignore this and worry only it if I have to use all 32 bits in generating a key value. The second is the need for a bitwise left << operator. This can be done via the following PL/SQL:
Function Left_Shift( p_val Integer, p_shift Integer)
Return Integer
As
Begin
Return trunc(p_val * power(2,p_shift));
End;
OK, now let’s do the translation to PL/SQL.
create or replace Function Morton (p_col Natural, p_row Natural)
Return INTEGER
As
v_row Natural := ABS(p_row);
v_col Natural := ABS(p_col);
v_key Natural := 0;
v_level BINARY_INTEGER := 0;
v_left_bit BINARY_INTEGER;
v_right_bit BINARY_INTEGER;
v_quadrant BINARY_INTEGER;
Function Left_Shift( p_val Natural, p_shift Natural)
Return PLS_Integer
As
Begin
Return trunc(p_val * power(2,p_shift));
End;
Begin
while ((v_row>0) Or (v_col>0)) Loop
/* split off the row (left_bit) and column (right_bit) bits and
then combine them to form a bit-pair representing the
quadrant */
v_left_bit := MOD(v_row,2);
v_right_bit := MOD(v_col,2);
v_quadrant := v_right_bit + ( 2 * v_left_bit );
v_key := v_key + Left_Shift( v_quadrant,( 2 * v_level ) );
/* row, column, and level are then modified before the loop
continues */
v_row := trunc( v_row / 2 );
v_col := trunc( v_col / 2 );
v_level := v_level + 1;
End Loop;
return (v_key);
End Morton;
A quick test:
SQL> select codesys.morton(100,50) from dual;
CODESYS.MORTON(100,50)
-----------------------
7704
Morton Restrictions
There is a restriction to how a Morton space curve can be applied: the grid must be regular. This means that while the X and Y dimensions of the grid may be different (ie giving a rectangle not a square) the number of grids either side must be the same (ie a 100×100 grid not a 150×100 grid). Another requirement that affects the generation of a correct space curve is that the (row,column) index values must always be the same. So, in a 100×100 grid the lower left grid should be referenced as (0,0) or (1234,1234) and not (0,1) or (1234, 3456).
Apply to Real Data
First off let’s generate something that is ideal to our task: random point data. Note that the area, 5000m x 5000m, can be divided in to 100 × 100 × 500m grid cells.
DROP TABLE MORTONED_P;
DROP TABLE MORTON_P;
CREATE TABLE MORTON_P (
id integer,
morton_key integer,
geom mdsys.sdo_geometry
);
-- Now create the points and write then to the table
--
SET FEEDBACK OFF
INSERT INTO MORTON_P
SELECT rownum,
0,
mdsys.sdo_geometry(2001,NULL,
MDSYS.SDO_POINT_TYPE(
ROUND(dbms_random.value(350000 - ( 10000 / 2 ),
350000 + ( 10000 / 2 )),2),
ROUND(dbms_random.value(5400000 - ( 10000 / 2 ),
5400000 + ( 10000 / 2 )),2),
NULL),
NULL,NULL)
FROM DUAL
CONNECT BY LEVEL <= 1000;
COMMIT;
SET FEEDBACK ON
SET HEADING OFF
SELECT 'Inserted '||count(*)||' records into MORTON_P' FROM MORTON_P;
SET HEADING ON
Inserted 1000 records into MORTON_P
-- Create User_Sdo_Geom_Metadata
--
DELETE FROM USER_SDO_GEOM_METADATA WHERE TABLE_NAME = 'MORTON_P';
DECLARE
v_shape mdsys.sdo_geometry;
BEGIN
SELECT SDO_AGGR_MBR(geom) INTO v_shape FROM MORTON_P;
INSERT INTO USER_SDO_GEOM_METADATA(TABLE_NAME,COLUMN_NAME,DIMINFO,SRID)
VALUES ('MORTON_P','GEOM',MDSYS.SDO_DIM_ARRAY(
MDSYS.SDO_DIM_ELEMENT('X',V_SHAPE.SDO_ORDINATES(1),V_SHAPE.SDO_ORDINATES(3),0.05),
MDSYS.SDO_DIM_ELEMENT('Y',V_SHAPE.SDO_ORDINATES(2),V_SHAPE.SDO_ORDINATES(4),0.05)),NULL);
END;
/
SHOW ERRORS
COMMIT;
We are not going to create a spatial index as it is not needed.
What does this data look like? If we select against the table with no sort we can assume we are getting data as it is physically on disk. Note also, that we want to ensure that our column/row addresses are normalised to the start of the grid.
SQL> set pagesize 1000 linesize 131
SQL> Prompt Inspect rows to show random ordering...
Inspect rows to show random ordering...
SQL> list
1 SELECT rowid,
2 id,
3 a.geom.sdo_point.y as y,
4 a.geom.sdo_point.x as x,
5 FLOOR(a.geom.sdo_point.y/500) - MIN(FLOOR(a.geom.sdo_point.y/500)) OVER (ORDER BY FLOOR(a.geom.sdo_point.y/500)) as morton_gx,
6 FLOOR(a.geom.sdo_point.x/500) - MIN(FLOOR(a.geom.sdo_point.x/500)) OVER (ORDER BY FLOOR(a.geom.sdo_point.x/500)) as morton_gy
7 FROM MORTON_P a
8* WHERE rownum < 20
SQL> /
ROWID ID Y X MORTON_GX MORTON_GY
------------------ ---------- ---------- ---------- ---------- ----------
AAAvuoAAEAADF30AAP 253 5398446.61 345393.81 5 0
AAAvuoAAEAADF30AAD 241 5398851.27 345741.16 6 1
AAAvuoAAEAADF30AAR 255 5402238.68 348014.59 13 6
AAAvuoAAEAADF30AAH 245 5401531.45 348593.19 12 7
AAAvuoAAEAADF30AAL 249 5401286.57 348614.63 11 7
AAAvuoAAEAADF30AAS 256 5397924.49 349504.67 4 9
AAAvuoAAEAADF30AAK 248 5401027.36 349820.31 11 9
AAAvuoAAEAADF30AAG 244 5403216.65 350262.56 15 10
AAAvuoAAEAADF30AAO 252 5398864.63 350892.57 6 11
AAAvuoAAEAADF30AAQ 254 5398583.57 350732.86 6 11
AAAvuoAAEAADF30AAJ 247 5404038.95 351249.54 17 12
AAAvuoAAEAADF30AAN 251 5404272.95 351816.35 17 13
AAAvuoAAEAADF30AAC 240 5401672.14 352233.77 12 14
AAAvuoAAEAADF30AAF 243 5404463.24 352343.83 17 14
AAAvuoAAEAADF30AAA 238 5402457.6 352071.85 13 14
AAAvuoAAEAADF30AAM 250 5403813.73 353353.01 16 16
AAAvuoAAEAADF30AAI 246 5403652.16 353834.18 16 17
AAAvuoAAEAADF30AAB 239 5395604.75 353661.09 0 17
AAAvuoAAEAADF30AAE 242 5401454.78 354527.93 11 19
19 rows selected.
Note how the data is pretty unsorted.
The following image was generated by linking to the MORTON_P table using Manifold GIS and rendering the data by the ID column such that sequential ID column values have roughly the same colour. Note that the colours are displaying randomly which befits the way the data was created and written.

Now, let’s compute the morton key and add it to the base table.
SQL> Prompt Now Apply Morton Key ....
Now Apply Morton Key ....
SQL> UPDATE MORTON_P A
2 SET a.morton_key = codesys.Morton(
3 (SELECT FLOOR(b.geom.sdo_point.y/500) - MIN(FLOOR(b.geom.sdo_point.y/500)) OVER (ORDER BY FLOOR(b.geom.sdo_point.y/500)) FROM Morton_P b WHERE b.id = a.ID ),
4 (SELECT FLOOR(c.geom.sdo_point.x/500) - MIN(FLOOR (c.geom.sdo_point.x/500)) OVER (ORDER BY FLOOR(c.geom.sdo_point.x/500)) FROM Morton_P c WHERE c.id = a.ID ));
1000 rows updated.
SQL> COMMIT;
Commit complete.
Now let’s create our sorted table using the morton_key value on the base table.
SQL> drop table mortoned_p;
Table dropped.
SQL> Create table mortoned_p
2 as
3 select rownum as id, morton_key, geom
4 from (select morton_key, geom
5 from morton_p mp
6 order by mp.morton_key desc
7 );
Table created.
Now select against the table with no sort (assume we are getting data as physically on disk).
SQL> Prompt Inspect rows to show ordering with morton attribute assigned...
Inspect rows to show ordering with morton attribute assigned...
SQL> SELECT rowid,
2 id,
3 a.geom.sdo_point.x as x,
4 a.geom.sdo_point.y as x,
5 a.morton_key
6 FROM MORTONED_P a
7 WHERE rownum < 20;
ROWID ID X X MORTON_KEY
------------------ ---------- ---------- ---------- ----------
AAAvxSAAEAADJIUAAA 1 354863.68 5404756.97 143415955
AAAvxSAAEAADJIUAAB 2 354149.72 5404920.69 143415954
AAAvxSAAEAADJIUAAC 3 354258.89 5404744.74 143415954
AAAvxSAAEAADJIUAAD 4 354683.14 5404263.14 143415953
AAAvxSAAEAADJIUAAE 5 354737.04 5404003.85 143415953
AAAvxSAAEAADJIUAAF 6 354075.34 5404131.93 143415952
AAAvxSAAEAADJIUAAG 7 353977.42 5404713.1 143415943
AAAvxSAAEAADJIUAAH 8 353564.38 5404992.3 143415943
AAAvxSAAEAADJIUAAI 9 353807.76 5404522.37 143415943
AAAvxSAAEAADJIUAAJ 10 353691.9 5404567.27 143415943
AAAvxSAAEAADJIUAAK 11 353124.72 5404703.87 143415942
AAAvxSAAEAADJIUAAL 12 353317.99 5404915.13 143415942
AAAvxSAAEAADJIUAAM 13 353256.95 5404421.45 143415940
AAAvxSAAEAADJIUAAN 14 353358.12 5404093.68 143415940
AAAvxSAAEAADJIUAAO 15 353276.15 5404120.26 143415940
AAAvxSAAEAADJIUAAP 16 352671.65 5404985.66 143415939
AAAvxSAAEAADJIUAAQ 17 352068.16 5404722.24 143415938
AAAvxSAAEAADJIUAAR 18 352317.38 5404995.05 143415938
AAAvxSAAEAADJIUAAS 19 352019.32 5404902.57 143415938
19 rows selected.
Yes, the data is nicely sorted as we want.
And visually?
Visualising the Morton Key
The best way to understand what a space curve is doing is to visualise it via the actual grid and also the actual space curve that passes through the key points in the grid.
First, let’s create the grid and visualise it.
SQL> drop table morton_grid;
Table dropped.
SQL> create table morton_grid
2 as
3 select rownum as id,
4 c.geometry as geom,
5 codesys.Morton(FLOOR(c.centroid.sdo_point.y/500) - MIN(FLOOR(c.centroid.sdo_point.y/500)) OVER (ORDER BY FLOOR(c.centroid.sdo_point.y/500)),
6 FLOOR(c.centroid.sdo_point.x/500) - MIN(FLOOR(c.centroid.sdo_point.x/500)) OVER (ORDER BY FLOOR(c.centroid.sdo_point.x/500))) as morton_key
7 from (select m.geometry,
8 codesys.centroid.sdo_centroid(codesys.centroid.convertGeometry(m.geometry,0.5),0.5,0) as centroid
9 from table(codesys.tesselate.regularGrid((select sdo_aggr_mbr(p.geom) as mbr
10 from morton_p p),
11 mdsys.sdo_point_type(500,500,null))) m
12 ) c;
Table created.
Now, let’s look at it. The following image was generated by linking to the MORTONED_GRID table using Manifold GIS and rendering and labelling the data by the MORTON_KEY value.

Notice how the data is colouring shows the relatedness of spatial data next to each grid.
But to get a real sense as to how the space curve sort is going to work, let’s create a linestring that passes through the centroids of each grid order by the Morton space curve value.
SQL> Prompt Create linestring representing the morton space curve
Create linestring representing the morton space curve
SQL> drop table Morton_L;
Table dropped.
SQL> create table Morton_L
2 as
3 SELECT rownum as id,
4 mdsys.sdo_geometry(2002,NULL,NULL,mdsys.sdo_elem_info_array(1,2,1),
5 CAST(MULTISET(SELECT ord
6 FROM (select m.morton_Key,
7 v.x as ord
8 from morton_grid m,
9 table(mdsys.sdo_util.getvertices(codesys.centroid.sdo_centroid(codesys.centroid.convertGeometry(m.geom,0.5),0.5,0))) v
10 union all
11 select m.morton_Key,
12 v.y as ord
13 from morton_grid m,
14 table(mdsys.sdo_util.getvertices(codesys.centroid.sdo_centroid(codesys.centroid.convertGeometry(m.geom,0.5),0.5,0))) v
15 )
16 ORDER BY morton_key, ord
17 ) as mdsys.sdo_ordinate_array)) as geom
18 from dual;
Table created.
Here is the space curve superimposed on the labelled Morton grid:

We can also look at the actual point data rendered by the MORTON_KEY and ID columns. Note that the colours are displaying the ordering of the Morton space-curve.


Finally, let’s superimpose the actual point (geometry) data on the 500m x 500m grid with everything rendered by the Morton Key.

I think we can safely assume we have achieved our goal.
I hope this is useful to someone.
       
|
Comment [13]
Simon,
Not sure if I misunderstood you. In PostGIS you can do an ORDER BY ageometrycolumn;
What it ends up doing is doing some sort of sort against the bounding box of the geometry.
I haven’t played around to see how that sorting compares to what you have above. It is a common practice to improve performance in PostGIS, by doing a
CLUSTER ON the gist_spatial_index
and that physically sorts the data in the order of the index.
— Regina · 18 September 2009, 19:05 · #
Regina,
Yes, but you can’t ORDER BY sdo_geometry in Oracle!
It is good that you can do it in PostGIS. Most useful in achieving a spatial sort that can help performance of applications like MapServer.
However, the Morton key is a space curve and is more sophisticated object than just coordinates from an MBR. Sorting by X and Y gives you “narrow” stripes of data (still very useful) but it doesn’t enable one to take advantage of the “sideways” relationships (or multidimensional “clusters”) between spatial data in any location. Space-curves give you that better way to represent those spatial correlations and so group the data.
How much better the performance of applications accessing data after a PostGIS or Morton sort is unknown. Perhaps when I get time.
Similarly, one could cluster Oracle data on a Morton key but, again, discussion on that is for another day.
regards
Simon
— Simon · 20 September 2009, 08:59 · #
Simon,
Ah okay. Yes would be interesting to compare the 3.
I think the ORDER BY in PostGIS uses a btree sort (not rtree), where as the CLUSTER ON uses rtree. So I think the CLUSTER ON produces better performance. How that all compares with Morton would be interesting to test.
— Regina · 21 September 2009, 13:18 · #
Regina,
I have revised the article to add in the visualisation of the Mortoned grid and the linestring described by the Morton Key points.
I hope this helps you see how different a space curve can be to a sort based on the coordinates or MBR of a geometry.
regards
Simon
— Simon · 22 September 2009, 23:25 · #
Hi,i’m searching for a java translation of morton code..I’d like to test it for a case study.Could you send it to me?Thanks
— John · 28 October 2010, 14:16 · #
Great article Simon. I have a relatively large spatial table in Oracle (2 million records)with point (address) data. I am in the process of partitioning it to improve performance. Are you aware of any utilities that can prescribe the optimum number of partitions to create the table?
Ideally, such a utility would take into account the max number of desired points in a given partition then extrapolate the required partition ranges.
E.g.
CREATE TABLE new_partn_table (in_date DATE,
geom SDO_GEOMETRY, x_value NUMBER)
PARTITION BY RANGE (X_VALUE)
(PARTITION P_LT_90W VALUES LESS THAN (-90),
PARTITION P_LT_0 VALUES LESS THAN (0),
PARTITION P_LT_90E VALUES LESS THAN (90),
PARTITION P_MAX VALUES LESS THAN (MAXVALUE)
);
— Ross · 22 March 2011, 17:09 · #
Very interesting idea this Morton Key. I have learned a lot with this article. Unfortunately, I did’nt succeded to realize your example using the same code as you on random points. The morton_key update with a 0 value for each point. I dont know what im doing wrong. Im using locator, not spatial.
Also CODESYS seems to be unrecognized so I used the command select morton(100,50) from dual; and it works. Any hint of what could be wrong?
Thank you,
Max
— Max Demars · 17 August 2012, 16:10 · #
Max,
> Also CODESYS seems to be unrecognized so I used the command select morton(100,50) from dual; and it works. Any hint of what could be wrong?
CODESYS is the name of the schema in which the Morton function was created. Replace it with the name of your schema that holds the Morton function. Or, execute the function without the prefix (but only when connected as the schema user) as you have done.
> The morton_key update with a 0 value for each point. I dont know what im doing wrong.
No idea what you mean by “update with a 0 value”. Can you provide me with the SQL that is generating this 0 value.
> Im using locator, not spatial.
Irrelevant as we are not using any Oracle functions.
regards
Simon
— Simon Greener · 18 August 2012, 01:30 · #
Hello Simon,
Thank you for your answer.
I used the SQL you provided in this page, step by step. The update of morton_key column in Morton_P is not working as the values stay at 0 for each points. There is no sql error when executing
UPDATE MORTON_P A
SET a.morton_key = Morton(
(SELECT FLOOR – MIN) OVER (ORDER BY FLOOR) FROM Morton_P b WHERE b.id = a.ID ),
(SELECT FLOOR – MIN) OVER (ORDER BY FLOOR) FROM Morton_P c WHERE c.id = a.ID ));
COMMIT;
No error neither when I created the function Morton. Its working perfectly. The problem is with the UPDATE.
Max
— Max Demars · 20 August 2012, 15:25 · #
Maybe the problem is the decimal separator which is the coma ‘, ‘ in my oracle. But im not sure about that. What do you think?
— Max Demars · 21 August 2012, 15:23 · #
Unfortunately the lack of consideration to unsigned ints lead to issues in postgresql (int out of range). It is easily fixed by changing the quantum & key variable declaration, and return type to int8 (but keep supplied parameters to int4).
— tishtosh · 8 February 2015, 10:12 · #
P.S. I talk about http://spatialdbadvisor.com/sql_server_blog/260/creating-a-morton-number-space-key-generator-for-sql-server procedure for postgresql!
— tishtosh · 8 February 2015, 10:14 · #
in fact, the postgresql implementation is rather ill-defined for large inputs in the thousands/tens of thousands. i had to replace every instance of int4 to int8, and like oracle postgres has no bitwise operator for large integer, so i stole the function from this page (adapted slightly):
CREATE or replace Function Left_Shift( p_val int8, p_shift int8)
Returns int8
As
$$
DECLARE
BEGIN Return trunc(p_val * power(2,p_shift));
END;
$$ LANGUAGE ‘plpgsql’ IMMUTABLE;
— tishtosh · 8 February 2015, 12:16 · #