Go to content Go to navigation and search

Current Oracle Spatial Blogs

Search

RSS / Atom

Email me

textpattern

Using Oracle's SDO_NN Operator - Some examples · Feb 3, 04:01 pm by Simon Greener

This little article was occasioned by someone emailing me and asking:

Actually I do have a topic for you – clustering. Before we moved to Oracle Spatial, we had a request for a client to: “Show us all the stores that are close to each other”. I’m somewhat embarrassed to say that we did this by coordinate match.

What would be nice is to understand how to implement the query: Show me all stores within x distance of each other.

Well, there is nothing to be embarrassed about because you did this before you had Oracle Spatial (actually all the functionality used below is available to Locator users – thanks Oracle)! What I will do is outline the Oracle functionality that can do what the person wanted and give a practical example I used recently to solve a particular problem.

First off the function we need in Oracle is the SDO_NN spatial operator. (My preference for everything I do in Oracle is to try and push as much processing into a SQL statement before launching into PL/SQL ie programming.) The Oracle documentation on SDO_NN is quite thorough so I recommend that you read it in consort with this article: For example, the only thing I will say say about the SDO_NN_DISTANCE ancillary operator (used in the first example) is that it returns the actual distance Oracle computed between the searched for object and the search object supplied to the SDO_NN operator. The examples below will clarify its use. For a detailed discussion on the operators read the documentation. For now, I will limit myself to a few comments and examples.

SDO_BATCH_SIZE vs SDO_NUM_RES

One of the first things to understand is use of the sdo_batch_size and sdo_num_res parameters. (Again the documentation is quite thorough on these parameters.)

sdo_num_res=<N>
simply returns exactly N nearest objects to the search object. So, in the example that follows the nearest object only is returned.
gis@XE> select id,
  2            storetype,
  3            sdo_nn_distance(1)
  4       from store s
  5      where sdo_nn(s.geom,
  6                   mdsys.sdo_geometry(2003,NULL,NULL,
  7                         mdsys.sdo_elem_info_array(1,1003,3),
  8                         mdsys.sdo_ordinate_array(380004,5100003,390000,5160000)),
  9*                  'sdo_num_res=1',1) = 'TRUE'
gis@XE> /
ID STORETYPE  SDO_NN_DISTANCE(1)
---------- ---------- ------------------
       271 SHOE                215195.12

(The script to create the STORE table can be accessed here.)

Note that the store is a SHOE store. Let’s “widen” the search a little (note the use of the ORDER BY clause):

gis@XE> select id,
  2         storetype,
  3         sdo_nn_distance(1)
  4    from store s
  5   where sdo_nn(s.geom,
  6                mdsys.sdo_geometry(2003,NULL,NULL,
  7                      mdsys.sdo_elem_info_array(1,1003,3),
  8                      mdsys.sdo_ordinate_array(380004,5100003,390000,5160000)),
  9                'sdo_num_res=5',1) = 'TRUE'
 10*  order by 3
gis@XE> /
        ID STORETYPE  SDO_NN_DISTANCE(1)
---------- ---------- ------------------
       271 SHOE                215195.12
       100 COMPUTER           215228.103
        60 SHOE                215307.04
       494 VIDEO              215341.161
       493 FURNITURE          215347.298

So, what if we only want the nearest three SHOE stores to our search point? Can we just add the “storetype = ‘SHOE’”?

gis>XE>  select id,
  2         storetype,
  3         sdo_nn_distance(1)
  4    from store s
  5   where sdo_nn(s.geom,
  6                mdsys.sdo_geometry(2003,NULL,NULL,
  7                      mdsys.sdo_elem_info_array(1,1003,3),
  8                      mdsys.sdo_ordinate_array(380004,5100003,390000,5160000)),
  9                'sdo_num_res=5',1) = 'TRUE'
 10     and s.storetype = 'SHOE'
 11*  order by 3
gis@XE> /
        ID STORETYPE  SDO_NN_DISTANCE(1)
---------- ---------- ------------------
       271 SHOE                215195.12
        60 SHOE                215307.04

No this will not do what we want. Why? Because the sdo_num_res parameter returns the objects based on their geometry not their attributes: the 5 stores returned by SDO_NN with sdo_num_res=5 are the 5 nearest. No more are returned. These 5 objects are then passed to the added predicate resulting in only 2 of the 5 passing the test. Can we move the predicate around? No because the SDO_NN will still only return the same 5 objects to the SELECT statement. We have to find a way to increase the set that can be returned by SDO_NN. Why not just increase the sdo_num_res to, say, 10?

gis@XE> select id,
  2         storetype,
  3         sdo_nn_distance(1)
  4    from store s
  5   where sdo_nn(s.geom,
  6                mdsys.sdo_geometry(2003,NULL,NULL,
  7                      mdsys.sdo_elem_info_array(1,1003,3),
  8                      mdsys.sdo_ordinate_array(380004,5100003,390000,5160000)),
  9                'sdo_num_res=10',1) = 'TRUE'
 10     and s.storetype = 'SHOE'
 11*  order by 3
gis@XE> /
        ID STORETYPE  SDO_NN_DISTANCE(1)
---------- ---------- ------------------
       271 SHOE                215195.12
        60 SHOE                215307.04
       380 SHOE                215909.35

Yes, it works, but it is subject to one guessing a value for sdo_num_res that will work in all cases. For example, sdo_num_res=10 does not work for COMPUTER stores:

gis@XE>  select id,
  2         storetype,
  3         sdo_nn_distance(1)
  4    from store s
  5   where sdo_nn(s.geom,
  6                mdsys.sdo_geometry(2003,NULL,NULL,
  7                      mdsys.sdo_elem_info_array(1,1003,3),
  8                      mdsys.sdo_ordinate_array(380004,5100003,390000,5160000)),
  9                'sdo_num_res=10',1) = 'TRUE'
 10     and s.storetype = 'COMPUTER'
 11*  order by 3
gis@XE> /
        ID STORETYPE  SDO_NN_DISTANCE(1)
---------- ---------- ------------------
       100 COMPUTER           215228.103
       465 COMPUTER            215717.65

See, we only got two! Sure, we would increase the sdo_num_res batch size again but we are playing a guessing game.

This is why the sdo_batch_size parameter exists. It exists, as the documentation says,”If any geometries in the table might be nearer than the geometries specified in the WHERE clause”. We know SHOE stores in the table are nearer than our COMPUTER stores but it is COMPUTER stores that we want.

gis@XE>  select id,
  2         storetype,
  3         sdo_nn_distance(1)
  4    from store s
  5   where sdo_nn(s.geom,
  6                mdsys.sdo_geometry(2003,NULL,NULL,
  7                      mdsys.sdo_elem_info_array(1,1003,3),
  8                      mdsys.sdo_ordinate_array(380004,5100003,390000,5160000)),
  9                'sdo_batch_size=0',1) = 'TRUE'
 10     and s.storetype = 'COMPUTER'
 11     and rownum < 4
 12*  order by 3
gis@XE> /
        ID STORETYPE  SDO_NN_DISTANCE(1)
---------- ---------- ------------------
       100 COMPUTER           215228.103
       465 COMPUTER            215717.65
         1 COMPUTER           216857.982

A correct result. I will leave you to read up on sdo_batch_size parameter value setting in the documentation.

How do we find the 3 nearest SHOE stores to every SHOE store in the STORE table? As follows?

gis@XE> select /*+ ORDERED USE_NL(s,s2)*/
  2         s.id,
  3         s2.id as nearestStoreId,
  4         sdo_nn_distance(1) as distance
  5    from store s,
  6         store s2
  7   where s.storetype = 'SHOE'
  8     and sdo_nn(s2.geom,
  9                s.geom,
 10                'sdo_batch_size=10',1) = 'TRUE'
 11     and s2.storetype = 'SHOE'
 12     and s2.id <> s.id
 13     and rownum < 4
 14   order by 1,3
 15  /
        ID NEARESTSTOREID   DISTANCE
---------- -------------- ----------
       394             39 5631.82065
       394            413  6387.1917
       394            462 6927.07224

Anyone spot the error? The rownum < 4 is applied to the final rowset and not to each s.store and its 3 nearest neighbours!

What we need to do is order the query such that the rownum test is applied to the neighbours of each and every store. One way to do this is via a correlated subquery as follows:

gis@XE> select /*+ ORDERED USE_NL(s,s2)*/
  2         s.id,
  3         s2.id as nearestStoreId,
  4         mdsys.sdo_geom.sdo_distance(s.geom,s2.geom,0.05) as distance
  5    from store s,
  6         store s2
  7   where s.storetype = 'SHOE'
  8     AND s2.id in (select id
  9                from store s3
 10               where sdo_nn(s3.geom,s.geom,'sdo_batch_size=10',1) = 'TRUE'
 11                      and s3.storetype = 'SHOE'
 12                      and s3.id <> s.id
 13                      and rownum < 4)
 14*  order by 1,2
gis@XE> /
        ID NEARESTSTOREID   DISTANCE
---------- -------------- ----------
....
       490             82 7933.95789
       490            233 7704.35301
       490            480 5899.48249
       495            159 4850.15973
       495            243 5073.66333
       495            318 7008.11984
       499             41 2635.73385
       499             74 4104.74433
       499            430 4327.98326

A correct result.

Note: Because of limitations with returned values from correlated sub-queries, we cannot access any sdo_nn_distance() values so we simply use the sdo_geom.sdo_distance function to get out required distance. If I can come up with a better query (or perhaps my readers may have a better approach) I will amend this blog posting.

Quality of Returned Distance

What I want to turn to the types of distances SDO_NN calculates. It is an approximation to the actual geometric distance done so that the speed of the SDO_NN operator remains fast (which it is) through mainly RTree processing or is it computed by looking at the actual geometries? All I will do is conduct a very simple test using a point and a line.

In the Codesys.ProjLine2D table there is a simple, straight line geometry with linetype of ‘STRAIGHTVERTEX’ as follows:

INSERT INTO ProjLine2D VALUES(
'STRAIGHTVERTEX', 
SDO_GEOMETRY(2002, NULL, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY( 380000.0, 5100000.0, 380000.0, 5160000.0)));

You will note that it is a horizontal line of length 60,000 meters. If I search this feature using two points to show that the SDO_NN distance calculation is quite accurate.

gis@XE> select sdo_nn_distance(1)
  2    from codesys.projline2d p
  3    where p.linetype = 'STRAIGHTVERTEX'
  4    and sdo_nn(p.geom,mdsys.sdo_geometry(2001,NULL,mdsys.sdo_point_type(380010.0, 5160000.0,NULL),NULL,NULL),
  5*  'sdo_num_res=1',1) = 'TRUE'
gis@XE> /
SDO_NN_DISTANCE(1)
------------------
                10
gis@XE> select sdo_nn_distance(1)
  2    from codesys.projline2d p
  3    where p.linetype = 'STRAIGHTVERTEX'
  4    and sdo_nn(p.geom,mdsys.sdo_geometry(2001,NULL,mdsys.sdo_point_type(380010.0, 5155550.0,NULL),NULL,NULL),
  5*  'sdo_num_res=1',1) = 'TRUE'
gis@XE> /
SDO_NN_DISTANCE(1)
------------------
                10

So, SDO_NN must look at the geometries of the objects it deals with to calculate its distances and not rely on MBRs and object vertices.

Hints

You will have noted that that all the queries above that involve a single table require no hints (as expected). However, the query against the two STORE tables above used the ORDERED and USE_NL (USE_Nested_Loops) hints. Why is this? Occasionally, queries just won’t run. For example, here is a query that does not work for the data I ship with my free PL/SQL packages on the copy of XE (10gr) I am running:

gis@XE> select p.id,
  2         l.linetype,
  3         sdo_nn_distance(1)
  4    from codesys.projpoint2d sample (0.1) p,
  5         codesys.projline2d l
  6   where sdo_nn(l.geom,p.geom,'sdo_num_res=5',1) = 'TRUE'
  7     and l.linetype like '<span>VERTEX</span>'
  8  /
select p.id,
*
ERROR at line 1:
ORA-13249: SDO_NN cannot be evaluated without using index
ORA-06512: at "MDSYS.MD", line 1723
ORA-06512: at "MDSYS.MDERR", line 17
ORA-06512: at "MDSYS.PRVT_IDX", line 9

I have note that this can happen where a non-spatial attribute used in a predicate has its own index (and sometimes not). This issue is documented in the Spatial Operators chapter of the Oracle Spatial documentation and also in the excellent book Pro Oracle Spatial. The recommended solution is to include hints. Sometimes adding the /*+ORDERED*/ hint on its own can work but in other cases not. The ones recommended in the book involve knowing about the name of the RTRee and ordinary indexes over the CODESYS.PROJLINE2D table (/*+INDEX*/). Similarly for the documentation though it does suggested the LEADING hint in cases involving a join between the two tables in the FROM clause used in the SDO_NN operator. I have found that, at times, the ORDERED, INDEX and NO_INDEX hints are not enough: I have had to specify the LEADING or USE_NL (USE_Nested_Loops) hints as in the following examples:

gis@XE> select /*+ORDERED USE_NL(p,l) */
  2         p.id,
  3         l.linetype,
  4         sdo_nn_distance(1)
  5    from codesys.projpoint2d sample (0.1) p,
  6         codesys.projline2d l
  7   where sdo_nn(l.geom,p.geom,'sdo_num_res=5',1) = 'TRUE'
  8     and l.linetype like '<span>VERTEX</span>'
  9  /
        ID LINETYPE                                 SDO_NN_DISTANCE(1)
---------- ---------------------------------------- ------------------
       393 VERTEXARC                                        156738.398
       393 VERTEX                                           241507.531
       393 STRAIGHTVERTEX                                   249322.965
       393 45DEGREEVERTEX                                   300109.767

You should read the documentation and be prepared to experiment with different hints.

A Real Example

Finally, I thought I would leave you with a real example derived from something I had to do for a customer recently. What we will do is find the nearest vertex-to-vertex segment (greater than 5 meters in length) of a land parcel polygon to a road centreline with the same name. Here is a picture of some data and a road centreline. Note that the distance to the road centreline will be the same for a parcel side boundary where it joins a front boundary as both share the same (corner) vertex. To get around this we will use the mid-point of each segment.

Before sdo_nn/min-point processing

The identifiers of the hightlighted land parcels are: oid in ( 26388, 26386, 26387, 26392, 26390, 26391, 26389 )

Here is a query that achieves what we want to do.

gis@XE> select /*+ ORDERED USE_NL(a,s)*/
  2           a.oid,
  3             sdo_geom.sdo_length(sdo_geometry(2002,8307,null,
  4                                              sdo_elem_info_array(1,2,1),
  5                                              sdo_ordinate_array(b.startcoord.x,b.startcoord.y,b.endcoord.x,b.endcoord.y)),
  6                                    0.05) length
  7          from land_parcel a,
  8               table(codesys.geom.getpipedvector2d(a.geom)) b,
  9               road_centreline s
 10          where a.oid in ( 26388, 26386, 26387, 26392, 26390, 26391, 26389 )
 11            and sdo_nn(s.geom,
 12                       sdo_geometry(2001,8307, sdo_point_type((b.startcoord.x+b.endcoord.x)/2,(b.startcoord.y+b.endcoord.y)/2,null),
 13                                    NULL,NULL),
 14                       'sdo_num_res=1') = 'TRUE'
 15            and s.street_name = a.street_name
 16            and sdo_geom.sdo_length(sdo_geometry(2002,8307,null,
 17                                                 sdo_elem_info_array(1,2,1),
 18                                                 sdo_ordinate_array(b.startcoord.x,b.startcoord.y,b.endcoord.x,b.endcoord.y)),
 19                                    0.05) > 5
 20  order by 1,2
 21  /
       OID     LENGTH
---------- ----------
     26386  16.598107
     26386 35.6314592
     26386 35.6419791
     26387 15.5455833
     26387 17.5226582
     26387 32.3342913
     26387 35.6419791
     26388 11.9649317
     26388 32.3342913
     26388 37.0525111
     26389 13.0049383
     26389 36.2842051
     26389 37.0525111
     26390 16.9069198
     26390 36.2708158
     26390 36.2842051
     26391 18.5583617
     26391 19.6084577
     26391 36.2663495
     26391 36.6570959
     26392  18.900781
     26392 36.2663495
     26392 36.2708158

23 rows selected.

But we want the nearest vector for each OID. The SQL is:

gis@XE> select /*+ ORDERED USE_NL(a,s) <strong>/
  2             a.oid,
  3             min(sdo_geom.sdo_length(sdo_geometry(2002,8307,null,
  4                                              sdo_elem_info_array(1,2,1),
  5                                              sdo_ordinate_array(b.startcoord.x,b.startcoord.y,b.endcoord.x,b.endcoord.y)),
  6                                    0.05) ) as length
  7          from land_parcel a,cad_poly
  8               table(codesys.geom.getpipedvector2d(a.geom)) b,
  9               road_centreline s
 10          where a.oid in ( 26388, 26386, 26387, 26392, 26390, 26391, 26389 )
 11            and sdo_nn(s.geom,
 12                         sdo_geometry(2001,8307, sdo_point_type((b.startcoord.x+b.endcoord.x)/2,(b.startcoord.y+b.endcoord.y)/2,null),
 13                                      NULL,NULL),
 14                       'sdo_num_res=1') = 'TRUE'
 15            and s.street_name = a.street_name
 16            and sdo_geom.sdo_length(sdo_geometry(2002,8307,null,
 17                                                 sdo_elem_info_array(1,2,1),
 18                                                 sdo_ordinate_array(b.startcoord.x,b.startcoord.y,b.endcoord.x,b.endcoord.y)),
 19                                    0.05) > 5
 20        group by a.oid
 21  /
	

<code id="source_code">

       OID     LENGTH
---------- ---------- 26388 11.9649317 26392 18.900781 26391 18.5583617 26387 15.5455833 26390 16.9069198 26386 16.598107 26389 13.0049383

7 rows selected.

Finally, while we could use the MIN function with an appropriate GROUP BY clause to return the vector with the minimum distance to the road centreline, we cannot easily return the midpoint of the nearest vector of each OID using these operators. To do so requires some tricky SQL using a rank/partition analytic to find the first vector whose distance to the road centreline is the minimum of all vectors that make up any one land parcel (rank = 1). Here is the final result.

gis@XE> select oid,cldist,midpoint,parcel_vector
  2   from (select oid,cldist,midpoint,parcel_vector,
  3                RANK() OVER (PARTITION BY oid ORDER BY cldist) as oidrank
  4           from ( select /</strong>+ ORDERED USE_NL(a,s)*/
  5                         a.oid,
  6                         sdo_geom.sdo_distance(s.geom,
  7                                               sdo_geometry(2001,8307,
  8                                               sdo_point_type((b.startcoord.x+b.endcoord.x)/2,(b.startcoord.y+b.endcoord.y)/2,null),
  9                                               NULL,NULL),
 10                                               0.05) as cldist,
 11                         sdo_geometry(2002,8307,null,
 12                                      sdo_elem_info_array(1,2,1),
 13                                      sdo_ordinate_array(b.startcoord.x,b.startcoord.y,b.endcoord.x,b.endcoord.y)) as parcel_vector,
 14                         sdo_geometry(2001,8307,
 15                                      sdo_point_type((b.startcoord.x+b.endcoord.x)/2,(b.startcoord.y+b.endcoord.y)/2,null),
 16                                      NULL,NULL) as midpoint
 17                    from land_parcel a,
 18                         table(codesys.geom.getpipedvector2d(a.geom)) b,
 19                         road_centreline s
 20                   where a.oid in ( 26388, 26386, 26387, 26392, 26390, 26391, 26389 )
 21                     and sdo_nn(s.geom,
 22                                sdo_geometry(2001,8307,sdo_point_type((b.startcoord.x+b.endcoord.x)/2,(b.startcoord.y+b.endcoord.y)/2,null),NULL,NULL),
 23                                'sdo_num_res=1') = 'TRUE'
 24                     and s.street_name = a.street_name
 25                     and sdo_geom.sdo_length(sdo_geometry(2002,8307,null,
 26                                                          sdo_elem_info_array(1,2,1),
 27                                                          sdo_ordinate_array(b.startcoord.x,b.startcoord.y,b.endcoord.x,b.endcoord.y)),
 28                                             0.05) > 5
 29               )
 30        )
 31   where oidrank = 1
 32  /
       OID     CLDIST
---------- ----------
MIDPOINT(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
-------------------------------------------------------------------------------
PARCEL_VECTOR(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
------------------------------------------------------------------------------------
     26386 5.72710973
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.673904, -37.862795, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.673898, -37.862869, 144.673909, -37.86272))
     26387 10.1054417
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.67386, -37.862933, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.673822, -37.862996, 144.673898, -37.862869))
     26388 16.3701483
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.673864, -37.863038, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.673905, -37.863081, 144.673822, -37.862996))
     26389  11.580563
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.673974, -37.863058, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.674042, -37.863036, 144.673905, -37.863081))
     26390 6.43017772
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.674138, -37.863039, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.674234, -37.863043, 144.674042, -37.863036))
     26391 5.70850051
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.674554, -37.863055, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.674659, -37.863059, 144.674449, -37.863051))
     26392 6.07655561
SDO_GEOMETRY(2001, 4030, SDO_POINT_TYPE(144.674341, -37.863047, NULL), NULL, NULL)
SDO_GEOMETRY(2002, 4030, NULL, SDO_ELEM_INFO_ARRAY(1, 2, 1), SDO_ORDINATE_ARRAY(144.674449, -37.863051, 144.674234, -37.863043))

7 rows selected.

Mapping these midpoints (large gray circles) produces the following.

After sdo_nn/min-point processing

If you reached the end of this article, thanks for persevering with the material. I hope you got something out of it.

  Textile Help

<<Converting distances and units of measure in Oracle Locator >>Spatial Pipelining