Go to content Go to navigation and search

Radius Studio and ESRI (Part 2) · 175 days ago by Simon Greener

This article has been a long time in gestation. Sure, I could have written it anytime in the past 12 months, but it is just has not been in my top 10 jobs: paid work always takes preference to “labours of love”

The ESRI value-add conundrum

For anyone selling software in the GIS industry one question always seems to be foremost:

“How does the software integrate with what ESRI offers?”

with an important corollary question then arising:

“How does this add value to ESRI’s software and customers?”

(And in a way that doesn’t cause ESRI to enter your market.)

The FDO Value-Add

In Part One, “Radius Studio and FDO” I explained how quality open source software, FDO, could be used to add value to 1Spatial’s Radius Studio. That value can be summarised as follows:

The use of FDO liberates 1Spatial programmers from inventing the GIS data access wheel allowing them to concentrate developing new functionality for Radius Studio.

In summary, this is the pre-eminent “value add”. To be able to concentrate on what you do best, correctly relegating generic data access componentry to where it should be: foundational.

ESRI Integration

This article is all about the question of how Radius Studio might value-add ESRI’s software and customers now that it can use FDO to access ESRI’s myriad data formats.

Recently 1Spatial discovered that having embedded FDO within Radius Studio they could now access ArcSDE based data with little or no engineering effort. (The first integration they conducted with FDO was with shapefiles.) As Chris Tagg observed in an email to me: “It just works”. I suspect they also will discover that, via the FDO OGR provider (and any others that may get created), they can access personal geodatabases as well. The latter is important because it is likely that the greatest amount of spatial data being managed by ESRI software is in personal geodatabases.

But does being able to read someone else’s proprietary format data holdings add any value to ESRI?

From ESRI’s perspective the answer is “not really”. Why? Well, will having such capability sell any more ESRI licenses to customers? No. Being able to read data is also not a basis for being a business partner (even Manifold GIS can read shapefiles, personal and enterprise geodatabases – without the use of FDO – and they are very much in the anti-ESRI camp). And has such a capability helped current ESRI customer improve their return on their ESRI software investment? Unlikely.

Is being able to write the data back to ESRI database formats a value add?

If writing back the data occurs via back-door methods (eg reverse-engineering of the format), then the answer is definitely “no” as it places the customer and vendor in a difficult situation with respect to ESRI support. For example, writing an invalid shape into an Enterprise Geodatabase can cause queries against that database by ArcSDE to fail (as all geometries extracted in a query are validated by ArcSDE). If the customer then lodged a support call and ESRI Inc discovered that a third-party application caused the issue they would rightly insist that the offending software be removed.

Fortunately, FDO’s ArcSDE provider uses ESRI’s published method for accessing ArcSDE data: the ArcSDE API.

But writing data can be seen as a value-add where the product vendor provides a product that ESRI itself does not. So, being able to write clean GPS data from, say, Trimble Pathfinder Office directly to an ArcSDE database would be seen as a value add if it used the ArcSDE API and is obviously from a market segment ESRI does not directly compete in.

1Spatial and ESRI: more than just data

But what about 1Spatial? It can read, and write, ESRI ArcSDE and other proprietary format data. And, fortunately, for ArcSDE, that writing occurs via ESRI’s own API (so is “support friendly”). So they have capability in data interoperability but they are not an ESRI business partner; they own no ESRI software; they have not created any “extension” products for ArcGIS or ArcGIS Server; in fact they directly compete against ESRI in a host of areas: database cartography, object oriented GIS, topology.

So, how can they value-add ESRI in a way that is non-threatening?

Technology Stack: The Geodatabase

Integration of Radius Studio and ESRI data technologies is more than just the reading and writing of a particular proprietary storage formats though particular APIs i.e. ArcSDE. Rather integration needs to consider the whole of the ESRI “technology stack” and their positioning of that technology in the marketplace.

The ESRI “technology stack” is predominantly a style of Model Driven Architecture (MDA) based on their “intelligent feature” Geodatabase technologies.

The Geodatabase is more than just a bucket of geographic data, it also implements sophisticated business logic that, for example, builds relationships between data types, such as topologies and geometric networks; validates data; and controls access.

Within the ESRI system, the proprietary spatial data formats (shapefiles, SDEBINARY, etc) and data access technologies (ArcSDE, Jet, APIs) are glued together through a common generic geospatially aware semantic model that is stored and managed within the metadata repository components of a Geodatabase. A fully defined Geodatabase should include all the rules that are necessary for defining, controlling and validating data quality and integrity. ESRI client technology such as ArcGIS, ArcEngine, ArcServer dynamically use those rules (though generated code) to control data editing.

ESRI’s preferred method for the construction of Geodatabase models is to use a UML CASE tool such as Microsoft Visio and Rational Rose which it has extended with a set of modeling templates that enable customers to model some of the spatial aspects of a Geodatabase. Once modelled, a design can be “forward engineered” to the Geodatabase persistent data store (via the ArcCatalog Schema wizard) with “behaviours” being generated for use within the ArcGIS technology via the Code Generation Wizard.

However, it would appear that while many ESRI customers have purchased all the software components necessary to build an intelligent Geodatabase, they do not have the in-house skills to use UML to define a Geodatabase structure, or the skills to conformance check and migrate their data. Because of this, ESRI and other “domain experts” have developed, and made available for download, “starter data models” which customers can use as templates to “kick start” their Geodatabase creation (see http://support.esri.com/datamodels). These models are already “spatially aware”. However, when a customer uses one of these models, converting existing spatial data such that it conforms to the model is a manual process.

Until an ESRI customer has a fully specified, semantically rich, Geodatabase populated with quality data, they cannot leverage the “intelligent features” promise in the ESRI stack of technology. Until they do, their Return On Investment is sub-optimal.

The Radius Studio Value-Add.

Given this understanding of the Geodatabase technology, leads to a number of possible areas in which Radius Studio could deliver significant return on investment (ROI) for ESRI customers.

Firstly, Radius Studio is designed to be “domain expert” friendly. One does not need to know anything about UML to be able to discover and define rules.

Secondly, by reading in the semantics of a “starter model”, Radius Studio could conform a customer’s existing spatial data to a chosen ESRI data model.

Thirdly, since no generic “starter data model” reflects the idiosyncrasies of any one business’s rules or operating envionment, Radius Studio, through its ability to automatically discover new rules, could enhance a “starter model” through examination of a customer’s specific data.

Finally, having conformed the data and discovered new rules, if Radius Studio had the ability to write the rules and conformed data to the ESRI Geodatabase metadata catalog (via the XML Schema of the Geodatabase) this could give ESRI customers an enormous “leg up” in their migration activities, but more importantly will enable the customer owning the ESRI technology, to use all the “smart feature” capabilty of the whole technology stack, delivering much faster ROI than they could otherwise expect.

Comment

The Sad State of GIS SQL Standards · 175 days ago by Simon Greener

The background to this blog comes from two separate sources: the first is my testing of SQL Server 2008 “Katmai”, and secondly, a blog article that Charlie Savage wrote a while back on The Sad State of GIS Web Standards.

Web Mapping Standards

As an ex-GIS Manager who monitored GIS standards for possible deployment within the business I worked for I can only concur with Charlie’s comments.

In particular, Web Mapping Service (WMS) was pretty useless as the basis of an feature-rich, middle-tier, internet standards based, rendering engine given it was designed by the standards committee as a “raster representation” of spatial data when what the industry wanted was an IMS standard that enabled the querying of individual objects within individual layers and the visualisation of selections against the geographic objects within those layers (not just what occurred at a pixel). For those who say that Web Feature Service (WFS) can be used for just such a selection I can only retort that it took years to become available and, even now, selection and visualistion is still a custom use of WMS/WFS.

In the end I only found two uses for OGC web standards: one was for the display of a dumb raster image in Trimble Pathfinder Office behind GPS data before uploading into the corporate database (a good visual check the data was in the right area); the other was for accessing external public data in either WMS or WFS form. But for internal business-centric use where higher functionality was needed they were, and are, useless.

SQL Standards

But my main contribution to Charlie’s blog article is to outline a few issues with the current geospatial SQL standards: in particular the OpenGIS® Simple Features Implementation Specification for SQL and SQL/MM standards. Microsoft’s SQL Server Spatial (“Katmai”) functionality is based on the OGC 1.1 standard.

Generally, I don’t have much of an issue with these standards as they implement a workable set of basic methods needed for the storage, search, geoprocessing and retrieval of spatial data.

However, I do have issue with the lack of support within either standard for Minimum Bounding Rectangle (MBR) objects, MBR-based search/retrieval and spatial aggregates. (After all so much of what we do with spatial data depends on MBR processing.)

Search Operators

In the OGC 1.1 and SQL/MM standards, the only search operators are:

There is no ST_MBR() or ST_Envelope() search operator. (There is a ST_Envelope() operator that returns the MBR of a geometry object as a polygon but there are no methods for “envelopes” or search operators.)

Currently, the best performing operator the creator of a geospatial application can use to get shapes out of an OGC/SQLMM compliant database for a “find all objects inside the current display extent” I can get is via ST_Intersects():

Example from SQL Server 2008: note STIntesects() and not ST_Intersects().
SELECT d.delaunay_num, d.geom.STAsText()
  FROM Delaunay d
 WHERE d.geom.STIntersects(geometry::STPolyFromText('POLYGON((350000 5400000,355000 5400000,355000 5405000,350000 5405000,350000 5400000))',28355)) = 1;

Two comments on MBR filtering and MBR objects are needed.

1. Construction of Search Object

The construction of the search area (MBR) via a 5 vertex polygon is a pain. Another method one might use is to generate a search polygon via use of the ST_Envelope() OGC 1.1/SQLMM method as in the following from SQL Server “Katmai”:

SELECT d.delaunay_num, d.geom.STAsText()
  FROM Delaunay d
 WHERE d.geom.STIntersects(geometry::STLineFromText('LINESTRING(350000 5400000,355000 5405000)',28355).STEnvelope()) = 1;

This is far more elegant a way to express the search rectangle.

But why can’t we use some sort of MBR (cf Oracle’s optimized rectangle) object more directly?

SELECT d.delaunay_num, d.geom.STAsText()
  FROM Delaunay d
 WHERE d.geom.STIntersects(geometry::STEnvelope(350000 5400000,355000 5405000,28355)) = 1;

So, that’s one suggestion: a rectangle/MBR type and operators.

2. Efficent and effective searching.

Back earlier this year I was asked to look at why Oracle Spatial was running so slowly at a client site. (One felt that Oracle was being “blamed” as it is “common knowledge” that Oracle performs badly.) When I got in to that site I discovered that Oracle’s MapViewer was running very, very fast: fast enough for me to observe that it was the performance benchmark against which all other GIS access should be measured – to whit the customer agreed. Now, Oracle’s MapViewer executes SDO_FILTER() to get its data for map rendering while Deegree (one of the geospatial applications accessing the database) was using SDO_RELATE(... ‘mask=ANYINTERACT’...). SDO_RELATE(.. ‘mask=ANYINTERACT’...) is roughly equivalent to STIntersects(). In executing some performance measures I discovered the following performance differences:

Dataset Deegree vs MapViewer (%) Notes
Parcels 19.7 Polygon data
Contours 30.3 Contours have a complex spatial description
Transport 48.2 Transport spatial geometries are relatively simple (small avg vertices)
PlaceNames 86.8 No geometric comparisons needed for points: Sdo_Relate effectively an Sdo_Filter

So, by the OGC not having a Filter operator it effectively slows down data access for 98% (a guestimate based on experience) of all spatial queries against a spatial database! The following SQL statement shows just what a filter that compares MBRs of geometries and returns those for which there is an interaction might look like:

SELECT d.delaunay_num, d.geom.STAsText()
  FROM Delaunay d
 WHERE d.geom.ST_MBR(350000,5400000,355000,5405000,28355) = 1;

The above query in PostGIS is (note the BOX3D operator cf my suggestion of an MBR object in the relevant standards):

SELECT d.delaunay_num, d.geom.STAsText()
  FROM Delaunay d
 d.geom && SetSRID('BOX3D(350000 5400000,355000 5405000)'::box3d,28355);

And in Oracle is:

SELECT d.delaunay_num, d.geom.STAsText()
  FROM Delaunay d
 WHERE sdo_filter(d.geom,sdo_geometry(2003,28355,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,1003,3), SDO_ORDINATE_ARRAY(350000,5400000,355000,5405000)) = 'TRUE'

So, both Oracle and PostGIS have a Box/MBR/optimised rectangle object support and a fast primary filter/search capability.

Spatial Aggregates

The relevant spatial SQL standards do not mention one of the most important functions that a user executes: group by aggregates. An example from Oracle:

SELECT d.table_attribute,
       SDO_AGGR_UNION(SDOAGGRTYPE(d.geom))
  FROM Delaunay d
 WHERE sdo_filter(d.geom,sdo_geometry(2003,28355,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,1003,3), SDO_ORDINATE_ARRAY(350000,5400000,355000,5405000)) = 'TRUE'
  GROUP BY d.table_attribute

As you can see the custom aggregate functions have type signatures that are specific to Oracle limiting cross-database portability (more on portability in another blog). But if we take Oracle’s approach and apply it to the standard (though this is probably invalid as the Oracle approach is dictated by the architecture of the Oracle database) perhaps a standards compliant SQL statement might look like this:

SELECT d.table_attribute, 
       d.geom.STAggrUnion()
  FROM Delaunay d
 WHERE d.geom.ST_MBR(350000,5400000,355000,5405000,28355) = 1;
  GROUP BY d.table_attribute

Though I would think that d.geom.STUnion() might still be acceptable with the context (ie the GROUP BY clause) telling the database query engine that the version of the STUnion() (no supplied geometry) function that is needed is an aggregate function and not just a union of two shapes.

Conclusion

Why do the relevant standards not implement or suggest fast search operators, MBR types and aggregate operators? Is it because the relevant committes were more interested in formal (theoretical) aspects of spatial data (cf WMS) or because it leaves vendors (who have representatives on those standards bodies) wiggly room to add in the necessary additional functionality they know is needed to make the products useable in real world situations (with a claim that they have to implement their custom extensions for reasons of speed/functionality)?

Comments?

Comment

Microsoft to release their own spatial capability for SQL Server · 372 days ago by Simon Greener

My friend, Andrew Hallam just pointed out something that I missed. Microsoft is developing a new release of SQL Server that includes capability for “geographic” data.

Note the “geographic” down near the bottom of this press release on Katmai

Also this note on the future version of SQL Server

Supposedly scheduled to ship in 2008 but Microsoft is not great at hitting targets.

I for one will certainly be getting a copy!

BTW I have now started to get some serious PostGIS experience under my belt.

What fun!

Spatial-in-the-database?

Let’s Roll!

Comment

Radius Studio and FDO · 475 days ago by Simon Greener

In 2006 I spent 3 months conducting some research and development for 1Spatial under a UK Department of Trade and Industry grant. That research was predominantly aimed at investigating methods whereby their flagship product, Radius Studio, could integrate with ESRI GeoDatabase technology in a way that adds value to both company’s products. The research concluded with 3 main recommendations. One of those was the subject of a recent 1Spatial news release wherein they announced that they had integrated the OSGeo’s Feature Data Objects (FDO) technology (thanks to AutoDesk) into Radius Studio.

OK, so how does FDO add value to Radius Studio and how does this integration help with adding value to ESRI GeoDatabase technology?

First one first.

I wrote a blog piece a while ago about Feature Data Objects and have included links on the technology on my home page. In that blog piece I said:

“Secondly it provides a standardised interface that programmers can use to access a range of existing geospatial data formats, not just one. The lack of a production quality standardised access API for the myriad of geospatial formats that exist, has been one of the great anchors on the good boat GIS for too many years. (That is not to say that Safe Software haven’t done a good job of making spatial data more accessible.)”

While I would love to see real Spatial SQL based data access drivers around (eg ADO, ODBC, JDBC etc) so that I could, one day, Start>Control Panel>Administrative Tools>Data Sources (ODBC) and select the appropriate driver (eg shapefile, TAB file etc etc) that day has not arrived.

But FDO’s day has.

Radius Studio initially read and wrote Oracle Spatial/Locator and nothing else. To be able to connect to ESRI data-sources, 1Spatial was faced with having to write, from scratch, a data access layer for each of the data sources in the ESRI stable that customers are using to manage their data.

This is a non-trivial task particularly because of the many proprietary formats ESRI provides. And, ESRI customers, use a large range when managing their data:

Developing one’s own, proprietary, data access layer is also an expensive option because it would require the purchase of ESRI software and, more importantly, 1Spatial would have to dedicate valuable staff time to learning, configuring and programming ESRI’s technology (or AutoDesk’s, MapInfo’s etc etc).

More importantly, these staff would not be able to be used to value-add 1Spatial’s own software through creating new, or enhancing existing, functionality: these staff would be wasting time “reinventing the geodata wheel” by developing low-level data access drivers.

To do this properly, so that the access layer could be reused for other GIS companies’ proprietary geodata formats, would take many, many hours of programming. In fact, once done one would end up wanting to get a return on that investment by trying to sell it in its own right. But who wants to compete with Safe Software? And, from 1Spatial’s perspective, would slow down the time to market for what Radius Studio does best: rules discovery, creation, conformance checking, correction and certification.

A “data access layer” product, would also confuse staff and customers as to what 1Spatial’s real “value add” in the data quality sector actually was!

Fortunately, FDO came along at the right time and, though many disagree as to the merits of (or motives behind) making FDO open source, it provided a number of immediate benefits:

(Many other providers are being written for FDO as I write.)

A trial integrating FDO into Radius Studio last year was very successful (for minimal effort). Thus, though ONE piece of engineering MULTIPLE benefits could be gained.

Ok, so Radius Studio + FDO provides access to ESRI geodata. So how does this value-add Radius Studio in an ESRI world.

Though a second blog posting I will explain.

Comment [1]

>>A Thank You