Oracle8 Spatial Cartridge Oracle8 Spatial Cartridge User's Guide and Reference Release 8.0.5 A53264-03 |
|
This appendix provides supplemental information to aid in setup, maintenance, and tuning of a spatial database. The scripts and tuning suggestions provided are intended as guidelines that can be adapted to the specific needs of your database.
Spatial Cartridge provides sample SQL script files to show how to use dynamic SQL in a PL/SQL block to create layer tables for spatially indexed data or to administer and manipulate all the partitions of a partitioned spatial table. The scripts are available after installation in the ORACLE_HOME/md/admin directory.
This section describes the cr_spatial_index.sql and crlayer.sql scripts.
The cr_spatial_index.sql script file shows an example of updating the spatial index for a layer, and executing a commit after every 50 GIDs have been entered.
The procedures SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.POPULATE_INDEX_FIXED() operate as a single transaction. To reduce the amount of rollback required to execute these procedures, you can write a routine similar to that in cr_spatial_index.sql. This script loops and calls SDO_ADMIN.UPDATE_INDEX_FIXED()
for each GID, committing after every 50 GIDs.
-- cr_spatial_index.sql -- -- Note: if geometries do not span more than 1 row, you can remove -- the DISTINCT qualifier from the SELECT statement -- declare cursor c1 is SELECT DISTINCT sdo_gid from POLYGON_SDOGEOM; gid number; i number; begin i := 0; for r in c1 loop begin gid:= r.sdo_gid; sdo_admin.update_index_fixed('POLYGON', gid, 15, FALSE, FALSE, FALSE); exception when others then dbms_output.put_line('error for gid'||to_char(gid)||': '||SQLERRM ); end; i:= i + 1; if i = 50 then commit; i:= 0; end if; end loop; commit; end; /
When you call the SDO_ADMIN.UPDATE_INDEX_FIXED()
procedure for a large data set, you may get a "snapshot too old" error message from the Oracle server. You can avoid this error by creating more or larger rollback segments. You can also try to increase the number of GIDs before committing the transaction.
Note:
The cr_spatial_index.sql script is not available in your ORACLE_HOME/md/admin directory after installation. You must create this script yourself. |
The crlayer.sql script file is a template used to create all the tables for a layer and populate the metadata in the <layername>_SDODIM and <layername>_SDOLAYER tables.
This section describes the following scripts:
Although the scripts described in this section are available, the recommended approach is to use Oracle8 partitioning and spatial indexing.
The altpart.sql script file shows how to use dynamic SQL in a PL/SQL procedure to modify all partitions of a Spatial Cartridge partitioned table.
The Spatial Cartridge data dictionary view used in this SQL script requires that a registered Spatial Cartridge partitioned table is specified. If the table is not registered, you can use the USER_TABLES view to select all the partition tables from the user's schema. To use the USER_TABLES view, use the following syntax:
SQL> SELECT TABLENAME FROM user_tables WHERE TABLENAME LIKE
2> '%tablename_P%';
The drppart.sql script file shows how to use dynamic SQL in a PL/SQL procedure to drop (remove) all partitions of a Spatial Cartridge partitioned table. After running this procedure, you must run the SDO_ADMIN.DROP_PARTITION_INFO()
procedure.
The Spatial Cartridge data dictionary view used in this SQL script requires that a registered Spatial Cartridge partitioned table is specified. If the table is not registered, you can use the USER_TABLES view to select all the partition tables from the user's schema. To use the USER_TABLES view, use the following syntax:
SQL> SELECT TABLENAME FROM user_tables WHERE TABLENAME LIKE
2> '%tablename_P%';
The sdogrant.sql script file contains an administrative procedure, PROPAGATE_GRANTS(), which is used after calling the SDO_ADMIN.PARTITION() or SDO_ADMIN.REPARTITION() procedures.
This procedure must first be compiled by running the sdogrant.sql file. The PROPAGATE_GRANTS() procedure is only callable by the user who compiled it.
The following information can be used as a guideline for tuning a spatial database.
Data modeling is very important when designing a spatial database. You should group geometries into layers based on the similarity of their attributes. Assume your data model uses line strings to represent both roads and rivers. The attributes of a road and the attributes of a river are different. Therefore, these geometries should be modeled in two different layers.
In practice, however, if the user of your application will always ask to see both the roads and rivers in a particular region (area of interest), then it may be appropriate to model roads and rivers in the same layer with a common set of attributes.
It is equally important to understand how the data in the various layers will be queried. If the user of your application is interested in querying the data based on a relationship between the layers, then you should index the layers with the same tiling level. For example, a query such as, "which roads cross rivers?" can achieve better performance if the roads and rivers layers are tiled at the same level.
It is not always critical to tile all your layers to the same level. You may find times when you have two layers that are optimally tiled to different levels (for example zipcode boundaries may be tiled to level 5 and and major roads may be tiled to level 7). If you want to ask the question, give me all the major roads that intersect a particular postal code boundary, it is not necessary to re-tile all the zipcode boundaries to level 7. You can move the postal code boundary of interest to a temporary layer and just re-tile that one geometry to level 7. You can then perform the query.
The following example explains how tiling is used in Spatial Cartridge.
Assume you want all the roads (line strings) that overlap a county boundary (polygon) in a spatial database containing 10 million roads. Ignoring Spatial Cartridge features for a moment, in purely mathematical terms, the problem translates into comparing all the line segments that make up each road, to the line segments and area of the county boundary to see if there is any intersection. This geometry-to-
geometry comparison is very expensive.
Spatial Cartridge simplifies this calculation by approximating each geometry with fixed-sized tiles. The primary filter in Spatial Cartridge translates the problem to show all the roads that have a tile equal to a tile that approximates the polygon. The result of this is a superset of the final answer.
The secondary filter (a true geometry-to-geometry comparison) can now be applied to the candidates that returned from the Spatial Cartridge primary filter, instead of to every road in the database.
Picking the correct tile size for fixed tiling is one of the most important factors in attaining good performance. If the tile size you select is too small, you could end up generating thousands of tiles per geometry. Also, the process of tiling a query window (like the county boundary mentioned previously) may become very time consuming.
At the same time, you do not want to choose tiles that are too big. This would defeat the purpose of the Spatial Cartridge primary filter. If the tiles are too big, then too many geometries are returned from the primary filter and are sent to the more costly secondary filter.
Keep in mind that the tile size you choose should also depend on if the query window (area of interest) is already defined in the database. If the query window is defined in the database, (that is, if the spatial tables and spatial indexes already exist), then you should choose a smaller tile size. Assume the State layer and the Highway layer are already defined in the database. You could perform a spatial join query such as, "which interstate highways go through the state?" without incurring the overhead of tiling because the query window is already defined in the database. If, on the other hand, you are creating the query window dynamically, you have to factor in the time it takes to define and index the query window. In this case, you should choose a larger tile size to reduce the time it takes to define and index the query window.
Oracle recommends running the SDO_TUNE.ESTIMATE_TILING_LEVEL()
function on your data set to get an initial tiling level estimate. This may not be your final answer, but it will be a good level to start your analysis. In general, it is also recommended that you take a random sample of your data and check the query performance at different levels of tiling. This would give an indication of what is the best tiling level for the total data set.
Properly choosing rollback segments and tablespaces are important for getting good performance from Spatial Cartridge. Therefore, it is very important to read the Oracle8 Administrator's Guide and understand the concepts of tablespaces and rollbacks.
Here are some general guidelines to consider:
SDO_ADMIN.POPULATE_INDEX_FIXED()
, SDO_ADMIN.UPDATE_INDEX_FIXED()
, SDO_ADMIN.POPULATE_INDEX()
, or SDO_ADMIN.UPDATE_INDEX()
procedure.
Point data, unlike line and polygon data, has the unique characteristic of containing one tile per point. This section describes how to improve the performance of queries on point data.
When querying point data with a rectangular query window, you can take advantage of the nature of these geometries to improve performance.
A rectangle can be defined by its lower-left and upper-right coordinates (Xmin, Ymin and Xmax, Ymax). A point has a single set of coordinates (Px, Py). When your area-of-interest is a rectangle, instead of using the SDO_GEOM.RELATE()function in the secondary filter, you can use simple SQL comparison operators as follows:
SELECT sdo_gid, sdo_x1, sdo_y1 FROM cities_sdogeom, (SELECT a.sdo_gid gid1 FROM cities_sdoindex a, window_sdoindex b WHERE b.sdo_gid = [area of interest id] AND a.sdo_code = b.sdo_code) WHERE sdo_gid = gid1 AND sdo_x1 BETWEEN Xmin AND Xmax AND sdo_y1 BETWEEN Ymin AND Ymax;
The DISTINCT clause is not necessary in the primary filter of the query because a point contains only a single tile in the spatial index.
Because a point is always referenced by only one tile in a spatial index, for additional performance, you can place the columns normally found in the <layername>_SDOINDEX table in the <layername>_SDOGEOM table. This will save you the cost of joining the <layername>_SDOINDEX and <layername>_SDOGEOM tables.
You still need to create an updatable view for the <layername>_SDOINDEX table that selects the appropriate columns from the <layername>_SDOGEOM table. This is because functions such as SDO_ADMIN.UPDATE_INDEX_FIXED()
and SDO_ADMIN.POPULATE_INDEX_FIXED()
expect a <layername>_SDOINDEX table to exist. Create the view using "instead of" triggers for insert, delete, and update such that the appropriate columns in the <layername>_SDO_GEOM table are updated. The following example shows how to use "instead of" triggers:
CREATE OR REPLACE TRIGGER mytrig INSTEAD OF INSERT ON points_sdoindex REFERENCING new AS n FOR EACH ROW BEGIN UPDATE points_sdogeom SET points_sdogeom.sdo_code = :n.sdo_gid; END; CREATE OR REPLACE TRIGGER mydeltrig INSTEAD OF DELETE ON points_sdoindex REFERENCING old AS n FOR EACH ROW BEGIN UPDATE points_sdogeom SET points_sdogeom.sdo_code = NULL WHERE points_sdogeom.sdo_gid = :n.sdo_gid; END;
The following example shows a window query of a layer containing point data when the window layer contains one rectangle:
SELECT sdo_gid, sdo_x1, sdo_y1 FROM points_sdogeom a, window_sdoindex b WHERE b.sdo_gid = [area of interest id] AND a.sdo_code = b.sdo_code) AND sdo_x1 BETWEEN Xmin AND Xmax AND sdo_y1 BETWEEN Ymin AND Ymax;
There are some helpful hints you can place in your spatial join queries to improve performance. The remainder of this section describes some of the hints you can use. For more information on hints, see the Oracle8 Tuning manual.
A spatial join takes place between two layers. When the two layers being joined are line or polygon layers, the spatial join query contains two DISTINCT clauses: one in the inner SELECT clause and the other in the outer SELECT clause. The Oracle optimizer ignores the inner DISTINCT clause to save on the cost of sorting. However, if the inner DISTINCT clause is ignored, the secondary filter gets called many more times than it needs to be. This can have a significant impact on performance because the secondary filter is an expensive operation. Use the NO_MERGE hint to prevent the optimizer from ignoring the inner DISTINCT clause.
In a spatial join, all the tiles from one layer are compared to all the tiles from another layer. The Oracle server performs a full table scan on one <layername>_SDOINDEX table, (preferably the smaller of the two), and an index lookup on the other <layername>_SDOINDEX table. Use the INDEX and USE_NL hints to force the optimizer to perform the full table scan on the smaller of the two <layername>_SDOINDEX tables being compared.
The following example shows a spatial join between line (road) and polygon (county) data. The query answers the question, "which counties intersect major roads?"
SELECT /*+ cost ordered use_nl(COUNTY_sdogeom) index (COUNTY_sdogeom NAME_OF_SDO_GID_INDEX) */ COUNTY_sdogeom.SDO_GID, COUNTY_sdogeom.SDO_ESEQ, COUNTY_sdogeom.SDO_SEQ, COUNTY_sdogeom.SDO_X1,COUNTY_sdogeom.SDO_Y1, COUNTY_sdogeom.SDO_X2,COUNTY_sdogeom.SDO_Y2, COUNTY_sdogeom.SDO_X3,COUNTY_sdogeom.SDO_Y3, COUNTY_sdogeom.SDO_X4,COUNTY_sdogeom.SDO_Y4, COUNTY_sdogeom.SDO_X5,COUNTY_sdogeom.SDO_Y5, COUNTY_sdogeom.SDO_X6,COUNTY_sdogeom.SDO_Y6, COUNTY_sdogeom.SDO_X7,COUNTY_sdogeom.SDO_Y7, COUNTY_sdogeom.SDO_X8,COUNTY_sdogeom.SDO_Y8 FROM (SELECT DISTINCT gid_a gid1 FROM (SELECT /*+ index (a NAME_OF_SDO_CODE_INDEX) index (b NAME_OF_SDO_CODE_INDEX) use_nl (a b) no_merge */ DISTINCT a.sdo_gid gid_a, b.sdo_gid gid_b FROM COUNTY_SDOINDEX a, MAJOR_ROAD_SDOINDEX b WHERE a.sdo_code = b.sdo_code) WHERE sdo_geom.relate('COUNTY', gid_a, 'ANYINTERACT', 'MAJOR_ROAD',gid_b) <> 'FALSE'), COUNTY_sdogeom WHERE COUNTY_sdogeom.sdo_gid = gid1;
The following example shows a spatial join between line (road) and point (street address) data. The query answers the question, "which addresses are on a major road?"
SELECT /*+ cost ordered use_nl (STREET_ADDRESS_sdogeom) index (STREET_ADDRESS_sdogeom NAME_OF_SDO_GID_INDEX) */ STREET_ADDRESS_sdogeom.SDO_GID, STREET_ADDRESS_sdogeom.SDO_X1, STREET_ADDRESS_sdogeom.SDO_Y1 FROM (SELECT DISTINCT gid_a gid1 FROM (SELECT /*+ index (a NAME_OF_SDO_CODE_INDEX) index (b NAME_OF_SDO_CODE_INDEX) use_nl (a b) */ a.sdo_gid gid_a, b.sdo_gid gid_b FROM STREET_ADDRESS_SDOINDEX a, MAJOR_ROAD_SDOINDEX b WHERE a.sdo_code = b.sdo_code) WHERE sdo_geom.relate('STREET_ADDRESS', gid_a, 'ANYINTERACT', 'MAJOR_ROAD',gid_b) <> 'FALSE'), COUNTY_sdogeom WHERE COUNTY_sdogeom.sdo_gid = gid1;
The inner DISTINCT clause is not necessary for spatial joins where one of the layers contains point data. Therefore, the NO_MERGE hint is not necessary. This is because points contain only one tile in the spatial index.
The following example shows a spatial join between polygon (county) and point (street address) data. The query generates a report that displays how many addresses are associated with each county.
If you can assume that each street address is associated with a single county, you can significantly speed up this query. Because points contain only a single tile in the spatial index, any street address tile that matches only one county tile in the primary filter does not need to go through the expensive secondary filter.
SELECT county_gid, count(street_gid) FROM (SELECT poly.sdo_gid county_gid, street.sdo_gid street_gid FROM STREET_ADDRESS_sdoindex street, (SELECT sdo_code county_sdo_code, count(sdo_gid) interacts FROM CENSUS_COUNTY_sdoindex GROUP by sdo_code ) counts, CENSUS_COUNTY_sdoindex poly WHERE street.sdo_code = counts.county_sdo_code AND poly.sdo_code = street.sdo_code AND (counts.interacts = 1 OR sdo_geom.relate('STREET_ADDRESS', street.sdo_gid, 'ANYINTERACT', 'CENSUS_COUNTY',poly.sdo_gid) <> 'FALSE' ) ) GROUP BY county_gid;
Spatial Cartridge supports three geometry types: points, lines, and polygons. If your data contains another type, such as a circle or arc, then you must choose the supported type that best approximates your desired type. For example, a circle can be defined as a multi-sided polygon. Obviously, the more coordinates in the element, the better the approximation will be.
Although customized types are not supported, you do not have to lose your knowledge of the type. After storing the approximated element, create another element in that geometry with ETYPE=0. Spatial Cartridge ignores elements of ETYPE=0. You can then write your own routines to handle your specialized geometry type.
When the Oracle server processes SQL statements that require sorting, such as statements containing an ORDER BY or DISTINCT clause, the Oracle server stores the result set in a temporary storage area. The result set is then sorted. If the SORT_AREA_SIZE is insufficient for holding the result set in memory, then some data may be written to disk and an entry is written in the redo log.
The RELATE() and INTERACT() secondary filters issue SQL statements internally that contain DISTINCT and ORDER BY clauses. If the SORT_AREA_SIZE initialization parameter is too small for processing the secondary filters, then some sorting may occur on disk, which causes entries to be written in the redo log. This may affect performance. For better performance, increase the SORT_AREA_SIZE parameter to force sorting to occur in memory.
To select an appropriate tiling level, it may help to visualize the tiles covering your geometries. Through visualization, you can determine how many tiles are used for each object, the size of the tiles, and how well the edges of your geometry are covered. The basic algorithm is:
The Spatial Cartridge spatial index is represented internally as a linear quadtree. The structure used to represent the linear quadtree is composed of two components: a data component and a metadata component. The data component of the linear quadtree is stored in the SDO_CODE column, and the metadata component is stored in the SDO_META column.
The SDO_META column is not required for spatial queries. However, by combining the SDO_META column with the SDO_CODE column, the tiles of any geometry or of the entire data set can be decoded. This capability allows the tiles to be visualized.
Two Spatial Cartridge internal functions have been made visible in order to describe the tiles. These functions were part of a previous release of Oracle Spatial Data Option, and are currently reserved for internal use only. The functions are not recommended for general use, except for this visualization example. Use the following syntax for the internal functions:
hhcellbndry (sdo_code || sdo_meta, sdo_dimnum, sdo_lb, sdo_ub,
hhlength(sdo_code || sdo_meta) {'MIN' | 'MAX'})
In the following examples, the dimension boundaries were assumed to be -180 to 180, and -90 and 90. The dimensional information is stored in the <layername>_SDODIM table.
If you used SDO_ADMIN.UPDATE_INDEX_FIXED()
or SDO_ADMIN.POPULATE_INDEX_FIXED() to generate your spatial index, replace "sdo_code || sdo_meta" with sdo_tile in the SQL statements that follow.
The following SQL query can be used to decode all the index entries in a
<layername>_SDOINDEX table. The example returns the coordinates of the lower-left and upper-right corners of each tile.
SELECT hhcellbndry (sdo_code || sdo_meta, 1, -180.000000000, 180.000000000,
hhlength (sdo_code || sdo_meta), 'MIN') min_x, hhcellbndry (sdo_code || sdo_meta, 1, -180.000000000, 180.000000000,
hhlength (sdo_code || sdo_meta), 'MAX') max_x, hhcellbndry (sdo_code || sdo_meta, 2, -90.000000000, 90.000000000,
hhlength (sdo_code || sdo_meta), 'MIN') min_y, hhcellbndry (sdo_code || sdo_meta, 2, -90.000000000, 90.000000000,
hhlength (sdo_code || sdo_meta), 'MAX') max_y FROM (SELECT DISTINCT sdo_code, sdo_meta FROM <layername>_sdoindex);
The following SQL query can be used to decode the index entries for a specific geometry stored in a <layername>_SDOINDEX table:
SELECT hhcellbndry (sdo_code || sdo_meta, 1, -180.000000000, 180.000000000,
hhlength (sdo_code || sdo_meta), 'MIN') min_x, hhcellbndry (sdo_code || sdo_meta, 1, -180.000000000, 180.000000000,
hhlength (sdo_code || sdo_meta), 'MAX') max_x, hhcellbndry (sdo_code || sdo_meta, 2, -90.000000000, 90.000000000,
hhlength (sdo_code || sdo_meta), 'MIN') min_y, hhcellbndry (sdo_code || sdo_meta, 2, -90.000000000, 90.000000000,
hhlength (sdo_code || sdo_meta), 'MAX') max_y FROM <layername>_sdoindex WHERE sdo_gid = <geometry id>;