Oracle8 Spatial Cartridge User's Guide and Reference Release 8.0.4 A53264-02 |
|
This chapter describes how to load spatial data into a database, including storing the data in a table and creating a spatial index for it.
There are two steps involved in loading raw data into a spatial database such that it can be queried efficiently:
Table 2-1 through Table 2-4 show the format of the tables needed to store and index spatial data.
Table 2-1 <layername>_SDOLAYERSDO_ORDCNT | SDO_LEVEL | SDO_NUMTILES | SDO_COORDSYS |
---|---|---|---|
<number> | <number> | <number> | <varchar> |
SDO_DIMNUM | SDO_LB | SDO_UB | SDO_TOLERANCE | SDO_DIMNAME |
---|---|---|---|---|
<number> | <number> | <number> | <number> | <varchar> |
SDO_GID | SDO_ESEQ | SDO_ETYPE | SDO_SEQ | SDO_X1 | SDO_Y1 | ... | SDO_Xn | SDO_Yn |
---|---|---|---|---|---|---|---|---|
<number> | <number> | <number> | <number> | <number> | <number> | ... | <number> | <number> |
SDO_GID | SDO_CODE | SDO_MAXCODE | SDO_GROUPCODE | SDO_META |
---|---|---|---|---|
<number> | <raw> | <raw> | <raw> | <raw> |
The process of loading data can be classified into two categories:
This process is used to load large volumes of data into the database and uses the SQL*Loader to load the data.
This process is used to insert relatively small amounts of data into the database and is analogous to the INSERT statement in SQL.
Bulk loading can be used to import large amounts of legacy or ASCII data into a spatial database. Bulk loading is accomplished using the SQL*Loader1.
Example 2-1 shows the format of the raw data and control file that would be required to load the data into the SDOGEOM table with the layer name ROADS. You can choose any format of ASCII data as long you can write a SQL*Loader control file to load that data into the tables.
Assume that the ASCII data consists of a file with delimited columns, and separate rows fixed by the limits of the table with the following format.
geometry rows: GID, ESEQ, ETYPE, SEQ, LON1, LAT1, LON2, LAT2
The coordinates in the geometry rows represent the end points of line segments, which taken together, represent a polygon. Example 2-2 shows the control file for loading the data into the geometry table.
LOAD DATA INFILE * INTO TABLE ROADS_SDOGEOM FIELDS TERMINATED BY WHITESPACE TRAILING NULLCOLS (SDO_GID INTEGER EXTERNAL, SDO_ESEQ INTEGER EXTERNAL, SDO_ETYPE INTEGER EXTERNAL, SDO_SEQ INTEGER EXTERNAL, SDO_X1 FLOAT EXTERNAL, SDO_Y1 FLOAT EXTERNAL, SDO_X2 FLOAT EXTERNAL, SDO_Y2 FLOAT EXTERNAL) BEGINDATA 1 0 3 0 -122.401200 37.805200 -122.401900 37.805200 1 0 3 1 -122.401900 37.805200 -122.402400 37.805500 1 0 3 2 -122.402400 37.805500 -122.403100 37.806000 1 0 3 3 -122.403100 37.806000 -122.404400 37.806800 1 0 3 4 -122.404400 37.806800 -122.401200 37.805200 1 1 3 0 -122.405900 37.806600 -122.407549 37.806394 1 1 3 1 -122.407549 37.806394 -122.408300 37.806300 1 1 3 2 -122.408300 37.806300 -122.409100 37.806200 1 1 3 3 -122.409100 37.806200 -122.405900 37.806600 2 0 2 0 -122.410800 37.806000 -122.412300 37.805800 2 0 2 1 -122.412300 37.805800 -122.414100 37.805600 2 0 2 2 -122.414100 37.805600 -122.412300 37.805800 2 0 2 3 -122.412300 37.805800 -122.410800 37.806000 3 0 1 0 -122.567474 38.643564 3 0 1 1 -126.345345 39.345345
Note that table ROADS_SDOGEOM exists in the schema before attempting the load.
In Example 2-3, the data resides in a single flat file and the data set consists of point, line string, and polygon data. The data uses fixed-position columns and overloaded table rows.
SDO_GID SDO_ESEQ SDO_ETYPE SDO_SEQ SDO_X1 SDO_Y1 SDO_X2 SDO_Y2
The corresponding control file for this format of input data would be:
LOAD DATA INFILE * INTO TABLE NEW_SDOGEOM (SDO_GID POSITION (1:5) INTEGER EXTERNAL, SDO_ESEQ POSITION (7:10) INTEGER EXTERNAL, SDO_ETYPE POSITION (12:15) INTEGER EXTERNAL, SDO_SEQ POSITION (17:21) INTEGER EXTERNAL, SDO_X1 POSITION (23:35) FLOAT EXTERNAL, SDO_Y1 POSITION (37:48) FLOAT EXTERNAL, SDO_X2 POSITION (50:62) FLOAT EXTERNAL, SDO_Y2 POSITION (64:75) FLOAT EXTERNAL) BEGINDATA 1 0 3 0 -122.401200 37.805200 -122.401900 37.805200 1 0 3 1 -122.401900 37.805200 -122.402400 37.805500 1 0 3 2 -122.402400 37.805500 -122.403100 37.806000 1 0 3 3 -122.403100 37.806000 -122.404400 37.806800 1 0 3 4 -122.404400 37.806800 -122.401200 37.805200 1 1 3 0 -122.405900 37.806600 -122.407549 37.806394 1 1 3 1 -122.407549 37.806394 -122.408300 37.806300 1 1 3 2 -122.408300 37.806300 -122.409100 37.806200 1 1 3 3 -122.409100 37.806200 -122.405900 37.806600 2 0 2 0 -122.410800 37.806000 -122.412300 37.805800 2 0 2 1 -122.412300 37.805800 -122.414100 37.805600 2 0 2 2 -122.414100 37.805600 -122.412300 37.805800 2 0 2 3 -122.412300 37.805800 -122.410800 37.806000 3 0 1 0 -122.567474 38.643564 3 0 1 1 -126.345345 39.345345
Spatial Cartridge uses standard Oracle8 tables that can be accessed or loaded with standard SQL syntax. Example 2-4 loads data for a geometry (GID 17) consisting of a polygon with four sides that contains both a hole and point. Notice that the first coordinate of the polygon (5, 20) is repeated at the end to close the polygon.
INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (17, 0, 3, 0, 5, 20, 5, 30, 10, 30, 10, 20, 5, 20); -- hole INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (17, 1, 3, 0, 8, 21, 8, 24, 9, 24, 9, 21, 8, 21); -- point INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1) VALUES (17, 2, 1, 0, 9, 29);
The SQL INSERT statement inserts one row of data per call. In Example 2-4, the table had enough columns to store the polygon in a single row. However, if your table had fewer columns (or your polygon had more points), you would have to perform mulitple inserts in order to match the table structure; the data would not wrap automatically to the next row. To load a large geometry, repeat the SDO_GID, SDO_ESEQ, and SDO_ETYPE, and increment the SDO_SEQ for each line as shown in Example 2-5.
INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 0, 1, 15, 1, 16, 2, 17, 3, 17, 4, 18); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 1, 4, 18, 5, 18, 6, 19, 7, 18, 6, 17); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 2, 6, 17, 7, 16, 7, 15, 6, 14, 7, 13); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5) VALUES (18, 0, 3, 3, 7, 13, 6, 12, 5, 13, 4, 13, 3, 14); INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_Y3) VALUES (18, 0, 3, 4, 3, 14, 2, 14, 1, 15);
Spatial Cartridge provides two functions to facilitate inserting data into spatial tables. A benefit to using these functions is that the issue of row-wrapping when loading elements with multiple points is handled automatically by these functions.
There are two steps to incrementally add data to the spatial tables:
The SDO_GEOM.INIT_ELEMENT() function returns the sequence number of the element in the geometry. This sequence number is required as a parameter to the SDO_GEOM.ADD_NODES() procedure.
In Example 2-6, a simple polygon, geometry number 1234, consisting of five vertices needs to be stored. The first step is to call SDO_GEOM.INIT_ELEMENT() to initialize the element.
elem_value := sdo_geom.init_element('ROADS', 1234);
Next, call SDO_GEOM.ADD_NODES() to fill in the attributes of the polygon. The vertices can be added in either clockwise or counter-clockwise order.
sdo_geom.add_nodes('ROADS', 1234, elem_value, sdo_geom.polygon_type, Ax, Ay, Bx, By, Cx, Cy, Dx, Dy, Ex, Ey, Ax, Ay);
Close the polygon by repeating the first vertex (Ax,Ay) as the last vertex.
In Example 2-7, assume that the geometry shown in Figure 2-1 needs to be stored. The geometry consists of a polygon with a hole in it. Note that both calls to the SDO_GEOM.ADD_NODES() procedure are made with the same GID (6789) because this is a single object even though it is composed of two elements.
val1 := sdo_geom.init_element('PARKS', 6789); sdo_geom.add_nodes('PARKS', 6789, val1, SDO_GEOM.POLYGON_TYPE, P1x, P1y, P2x, P2y, P3x, P3y, P4x, P4y, P5x, P5y, P6x, P6y, P1x, P1y); val2 := sdo_geom.init_element('PARKS', 6789); sdo_geom.add_nodes('PARKS', 6789, val2, SDO_GEOM.POLYGON_TYPE, G1x, G1y, G2x, G2y, G3x, G3y, G4x, G4y, G1x, G1y);
Once data has been loaded into the spatial tables through either bulk or transactional loading, a spatial index needs to be created on the tables for efficient access to the data.
Create an Oracle8 table called <layername>_SDOINDEX as follows:
SQL> create table <layername>_SDOINDEX 2 ( 3 SDO_GID integer, 4 SDO_CODE raw(255) 5 );
For a bulk load, you can call the SDO_ADMIN.POPULATE_INDEX() procedure once to tessellate the geometry table and add the generated tiles to the spatial index table. The argument to this procedure is simply the name of the layer. The level to which the geometry should be tessellated, and whether to use the fixed-size or variable-sized tile indexing technique is determined by values in the
<layername>_SDOLAYER table.
If data is updated in or deleted from a specific geometry table, you can call SDO_ADMIN.UPDATE_INDEX() to update the index for one SDO_GID. The arguments to this procedure are the name of the layer and the SDO_GID of the designated geometry.
See Chapter 5, "Administrative Procedures" for a complete description of the SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.UPDATE_INDEX() procedures.
Spatial Cartridge provides two methods for spatial indexing. Fixed-size tiling is recommended for all production applications. For advanced development applications, you may want to experiment with variable-sized tiling, which theoretically could provide better selectivity in some data sets.
Which tessellation algorithm is used by the SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.UPDATE_INDEX() procedures is determined by the values of the SDO_LAYER and SDO_NUMTILES columns in the <layername>_SDOLAYER table as follows:
Oracle recommends using fixed-size cover tiles for indexing a geometry.
The fixed-size tile algorithm is expressed as a level referring to the number of tessellations performed. To use fixed-size tile indexing, set the SDO_NUMTILES column in the <layername>_SDOLAYER table to NULL and the SDO_LEVEL column to the desired tiling level. The relationship between the tiling level and the resulting size of the tiles is dependent on the domain of the layer.
The domain used for indexing is defined by the upper and lower boundaries of each dimension stored in the <layername>_SDODIM table. A typical domain in a GIS application could be -90 to 90 degrees for latitude, and -180 to 180 degrees for longitude2, as represented in Figure 2-2.
If the SDO_LEVEL column is set to 1, then the tiles created by the indexing mechanism are the same size as tiles at the first level of tessellation. Each tile would be 180 degrees by 90 degrees as shown in Figure 2-3.
The formula for the number of fixed-size tiles is 4n where n is the number of tessellations, stored in the SDO_LEVEL column. Figure 2-4 shows fixed-size tiling at level 2. In this figure, each tile is 90 degrees by 45 degrees.
The size of a tile can be determined by applying the following formula to each dimension:
length = (upper_bound - lower_bound) / 2 ^ sdo_level
The length refers to the length of the tile along the specified dimension. Applying this formula to the tiling shown in Figure 2-4 yields the following sizes:
length for dimension X = (180 - (-180) ) / 2^2 = (360) / 4 = 90 length for dimension Y = (90 - (-90) ) / 2^2 = (180) / 4 = 45
Thus, at level 2 the tiles are 90x45 degrees in size. As the number of levels increases, the tiles become smaller and smaller. Smaller tiles provide a more precise fit of the tiles over the geometry being indexed. However, because the number of tiles generated is unbounded, you must take into account the performance implications of using higher levels. The SDO_TUNE.ESTIMATE_TILING_LEVEL() function can be used to determine an appropriate level for indexing with fixed-size tiles. See Chapter 6 for a description of this procedure.
Besides the performance aspects related to selecting a fixed-size tile, tessellating the geometry into fixed-size tiles might have benefits related to the type of data being stored, such as using tiles sized to represent 1-acre farm plots, city blocks, or individual pixels on a display. Data modeling is an important part any database design, and is essential in a spatial database where the data often represents actual physical locations.
Assume that data has been loaded into a layer called ROADS. To create a spatial index, create a table ROADS_SDOINDEX and invoke the following procedure:
sdo_admin.populate_index('ROADS');
The value in the SDO_LEVEL column of ROADS_SDOLAYER can be used as a tuning parameter while tessellating objects. Increasing the level increases the number of tiles to provide a more precise fit of the tiles over the object. See the description of the ESTIMATE_TILING_LEVEL()function in Chapter 6 for information on estimating the tiling level in several different ways.
After SDO_ADMIN.POPULATE_INDEX() has been called to fill the spatial index, you should also create standard indexes on the <layername>_
SDOINDEX.SDO_CODE column.
If a geometry with an SDO_GID 5944 has been added to the spatial tables, update the index with the following procedure:
sdo_admin.update_index('ROADS', 5944);
SDO_ADMIN.POPULATE_INDEX()and SDO_ADMIN.UPDATE_INDEX() behave differently than the CREATE INDEX statement in SQL. An implicit commit is not executed after the procedures are called. Therefore these transactions can be rolled back.
SDO_ADMIN.POPULATE_INDEX operates as a single transaction. To reduce the amount of rollback required to execute this procedure, you can write a routine that loops and calls SDO_ADMIN.UPDATE_INDEX(). See Section A.1.1.1, "cr_spatial_index.sql Script" for more information.
Spatial indexing with variable-sized tiles is not recommended for production systems. Variable-sized tiling is included in Spatial Cartridge primarily for experimentation purposes.
To use variable-sized tiling, the SDO_LEVEL and SDO_NUMTILES columns must be set in the <layername>_SDOLAYER table.
The SDO_NUMTILES column determines the number of tiles that will be used to cover a geometry being indexed. Typically this value is small, such as 4 or 8 tiles. However, the larger the number of tiles, the better the tiles will fit the geometry being covered. This increases the selectivity of the primary filter. See Section 3.3.2 and Section 3.3.3 for a discussion of primary and secondary filters.
The SDO_LEVEL column indicates the spatial partitioning level for the generated tiles. See Section 1.5.3 for a description of the spatial partitioning utilized by Spatial Cartridge when using variable-sized tiles.
Setting the proper SDO_LEVEL value is more art than science. One approach would be use the SDO_TUNE.ESTIMATE_TILING_LEVEL() function to determine an appropriate starting SDO_LEVEL value, and then compare the performance with slightly higher or lower values.
1
See the Oracle Server Utilities User's Guide for information on the SQL*Loader.
2
The transference of the domain onto a sphere or Mercator projection is left to GIS (or other) application programmers. Spatial Cartridge treats the domain as a conventional X by Y rectangle.