Skip Headers

Oracle® Spatial User's Guide and Reference
10g Release 1 (10.1)

Part Number B10826-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

5 Geocoding Address Data

Geocoding is the process of associating spatial locations (longitude and latitude coordinates) with postal addresses. This chapter includes the following major sections:

5.1 Concepts for Geocoding

This section describes concepts that you must understand before you use the Spatial geocoding capabilities.

5.1.1 Address Representation

Addresses to be geocoded can be represented either as formatted addresses or unformatted addresses.

A formatted address is described by a set of attributes for various parts of the address, which can include some or all of those shown in Table 5-1.

Table 5-1 Attributes for Formal Address Representation

Address Attribute Description
Name Place name (optional).
Intersecting street Intersecting street name (optional).
Street Street address, including the house or building number, street name, street type (Street, Road, Blvd, and so on), and possibly other information.

In the current release, the first four characters of the street name must match a street name in the geocoding data for there to be a potential street name match.

Settlement The lowest-level administrative area to which the address belongs. In most cases it is the city. In some European countries, the settlement can be an area within a large city, in which case the large city is the municipality.
Municipality The administrative area above settlement. Municipality is not used for United States addresses. In European countries where cities contain settlements, the municipality is the city.
Region The administrative area above municipality (if applicable), or above settlement if municipality does not apply. In the United States, the region is the state; in some other countries, the region is the province.
Postal code Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code.
Postal add-on code String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format.
Country The country name or ISO country code.

Formatted addresses are specified using the SDO_GEO_ADDR data type, which is described in Section 5.2.1.

An unformatted address is described using lines with information in the postal address format for the relevant country. The address lines must contain information essential for geocoding, and they might also contain information that is not needed for geocoding (something that is common in unprocessed postal addresses). An unformatted address is stored as an array of strings. For example, an address might consist of the following strings: '22 Monument Square' and 'Concord, MA 01742'.

Unformatted addresses are specified using the SDO_KEYWORDARRAY data type, which is described in Section 5.2.3.

5.1.2 Match Modes

The match mode for a geocoding operation determines how closely the attributes of an input address must match the data being used for the geocoding. Input addresses can include different ways of representing the same thing (such as Street and the abbreviation St), and they can include minor errors (such as the wrong postal code, even though the street address and city are correct and the street address is unique within the city).

You can require an exact match between the input address and the data used for geocoding, or you can relax the requirements for some attributes so that geocoding can be performed despite certain discrepancies or errors in the input addresses. Table 5-2 lists the match modes and their meanings. Use a value from this table with the match_mode attribute of the SDO_GEO_ADDR data type (described in Section 5.2.1) and for the match_mode parameter of a geocoding function or procedure.

Table 5-2 Match Modes for Geocoding Operations

Match Mode Description
EXACT All attributes of the input address must match the data used for geocoding. However, if the house or building number, base name (street name), street type, street prefix, and street suffix do not all match the geocoding data, a location in the first match found in the following is returned: postal code, city or town (settlement) within the state, and state. For example, if the street name is incorrect but a valid postal code is specified, a location in the postal code is returned.
RELAX_STREET_TYPE The street type can be different from the data used for geocoding. For example, if Main St is in the data used for geocoding, Main Street would also match that, as would Main Blvd if there was no Main Blvd and no other street type named Main in the relevant area.
RELAX_POI_NAME The name of the point of interest does not have to match the data used for geocoding. For example, if Jones State Park is in the data used for geocoding, Jones State Pk and Jones Park would also match as long as there were no ambiguities or other matches in the data.
RELAX_HOUSE_NUMBER The house or building number and street type can be different from the data used for geocoding. For example, if 123 Main St is in the data used for geocoding, 123 Main Lane and 124 Main St would also match as long as there were no ambiguities or other matches in the data.
RELAX_BASE_NAME The base name of the street, the house or building number, and the street type can be different from the data used for geocoding. For example, if Pleasant Valley is the base name of a street in the data used for geocoding, Pleasant Vale would also match as long as there were no ambiguities or other matches in the data.
RELAX_POSTAL_CODE The postal code (if provided), base name, house or building number, and street type can be different from the data used for geocoding.
RELAX_BUILTUP_AREA The address can be outside the city specified as long as it is within the same county. Also includes the characteristics of RELAX_POSTAL_CODE.
RELAX_ALL Equivalent to RELAX_BUILTUP_AREA.
DEFAULT Equivalent to RELAX_BASE_NAME.

5.1.3 Match Codes

The match code is a number indicating which input address attributes matched the data used for geocoding. The match code is stored in the MATCH_CODE attribute of the output SDO_GEO_ADDR object (described in Section 5.2.1).

Table 5-3 lists the possible match code values.

Table 5-3 Match Codes for Geocoding Operations

Match Code Description
1 Exact match: the city name, postal code, street base name, street type (and suffix or prefix or both, if applicable), and house or building number match the data used for geocoding.
2 The city name, postal code, street base name, and house or building number match the data used for geocoding, but the street type, suffix, or prefix does not match.
3 The city name, postal code, and street base name match the data used for geocoding, but the house or building number does not match.
4 The city name and postal code match the data used for geocoding, but the street address does not match.
10 The city name matches the data used for geocoding, but the postal code does not match.
11 The postal code matches the data used for geocoding, but the city name does not match.

5.1.4 Error Messages for Output Geocoded Addresses

For an output geocoded address, the ErrorMessage attribute of the SDO_GEO_ADDR object (described in Section 5.2.1) contains a string that indicates which address attributes have been matched against the data used for geocoding. Before the geocoding operation begins, the string is set to the value ???????????281C??; and the value is modified to reflect which attributes have been matched.

Table 5-4 lists the character positions in the string and the address attribute corresponding to each position. It also lists the character value that the position is set to if the attribute is matched.

Table 5-4 Geocoded Address Error Message Interpretation

Position Attribute Value If Matched
1-4 (Reserved for future use.) ????
5 House or building number #
6 Street prefix E
7 Street base name N
8 Street suffix U
9 Street type T
10 Secondary unit S
11 Built-up area or city B
14 Region 1
15 Country C
16 Postal code P
17 Postal add-on code A

5.2 Data Types for Geocoding

This section describes the data types specific to geocoding functions and procedures.

5.2.1 SDO_GEO_ADDR Type

The SDO_GEO_ADDR object type is used to describe an address. When a geocoded address is output by an SDO_GCDR function or procedure, it is stored as an object of type SDO_GEO_ADDR.

Table 5-5 lists the attributes of the SDO_GEO_ADDR type. Not all attributes will be relevant in any given case. The attributes used for a returned geocoded address depend on the geographical context of the input address, especially the country.

Table 5-5 SDO_GEO_ADDR Type Attributes

Attribute Data Type Description
Id NUMBER (Not used.)
AddressLines SDO_KEYWORDARRAY Address lines. (The SDO_KEYWORDARRAY type is described in Section 5.2.3.)
PlaceName VARCHAR2(200) (Not used.)
StreetName VARCHAR2(200) Street name, including street type. Example: MAIN ST
IntersectStreet VARCHAR2(200) Intersecting street.
SecUnit VARCHAR2(200) Secondary unit, such as an apartment number or building number.
Settlement VARCHAR2(200) Lowest-level administrative area to which the address belongs. (See Table 5-1.)
Municipality VARCHAR2(200) Administrative area above settlement. (See Table 5-1.)
Region VARCHAR2(200) Administrative area above municipality (if applicable), or above settlement if municipality does not apply. (See Table 5-1.)
Country VARCHAR2(100) Country name or ISO country code.
PostalCode VARCHAR2(20) Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code.
PostalAddOnCode VARCHAR2(20) String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format.
FullPostalCode VARCHAR2(20) Full postal code, including the postal code and postal add-on code.
POBox VARCHAR2(100) Post Office box number.
HouseNumber VARCHAR2(100) House or building number. Example: 123 in 123 MAIN ST
BaseName VARCHAR2(200) Base name of the street. Example: MAIN in 123 MAIN ST
StreetType VARCHAR2(20) Type of the street. Example: ST in 123 MAIN ST
StreetTypeBefore VARCHAR2(1) (Not used.)
StreetTypeAttached VARCHAR2(1) (Not used.)
StreetPrefix VARCHAR2(20) Prefix for the street. Example: S in 123 S MAIN ST
StreetSuffix VARCHAR2(20) Suffix for the street. Example: NE in 123 MAIN ST NE
Side VARCHAR2(1) Side of the street (L for left or R for right) that the house is on when you are traveling from lower to higher numbered addresses.
Percent NUMBER Number from 0 to 1 (multiply by 100 to get a percentage value) indicating how far along the street you are when traveling from lower to higher numbered addresses.
EdgeID NUMBER Edge ID of the road segment.
ErrorMessage VARCHAR2(20) Error message (see Section 5.1.4).
MatchCode NUMBER Match code (see Section 5.1.3).
MatchMode VARCHAR2(30) Match mode (see Section 5.1.2).
Longitude NUMBER Longitude coordinate value.
Latitude NUMBER Latitude coordinate value.

You can return the entire SDO_GEO_ADDR object, or you can specify an attribute using standard "dot" notation. Example 5-1 contains statements that geocode the address of the San Francisco City Hall; the first statement returns the entire SDO_GEO_ADDR object, and the remaining statements return some specific attributes.

Example 5-1 Geocoding, Returning Address Object and Specific Attributes

SELECT SDO_GCDR.GEOCODE('SCOTT', 
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'), 
    'US', 'RELAX_BASE_NAME') FROM DUAL;
 
SDO_GCDR.GEOCODE('CJMURRAY',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
SDO_GEO_ADDR(0, SDO_KEYWORDARRAY(), NULL, 'CARLTON B GOODLETT PL', NULL, NULL, '
SAN FRANCISCO', NULL, 'CA', 'US', '94102', NULL, '94102', NULL, '1', 'CARLTON B 
GOODLETT', 'PL', 'F', 'F', NULL, NULL, 'L', .01, 23614360, 'nul?#ENUT?B281CP?',
1, 'DEFAULT', -122.41815, 37.7784183) 

SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').StreetType  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
PL                                                                              
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').Side  RROM DUAL;
 
S                                                                               
-                                                                               
L                                                                               
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').Percent  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
                                                                             .01
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').EdgeID  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
                                                                        23614360
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').MatchCode  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
                                                                               1

5.2.2 SDO_ADDR_ARRAY Type

The SDO_ADDR_ARRAY type is a VARRAY of SDO_GEO_ADDR objects (described in Section 5.2.1) used to store geocoded address results. Multiple address objects can be returned when multiple addresses are matched as a result of a geocoding operation.

The SDO_ADDR_ARRAY type is defined as follows:

CREATE TYPE sdo_addr_array AS VARRAY(1000) OF sdo_geo_addr;

5.2.3 SDO_KEYWORDARRAY Type

The SDO_KEYWORDARRAY type is a VARRAY of VARCHAR2 strings used to store address lines for unformatted addresses. (Formatted and unformatted addresses are described in Section 5.1.1.)

The SDO_KEYWORDARRAY type is defined as follows:

CREATE TYPE sdo_keywordarray AS VARRAY(10000) OF VARCHAR2(9000);

5.3 Using the Geocoding Capabilities

To use the Oracle Spatial geocoding capabilities, you must use data provided by a geocoding vendor, and the data must be in the format supported by the Oracle Spatial geocoding feature. For information about getting and loading this data, go to the Spatial page of the Oracle Technology Network (OTN):

http://otn.oracle.com/products/spatial/

Find the link for geocoding, and follow the instructions.

To geocode an address using the geocoding data, use the SDO_GCDR PL/SQL package subprograms, which are documented in Chapter 20: