Oracle8
ConText Cartridge Workbench User's Guide
Release 2.4 A63822-01 |
|
This chapter describes how to use the TextServer3 Dictionary
Migration Tool.
The following topics are covered in this chapter:
The TextServer3 Dictionary Migration tool has been developed to assist users who are migrating from TextServer3 to ConText. It is not a complete migration tool, but it does allow you to:
The tool is delivered as a PL/SQL package called CTXWMG_MIGRATE.
The package generates SQL scripts that you can customize to your requirements
before running them.
This package must be created in the ConText database on the
server after the ConText Workbench is installed on a client machine.
The Migration Tool is a PL/SQL package, CTXWMG_MIGRATE,
that is installed on the same database as ConText. It queries the TextServer3
dictionary to find as much useful information as possible that can be used
in the creation of column policies. This is done using a database link
to the database where the TextServer3 text dictionary resides. It also
uses the database link to access any TextServer3 thesauri or section groups
that are to be migrated.
The PL/SQL package that implements the tool makes use of
the UTL_FILE package. This package adds File I/O capabilities to PL/SQL
in Oracle8, release 8.0.
The Migration Tool is intended for use by application developers or TextServer3/ConText administrators. It is assumed that the user has an understanding of the following:
Familiarity with the Oracle8 IMP/EXP utility is also assumed.
All references to migration refer to the TextServer3/ConText
data dictionary migration process. All references to the Migration Tool
refer to the TextServer3/ConText Dictionary Migration Tool.
See
Also:
For more information about policies, preferences, attributes, and thesauri in ConText, see the Oracle8 ConText Cartridge Administrator's Guide |
Although TextServer3 and ConText share similar functionality,
there are differences that prevent the automatic migration of text tables,
thesauri and section groups. The user or application developer needs to
make decisions about preferences and policies using their knowledge about
the text application.
For this reason, the Migration Tool does not directly create column policies and thesauri in the ConText data dictionary. Instead, it generates:
The generated scripts are edited by the user in order to
customize the migration to suit the needs of the application. The user
needs to alter the creation of some preferences because of differences
in ConText functionality (eg. filters supported internally) or to take
advantage of new functionality.
The Migration Tool is aimed at migrating from any version
of TextServer3 to Oracle8 ConText Cartridge 2.4.5. The database used by
each product must be accessible to the user carrying out the migration.
The table that is a TextServer3 text table must exist on
the target database (where ConText is installed). It need not be populated,
but must have a column that has a unique or primary key constraint on it
- this column must be the TextServer3 textkey column. The exp/imp utilities
can be used for this purpose, or the TextServer3 database can be upgraded
to 8.0.5.
Note: A separate 7.2 ORACLE_HOME must be maintained if the user wants to carry on using the TextServer3 application. |
ConText 2.4.5 must be installed on a 8.0.5 database. The
user who uses the Migration Tool to generate scripts and load files need
not be the user who ultimately owns the column policies and thesauri. The
Migration Tool user must have privileges that allow the creation and dropping
of database links for each TextServer3 user whose text tables and thesauri
are to be migrated.
TextServer3 has areas of functionality that are available
in ConText, but the ConText functionality does not exactly parallel the
TextServer3 functionality. One of these areas is Stored Query Expressions
(SQEs). In TextServer3, an SQE can be a query against multiple text columns.
These text columns do not have to be in the same text table. In addition,
the query also supports the full 'SELECT ... FROM ... WHERE ...' clauses
in a SQL statement.
The ConText implementation binds an SQE to a single column
policy and specifies only the query expression. For these reasons, no attempt
is made to migrate TextServer3 SQEs to ConText.
Another area of TextServer3 functionality that differs from
the ConText implementation is the thesaurus. A term in a TextServer3 thesaurus
or synonym ring can be a proper term, a reference to another thesaurus
or synonym ring, or a reference to an SQE. In ConText, a term can only
be a proper term or phrase. When doing the thesaurus migration, the user
has control over what happens when a term is a thesaurus or synonym ring,
but no attempt to process a term is made when that term is an SQE.
If section groups are to be migrated, the section groups
must be created in the ConText data dictionary before policy creation
and indexing is carried out.
This section provides details for installing the Migration
Tool.
During installation of the ConText Workbench, the creation
scripts for the CTXWMG_MIGRATE package are copied to the client
machine and an icon is created for executing the scripts; however the scripts
are not executed to create the package.
Installing the Migration Tool involves using the icon to
create the CTXWMG_MIGRATE package in the database on the server
machine.
Before installing the Migration Tool, you must perform the following tasks:
The Migration Tool, once installed on the ConText database, can generate files for the tables and thesauri that are to be migrated. These files are owned by the owner of the Oracle8 shadow process. Therefore, choose a directory that can be accessed by the Oracle Server user who runs SQL*Plus and ctxload. This is because the generated files may need to be customized.
Accessible directories must be specified in the INIT.ORA file for the 8.0 instance. Each accessible directory is specified by the following parameter:
UTL_FILE_DIR = <directory name> UTL_FILE_DIR = <directory name> ...
The named director(ies) are then available to the PL/SQL
File I/O package.
You may require that the names of database links are different from the name of the database where they are linking. To enforce this, add the following parameter to INIT.ORA:
GLOBAL_NAMES = FALSE
The database instance where ConText is installed must be restarted so that the UTL_FILE_DIR and GLOBAL_NAMES parameters can take effect.
The Migration Tool uses a database link to access the TextServer3
dictionary. The link is called ts3db.
For every TextServer3 user that owns a text table or thesaurus
to be migrated, the link must be dropped, and then recreated.
Links should be created by the Oracle user that runs the Migration Tool. From SQL*Plus, use the following syntax:
SQL> create database link ts3db connect to <user> 2 identified by <pw> using '<connect_string>';
When the database link must be created for another TextServer3 user, drop the existing ts3db link:
SQL> drop database link ts3db;
To install the CTXWMG_MIGRATE package, double-click
on the TextServer3 Migration Tool Installation icon.
During text table migration, a script is generated that creates,
as closely as possible, a policy that reflects an indexable column in a
TextServer3 text table. The user must specify a prefix for the naming of
the preferences in the policy. If a TextServer3 text table contains multiple
indexable columns, then a script is created for each column.
ConText has a concept of preferences and policies. A preference specifies one of the indexing options that are necessary for creating a text index for a column:
The preferences are grouped into a policy and the policy
is assigned to a column. Then, a text index is created for the column policy.
A preference consists of a Tile, which identifies the indexing
option for the preference, and a number of attributes for the Tile. The
attributes are set for a Tile using the CTX_DDL.SET_ATTRIBUTE procedure.
See
Also:
For more information about policies, preferences, and attributes, see Oracle8 ConText Cartridge Administrator's Guide |
A brief outline of how the Migration Tool handles each type
of preference is given here.
If the TextServer3 column is EXTERNAL, then the OSFILE Tile
is used to create a Data Store preference. If an External File Path for
EXTERNAL columns is specified, then the path attribute of OSFILE
is set appropriately. Otherwise, the predefined CTXSYS.DEFAULT_DIRECT_DATASTORE
preference is used.
If the TextServer3 text table contains documents of a single
format supported internally by ConText, then the FILTER attribute is set
appropriately. If multiple formats are used, that are all supported internally
by ConText, the predefined CTXSYS.AUTOB preference is used. However, if
unsupported formats are used, the tool generates a commented-out list of
CTX_DDL.SET_ATTRIBUTE calls, for the executable attribute, to match
the formats that are recognized by Autorec. The user should then uncomment
the appropriate ones and edit the executable name.
The BASIC LEXER Tile is used to create the Lexer preference.
If the TextServer3 text column has column-level options defined, then these
are used when setting lexer attributes, otherwise table-level options are
used. The options map as follows:
TextServer3 Option | BASIC LEXER Tile Attribute(s) |
---|---|
Alpha Join |
printjoins |
Numeric Join |
numjoin, numgroup |
Continuation |
continuation |
case Conversion = BASE |
BASE_LETTER set to 1 |
If a particular TextServer3 option is not set, then the equivalent
BASIC LEXER attribute value is not set.
Additionally, if the TextServer3 text column references a
section group, then the following two BASIC LEXER Tile attributes are set:
Attribute Name | Value(s) |
---|---|
startjoin |
'<' and '/' |
endjoin |
'>' |
The user may wish to edit these.
The GENERIC ENGINE Tile is used to create the Engine preference.
The resulting script contains a call to CTX_DDL.SET_ATTRIBUTE for each
attribute of GENERIC ENGINE, but they are commented out. The values need
to be edited by the user as appropriate, and the line uncommented.
The GENERIC WORDLIST tile is used to create the Wordlist
preference. For the stclause and instclause attributes, a
commented-out call to CTX_DDL.SET_ATTRIBUTE() exists.
stemmer is set to 1 (English) and fuzzy_match
is set to 1 (English).
If the TextServer3 text column references a section group,
then the section_group attribute is set appropriately.
If there are no stop words in the appropriate TextServer3
wordlist, then the predefined CTXSYS.NO_STOPLIST preference is used.
If there are less than 4096 lowercase stop words in the TextServer3
table wordlist with a length of less than 65 characters, then these words
are used to make up a stoplist.
If there are more than 4096 stop words, the CTXSYS.DEFAULT_STOPLIST
preference is used.
The generation of a script is straightforward after the tool
has been installed and the UTL_FILE_DIR initialization parameter has been
set.
The user needs to decide which table to migrate and what
the prefix for the preference names are going to be. Then, the user calls
the CTXWMG_MIGRATE.MIGRATE procedure to create the necessary scripts for
the migration.
The preference names that the tool generates in the scripts
are a concatenation of the prefix supplied by the user, the name of the
preference category, and the name of the column for which each script is
created.
The generated scripts contain calls to CTX_DDL.SET_ATTRIBUTE
that have been commented out with the SQL*Plus command rem. The user must
examine the script to determine whether such calls need uncommenting. The
comments are used for attributes where the Migration Tool is unable to
determine a reasonable value.
An example of this is for the storage parameters used by
various tables and indexes in the Engine preference.
This example uses the MIGRATE procedure to create column
policy scripts.
In this example, a TextServer3 table, owned by jbloggs,
is called resume and has two indexable columns called cv
and address. The directory that has been chosen as the target directory
for the scripts is called /usr/home/joebloggs/migrate. The prefix convention
chosen is RESUME_APP. A database link called ts3db has been created
for jbloggs.
To generate scripts, the procedure MIGRATE is called:
begin ctxwmg_migrate.migrate (tabname=>'RESUME', pref_conv=>'RESUME_APP', scriptdir=>'/usr/home/joebloggs/migrate'); end;
From this example, two scripts would be generated, one for each of the indexable columns in resume:
The migration of TextServer3 thesauri and synonym rings to
ConText is not as straightforward as the migration of TextServer3 tables.
This is because in ConText each term in a thesaurus must be a phrase. In
TextServer3, narrower terms can be another thesaurus or a synonym ring.
A related term can also be a thesaurus or synonym ring. This means that
the functionality of a TextServer3 thesaurus is determined just as much
by its structure and its relationship with other thesauri/synonym rings,
as by the query operators made available.
Another difference is that a thesaurus in TextServer3 can
contain only one hierarchy. In ConText, a thesaurus can contain multiple
hierarchies. A ConText thesaurus can also contain synonym rings. The load
files create Standard Narrower Term hierarchies.
A term in a ConText synonym ring must be unique to that synonym
ring within a particular thesaurus. All synonym ring terms are given a
qualifier so that conflicts do not arise. The qualifier is the name of
the TextServer3 synonym ring.
The Migration Tool provides four procedures for use in the
migration of thesauri and synonym rings. Each of these procedures generates
a load file that can be used with ctxload. The ctxload utility can be used
for loading thesauri into the ConText dictionary.
See
Also:
For more information about thesauri and synonym rings, see Oracle8 ConText Cartridge Administrator's Guide |
Section groups can be migrated from TextServer3 to ConText.
The generated script should be run from SQL*Plus to create the section
group and add sections to it.
See
Also:
For more information about sections and section groups, see Oracle8 ConText Cartridge Administrator's Guide |
The CTXWMG_MIGRATE PL/SQL package contains the following
stored procedures:
The MIGRATE procedure creates a SQL script for a TextServer3
text table. When the script is run, it generates a ConText policy for an
existing table in a ConText database.
CTXWMG_MIGRATE.MIGRATE_(tabname IN VARCHAR2, pref_conv IN VARCHAR2, scriptdir IN VARCHAR2);
Specify the name of the TextServer3 text table that contains one or more indexable columns for which you want to create a ConText policy.
Specify the string of characters used to generate a prefix for the preference names in the script.
Specify the directory for the script generated by MIGRATE.
The preference names that the tool generates in the script
are a concatenation of the prefix supplied by the user, the name of the
preference category, and the name of the column for which the script is
being created.
For example, if the tool is creating a script for the doc
column, and the user specifies a prefix of myapp, then the Engine
preference is called myapp_engine_doc.
If the preference is being created for an internal filter,
then the name of the preference category is substituted with four letters
representing the format (e.g. WP51 for WordPerfect 5.1).
The MIGRATE_ALL_SYNS procedure creates a single load file
for all the synonym rings owned by the user.
CTXWMG_MIGRATE.MIGRATE_ALL_SYNS(loaddir IN VARCHAR2);
Specify the directory for load file generated by MIGRATE_ALL_SYNS.
No 'in-place' expansion occurs.
When the load file generated by MIGRATE_ALL_SYNS is used
to generate a ConText thesaurus, all the synonym rings from TextServer3
are created within one ConText thesaurus.
The name of the generated load file is based on the user
ID for the TextServer3 user. For example, for a user with a TextServer3
user ID of 3, the generated file is called ALL_SYNONYMS_3.syn.
The MIGRATE_ALL_THES procedure creates a single load file
for all the thesauri owned by the user.
CTXWMG_MIGRATE.MIGRATE_ALL_THES(loaddir IN VARCHAR2);
Specify the directory for load file generated by MIGRATE_ALL_THES.
No 'in-place' expansion occurs.
When the load file generated by MIGRATE_ALL_THES is used
to generate a ConText thesaurus, all the hierarchies from the TextServer3
thesauri are created within one ConText thesaurus.
The name of the generated load file is based on the user
ID for the TextServer3 user. For example, for a user with a TextServer3
user ID of 3, the generated file is called ALL_THESAURI_3.ths.
The MIGRATE_SG procedure generates a script for a named section
group.
CTXWMG_MIGRATE.MIGRATE_SG(group_name IN VARCHAR2, loaddir IN VARCHAR2);
Specify the name of the section group to be migrated.
Specify the directory for script generated by MIGRATE_SG.
The name of the generated script is based on the name of
the TextServer3 section group. For example, for a TextServer3 section group
called MYSECTIONGROUP, the generated script is called SG_MYSECTIONGROUP.sql.
The MIGRATE_SYN procedure generates a load file for a named
TextServer3 synonym ring.
CTXWMG_MIGRATE.MIGRATE_SYN(syn_sname IN VARCHAR2, loaddir IN VARCHAR2);
Specify the name of the synonym ring to be migrated.
Specify the directory for load file generated by MIGRATE_SYN.
No 'in-place' expansion occurs.
The name of the TextServer3 synonym ring is used as a qualifier
for each term.
The name of the generated load file is based on the thesaurus
name and the user ID for the TextServer3 user. For example, for a TextServer3
synonym ring called MYSYNRING belonging to a user with a TextServer3 user
ID of 3, the generated file is called MYSYNRING_3.ths.
The MIGRATE_THES procedure generates a load file for a named
TextServer3 thesaurus.
CTXWMG_MIGRATE.MIGRATE_THES(thesname IN VARCHAR2, t_expand_mode IN INTEGER DEFAULT 0, s_expand_mode IN INTEGER DEFAULT 0, dump_syns IN INTEGER DEFAULT 0, loaddir IN VARCHAR2);
Specify the name of the thesaurus to be migrated.
Specify the expansion mode for referenced thesauri:
Specify the expansion for referenced synonym rings:
1 Expand single user synonym ring in-place
2 Expand multiple user synonym rings in-place
3 Expand single user or public synonym ring in-place
4 Expand multiple user or public synonym rings in-place
Specify to dump all synonym rings owned by current user.
Specify the directory for load file generated by MIGRATE_THES.
The t_expand_mode and s_expand_mode parameters
allow users to control what happens when the thesaurus hierarchy contains
a narrower term that is a synonym ring.
If thesaurus expansion is used, a referenced thesaurus is
expanded 'in-place'. This means that instead of the term being a reference
to a thesaurus, it becomes the top-term of the referenced thesaurus. This
effectively makes the referenced hierarchy part of the main hierarchy.
If a thesaurus that has been made part of the main hierarchy references
other thesauri itself, then those thesauri also become part of the main
hierarchy. Referenced thesauri may be owned by the user or may be public.
Expansion of referenced thesauri can be limited to those owned by the user
if required.
In-place synonym ring expansion is similar to thesaurus expansion,
but the terms in the referenced ring become narrower terms of the referencing
term. The user can limit expansion to those synonym rings owned by the
user. It is possible for a synonym ring term to be a reference to another
synonym ring. Expansion of such a synonym ring would result in those terms
also being narrower terms of the term in the thesaurus. The user can control
this, as well as being able to limit the expansion to synonym rings owned
by the user.
If a value of 1 is specified for the dump_syns parameter,
MIGRATE_THES dumps all synonym rings into the same load file. When the
load file is used to create a thesaurus in ConText, all the dumped TextServer3
synonym rings are part of that ConText thesaurus.
The name of the generated load file is based on the thesaurus
name and the user ID for the TextServer3 user. For example, for a TextServer3
thesaurus called MYTHES belonging to a user with a TextServer3 user ID
of 3, the generated file is called MYTHES_3.ths.
A sample script is supplied with the Migration Tool. It can
be altered to suit the user's requirements. It prompts the user for values.
The following example makes use of the sample script to demonstrate
how a user's text table and thesaurus can be migrated from TextServer3
to ConText.
In this example, the TextServer3 user is jbloggs on a TextServer3 database using a connect string 'prod_db'. The jbloggs user has a TextServer3 user ID of 3 and one text table called odc_papers:
The summary column and the documents in the paper
column are in the same format, which is either Word for Windows 6 or WordPerfect
6 format. Both of these formats are supported internally by ConText. Neither
of these two columns are external.
The application uses a single thesaurus, called MY_THESAURUS,
but it does contain terms that are references to other thesauri and synonym
rings. Some of these are owned by the user, others are public. The user
requires the same terms to be returned from operations on the thesaurus
using ConText.
The directory chosen for creating the column policy script
and load files is /usr/home/joebloggs/migrate. The script creates
the database link, and drops it after finishing the script generation.
The steps for migrating the text table and thesaurus are:
% sqlplus jbloggs/jbloggs @samp_mig
Enter the following information at the prompts:
jbloggs jbloggs prod_db /usr/home/joebloggs/migrate odc_papers joes_app 2 4 1
The Migration Tool creates three script files in the directory /usr/home/joebloggs/migrate:
If these scripts are then run from SQL*Plus, they create
column policies for each of the text columns. Both column policies use
the Autorec functionality to invoke the internal filters since the Migration
Tool determines that all the formats used are supported by ConText. Before
running the scripts, the user may choose to edit the preference name or
any of the attribute values.
The thesaurus is migrated to ConText by using the ctxload utility. Assuming that the thesaurus should keep the name 'MY_THESAURUS', the following command would import the thesaurus:
% ctxload -user jbloggs/jbloggs -name MY_THESAURUS -file MY_THESAURUS_3.ths -thes
This completes the migration of the TextServer3 Dictionary
information for the text table, ODC_PAPERS. The user should now import
the data from the text table in the TextServer3 database and create a ConText
index for the data.