Skip Headers

Oracle® Streams Concepts and Administration
10g Release 1 (10.1)

Part Number B10727-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

2
Streams Capture Process

This chapter explains the concepts and architecture of the Streams capture process.

This chapter contains these topics:

The Redo Log and a Capture Process

Every Oracle database has a set of two or more redo log files. The redo log files for a database are collectively known as the database's redo log. The primary function of the redo log is to record all changes made to the database.

Redo logs are used to guarantee recoverability in the event of human error or media failure. A capture process is an optional Oracle background process that scans the database redo log to capture DML and DDL changes made to database objects. When a capture process is configured to capture changes from a redo log, the database where the changes were generated is called the source database.

A capture process may run on the source database or on a remote database. When a capture process runs on the source database, the capture process is a local capture process. When a capture process runs on a remote database, the remote database is called the downstream database. If a capture process runs on a downstream database, then archived redo log files from the source database are copied to the downstream database, and the capture process captures changes from these files at the downstream database.

Logical Change Records (LCRs)

A capture process reformats changes captured from the redo log into LCRs. An LCR is an object with a specific format that describes a database change. A capture process captures two types of LCRs: row LCRs and DDL LCRs. Row LCRs and DDL LCRs are described in detail later in this section.

After capturing an LCR, a capture process enqueues an event containing the LCR into a queue. A capture process is always associated with a single SYS.AnyData queue, and it enqueues events into this queue only. For improved performance, captured events always are stored in a buffered queue, which is System Global Area (SGA) memory associated with a SYS.AnyData queue. You can create multiple queues and associate a different capture process with each queue. Figure 2-1 shows a capture process capturing LCRs.


Note:

A capture process can be associated only with a SYS.AnyData queue, not with a typed queue.


Figure 2-1 The Capture Process

Text description of strms012.gif follows

Text description of the illustration strms012.gif

See Also:

Row LCRs

A row LCR describes a change to the data in a single row or a change to a single LONG, LONG RAW, or LOB column in a row. The change results from a data manipulation language (DML) statement or a piecewise update to a LOB. For example, a single DML statement may insert or merge multiple rows into a table, may update multiple rows in a table, or may delete multiple rows from a table.

Therefore, a single DML statement can produce multiple row LCRs. That is, a capture process creates an LCR for each row that is changed by the DML statement. In addition, an update to a LONG, LONG RAW, or LOB column in a single row may result in more than one row LCR.

Each row LCR is encapsulated in an object of LCR$_ROW_RECORD type and contains the following attributes:

A captured row LCR also may contain transaction control statements. These row LCRs contain directives such as COMMIT and ROLLBACK. Such row LCRs are internal and are used by an apply process to maintain transaction consistency between a source database and a destination database.

DDL LCRs

A DDL LCR describes a data definition language (DDL) change. A DDL statement changes the structure of the database. For example, a DDL statement may create, alter, or drop a database object.

Each DDL LCR contains the following information:

Extra Information in LCRs

In addition to the information discussed in the previous sections, row LCRs and DDL LCRs optionally may include the following extra information (or LCR attributes):

You can use the INCLUDE_EXTRA_ATTRIBUTE procedure in the DBMS_CAPTURE_ADM package to instruct a capture process to capture one or more extra attributes.

See Also:

Capture Process Rules

A capture process either captures or discards changes based on rules that you define. Each rule specifies the database objects and types of changes for which the rule evaluates to TRUE. You can place these rules in a positive or negative rule set for the capture process.

If a rule evaluates to TRUE for a change, and the rule is in the positive rule set for a capture process, then the capture process captures the change. If a rule evaluates to TRUE for a change, and the rule is in the negative rule set for a capture process, then the capture process discards the change. If a capture process has both a positive and a negative rule set, then the negative rule set is always evaluated first.

You can specify capture process rules at the following levels:

Datatypes Captured

When capturing the row changes resulting from DML changes made to tables, a capture process can capture changes made to columns of the following datatypes:

A capture process does not capture the results of DML changes to columns of the following datatypes: BFILE, ROWID, and user-defined types (including object types, REFs, varrays, nested tables, and Oracle-supplied types). A capture process raises an error if it tries to create a row LCR for a DML change to a table containing a column of an unsupported datatype.

When a capture process raises an error, it writes the LCR that caused the error into its trace file, raises an ORA-00902 error, and becomes disabled. In this case, modify the rules used by the capture process to avoid the error, and restart the capture process.


Note:
  • You may add rules to a negative rule set for a capture process that instruct the capture process to discard changes to tables with columns of unsupported datatypes. However, if these rules are not simple rules, then a capture process may create a row LCR for the change and continue to process it. In this case, a change that includes an unsupported datatype may cause the capture process to raise an error, even if the change does not satisfy the rule sets used by the capture process. The DBMS_STREAMS_ADM package creates only simple rules.
  • Some of the datatypes listed previously in this section may not be supported by Streams in earlier releases of Oracle. If your Streams environment includes one or more databases from an earlier release of Oracle, then make sure row LCRs do not flow into a database that does not support all of the datatypes in the row LCRs. See the Streams documentation for the earlier Oracle release for information about supported datatypes.

See Also:

Types of Changes Captured

A capture process can capture only certain types of changes made to a database and its objects. The following sections describe the types of DML and DDL changes that can be captured.


Note:

A capture process never captures changes in the SYS, SYSTEM, or CTXSYS schemas.


See Also:

Chapter 4, "Streams Apply Process" for information about the types of changes an apply process can apply

Types of DML Changes Captured

When you specify that DML changes made to certain tables should be captured, a capture process captures the following types of DML changes made to these tables:

The following are considerations for capturing DML changes:

Types of DDL Changes Ignored by a Capture Process

A capture process captures the DDL changes that satisfy its rule sets, except for the following types of DDL changes:

A capture process can capture DDL statements, but not the results of DDL statements, unless the DDL statement is a CREATE TABLE AS SELECT statement. For example, when a capture process captures an ANALYZE statement, it does not capture the statistics generated by the ANALYZE statement. However, when a capture process captures a CREATE TABLE AS SELECT statement, it captures the statement itself and all of the rows selected (as INSERT row LCRs).

Some types of DDL changes that are captured by a capture process cannot be applied by an apply process. If an apply process receives a DDL LCR that specifies an operation that cannot be applied, then the apply process ignores the DDL LCR and records information about it in the trace file for the apply process.

See Also:

Other Types of Changes Ignored by a Capture Process

The following types of changes are ignored by a capture process:

In addition, online table redefinition using the DBMS_REDEFINITION package is not supported on a table or schema for which a capture process captures changes.

NOLOGGING and UNRECOVERABLE Keywords for SQL Operations

If you use the NOLOGGING or UNRECOVERABLE keyword for a SQL operation, then the changes resulting from the SQL operation cannot be captured by a capture process. Therefore, do not use these keywords if you want to capture the changes that result from a SQL operation.

If the object for which you are specifying the logging attributes resides in a database or tablespace in FORCE LOGGING mode, then Oracle ignores any NOLOGGING or UNRECOVERABLE setting until the database or tablespace is taken out of FORCE LOGGING mode. You can determine the current logging mode for a database by querying the FORCE_LOGGING column in the V$DATABASE dynamic performance view. You can determine the current logging mode for a tablespace by querying the FORCE_LOGGING column in the DBA_TABLESPACES static data dictionary view.


Note:

The UNRECOVERABLE keyword is deprecated and has been replaced with the NOLOGGING keyword in the logging_clause. Although UNRECOVERABLE is supported for backward compatibility, Oracle Corporation strongly recommends that you use the NOLOGGING keyword, when appropriate.


See Also:

Oracle Database SQL Reference for more information about the NOLOGGING and UNRECOVERABLE keywords, FORCE LOGGING mode, and the logging_clause

UNRECOVERABLE Clause for Direct Path Loads

If you use the UNRECOVERABLE clause in the SQL*Loader control file for a direct path load, then the changes resulting from the direct path load cannot be captured by a capture process. Therefore, if the changes resulting from a direct path load should be captured by a capture process, then do not use the UNRECOVERABLE clause.

If you perform a direct path load without logging changes at a source database, but you do not perform a similar direct path load at the destination databases of the source database, then apply errors may result at these destination databases when changes are made to the loaded objects at the source database. In this case, a capture process at the source database can capture changes to these objects, and one or more propagations can propagate the changes to the destination databases, but these objects may not exist at the destination databases, or, the objects may exist at the destination database, but the rows related to these changes may not exist.

Therefore, if you use the UNRECOVERABLE clause for a direct path load and a capture process is configured to capture changes to the loaded objects, then make sure any destination databases contain the loaded objects and the loaded data to avoid apply errors. One way to make sure that these objects exist at the destination databases is to perform a direct path load at each of these destination databases that is similar to the direct path load performed at the source database.

If you load objects into a database or tablespace that is in FORCE LOGGING mode, then Oracle ignores any UNRECOVERABLE clause during a direct path load, and the loaded changes are logged. You can determine the current logging mode for a database by querying the FORCE_LOGGING column in the V$DATABASE dynamic performance view. You can determine the current logging mode for a tablespace by querying the FORCE_LOGGING column in the DBA_TABLESPACES static data dictionary view.

See Also:

Oracle Database Utilities for information about direct path loads and SQL*Loader

Supplemental Logging in a Streams Environment

Supplemental logging places additional column data into a redo log whenever an operation is performed. A capture process captures this additional information and places it in LCRs. Supplemental logging is always configured at a source database, regardless of location of the capture process that captures changes to the source database.

Typically, supplemental logging is required in Streams replication environments. In these environments, an apply process needs the additional information in the LCRs to properly apply DML changes and DDL changes that are replicated from a source database to a destination database. However, supplemental logging may also be required in environments where changes are not applied to database objects directly by an apply process. In such environments, an apply handler may process the changes without applying them to the database objects, and the supplemental information may be needed by the apply handlers.

See Also:

Oracle Streams Replication Administrator's Guide for detailed information about when supplemental logging is required

Instantiation in a Streams Environment

In a Streams environment that shares a database object within a single database or between multiple databases, a source database is the database where changes to the object are generated in the redo log, and a destination database is the database where these changes are dequeued by an apply process. If a capture process captures or will capture such changes, and the changes will be applied locally or propagated to other databases and applied at destination databases, then you must instantiate these source database objects before these changes can be dequeued and processed by an apply process. If a database where changes to the source database objects will be applied is a different database than the source database, then the destination database must have a copy of these database objects.

In Streams, the following general steps instantiate a database object:

  1. Prepare the object for instantiation at the source database.
  2. If a copy of the object does not exist at the destination database, then create an object physically at the destination database based on an object at the source database. You can use export/import, transportable tablespaces, or RMAN to copy database objects for instantiation. If the database objects already exist at the destination database, then this step is not necessary.
  3. Set the instantiation SCN for the database object at the destination database. An instantiation SCN instructs an apply process at the destination database to apply only changes that committed at the source database after the specified SCN.

In some cases, Step 1 and Step 3 are completed automatically. For example, when you add rules for an object to the positive rule set for a capture process by running a procedure in the DBMS_STREAMS_ADM package, the object is prepared for instantiation automatically. Also, when you use export/import or transportable tablespaces to copy database objects from a source database to a destination database, instantiation SCNs may be set for these objects automatically. Instantiation is required whenever an apply process dequeues captured LCRs, even if the apply process sends the LCRs to an apply handler that does not execute them.


Note:

You can use either Data Pump export/import or original export/import for Streams instantiations. General references to export/import in this document refer to both Data Pump and original export/import. This document distinguishes between Data Pump and original export/import when necessary.


See Also:

Oracle Streams Replication Administrator's Guide for detailed information about instantiation in a Streams replication environment

Local Capture and Downstream Capture

You can configure a capture process to run locally on a source database or remotely on a downstream database. The following sections describe these options in detail.

Local Capture

Local capture means that a capture process runs on the source database. Figure 2-1 shows a database using local capture.

The Source Database Performs All Change Capture Actions

If you configure local capture, then the following actions are performed at the source database:

Advantages of Local Capture

The following are the advantages of using local capture:

Downstream Capture

Downstream capture means that a capture process runs on a database other than the source database. Archived redo log files from the source database are copied to the downstream database, and the capture process captures changes in these files at the downstream database. You can copy the archived redo log files to the downstream database using log transport services, the DBMS_FILE_TRANSFER package, file transfer protocol (FTP), or some other mechanism.

Figure 2-2 Downstream Capture

Text description of strms037.gif follows

Text description of the illustration strms037.gif


Note:

As illustrated in Figure 2-2, the source database for a change captured by a downstream capture process is the database where the change was recorded in the redo log, not the database running the downstream capture process.


You can configure multiple capture processes at a downstream database to capture changes from a single source database. You also can copy redo log files from multiple source databases and configure multiple capture processes to capture changes in these redo log files at a single downstream database.

In addition, a single database may have one or more capture processes that capture local changes and other capture processes that capture changes from a remote source database. That is, you can configure a single database to perform both local capture and downstream capture.

See Also:

Oracle Data Guard Concepts and Administration for more information about log transport services

The Downstream Database Performs Most Change Capture Actions

If you configure downstream capture, then the following actions are performed at the downstream database:

In a downstream capture configuration, the following actions are performed at the source database:

In addition, the redo log files must be copied from the computer system running the source database to the computer system running the downstream database. Typically, log transport services copies these redo log files to the downstream database.

See Also:

Chapter 6, "How Rules Are Used In Streams" for more information about rule sets for Streams clients and for information about how events satisfy rule sets

Advantages of Downstream Capture

The following are the advantages of using downstream capture:

Optional Database Link from the Downstream Database to the Source Database

When you create or alter a downstream capture process, you can optionally specify the use of a database link from the downstream database to the source database. This database link must have the same name as the global name of the source database. Such a database link simplifies the creation and administration of a downstream capture process. You specify that a downstream capture process uses a database link by setting the use_database_link parameter to true when you run CREATE_CAPTURE or ALTER_CAPTURE on the downstream capture process.

When a downstream capture process uses a database link to the source database, the capture process connects to the source database to perform the following administrative actions automatically:

If a downstream capture process does not use a database link, then you must perform these actions manually.

See Also:

"Creating a Downstream Capture Process That Uses a Database Link" for information about when the DBMS_CAPTURE_ADM.BUILD procedure is run automatically during capture process creation if the downstream capture process uses a database link

Operational Requirements for Downstream Capture

The following are operational requirements for using downstream capture:

In a downstream capture environment, the source database can be a single instance database or a multi-instance Real Application Clusters (RAC) database. The downstream database can be a single instance database or a multi-instance RAC database, regardless of whether the source database is single instance or multi-instance.

SCN Values Relating to a Capture Process

This section describes system change number (SCN) values that are important for a capture process. You can query the ALL_CAPTURE data dictionary view to display these values for one or more capture processes.

Captured SCN and Applied SCN

The captured SCN is the SCN that corresponds to the most recent change scanned in the redo log by a capture process. The applied SCN for a capture process is the SCN of the most recent event dequeued by the relevant apply processes. All events lower than this SCN have been dequeued by all apply processes that apply changes captured by the capture process. The applied SCN for a capture process is equivalent to the low-watermark SCN for an apply process that applies changes captured by the capture process.

First SCN and Start SCN

This section describes the first SCN and start SCN for a capture process.

First SCN

The first SCN is the lowest SCN in the redo log from which a capture process can capture changes. If you specify a first SCN during capture process creation, then the database must be able to access redo log information from the SCN specified and higher.

The DBMS_CAPTURE_ADM.BUILD procedure extracts the source database data dictionary to the redo log. When you create a capture process, you can specify a first SCN that corresponds to this data dictionary build in the redo log. Specifically, the first SCN for the capture process being created can be set to any value returned by the following query:

COLUMN FIRST_CHANGE# HEADING 'First SCN' FORMAT 999999999
COLUMN NAME HEADING 'Log File Name' FORMAT A50

SELECT DISTINCT FIRST_CHANGE#, NAME FROM V$ARCHIVED_LOG
  WHERE DICTIONARY_BEGIN = 'YES';

The value returned for the NAME column is the name of the redo log file that contains the SCN corresponding to the first SCN. This redo log file, and subsequent redo log files, must be available to the capture process. If this query returns multiple distinct values for FIRST_CHANGE#, then the DBMS_CAPTURE_ADM.BUILD procedure has been run more than once on the source database. In this case, choose the first SCN value that is most appropriate for the capture process you are creating.

In some cases, the DBMS_CAPTURE_ADM.BUILD procedure is run automatically when a capture process is created. When this happens, the first SCN for the capture process corresponds to this data dictionary build.

Start SCN

The start SCN is the SCN from which a capture process begins to capture changes. You can specify a start SCN that is different than the first SCN during capture process creation, or you can alter a capture process to set its start SCN. The start CN does not need to be modified for normal operation of a capture process.

Start SCN Must Be Greater Than or Equal to First SCN

If you specify a start SCN when you create or alter a capture process, then the start SCN specified must be greater than or equal to the first SCN for the capture process. A capture process always scans any unscanned redo log records that have higher SCN values than the first SCN, even if the redo log records have lower SCN values than the start SCN. So, if you specify a start SCN that is greater than the first SCN, then the capture process may scan redo log records for which it cannot capture changes, because these redo log records have a lower SCN than the start SCN.

Scanning redo log records before the start SCN should be avoided if possible because it may take some time. Therefore, Oracle Corporation recommends that the difference between the first SCN and start SCN be as small as possible during capture process creation to keep the initial capture process startup time to a minimum.


Attention:

When a capture process is started or restarted, it may need to scan redo log files with a FIRST_CHANGE# value that is lower than start SCN. Removing required redo log files before they are scanned by a capture process causes the capture process to abort. You can query the DBA_CAPTURE data dictionary view to determine the first SCN, start SCN, and required checkpoint SCN. A capture process needs the redo log file that includes the required checkpoint SCN, and all subsequent redo log files. See "Capture Process Creation" for more information about the first SCN and start SCN for a capture process.


A Start SCN Setting That Is Prior to Preparation for Instantiation

If you want to capture changes to a database object and apply these changes using an apply process, then only changes that occurred after the database object has been prepared for instantiation can be applied. Therefore, if you set the start SCN for a capture process lower than the SCN that corresponds to the time when a database object was prepared for instantiation, then any captured changes to this database object prior to the prepare SCN cannot be applied by an apply process.

This limitation may be important during capture process creation. If a database object was never prepared for instantiation prior to the time of capture process creation, then an apply process cannot apply any captured changes to the object from a time before capture process creation time.

In some cases, database objects may have been prepared for instantiation before a new capture process is created. For example, if you want to create a new capture process for source database whose changes are already being captured by one or more existing capture processes, then some or all of the database objects may have been prepared for instantiation before the new capture process is created. If you want to capture changes to a certain database object with a new capture process from a time before the new capture process was created, then the following conditions must be met for an apply process to apply these captured changes:

Streams Capture Processes and RESTRICTED SESSION

When you enable restricted session during system startup by issuing a STARTUP RESTRICT statement, capture processes do not start, even if they were running when the database shut down. When the restricted session is disabled, each capture process that was running when the database shut down is started.

When the restricted session is enabled in a running database by the SQL statement ALTER SYSTEM with the ENABLE RESTRICTED SESSION clause, it does not affect any running capture processes. These capture processes continue to run and capture changes. If a stopped capture process is started in a restricted session, then the capture process does not start until the restricted session is disabled.

Streams Capture Processes and Oracle Real Application Clusters

You can configure a Streams capture process to capture changes in a Real Application Clusters (RAC) environment. If you use one or more capture processes and RAC in the same environment, then all archived logs that contain changes to be captured by a capture process must be available to all instances in the RAC environment. In a RAC environment, a capture process reads changes made by all instances.

Each capture process is started on the owner instance for its SYS.AnyData queue, even if the start procedure is run on a different instance. Also, a capture process will follow its queue to a different instance if the current owner instance becomes unavailable. The queue itself follows the rules for primary instance and secondary instance ownership. If the owner instance for a queue table containing a queue used by a capture process becomes unavailable, then queue ownership is transferred automatically to another instance in the cluster. In addition, if the capture process was enabled when the owner instance became unavailable, then the capture process is restarted automatically on the new owner instance. If the capture process was disabled when the owner instance became unavailable, then the capture process remains disabled on the new owner instance.

The DBA_QUEUE_TABLES data dictionary view contains information about the owner instance for a queue table. Also, any parallel execution servers used by a single capture process run on a single instance in a RAC environment.

See Also:

Capture Process Architecture

A capture process is an optional Oracle background process whose process name is cnnn, where nnn is a capture process number. Valid capture process names include c001 through c999. A capture process captures changes from the redo log by using the infrastructure of LogMiner. Streams configures LogMiner automatically. You can create, alter, start, stop, and drop a capture process, and you can define capture process rules that control which changes a capture process captures.

Changes are captured by a capture user. The capture user captures all changes that satisfy the capture process rule sets. In addition, the capture user runs all rule-based transformations specified by the rules in these rule sets. The capture user must have the necessary privileges to perform these actions, including execute privilege on the rule sets used by the capture process, execute privilege on all rule-based transformation functions specified for rules in the positive rule set, and privileges to enqueue events into the capture process queue. A capture process can be associated with only one user, but one user may be associated with many capture processes.

See Also:

"Configuring a Streams Administrator" for information about the required privileges

This section discusses the following topics:

Capture Process Components

A capture process consists of the following components:

Each reader server, preparer server, and builder server is a parallel execution server. The capture process (cnnn) is an Oracle background process.

See Also:

Capture Process States

The state of a capture process describes what the capture process is doing currently. You can view the state of a capture process by querying the STATE column in the V$STREAMS_CAPTURE dynamic performance view. The following capture process states are possible:

Multiple Capture Processes in a Single Database

If you run multiple capture processes on a single database, consider increasing the size of the System Global Area (SGA) for each instance. Use the SGA_MAX_SIZE initialization parameter to increase the SGA size. Also, you should increase the size of the Streams pool by 10 MB for each capture process parallelism on a database. For example, if you have two capture processes running on a database, and the parallelism parameter is set to 4 for one of them and 1 for the other, then increase the Streams pool by 50 MB (4 + 1 = 5 parallelism).


Note:
  • Oracle Corporation recommends that each capture process use a separate queue to keep LCRs from different capture processes separate.
  • If the size of the Streams pool is zero, then Streams may use up to 10% of the shared pool. The STREAMS_POOL_SIZE initialization parameter controls the size of the Streams pool.
  • Each capture process uses one LogMiner session.

See Also:

"Setting Initialization Parameters Relevant to Streams" for more information about the STREAMS_POOL_SIZE initialization parameter

Capture Process Checkpoints

At regular intervals, a capture process tries to record a checkpoint. At a checkpoint, the capture process records its current state persistently in the data dictionary of the database running the capture process.

The required checkpoint SCN is the lowest checkpoint SCN for which a capture process requires redo information. The redo log file that contains the required checkpoint SCN, and all subsequent redo log files, must be available to the capture process. If a capture process is stopped and restarted, then it starts scanning the redo log from the SCN that corresponds to its required checkpoint SCN. The required checkpoint SCN is important for recovery if a database stops unexpectedly. Also, if the first SCN is reset for a capture process, then it must be set to a value that is less than or equal to the required checkpoint SCN for the captured process. You can determine the required checkpoint SCN for a capture process by querying the REQUIRED_CHECKPOINT_SCN column in the DBA_CAPTURE data dictionary view.

Also, the SCN value that corresponds to the last checkpoint recorded by a capture process is the maximum checkpoint SCN. If you create a capture process that captures changes from a source database, and other capture processes already exist which capture changes from the same source database, then the maximum checkpoint SCNs of the existing capture processes can help you decide whether the new capture process should create a new LogMiner data dictionary or share one of the existing LogMiner data dictionaries. You can determine the maximum checkpoint SCN for a capture process by querying the MAX_CHECKPOINT_SCN column in the DBA_CAPTURE data dictionary view.

See Also:

Capture Process Creation

You can create a capture process using the DBMS_STREAMS_ADM package or the DBMS_CAPTURE_ADM package. Using the DBMS_STREAMS_ADM package to create a capture process is simpler because defaults are used automatically for some configuration options. In addition, when you use the DBMS_STREAMS_ADM package, a rule set is created for the capture process and rules may be added to the rule set automatically. The rule set is a positive rule set if the inclusion_rule parameter is set to true (the default), or it is a negative rule set if the inclusion_rule parameter is set to false.

Alternatively, using the DBMS_CAPTURE_ADM package to create a capture process is more flexible, and you create one or more rule sets and rules for the capture process either before or after it is created. You can use the procedures in the DBMS_STREAMS_ADM package or the DBMS_RULE_ADM package to add rules to a rule set for the capture process.

When you create a capture process using a procedure in the DBMS_STREAMS_ADM package and generate one or more rules in the positive rule set for the capture process, the objects for which changes are captured are prepared for instantiation automatically, unless it is a downstream capture process and there is no database link from the downstream database to the source database.

When you create a capture process using the CREATE_CAPTURE procedure in the DBMS_CAPTURE_ADM package, you should prepare for instantiation any objects for which you plan to capture changes as soon as possible after capture process creation. You can prepare objects for instantiation using one of the following procedures in the DBMS_CAPTURE_ADM package:

The LogMiner Data Dictionary for a Capture Process

A capture process requires a data dictionary that is separate from the primary data dictionary for the source database. This separate data dictionary is called a LogMiner data dictionary. There may be more than one LogMiner data dictionary for a particular source database. If there are multiple capture processes capturing changes from the source database, then two or more capture processes may share a LogMiner data dictionary, or each capture process may have its own LogMiner data dictionary. If the LogMiner data dictionary needed by a capture process does not exist, then the capture process populates it using information in the redo log when the capture process is started for the first time.

The DBMS_CAPTURE_ADM.BUILD procedure extracts data dictionary information to the redo log, and this procedure must be run at least once on the source database before any capture process capturing changes originating at the source database is started. The extracted data dictionary information in the redo log is consistent with the primary data dictionary at the time when the DBMS_CAPTURE_ADM.BUILD procedure is run. This procedure also identifies a valid first SCN value that can be used to create a capture process.

You may perform a build of data dictionary information in the redo log multiple times, and a particular build may or may not be used by a capture process to create a LogMiner data dictionary. The amount of information extracted to a redo log when you run the BUILD procedure depends on the number of database objects in the database. Typically, the BUILD procedure generates a large amount of redo information that a capture process must scan subsequently. Therefore, you should run the BUILD procedure only when necessary.

A capture process requires a LogMiner data dictionary because the information in the primary data dictionary may not apply to the changes being captured from the redo log. These changes may have occurred minutes, hours, or even days before they are captured by a capture process. For example, consider the following scenario:

  1. A capture process is configured to capture changes to tables.
  2. A database administrator stops the capture process. When the capture process is stopped, it records the SCN of the change it was currently capturing.
  3. User applications continue to make changes to the tables while the capture process is stopped.
  4. The capture process is restarted three hours after it was stopped.

In this case, to ensure data consistency, the capture process must begin capturing changes in the redo log at the time when it was stopped. The capture process starts capturing changes at the SCN that it recorded when it was stopped.

The redo log contains raw data. It does not contain database object names and column names in tables. Instead, it uses object numbers and internal column numbers for database objects and columns, respectively. Therefore, when a change is captured, a capture process must reference a data dictionary to determine the details of the change.

Because a LogMiner data dictionary may be populated when a capture process is started for the first time, it might take some time to start capturing changes. The amount of time required depends on the number of database objects in the database. You can query the STATE column in the V$STREAMS_CAPTURE dynamic performance view to monitor the progress while a capture process is processing a data dictionary build.

See Also:
Scenario Illustrating Why a Capture Process Needs a LogMiner Data Dictionary

Consider a scenario in which a capture process has been configured to capture changes to table t1, which has columns a and b, and the following changes are made to this table at three different points in time:

Time 1: Insert values a=7 and b=15.

Time 2: Add column c.

Time 3: Drop column b.

If for some reason the capture process is capturing changes from an earlier time, then the primary data dictionary and the relevant version in the LogMiner data dictionary contain different information. Table 2-1 illustrates how the information in the LogMiner data dictionary is used when the current time is different than the change capturing time.

Table 2-1 Information About Table t1 in the Primary and LogMiner Data Dictionaries
Current Time Change Capturing Time Primary Data Dictionary LogMiner Data Dictionary

1

1

Table t1 has columns a and b.

Table t1 has columns a and b at time 1.

2

1

Table t1 has columns a, b, and c.

Table t1 has columns a and b at time 1.

3

1

Table t1 has columns a and c.

Table t1 has columns a and b at time 1.

Assume that the capture process captures the change resulting from the insert at time 1 when the actual time is time 3. If the capture process used the primary data dictionary, then it might assume that a value of 7 was inserted into column a and a value of 15 was inserted into column c, because those are the two columns for table t1 at time 3 in the primary data dictionary. However, a value of 15 actually was inserted into column b, not column c.

Because the capture process uses the LogMiner data dictionary, the error is avoided. The LogMiner data dictionary is synchronized with the capture process and continues to record that table t1 has columns a and b at time 1. So, the captured change specifies that a value of 15 was inserted into column b.

Multiple Capture Processes for the Same Source Database

If one or more capture processes are capturing changes made to a source database, and you want to create a new capture process that captures changes to the same source database, then the new capture process either can create a new LogMiner data dictionary or share one of the existing LogMiner data dictionaries with one or more other capture processes. Whether a new LogMiner data dictionary is created for a new capture process depends on the setting for the first_scn parameter when you run CREATE_CAPTURE to create a capture process:

If multiple LogMiner data dictionaries exist, and you specify NULL for the first_scn parameter during capture process creation, then the new capture process automatically attempts to share the LogMiner data dictionary of one of the existing capture processes that has taken at least one check point. You can view the maximum checkpoint SCN for all existing capture processes by querying the MAX_CHECKPOINT_SCN column in the DBA_CAPTURE data dictionary view.

If multiple LogMiner data dictionaries exist, and you specify a non-NULL value for the first_scn parameter during capture process creation, then the new capture process creates a new LogMiner data dictionary the first time it is started. In this case, before you create the new capture process, you must run the BUILD procedure in the DBMS_CAPTURE_ADM package on the source database. The BUILD procedure generates a corresponding valid first scn value that you can specify when you create the new capture process. You can find a first SCN generated by the BUILD procedure by running the following query:

COLUMN FIRST_CHANGE# HEADING 'First SCN' FORMAT 999999999
COLUMN NAME HEADING 'Log File Name' FORMAT A50

SELECT DISTINCT FIRST_CHANGE#, NAME FROM V$ARCHIVED_LOG
  WHERE DICTIONARY_BEGIN = 'YES';

This query may return more than one row if the BUILD procedure was run more than once.

The most important factor to consider when deciding whether a new capture process should share an existing LogMiner data dictionary or create a new one is the difference between the maximum checkpoint SCN values of the existing capture processes and the start SCN of the new capture process. If the new capture process shares a LogMiner data dictionary, then it must scan the redo log from the point of the maximum checkpoint SCN of the shared LogMiner data dictionary onward, even though the new capture process cannot capture changes prior to its first SCN. If the start SCN of the new capture process is much higher than the maximum checkpoint SCN of the existing capture process, then the new capture process must scan a large amount of redo log information before it reaches its start SCN.

A capture process creates a new LogMiner data dictionary when the first_scn parameter is non-NULL during capture process creation. Follow these guidelines when you decide if a new capture process should share an existing LogMiner data dictionary or create a new one:

Figure 2-3 illustrates these guidelines.

Figure 2-3 Deciding Whether to Share a LogMiner Data Dictionary

Text description of strms042.gif follows

Text description of the illustration strms042.gif


Note:
  • If you create a capture process using one of the procedures in the DBMS_STREAMS_ADM package, then it is the same as specifying NULL for the first_scn and start_scn parameters in the CREATE_CAPTURE procedure.
  • You must prepare database objects for instantiation if a new capture process will capture changes made to these database objects. This requirement holds even if the new capture process shares a LogMiner data dictionary with one or more other capture processes for which these database objects have been prepared for instantiation.

See Also:

First SCN and Start SCN Specifications During Capture Process Creation

When you create a capture process using the CREATE_CAPTURE procedure in the DBMS_CAPTURE_ADM package, you can specify the first SCN and start SCN for the capture process. The first SCN is the lowest SCN in the redo log from which a capture process can capture changes, and it should be obtained through a data dictionary build or a query on the V$ARCHIVED_LOG dynamic performance view. The start SCN is the SCN from which a capture process begins to capture changes. The start SCN must be equal to or greater than the first SCN.

A capture process scans the redo information from the first SCN or an existing capture process checkpoint forward, even if the start SCN is higher than the first SCN or the checkpoint SCN. In this case, the capture process does not capture any changes in the redo information before the start SCN. Oracle Corporation recommends that, at capture process creation time, the difference between the first SCN and start SCN be as small as possible to keep the amount of redo scanned by the capture process to a minimum.

In some cases, the behavior of the capture process is different depending on the settings of these SCN values and on whether the capture process is local or downstream.


Note:

When you create a capture process using the DBMS_STREAMS_ADM package, both the first SCN and the start SCN are set to NULL during capture process creation.


The following sections describe capture process behavior for SCN value settings:

Non-NULL First SCN and NULL Start SCN for a Local or Downstream Capture Process

The new capture process is created at the local database with a new LogMiner session starting from the value specified for the first_scn parameter. The start SCN is set to the specified first SCN value automatically, and the new capture process does not capture changes that were made before this SCN.

The BUILD procedure in the DBMS_CAPTURE_ADM package is not run automatically. This procedure should have been called at least once before on the source database, and the specified first SCN must correspond to the SCN value of a previous build that is still available in the redo log. When the new capture process is started for the first time, it creates a new LogMiner data dictionary using the data dictionary information in the redo log. If the BUILD procedure in the DBMS_CAPTURE_ADM package has not been run at least once on the source database, then an error is raised when the capture process is started.

Capture process behavior is the same for a local capture process and a downstream capture process created with these SCN settings, except that a local capture process is created at the source database and a downstream capture process is created at the downstream database.

Non-NULL First SCN and Non-NULL Start SCN for a Local or Downstream Capture Process

If the specified value for the start_scn parameter is greater than or equal to the specified value for the first_scn parameter, then the new capture process is created at the local database with a new LogMiner session starting from the specified first SCN. In this case, the new capture process does not capture changes that were made before the specified start SCN. If the specified value for the start_scn parameter is less than the specified value for the first_scn parameter, then an error is raised.

The BUILD procedure in the DBMS_CAPTURE_ADM package is not run automatically. This procedure must have been called at least once before on the source database, and the specified first_scn must correspond to the SCN value of a previous build that is still available in the redo log. When the new capture process is started for the first time, it creates a new LogMiner data dictionary using the data dictionary information in the redo log. If the BUILD procedure in the DBMS_CAPTURE_ADM package has not been run at least once on the source database, then an error is raised.

Capture process behavior is the same for a local capture process and a downstream capture process created with these SCN settings, except that a local capture process is created at the source database and a downstream capture process is created at the downstream database.

NULL First SCN and Non-NULL Start SCN for a Local Capture Process

The new capture process creates a new LogMiner data dictionary if either one of the following conditions is true:

In either of these cases, the BUILD procedure in the DBMS_CAPTURE_ADM package is run during capture process creation. The new capture process uses the resulting build of the source data dictionary in the redo log to create a LogMiner data dictionary the first time it is started, and the first SCN corresponds to the SCN of the data dictionary build.

However, if there is at least one existing local capture process for the local source database that has taken a checkpoint, then the new capture process shares an existing LogMiner data dictionary with one or more of the existing capture processes. In this case, a capture process with a first SCN that is lower than or equal to the specified start SCN must have been started successfully at least once.

If there is no existing capture process for the local source database (or if no existing capture processes have taken a checkpoint yet), and the specified start SCN is less than the current SCN for the database, then an error is raised.

NULL First SCN and Non-NULL Start SCN for a Downstream Capture Process

If the use_database_link parameter is set to true during capture process creation, then the database link is used to obtain the current SCN of the source database. In this case, the new capture process creates a new LogMiner data dictionary if either one of the following conditions is true:

In either of these cases, the BUILD procedure in the DBMS_CAPTURE_ADM package is run during capture process creation. The first time you start the new capture process, it uses the resulting build of the source data dictionary in the redo log files copied to the downstream database to create a LogMiner data dictionary. Here, the first SCN for the new capture process corresponds to the SCN of the data dictionary build.

However, if there is at least one existing capture process that has taken a checkpoint and captures changes to the source database at the downstream database, then the new capture process shares an existing LogMiner data dictionary with one or more of these existing capture processes, regardless of the use_database_link parameter setting. In this case, one of these existing capture processes with a first SCN that is lower than or equal to the specified start SCN must have been started successfully at least once.

If the use_database_link parameter is set to true during capture process creation, there is no existing capture process that captures changes to the source database at the downstream database (or no existing capture process has taken a checkpoint), and the specified start_scn parameter value is less than the current SCN for the source database, then an error is raised.

If the use_database_link parameter is set to false during capture process creation and there is no existing capture process that captures changes to the source database at the downstream database (or no existing capture process has taken a checkpoint), then an error is raised.

NULL First SCN and NULL Start SCN

The behavior is the same as setting the first_scn parameter to NULL and setting the start_scn parameter to the current SCN of the source database.

See Also:

A New First SCN Value and Purged LogMiner Dictionary Information

When you reset the first SCN value for an existing capture process, Oracle automatically purges LogMiner data dictionary information prior to the new first SCN setting. If the start SCN for a capture process corresponds to information that has been purged, then Oracle automatically resets the start SCN to the same value as the first SCN. However, if the start SCN is higher than the new first SCN setting, then the start SCN remains unchanged.

Figure 2-4 shows how Oracle automatically purges LogMiner data dictionary information prior to a new first SCN setting, and how the start SCN is not changed if it is higher than the new first SCN setting.

Figure 2-4 Start SCN Higher Than Reset First SCN

Text description of strms038.gif follows

Text description of the illustration strms038.gif

Given this example, if the first SCN is reset again to a value higher than the start SCN value for a capture process, then the start SCN no longer corresponds to existing information in the LogMiner data dictionary. Figure 2-5 shows how Oracle resets the start SCN automatically if it is lower than a new first SCN setting.

Figure 2-5 Start SCN Lower Than Reset First SCN

Text description of strms039.gif follows

Text description of the illustration strms039.gif

As you can see, the first SCN and start SCN for a capture process may continually increase over time, and, as the first SCN moves forward, it may no longer correspond to an SCN established by the DBMS_CAPTURE_ADM.BUILD procedure.

See Also:

The Streams Data Dictionary

Propagations and apply processes use a Streams data dictionary to keep track of the database objects from a particular source database. A Streams data dictionary is populated whenever one or more database objects are prepared for instantiation at a source database. Specifically, when a database object is prepared for instantiation, it is recorded in the redo log. When a capture process scans the redo log, it uses this information to populate the local Streams data dictionary for the source database. In the case of local capture, this Streams data dictionary is at the source database. In the case of downstream capture, this Streams data dictionary is at the downstream database.

When you prepare a database object for instantiation, you are informing Streams that information about the database object is needed by propagations that propagate changes to the database object and apply processes that apply changes to the database object. Any database that propagates or applies these changes requires a Streams data dictionary for the source database where the changes originated.

After an object has been prepared for instantiation, the local Streams data dictionary is updated when a DDL statement on the object is processed by a capture process. In addition, an internal message containing information about this DDL statement is captured and placed in the queue for the capture process. Propagations may then propagate these internal messages to destination queues at databases.

A Streams data dictionary is multiversioned. If a database has multiple propagations and apply processes, then all of them use the same Streams data dictionary for a particular source database. A database can contain only one Streams data dictionary for a particular source database, but it can contain multiple Streams data dictionaries if it propagates or applies changes from multiple source databases.

See Also:

ARCHIVELOG Mode and a Capture Process

A local capture process reads online redo logs whenever possible and archived redo log files otherwise. A downstream capture process always reads archived redo log files from its source database. For this reason, the source database must be running in ARCHIVELOG mode when a capture process is configured to capture changes. You must keep an archived redo log file available until you are certain that no capture process will need that file.

You can query the REQUIRED_CHECKPOINT_SCN column in the DBA_CAPTURE data dictionary view to determine the required checkpoint SCN for a capture process. When the capture process is restarted, it scans the redo log from the required checkpoint SCN forward. Therefore, the redo log file that includes the required checkpoint SCN, and all subsequent redo log files, must be available to the capture process.

The first SCN for a capture process can be reset to a higher value, but it cannot be reset to a lower value. Therefore, a capture process will never need the redo log files that contain information prior to its first SCN. Query the DBA_LOGMNR_PURGED_LOG data dictionary view to determine which archived redo log files will never be needed by any capture process.

When a local capture process falls behind, there is a seamless transition from reading an online redo log to reading an archived redo log, and, when a local capture process catches up, there is a seamless transition from reading an archived redo log to reading an online redo log.

See Also:

Capture Process Parameters

After creation, a capture process is disabled so that you can set the capture process parameters for your environment before starting it for the first time. Capture process parameters control the way a capture process operates. For example, the time_limit capture process parameter can be used to specify the amount of time a capture process runs before it is shut down automatically.

See Also:

Capture Process Parallelism

The parallelism capture process parameter controls the number of preparer servers used by a capture process. The preparer servers concurrently format changes found in the redo log into LCRs. Each reader server, preparer server, and builder server is a parallel execution server, and the number of preparer servers equals the number specified for the parallelism capture process parameter. So, if parallelism is set to 5, then a capture process uses a total of seven parallel execution servers, assuming seven parallel execution servers are available: one reader server, five preparer servers, and one builder server.


Note:
  • Resetting the parallelism parameter automatically stops and restarts the capture process.
  • Setting the parallelism parameter to a number higher than the number of available parallel execution servers might disable the capture process. Make sure the PROCESSES and PARALLEL_MAX_SERVERS initialization parameters are set appropriately when you set the parallelism capture process parameter.

See Also:

"Capture Process Components" for more information about preparer servers

Automatic Restart of a Capture Process

You can configure a capture process to stop automatically when it reaches certain limits. The time_limit capture process parameter specifies the amount of time a capture process runs, and the message_limit capture process parameter specifies the number of events a capture process can capture. The capture process stops automatically when it reaches one of these limits.

The disable_on_limit parameter controls whether a capture process becomes disabled or restarts when it reaches a limit. If you set the disable_on_limit parameter to y, then the capture process is disabled when it reaches a limit and does not restart until you restart it explicitly. If, however, you set the disable_on_limit parameter to n, then the capture process stops and restarts automatically when it reaches a limit.

When a capture process is restarted, it starts to capture changes at the point where it last stopped. A restarted capture process gets a new session identifier, and the parallel execution servers associated with the capture process also get new session identifiers. However, the capture process number (cnnn) remains the same.

Capture Process Rule Evaluation

A capture process evaluates changes it finds in the redo log against its positive and negative rule sets. The capture process evaluates a change against the negative rule set first. If one or more rules in the negative rule set evaluate to TRUE for the change, then the change is discarded, but if no rule in the negative rule set evaluates to TRUE for the change, then the change satisfies the negative rule set. When a change satisfies the negative rule set for a capture process, the capture process evaluates the change against its positive rule set. If one or more rules in the positive rule set evaluate to TRUE for the change, then the change satisfies the positive rule set, but if no rule in the positive rule set evaluates to TRUE for the change, then the change is discarded. If a capture process only has one rule set, then it evaluates changes against this one rule set only.

A running capture process completes the following series of actions to capture changes:

  1. Finds changes in the redo log.
  2. Performs prefiltering of the changes in the redo log. During this step, a capture process evaluates rules in its rule sets at the object level and schema level to place changes found in the redo log into two categories: changes that should be converted into LCRs and changes that should not be converted into LCRs.

    Prefiltering is a safe optimization done with incomplete information. This step identifies relevant changes to be processed subsequently, such that:

    • A capture process converts a change into an LCR if the change satisfies the capture process rule sets.
    • A capture process does not convert a change into an LCR if the change does not satisfy the capture process rule sets.
    • Regarding MAYBE evaluations, the rule evaluation proceeds as follows:
      • If a change evaluates to MAYBE against both the positive and negative rule set for a capture process, then the capture process may not have enough information to determine whether the change will definitely satisfy both of its rule sets. In this case, the change is converted to an LCR for further evaluation.
      • If the change evaluates to FALSE against the negative rule set and MAYBE against the positive rule set for the capture process, then the capture process may not have enough information to determine whether the change will definitely satisfy both of its rule sets. In this case, the change is converted to an LCR for further evaluation.
      • If the change evaluates to MAYBE against the negative rule set and TRUE against the positive rule set for the capture process, then the capture process may not have enough information to determine whether the change will definitely satisfy both of its rule sets. In this case, the change is converted to an LCR for further evaluation.
      • If the change evaluates to TRUE against the negative rule set and MAYBE against the positive rule set for the capture process, then the capture process discards the change.
      • If the change evaluates to MAYBE against the negative rule set and FALSE against the positive rule set for the capture process, then the capture process discards the change.
  3. Converts changes that satisfy, or may satisfy, the capture process rule sets into LCRs based on prefiltering.
  4. Performs LCR filtering. During this step, a capture process evaluates rules regarding information in each LCR to separate the LCRs into two categories: LCRs that should be enqueued and LCRs that should be discarded.
  5. Discards the LCRs that should not be enqueued because they did not satisfy the capture process rule sets.
  6. Enqueues the remaining captured LCRs into the queue associated with the capture process.

For example, suppose the following rule is defined in the positive rule set for a capture process: Capture changes to the hr.employees table where the department_id is 50. No other rules are defined for the capture process, and the parallelism parameter for the capture process is set to 1.

Given this rule, suppose an UPDATE statement on the hr.employees table changes 50 rows in the table. The capture process performs the following series of actions for each row change:

  1. Finds the next change resulting from the UPDATE statement in the redo log.
  2. Determines that the change resulted from an UPDATE statement to the hr.employees table and must be captured. If the change was made to a different table, then the capture process ignores the change.
  3. Captures the change and converts it into an LCR.
  4. Filters the LCR to determine whether it involves a row where the department_id is 50.
  5. Either enqueues the LCR into the queue associated with the capture process if it involves a row where the department_id is 50, or discards the LCR if it involves a row where the department_id is not 50 or is missing.

    See Also:

Figure 2-6 illustrates capture process rule evaluation in a flowchart.

Figure 2-6 Flowchart Showing Capture Process Rule Evaluation

Text description of strms026.gif follows

Text description of the illustration strms026.gif

Persistent Capture Process Status Upon Database Restart

A capture process maintains a persistent status when the database running the capture process is shut down and restarted. For example, if a capture process is enabled when the database is shut down, then the capture process automatically starts when the database is restarted. Similarly, if a capture process is disabled or aborted when a database is shut down, then the capture process is not started and retains the disabled or aborted status when the database is restarted.