A Troubleshooting

This appendix explains how to use trace and log files in Oracle Real Application Clusters (RAC). This chapter also includes information about the alert file that is specific to each instance. The topics in this appendix are:

Monitoring Trace Files in Real Application Clusters
Using Log Files in Real Application Clusters
Enabling Additional Tracing for Real Application Clusters High Availability
Using Instance-Specific Alert Files in Real Application Clusters

Monitoring Trace Files in Real Application Clusters

Oracle records information about important events that occur in your RAC environment in trace files. The trace files for RAC are the same as those in single-instance Oracle databases. As a best practice, monitor trace files frequently and back them up regularly for all instances. Doing so preserves their content for future troubleshooting.

Where to Find Files for Analyzing Errors

Information about ORA-600 errors appear in the alert_SID.log file. For troubleshooting, you may need to also provide files from these bdump locations:

$ORACLE_HOME/admin/db_name/bdump on UNIX-based systems
%ORACLE_HOME%\admin\db_name\bdump on Windows-based systems

Some files may also be in the udump directory.

Background Thread Trace Files in Real Application Clusters

RAC background threads use trace files to record database operations and database errors. These trace logs help troubleshoot and also enable Oracle support to more efficiently debug cluster database configuration problems.

Background thread trace files are created regardless of whether the BACKGROUND_DUMP_DEST parameter is set in the server parameter file (SPFILE). If you set BACKGROUND_DUMP_DEST, then the trace files are stored in the directory specified. If you do not set the parameter, then the trace files are stored as follows:

$ORACLE_BASE/admin/db_name/bdump on UNIX-based systems
%ORACLE_BASE%\admin\db_name\bdump on Windows-based systems

The Oracle database creates a different trace file for each background thread. On UNIX-base systems, the trace files for the background processes are named SID_<process name>_<process identifier>.trc, for example:

SID_dbwr_1202.trc
SID_smon_4560.trc

User Process Trace Files in Real Application Clusters

Trace files are created for user processes if you set the USER_DUMP_DEST parameter in the initialization parameter file. The trace files for the user processes have the form oraxxxxx.trc, where xxxxx is a 5-digit number indicating the process identifier (PID) on UNIX-based systems or the thread number on Windows-based systems.

Using Log Files in Real Application Clusters

RAC provides several types of log files that record processing information as described in this section.

Clusterware Log Files

The following sections describe the locations of the clusterware log files.

Cluster Ready Services Log Files

Cluster Ready Services (CRS) has daemon processes that generate log information. Log files for the CRS daemon (crsd) can be found in the following directories:

<CRS Home>/crs/init
<CRS Home>/crs/<node name>.log

Oracle Cluster Registry Log Files

The Oracle Cluster Registry (OCR) records log information in the following location:

<CRS Home>/srvm/log/

Cluster Synchronization Services (CSS) Log Files

You can find CSS information that the OCSSD generates in log files in the following locations:

<CRS Home>/css/log/ocssd<number>.log
<CRS Home>/css/init/<node_name>.log

Event Manager Log Files

Event Manager (EVM) information generated by evmd is recorded in log files in the following locations:

<CRS Home>/evm/log/evmdaemon.log
<CRS Home>/evm/init/<node_name>.log

Oracle High Availability Log Files

The Oracle RAC high availability trace files are located in:

$ORACLE_BASE/<database_name>/admin/hdump

Where $ORACLE_BASE is configured and $ORACLE_HOME/racg/log when $ORACLE_BASE is not available.

Enabling Additional Tracing for Real Application Clusters High Availability

You may be requested to enable tracing to capture additional information for problem resolution when working with Oracle Support. Because the procedures described in this section may adversely affect performance, only perform these activities with the assistance of Oracle Support.

Generating Additional Trace Information for a Running Resource

To generate additional trace information for a running resource, set the resource attribute _USR_ORA_DEBUG to the value 1 using one of the following two methods:

For an individual resource, modify the resource profile (a text file named <resource_name>.cap) that you can generate with the following command:
```
crs_stat –p <resource_name>  >    \ 
<CRS Home>/crs/profile/<resource_name>.cap
```
Only create cap files in this directory for node-level resources. Use the <CRS Home>/crs/public/ directory for other resources. Then set the _USR_ORA_DEBUG value to 1 in this cap file and issue the following command:
```
crs_register –u <resource_name>
```
For all resources, add the following line to the script, racgwrap, in $ORACLE_HOME/bin:
```
_USR_ORA_DEBUG=1
export _USR_ORA_DEBUG
```

Only node-level resources (VIP, Enterprise Manager, and so on) should have their cap file created in the directory <CRS Home>/crs/profile.

Verifying Event Manager Daemon Communications

The event manager daemons (evmd) running on separate nodes communicate through specific ports. To determine whether the evmd for a node can send and receive messages, perform the test described in this section while running session 1 in the background.

On node 1, session 1 enter:

$ evmwatch –A –t "@timestamp @@"

On node 2, session 2 enter:

$ evmpost -u "hello" [-h nodename]

Session 1 should show output similar to the following:

$ 21-Aug-2002 08:04:26 hello

Ensure that each node can both send and receive messages by executing this test in several permutations.

Enabling Additional Debugging Information for Cluster Ready Services

The crsd daemon can produce additional debugging information if you set the variable CRS_DEBUG to the value 1 by performing the following procedures:

In the file /etc/init.d/init.crsd add the entry:

CRS_DEBUG=1
export CRS_DEBUG

Then kill the crsd daemon with the command:

$ kill –9 <crsd process id

Allow the init process to restart the crsd. Oracle will write additional information to the standard log files.

Enabling Tracing for Java-Based Tools and Utilities in Real Application Clusters

All of the Java-based tools and utilities available in RAC are invoked by executing scripts of the same name as the tool or utility. This includes the Database Configuration Assistant (DBCA), the Net Configuration Assistant (NetCA), the Virtual Internet Protocol Configuration Assistant (VIPCA), Service Control (SRVCTL), and the Global Services Daemon (GSD). For example to run the DBCA, enter the command dbca.

To enable additional debugging information, edit the command file (such as dbca in this example) and add the following parameters to the $JRE_DIR/bin/jre command line:

-DTRACING.ENABLED=true –DTRACING.LEVEL=2

For example, the file $ORACLE_HOME/bin/dbca contains the line:

$JRE_DIR/bin/jre –DORACLE_HOME=$OH –DJDBC_PROTOCOL=thin –mx64 –classpath $CLASSPATH oracle.sysman.dbca.Dbca $ARGUMENTS

Change this line to:

$JRE_DIR/bin/jre –DORACLE_HOME=$OH –DTRACING.ENABLED=true  -DTRACING.LEVEL=2 –DJDBC_PROTOCOL=thin –mx64 –classpath $CLASSPATH oracle.sysman.dbca.Dbca $ARGUMENTS

When you run the DBCA, trace information will record in the DBCA log file.

To debug SRVCTL, add the argument -D to the command line. For example, to generate tracing information for an add database operation, execute the following command:

$ srvctl -D add database -d mydatabase

Using Instance-Specific Alert Files in Real Application Clusters

Each instance in the cluster database has one alert file. The alert file for each instance, alert_SID.log, contains important information about error messages and exceptions that occur during database operations. Information is appended to the alert file each time you start the instance. All process threads can write to the alert file for the instance.

The alert_SID.log file is in the directory specified by the BACKGROUND_DUMP_DEST parameter in the initdb_name.ora initialization parameter file. If you do not set the BACKGROUND_DUMP_DEST parameter, the alert_SID.log file is generated in the following locations:

$ORACLE_BASE/admin/db_name/bdump on UNIX-based systems.
%ORACLE_BASE%\admin\db_name\bdump on Windows-based system

Resolving Pending Shutdown Issues

In some situations a SHUTDOWN IMMEDIATE may be pending and Oracle will not quickly respond to repeated shutdown requests. This is because Cluster Ready Services (CRS) may be processing a current shutdown request. In such cases, issue a SHUTDOWN ABORT using SQL*Plus for subsequent shutdown requests.