Skip Headers

Oracle® Text Reference
10g Release 1 (10.1)

Part Number B10730-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

B Supported Document Formats

This appendix contains a list of the document formats supported by the Inso filtering technology. The following topics are covered in this appendix:

B.1 About Document Filtering Technology

Oracle Text uses document filtering technology licensed from Stellent Chicago, Inc. This filtering technology enables you to index most document formats. This technology also enables you to convert documents to HTML for document presentation with the CTX_DOC package. The software is based in part on the work of the Independent JPEG Group.


See Also:

For a list of supported formats, see "Supported Document Formats".

To use Inso filtering for indexing and DML processing, you must specify the INSO_FILTER object in your filter preference.

To use Inso filtering technology for converting documents to HTML with the CTX_DOC package, you need not use the INSO_FILTER indexing preference, but you must still set up your environment to use this filtering technology as described in this appendix.

To convert documents to HTML format, Inso filtering technology relies on shared libraries and data files licensed from Stellent Chicago, Inc.

The following sections discuss the supported platforms and how to enable Inso filtering on the different platforms.

B.1.1 Latest Updates for Patch Releases

The supported platforms and formats listed in this appendix apply for this release. These supported formats are updated for patch releases. To view the latest formats, refer to the Oracle Technology Network:

http://otn.oracle.com/products/text.101/content.html

B.1.2 Supported Platforms

Several platforms can take advantage of Inso filter technology.

B.1.2.1 Supported Platforms

Inso filter technology is supported on the following platforms:

  • Sun Solaris on SPARC 32-bit and 64-bit (6 - 9.0)

  • IBM AIX 32-bit and 64-bit (4.3, 5.1, 5.2)

  • HP-UX 32-bit and 64-bit (10.0 - 11.0)

  • Red Hat Linux on Intel x86 (7.1, 7.2, 8.0, 9.0)

  • SuSE Linux on Intel x86 (7.x and 8.x)

  • Microsoft Windows (32-bit)

    • Windows NT (4.0 and above)

    • Windows 95

    • Windows 98

    • Windows 98SE

    • Windows ME

    • Windows 2000

    • Windows XP

    • Windows 2003

  • Microsoft Windows (64-bit)

    • Windows .Net Server 2003 Enterprise Edition

B.1.3 Environment Variables

All environment variables related to Inso filtering must be made visible to Oracle Text.

B.1.4 Requirements for UNIX Platforms

The following requirements apply to Solaris, IBM AIX, HP/UX, and Linux platforms:

  • Set the $HOME environment variable to enable Inso technology to write files to a subdirectory (.oit) in $HOME directory.

B.2 Supported Document Formats

The tables in this section list the document formats that Oracle Text supports for filtering. Document filtering is used for indexing, DML, and for converting documents to HTML with the CTX_DOC package. This filtering technology is based on Outside In HTML Export and Outside In Viewer Technology, licensed from Stellent Chicago, Inc.


Note:

These lists do not represent the complete list of formats that Oracle Text is able to process. The external filter framework enables Oracle Text to process any document format, provided an external filter exists that can filter all the formats to text.

B.2.1 Word Processing Formats - Generic Text

Format Version
ASCII Text 7- & 8-bit
ANSI Text 7- & 8-bit
Unicode Text All versions
HTML Versions through 3.0 (some limitations)
IBM Revisable Form Text All versions
IBM FFT All versions
Microsoft Rich Text Format (RTF) All versions
WML Version 5.2

B.2.2 Word Processing Formats - DOS

Format Version
DEC WPS Plus (WPL) Versions through 4.1
DEC WPS Plus (DX) Versions through 4.0
DisplayWrite 2 & 3 (TXT) All versions
DisplayWrite 4 & 5 Versions through Release 2.0
Enable Versions 3.0, 4.0 and 4.5
First Choice Versions through 3.0
Framework Version 3.0
IBM Writing Assistant Version 1.01
Lotus Manuscript Version 2.0
MASS11 Versions through 8.0
Microsoft Word Versions through 6.0
Microsoft Works Versions through 2.0
MultiMate Versions through 4.0
Navy DIF All versions
Nota Bene Version 3.0
Novell Word Perfect Versions through 6.1
Office Writer Version 4.0 to 6.0
PC-File Letter Versions through 5.0
PC-File+ Letter Versions through 3.0
PFS:Write Versions A, B, and C
Professional Write Versions through 2.1
Q&A Version 2.0
Samna Word Versions through Samna Word IV+
SmartWare II Version 1.02
Sprint Versions through 1.0
Total Word Version 1.2
Volkswriter 3 & 4 Versions through 1.0
Wang PC (IWP) Versions through 2.6
WordMARC Versions through Composer Plus
WordStar Versions through 7.0
WordStar 2000 Versions through 3.0
XyWrite Versions through III Plus

B.2.3 Word Processing Formats - Windows

Format Version
Hangul Version 97
Novell/Corel WordPerfect for Windows Versions through 10
JustWrite Versions through 3.0
JustSystems Ichitaro Version 5.0, 6.0, 8.0, 9.0, and 10.0
Legacy Versions through 1.1
Lotus AMI/AMI Professional Versions through 3.1
Lotus WordPro (Non-32-bit-Windows platforms are Text-only) Version 96 through Millennium Edition 9.6
Microsoft Works for Windows Versions through 4.0
Microsoft Windows Write Versions through 3.0
Microsoft Word for Windows Versions through 2002
Microsoft WordPad All versions
Novell Perfect Works Version 2.0
Professional Write Plus Version 1.0
Q&A Write for Windows Version 3.0
StarOffice Writer for Windows and UNIX (Text only) Version 5.2
WordStar for Windows Version 1.0
Adobe FrameMaker (MIF) Version 6.0

B.2.4 Word Processing Formats - Macintosh

Format Version
Microsoft Word for Mac Versions 3.0 - 4.0, 98, 2001
Novell WordPerfect Versions 1.02 through 3.0
Microsoft Works for Mac Versions through 2.0
MacWrite II Version 1.1

B.2.5 Spreadsheet Formats

Format Version
Enable Versions 3.0, 4.0 and 4.5
First Choice Versions through 3.0
Framework Version 3.0
Lotus 1-2-3 (DOS & Windows) Versions through 5.0
Lotus 1-2-3 for SmartSuite Version 97 - Millennium 9.6
Lotus 1-2-3 Charts (DOS & Windows) Versions through 5.0
Lotus 1-2-3 (OS/2) Versions through 2.0
Lotus Symphony Versions 1.0,1.1 and 2.0
Microsoft Excel Windows Versions 2.2 through 2002
Microsoft Excel Macintosh Versions 3.0 - 4.0,98 and 2001
Microsoft Excel Charts Versions 2.x - 7.0
Microsoft Multiplan Version 4.0
Microsoft Works for Windows Versions through 4.0
Microsoft Works (DOS) Versions through 2.0
Microsoft Works (Mac) Versions through 2.0
Mosaic Twin Version 2.5
Novell Perfect Works Version 2.0
Quattro Pro for DOS Versions through 5.0
Quattro Pro for Windows Versions through 10
PFS:Professional Plan Version 1.0
SuperCalc 5 Version 4.0
SmartWare II Version 1.02
StarOffice Calc for Windows and UNIX Version 5.2
VP Planner 3D Version 1.0

B.2.6 Database Formats

Format Version
Access Versions through 2.0
dBASE Versions through 5.0
DataEase Version 4.x
dBXL Version 1.3
Enable Versions 3.0, 4.0 and 4.5
First Choice Versions through 3.0
FoxBase Version 2.1
Framework Version 3.0
Microsoft Works for Windows Versions through 4.0
Microsoft Works (DOS) Versions through 2.0
Microsoft Works (Mac) Versions through 2.0
Paradox (DOS) Versions through 4.0
Paradox (Windows) Versions through 1.0
Personal R:BASE Version 1.0
R:BASE 5000 Versions through 3.1
R:BASE System V Version 1.0
Reflex Version 2.0
Q & A Versions through 2.0
SmartWare II Version 1.02

B.2.7 Display Formats

Format Version
PDF - Portable Document Format Adobe Acrobat Versions through 5.0 including Chinese (simplified and traditional), Japanese, Korean, and read-only PDF

Encrypted (password protected) PDF is not supported.

PDF containing embedded fonts without included character mapping is partially supported: characters that are represented by means of embedded fonts without included character mapping show up as meaningless output; however, all remaining characters (if any) in such a PDF document are still filtered correctly.


B.2.8 Presentation Formats

Format Version
Corel/Novell Presentations Versions through 10
Harvard Graphics for DOS Versions 2.x & 3.x
Harvard Graphics for Windows Windows versions
Freelance for Windows Versions through Millennium 9.6
Freelance for OS/2 Versions through 2.0
Microsoft PowerPoint for Windows Versions 3.0 through 2002
Microsoft PowerPoint for Macintosh Version 4.0 and 2001
StarOffice Impress for Windows and UNIX Version 5.2

B.2.9 Graphic Formats

The following table lists the graphic formats that the INSO filter recognizes. This means that indexing a text column that contains any of these formats produces no error. As such, it is safe for the column to contain any of these formats.


Note:

The INSO filter cannot extract textual information from graphics.

Table B-1 Supported Graphics Formats for INSO Filter

Graphics Format Version
Adobe Photoshop (PSD) Version 4.0
Adobe Illustrator Versions through 7.0, 9.0
Adobe FrameMaker graphics (FMV) Vector/raster through 5.0
Ami Draw (SDW) Ami Draw
AutoCAD Interchange and Native Drawing formats (DXF and DWG) AutoCAD Drawing Versions 2.5-2.6, 9.0 - 14.0, 2000i and 2002
AutoShade Rendering (RND) Version 2.0
Binary Group 3 Fax All versions
Bitmap (BMP, RLE, ICO, CUR, OS/2 DIB & WARP) No specific version
CALS Raster (GP4) Type I and Type II
Corel Clipart format (CMX) Versions 5 through 6
Corel Draw (CDR) Versions 6.0 - 8.0
Corel Draw (CDR with TIFF header) Versions 2.0 - 9.0
Computer Graphics Metafile (CGM) ANSI, CALS NIST version 3.0
Encapsulated PostScript (EPS) TIFF header only
Graphics Environment Manager (GEM) Bitmap & vector
GEM Paint (IMG) No specific version
Graphics Interchange Format (GIF) No specific version
Hewlett Packard Graphics Language (HPGL) Version 2
IBM Graphics Data Format (GDF) Version 1.0
IBM Picture Interchange Format (PIF) Version 1.0
Initial Graphics Exchange Spec (IGES) Version 5.1
JFIF (JPEG not in TIFF format) All versions
JPEG (Including EXIF) No specific version
Kodak Flash Pix (FPX) No specific version
Kodak Photo CD (PCD) Version 1.0
Lotus Snapshot All versions
Lotus PIC No specific version
Macintosh PICT1 & PICT2 Bitmap only
MacPaint (PNTG) No specific version
Micrografx Draw (DRW) Versions through 4.0
Micrografx Designer (DRW) Versions through 3.1
Micrografx Designer (DSF) Windows 95, version 6.0
Novell PerfectWorks (Draw) Version 2.0
OS/2 PM Metafile (MET) Version 3.0
Paint Shop Pro 6 (PSP) (Windows platform only) Versions5.0 - 6.0
PC Paintbrush (PCX and DCX) No specific version
Portable Bitmap (PBM) All versions
Portable Graymap (PGM) No specific version
Portable Network Graphics (PNG) Version 1.0
Portable Pixmap (PPM) No specific version
Postscript (PS) Level II
Progressive JPEG No specific version
Sun Raster (SRS) No specific version
TIFF Versions through 6
TIFF CCITT Group 3 & 4 Versions through 6
Truevision TGA (TARGA) Version 2
Visio (Preview) Version 4
Visio Versions 5, 2000 and 2002
WBMP No specific version
Windows Enhanced Metafile (EMF) No specific version
Windows Metafile (WMF) No specific version
WordPerfect Graphics (WPG & WPG2) Versions through 2.0
X-Windows Bitmap (XBM) x10 compatible
X-Windows Dump (XWD) x10 compatible
X-Windows Pixmap (XPM) x10 compatible

B.2.10 Other Document Formats

Format Version
Executable (EXE, DLL) No specific version
Executable for Windows NT No specific version
Microsoft Project (Text only) Version 98
Microsoft Outlook Message (MSG): (Text only) No specific version
vCard Version 2.1

B.3 Restrictions on Format Support

Password-protected documents and documents with password-protected content are not supported by the Inso filter.