HomeHome

How to use the Qt SAX2 classes


For a general discussion of the XML topics in Qt and an introduction to SAX2 see the document "Qt XML Module".

Quick start

Before reading this section you should have read the "Introduction to SAX2".

In this section we will present a small example for a reader that outputs the names of all start tags of an XML document. The start tags are indented corresponding to their nesting level.

At first you have to implement the functions of the handler classes you are intrested in. In our case we are only interested in three: QXmlContentHandler::startDocument(), QXmlContentHandler::startElement() and QXmlContentHandler::endElement(). For this purpose we use a subclass of the QXmlDefaultHandler (remember that the handler classes are abstract classes and the default handler class is a default implementation that does not change the parsing behavior):

class StructureParser : public QXmlDefaultHandler
{
public:
    bool startDocument()
    {
        indent = "";
        return TRUE;
    }
    bool startElement( const QString&, const QString&, const QString& qName, const QXmlAttributes& )
    {
        cout << indent << qName << endl;
        indent += " ";
        return TRUE;
    }
    bool endElement( const QString&, const QString&, const QString& )
    {
        indent.remove( 0, 1 );
        return TRUE;
    }
private:
    QString indent;
};

The next step is to make an instance of the handler. Then you create a QXmlInputSource for the XML that should be parsed. After that set up the reader (in our case we simply have to set the content handler) and start the parsing:

int main( int argc, char **argv )
{
    for ( int i=1; i<argc; i++ ) {
        StructureParser handler;
        QFile xmlFile( argv[i] );
        QXmlInputSource source( xmlFile );
        QXmlSimpleReader reader;
        reader.setContentHandler( &handler );
        reader.parse( source );
    }
    return 0;
}

Consider the following XML file:

<animals>
<mammals>
  <monkeys> <gorilla/> <orang-utan/> </monkeys>
</mammals>
<birds> <pigeon/> <penguin/> </birds>
</animals>

The program will produce the following output:

animals
 mammals
  monkeys
   gorilla
   orang-utan
 birds
  pigeon
  penguin

This small example is in so far incomplete since no error handling is done. You should always install an error handler (with QXmlReader::setErrorHandler()). This allows you to report parsing errors to the user.

Namespaces

Namespaces are a concept introduced to XML to allow a more modular design. Details on namespaces can be found at http://www.w3.org/TR/REC-xml-names/.

Namespaces do not change the parsing behavior. They are only reported through the handler.

You have to declare the namespace first. After that you can apply the namespace to element names or attribute names.

Namespaces are declared like attributes; they are attributes, strictly speaking. You can declare a namespace prefix fnord to the namespace name http://trolltech.com/fnord/ with the attribute xmlns:fnord="http://trolltech.com/fnord/" (remark: namespaces are URI references; this does not mean that there must be any data available at the address; they are simply unique names, not more). There is also one default namesapce that can be declared: xmlns.

Namespaces can be used for element names and attribute names by prepending the prefix with a ":" to the name. If an element name does not have a prefix, the default namespace is applied (the default namespace is not applied to attributes).

Example:

<fnord:document xmlns:fnord = 'http://trolltech.com/fnord/'
                xmlns       = 'http://trolltech.com/' >
    <element1 a = '42' fnord:b = '23'>Eris</element1>
    <fnord:element2 c = '42' fnord:d = '23'>Discordia</fnord:element2>
</fnord:document>

In this example the elements and attributes have the namespaces:

The following terms are used to distinguish the parts of the names with namespaces:

If an element does not have a ":", this means that there is no namespace prefix and the local part is the same as the qualified name (in the example element1 has no namespace prefix and the qualified name and the local part are both element1).

You can configure the behavior of the reader concerning namespace processing. This is done with the features http://xml.org/features/namespaces and http://xml.org/features/namespace-prefixes (for details on features see the section "Features and properties").

There are four reporting behaviors that are influenced by this features:

  1. The namespace prefix and local part of elements and attributes are reported.
  2. The qualified name of elements and attributes are reported.
  3. The startPrefixMapping() and endPrefixMapping() functions of the QXmlContentHandler are called.
  4. Attributes that declare namespaces (i.e. the attribute xmlns and attributes starting with xmlns:) are reported.

SAX2 requires the following behavior:
namespacesnamespace-prefixes Namespace prefix and local partQualified names Prefix mappingxmlns attributes
TRUEFALSE YesUnknown YesNo
TRUETRUE YesYes YesYes
FALSETRUE UnknownYes UnknownYes
The combination that both features (namespaces and namespace-prefixes) are FALSE is an illegal combination.

QXmlSimpleReader implements the following behavior:
namespacesnamespace-prefixes Namespace prefix and local partQualified names Prefix mappingxmlns attributes
TRUEFALSE YesYes YesNo
TRUETRUE YesYes YesYes
FALSETRUE NoYes NoYes

The default settings are http://xml.org/features/namespaces is TRUE and http://xml.org/features/namespace-prefixes is FALSE.

Features and properties

SAX2 allows you to set and query features and properties of an XML reader.

Features are simply options that change the behavior of the reader. Every feature has a unique name, represented as an URI, and a value which can be TRUE or FALSE.

Properties are a more general concept. They also have a unique name, represented as an URI, but their value is void*. So nearly everything can be used for a property values. This concept involves some danger, though: there are no means to get type-safety; the user has to take care that he passes the correct type. Properties are useful if a reader supports special handler classes, e.g.

The URIs used for features and properties are often URLs like http://xml.org/sax/features/namespace. This does not mean that there must be any data at the address. It is simply a way to define unique names, not more.

Everybody can define and use new SAX2 features and properties for his readers. There are only two features and no properties that a SAX2 reader must support: http://xml.org/sax/features/namespaces and http://xml.org/sax/features/namespace-prefixes.

Features can be set or queried with the three following three functions: QXmlReader::setFeature(), QXmlReader::feature() and QXmlReader::hasFeature().

Properties can be set or queried with similar functions: QXmlReader::setProperty(), QXmlReader::property() and QXmlReader::hasProperty().


Copyright © 2000 TrolltechTrademarks
Qt version 2.2.1