HomeHome

Internationalization with Qt


Internationalization of software is the process of allowing the software to be used efficiently by all people of the world. This means adapting to user and locality preferences such as language, input techniques, character encodings, and presentation conventions.

Step by Step

Writing cross-platform international software with Qt is a gentle, incremental process. Your software can become internationalized in the following stages:

  1. Use QString for all user-visible text.

    Since QString uses the Unicode encoding internally, all the languages of the world can be processed transparently using familiar text processing operations. Also, since all Qt functions that present text to the user take a QString as a parameter, there is no char* to QString conversion time.

    Strings that are in "programmer space" (such as QObject names and file format texts) need not use QString; the traditional char* or the QCString class will suffice.

    You're unlikely to notice that you are using Unicode - QString, and QChar are just like easier versions of the clumsy const char* and char from traditional C.

  2. Use tr() for all literal text.

    Where your program uses "quoted text" for text that will be presented to the user, ensure it goes through the QApplication::translate() function, usually this simply means using QObject::tr(). For example, assuming LoginWidget is a subclass of QWidget:

            LoginWidget::LoginWidget()
            {
                QLabel *label = new QLabel( tr("Password:"), this );
                ...
            }
    

    This is 99% of the user-visible strings you're likely to write.

    If the quoted text is not in a member function of a QObject/QWidget subclass, use either the tr() function of an appropriate class, or the QApplication::translate() function directly:

            void some_global_function( LoginWidget * logwid )
            {
                QLabel *label = new QLabel(
                        LoginWidget::tr("Password:"), logwid );
            }
    
            void same_global_function( LoginWidget * logwid )
            {
                QLabel *label = new QLabel(
                        qApp->translate("LoginWidget", "Password:"),
                        logwid );
            }
    

    Finally, if you need to have translatable text completely outside a function, there are two macros to help: QT_TR_NOOP() and QT_TRANSLATE_NOOP(). They merely mark the text for extraction by the lupdate utility described below - the macros expand to just the text (without the scope). Example usages are shown below.

            QString FriendlyConversation::greeting( int greet_type )
            {
                static const char* greeting_strings[] = {
                    QT_TR_NOOP( "Hello" ),
                    QT_TR_NOOP( "Goodbye" )
                };
                return tr( greeting_strings[greet_type] );
            }
    
            static const char* greeting_strings[] = {
                QT_TRANSLATE_NOOP( "FriendlyConversation", "Hello" ),
                QT_TRANSLATE_NOOP( "FriendlyConversation", "Goodbye" )
            };
            QString FriendlyConversation::greeting( int greet_type )
            {
                return tr( greeting_strings[greet_type] );
            }
    

    If you disable the const char* to QString automatic conversion by compiling your software with the macro QT_NO_CAST_ASCII defined, you'll be very likely to catch any strings you are missing. See QString::fromLatin1() for more details. Disabling the conversion can make programming cumbersome.

  3. Use QString::arg() for simple arguments.

    The printf() style of inserting arguments in strings is a poor choice for internationalized text, as it is sometimes necessary to change the order of arguments when translating. The QString::arg() functions offer a simple means for substituting arguments:

            void FileCopier::showProgress( int done, int total,
                                           const QString& current_file )
            {
                label.setText( tr("%1 of %2 files copied.\nCopying: %3")
                                .arg(done)
                                .arg(total)
                                .arg(current_file)
                             );
            }
    

  4. Produce translation.

    Once you are using tr() sufficiently, you can start producing translations of the user-visible text in your program.

    Translation of a Qt application is a three-step process:

    1. Run lupdate to extract translatable text from the C++ source code of the Qt application, resulting in a message file for translators (a .ts file). The utility recognizes the tr() construct described above and creates a certain number of .ts files (usually one per language).
    2. Provide translations for the source texts in the .ts file, using Qt Linguist. Since .ts files are in XML format, you can also edit them by hand.
    3. Run lrelease to obtain a light-weight message file (a .qm file) from the .ts file, suitable only for end use. You can see the .ts files as "source files", and .qm as "object files". The translator edits the .ts files, but the users of your application only need the .qm files. Both kinds of files are platform and locale independent.

    Typically, you will repeat these steps for every release of your application. The lupdate utility does its best to reuse the translations from the previous release.

    Before you run lupdate, you should prepare a project file. Here's an example project file (or .pro file):

    HEADERS         = funnydialog.h \
                      wackywidget.h \
    SOURCES         = funnydialog.cpp \
                      main.cpp \
                      wackywidget.cpp
    TRANSLATIONS    = superapp_dk.ts \
                      superapp_fi.ts \
                      superapp_no.ts \
                      superapp_se.ts
    

    When you invoke lupdate or lrelease, you have to give the name of the project file as a command-line argument.

    In this example, four exotic languages are supported: Danish, Finnish, Norwegian and Swedish. If you use tmake, you don't need an extra project file for lupdate; your tmake project file will do, if you add the TRANSLATIONS lines.

    In your application, you have to QTranslator::load() the translation files appropriate for the user's language, and to install them using QApplication::installTranslator().

    If you have been using the old Qt tools (findtr, msg2qm and mergetr), you can use qm2ts to convert your old .qm files.

    To get started, you should read at least the first chapter of the translation tutorial.

    While these utilities offer a convenient way to create .qm files, any system that writes .qm files is sufficient. You could make an application that adds translations to a QTranslator with QTranslator::insert() and then writes a .qm file with QTranslator::save(). This way the translations can come from any source you choose.

    Qt itself contains a small number of strings that will also need to be translated to the languages that you are targeting. In the near future Qt will ship with translations for some languages. We recommend that if you need to translate the Qt strings now, put the translations in separate .ts and .qm files. This will simplify transition to the official Qt translations.

  5. Support encodings.

    The QTextCodec class and the facilities in QTextStream make it easy to support many input and output encodings for your users' data. When the application starts, the locale of the machine will determine the 8-bit encoding used when dealing with 8-bit data - such as for font selection, text display, 8-bit text I/O, and character input.

    The application may occasionally have need for encodings other than the default local 8-bit encoding. For example, an application in a Cyrillic KOI8-R locale (the defacto-standard locale in Russia) might need to output Cyrillic in the ISO 8859-5 encoding. Code for this would be:

            QString string = ...; // Some Unicode text.
    
            QTextCodec* codec = QTextCodec::codecForName("ISO 8859-5");
            QCString encoded_string = codec->fromUnicode(string);
    
            ...; // Use encoded_string in 8-bit operations
    

    For converting Unicode to local 8-bit encodings, a shortcut is available: the local8Bit() method of QString returns such 8-bit data. Another useful shortcut is the utf8() method, which returns text in the 8-bit UTF-8 encoding - interesting in that it perfectly preserves Unicode information while looking like plain US-ASCII if the Unicode is wholly US-ASCII.

    For converting the other way, there are the QString::fromUtf8() and QString::fromLocal8Bit() convenience functions, or the general code, demonstrated by this conversion from ISO 8859-5 Cyrillic to Unicode conversion:

            QCString encoded_string = ...; // Some ISO 8859-5 encoded text.
    
            QTextCodec* codec = QTextCodec::codecForName("ISO 8859-5");
            QString string = codec->toUnicode(encoded_string);
    
            ...; // Use string in all of Qt's QString operations.
    

    Ideally Unicode I/O should be used as this maximizes the portability of documents between users around the world, but in reality it is useful to support all the appropriate encodings that your users' will need to process existing documents. In general, Unicode (UTF16 or UTF8) is the best for information transferred between arbitrary people, while within a language or national group, a local standard is often more appropriate. The most important encoding to support is the one returned by QTextCodec::codecForLocale(), as this is the one the user is most likely to need for communicating with other people and applications (this is the codec used by local8Bit()).

    Since most Unix systems do not have built-in support for converting between local 8-bit encodings and Unicode, it may be necessary to write your own QTextCodec subclass. Depending on the urgency, it may be useful to contact Trolltech technical support or ask on the qt-interest mailing list to see if someone else is already working on supporting the encoding. A useful interim measure can be to use the QTextCodec::loadCharmapFile() function to build a data-driven codec; this has a memory and speed penalty, especially with dynamically loaded libraries. For details of writing your own QTextCodec, see the mail QTextCodec class documentation.

  6. Localization.

    Localization is the process of adapting to local conventions such as date and time presentations. Such localizations can be accomplished using appropriate tr() strings, even "magic" words, as this somewhat contrived example shows:

            void Clock::setTime(const QTime& t)
            {
                if ( tr("AMPM") == "AMPM" ) {
                    // 12-hour clock
                } else {
                    // 24-hour clock
                }
            }
    

    In general, it is recommended that you do not attempt to localize images - choose clear icons that are appropriate for all localities, rather than relying on local puns or stretched metaphors.

System Support

Operating systems and window systems supporting Unicode are still in the early stages of development. The level of support available in the underlying system influences the support Qt provides on that platform, but applications written with Qt need not generally be too concerned with the actual limitations.

Unix/X11
Windows 95/98/NT

Supporting more Input Methods

While Trolltech doesn't have the resources or expertise in all the languages of the world to immediately include support in Qt, we are very keen to work with people who do have the expertise. Over the next few minor version numbers, we hope to add support for your language of choice, until everyone can use Qt and all the programs developed with Qt, regardless of their language.

Initially, languages with uni-directional single-byte encodings (European Latin-1 and KOI8-R, etc.) and the uni-directional multi-byte encodings (East Asian EUC-JP, etc.) will be supported. Later, support for the "complex" encodings - those requiring right-to-left input or complex character composition (eg. Arabic, Hebrew, and Thai script) will be implemented. The current state of activity is:

All encodings on Windows
On Windows, the local encoding is always supported.
ISO standard encodings ISO 8859-1, ISO 8859-2, ISO 8859-3, ISO 8859-4, ISO 8859-5, ISO 8859-7, ISO 8859-9, and ISO 8859-15
Fully supported. The Arabic (ISO 8859-6-I) and Hebrew (ISO 8859-8-I) encodings are not supported, but are under development externally.
KOI8-R
Fully supported.
eucJP, JIS, and ShiftJIS
Fully supported. Uses eucJP with the XIM protocol on X11, and the IME Windows NT in Japanese Windows NT. Serika Kurusugawa and other are assisting with this effort. kinput2 is the tested input method for X11.
eucKR
Under external development, Mizi Research are assisting with this effort. hanIM is the tested input method.
Big5
Qt contains a Big5 codec developed by Ming Che-Chuang. Testing is underway with the xcin (2.5.x) XIM server.
eucTW
Under external development.

If you are interested in contributing to existing efforts, or supporting new encodings beyond the more standard ones above, your work can be considered for inclusion in the official Qt distribution, or just included with your application.

Eventually, we hope to help Unix become as Unicode-oriented as Windows NT is becoming. This means better font support in the font servers, with new developments like the True Type font servers xfsft, xfstt, and x-tt, as well as UTF-8 (a Unicode encoding) filenames such as with the Unicode support in SolarisTM 7.

Notes about locales on X11

Many Unix distributions contain only partial support for some locales - for example, if you have a /usr/share/locale/ja_JP.EUC directory, this does not necessarily mean you can display Japanese text - you also need JIS encoded fonts (or Unicode fonts), and that /usr/share/locale/ja_JP.EUC directory needs to be complete. For best results, use complete locales from your OS vendor.


Copyright © 2000 TrolltechTrademarks
Qt version 2.2.1