I18n Guide
User Manual:
Open the PDF directly: View PDF .
Page Count: 162
Download | |
Open PDF In Browser | View PDF |
Common Desktop Environment: Internationalization Programmer’s Guide Copyright 1994, 1995 Hewlett-Packard Company Copyright 1994, 1995 International Business Machines Corp. Copyright 1994, 1995 Sun Microsystems, Inc. Copyright 1994, 1995 Novell, Inc. All rights reserved. This product and related documentation are protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or related documentation may be reproduced in any form by any means without prior written authorization. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the United States Government is subject to the restrictions set forth in DFARS 252.227-7013 (c)(1)(ii) and FAR 52.227-19. THIS PUBLICATION IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. The code and documentation for the DtComboBox and DtSpinBox widgets were contributed by Interleaf, Inc. Copyright 1993, Interleaf, Inc. Portions of Chapter 4,“Motif Dependencies,” are derived from the OSF/Motif Programmer’s Guide and are subject to the following copyright: Copyright 1989, 1990, 1993 Open Software Foundation, Inc. Portions of Chapter 5,“Xt and Xlib Dependencies,“ are derived from XLIB - C Language Interface Version X11 Release 5 and X Toolkit Intrinsics - C Language Interfaces Version X11 Release 5 and are subject to the following copyright: Copyright 1985, 1986, 1987, 1988, 1989, 1991 Massachusetts Institute of Technology, Cambridge, Massachusetts and Digital Equipment Corporation, Maynard, Massachusetts. All rights reserved. Permission to use, copy, modify, and distribute this documentation for any purpose and without fee is hereby granted, provided that this copyright, permission, and disclaimer notice appear on all copies and that the names of M.I.T. or Digital not be used in advertising or publicity pertaining to this documentation without specific prior permission. M.I.T. and Digital make no representations about the suitability of this documentation for any purpose. It is provided “as is” without express or implied warranty. UNIX is a trademark exclusively licensed through X/Open Company, Ltd. OSF/Motif and Motif are trademarks of Open Software Foundation, Ltd. X Window System is a trademark of X Consortium, Inc. Please Recycle Contents 1. Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 2. Introduction to Internationalization . . . . . . . . . . . . . . . . . . . . . 1 Overview of Internationalization. . . . . . . . . . . . . . . . . . . . . . . . . 2 Current State of Internationalization. . . . . . . . . . . . . . . . . . . 4 Internationalization Standards. . . . . . . . . . . . . . . . . . . . . . . . 4 Common Internationalization System . . . . . . . . . . . . . . . . . 5 Locales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Fonts, Font Sets, and Font Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Font Specification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Font Set Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Font List Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Example Font List Specification . . . . . . . . . . . . . . . . . . 10 Base Font Name List Specification . . . . . . . . . . . . . . . . . . . . 11 Text Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 iii Input Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Preedit Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 OffTheSpot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 OverTheSpot (Default) . . . . . . . . . . . . . . . . . . . . . . . . . 16 Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Status Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Auxiliary Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 MainWindow Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Focus Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Interclient Communications Conventions (ICCC) . . . . . . . . . . . 19 3. Internationalization and the Common Desktop Environment iv 21 Locale Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Font Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Matching Fonts to Character Sets . . . . . . . . . . . . . . . . . . . . . 23 Font Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Font Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Fonts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Font Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Font Set and Font List Syntax. . . . . . . . . . . . . . . . . . . . . . . . . 26 Font Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Font Charsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Default Font Set Per Language Group . . . . . . . . . . . . . . . . . 28 Latin ISO8859-1 Fonts . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Other ISO8859 Fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 CDE: Internationalization Programmer’s Guide JIS Japanese Font . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 KSC Korean Font . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 CNS Traditional Chinese Font . . . . . . . . . . . . . . . . . . . 30 GB Simplified Chinese Font . . . . . . . . . . . . . . . . . . . . . 30 Drawing Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Simple Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 XmString (Compound String) . . . . . . . . . . . . . . . . . . . . . . . . 31 Inputting Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Basic Prompts and Dialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Input within a DrawingArea Widget . . . . . . . . . . . . . . . . . . 34 Application-Specific and Language-Specific Intermediate Feedbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Text and TextField Widget . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Character Input within Customized Widgets Not Using Text[Field] Widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 XIM Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 XIM Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 XIM Callback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Extracting Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Resource Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Message Catalogs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Private Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Message Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Message Extraction Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 XPG4/Universal UNIX Messaging Functions . . . . . . . . . . . 42 Contents v 4. vi XPG4 Messaging Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Xlib Messaging Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Xlib Message and Resource Facilities . . . . . . . . . . . . . . . . . . 44 Localized Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Labels and Buttons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 List Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Title . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Text Widget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Input Method (Keyboards) . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Pixmap (Icon) Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Font Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Operating System Internationalized Functions . . . . . . . . . . . . . 51 Internationalization and Distributed Networks . . . . . . . . . . 55 Interchange Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 iconv Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Stateful and Stateless Conversions . . . . . . . . . . . . . . . . . . . . 59 Stateful Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Stateless Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Simple Text Basic Interchange . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 iconv Conversion Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 60 X Interclient (ICCCM) Conversion Functions . . . . . . . . . . . 61 Window Titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Mail Basic Interchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Encodings and Code Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 CDE: Internationalization Programmer’s Guide 5. Code Set Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Code Set Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Control Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Graphic Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Single-Byte Code Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Multibyte Code Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Extended UNIX Code (EUC) Code Set . . . . . . . . . . . . 66 ISO EUC Code Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 ISO646-IRV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 ISO8859-1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Other ISO8859 Code Sets. . . . . . . . . . . . . . . . . . . . . . . . 67 eucJP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 eucTW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 eucKR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Motif Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Locale Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Font Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Font List Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Font Lists Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Obtaining a Font. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Obtaining a Font Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Specifying a Font When the Font List Element Tag Is Absent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Specifying a Font Set When the Font List Element Tag Is Absent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Contents vii Font List Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Drawing Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Compound String Components . . . . . . . . . . . . . . . . . . . . . . . 80 Compound Strings and Resources . . . . . . . . . . . . . . . . 81 Setting a Compound String Programmatically . . . . . 81 Setting a Compound String in a Defaults File . . . . . . 82 Compound Strings and Font Lists . . . . . . . . . . . . . . . . . . . . . 83 Text and TextField Widgets and Font Lists . . . . . . . . . . . . . . 87 Inputting Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Geometry Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Focus Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Internationalized User Interface Language . . . . . . . . . . . . . . . . 92 Programming for Internationalized User Interface Language 92 viii String Literals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Font Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Font Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Creating Resource Files . . . . . . . . . . . . . . . . . . . . . . . . . 94 Setting the Environment . . . . . . . . . . . . . . . . . . . . . . . . 94 default_charset Character Set in UIL. . . . . . . . . . . . . . . . . . . 96 Example: uil_sample . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Compound Strings in UIL . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6. Xt and Xlib Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Locale Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 X Locale Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 CDE: Internationalization Programmer’s Guide Locale and Modifier Dependencies . . . . . . . . . . . . . . . . . . . . 102 Xt Locale Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 XtSetLanguageProc . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 XtDisplayInitialize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Font Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Creating and Freeing a Font Set . . . . . . . . . . . . . . . . . . . . . . . 109 Obtaining Font Set Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Drawing Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Inputting Localized Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Xlib Input Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . 111 Input Method Architecture . . . . . . . . . . . . . . . . . . . . . . 113 Input Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Keyboard Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Xlib Focus Management . . . . . . . . . . . . . . . . . . . . . . . . 117 Xlib Geometry Management . . . . . . . . . . . . . . . . . . . . . 118 Event Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 X Server Keyboard Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Interclient Communications Conventions for Localized Text . 122 Owner of Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Requester of Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 XmClipboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Passing Window Title and Icon Name to Window Managers 124 Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents 125 ix x A. Message Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 File-Naming Conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Cause and Recovery Information . . . . . . . . . . . . . . . . . . . . . . . . 128 Comment Lines for Translators . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Programming Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Writing Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Usage Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Standard Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Regular Expression Standard Messages . . . . . . . . . . . . . . . . . . . 134 Sample Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 CDE: Internationalization Programmer’s Guide Figures Figure 1-1 Information external to the application. . . . . . . . . . . . . . . . . . . 3 Figure 1-2 Common internationalized system. . . . . . . . . . . . . . . . . . . . . . . 6 Figure 1-3 Example of VendorShell widget with auxiliary (Japanese). . . 14 Figure 1-4 Example of OffTheSpot preediting with the VendorShell widget (Japanese) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Figure 1-5 Example of OverTheSpot preediting with the VendorShell widget (Japanese) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Figure 1-6 Example of Root preediting with the VendorShell widget (Japanese) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Figure 4-1 Relationships between compound strings, font sets, and font lists when the font list element tag is not XmFONTLIST_DEFAULT_TAG . . . . . . . . . . . . . . . . . . . . . . . . . 84 Figure 4-2 Relationships between compound strings, font sets, and font lists when a font list element tag is set to XmFONTLIST_DEFAULT_TAG . . . . . . . . . . . . . . . . . . . . . . . . . 86 Figure 4-3 Japanese preediting example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Figure 4-4 Sample UIL program on English and Japanese environments 96 Figure 5-1 Input method and input contexts . . . . . . . . . . . . . . . . . . . . . . . . 116 xi xii CDE: Internationalization Programmer’s Guide Tables Table 1-1 Locale Categories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Table 2-1 Font Set and Font List Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Table 2-2 XIM Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Table 2-3 Localized Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 2-4 Resources Used for Reading Lists . . . . . . . . . . . . . . . . . . . . . . . . 47 Table 2-5 Resources Used for Setting Titles and Icon Names . . . . . . . . . 47 Table 2-6 Locale-Sensitive Text[Field] Resources . . . . . . . . . . . . . . . . . . . 48 Table 2-7 Localized Resources for Input Method Customization . . . . . . 49 Table 2-8 Pixmap Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Table 2-9 Localized Font Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Table 2-10 Base Operating System Internationalized Functions . . . . . . . . 52 Table 3-1 Using iconv to Perform Conversions . . . . . . . . . . . . . . . . . . . . . 58 Table 3-2 Code Set Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Table 3-3 Encoding for eucJP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Table 3-4 Encoding for eucTW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Table 3-5 16 Planes of the CNS 11643-1992 Standard . . . . . . . . . . . . . . . . 71 xiii xiv Table 3-6 Encoding for eucKR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Table 5-1 Locale and Modifier Dependencies . . . . . . . . . . . . . . . . . . . . . 103 CDE: Internationalization Programmer’s Guide Preface The Common Desktop Environment: Internationalization Programmer’s Guide provides information for internationalizating the desktop, enabling applications to support various languages and cultural conventions in a consistent user interface. Specifically, this guide: • Provides guidelines and hints for developers on how to write applications for worldwide distribution. • Provides an overall view of internationalization topics that span different layers within the desktop. • Provides pointers to reference and more detailed documentation. In some cases, standard documentation is referenced. This guide is not intended to duplicate the existing reference or conceptual documentation but rather to provide guidelines and conventions on specific internationalization topics. This document focuses on internationalization topics and not on any specific component or layer in an open software environment. Who Should Use This Book This book provides various levels of information for the application programmer and developer and related fields. xv How This Book Is Organized Explanations of the contents of this book follow: Chapter 1, “Introduction to Internationalization,” provides an overview of internationalization and localizing within the desktop, including locales, fonts, drawing, inputting, interclient communication, and extracting user visual text. Information on the significance of internationalization standards is also provided. Chapter 2, “Internationalization and the Common Desktop Environment,” covers the set of topics that developers commonly need to consider when internationalizing their applications, including locale management, localized resources, font management, localized text tasks, interclient communication for localized text, and internationalized functions. Chapter 3, “Internationalization and Distributed Networks,” discusses topics related to handling encoded characters in distributed networks. Basic principles and examples for interclient interoperability are provided to guide developers in internationalized distributed environments. Chapter 4, “Motif Dependencies,” topics include internationalized applicaitons, locale management, localized text, international User Interface Language (UIL), and localized applications. Chapter 5, “Xt and Xlib Dependencies,” topics include locale management, localized text tasks, font set metrics, interclient communications conventions for localized text, and charset and font set encoding and registry information. Appendix A, “Message Guidelines,” is a set of guidelines for writing messages. xvi CDE: Internationalization Programmer’s Guide Related Publications See the following documentation for additional information on topics presented in this book: • ISO C: ISO/IEC 9899:1990, Programming Languages --- C (technically identical to ANS X3.159-1989, Programming Language C). • ISO/IEC 9945-1: 1990, (IEEE Standard 1003.1) Information Technology Portable Operating System Interface (POSIX) - Part 1: System Application Program Interface (API) [C Language]. • ISO/IEC DIS 9945-2: 1992, (IEEE Standard 1003.2-Draft) Information Technology - Portable Operating System Interface (POSIX) - Part 2: Shell and Utilities. • OSF/Motif 1.2: OSF Motif 1.2 Programmer’s Reference, Revision 1.2, Open Software Foundation, Prentice Hall, 1992, ISBN: 0-13-643115-1. • Scheifler, W. R., X Window System, The Complete Reference to Xlib, Xprotocol, ICCCM, XLFD - X Version 11, Release 5, Digital Press, 1992, ISBN: 1-55558088-2. • X/Open: X/Open CAE Specification System Interface Definition, Issue 4, X/Open Company Ltd., 1992, ISBN: 1-872630-46-4. • X/Open: X/Open CAE Specification Commands and Utilities, Issue 4, X/Open Company Ltd., 1992, ISBN: 1-872630-48-0. • X/Open: X/Open CAE Specification System Interface and Headers, Issue 4, X/Open Company Ltd., 1992, ISBN: 1-872630-47-2. • X/Open: X/Open Internationalization Guide, X/Open Company Ltd., 1992, ISBN: 1-872630-20-0. • ISO/IEC 10646-1:1993 (E): Information Technology - Universal Multi-Octet Coded Character Set (UCS). Part 1: Architecture and Basic Multilingual Plane. Preface xvii What Typographic Changes and Symbols Mean Table P-1 describes the type changes and symbols used in this book. Table P-1 Typeface or Symbol Typographic Conventions Meaning Example AaBbCc123 The names of commands, files, and directories; on-screen computer output Edit your .login file. Use ls -a to list all files. system% You have mail. AaBbCc123 Command-line placeholder: replace with a real name or value To delete a file, type rm filename. AaBbCc123 Book titles, new words or terms, or words to be emphasized Read Chapter 6 in User’s Guide. These are called class options. You must be root to do this. Code samples are included in boxes and may display the following: xviii % UNIX C shell prompt system% $ UNIX Bourne and Korn shell prompt system$ # Superuser prompt, all shells system# CDE: Internationalization Programmer’s Guide Introduction to Internationalization 1 Internationalization is the designing of computer systems and applications for users around the world. Such users have different languages and may have different requirements for the functionality and user interface of the systems they operate. In spite of these differences, users want to be able to implement enterprise-wide applications that run at their sites worldwide. These applications must be able to interoperate across country boundaries, run on a variety of hardware configurations from multiple vendors, and be localized to meet local users’ needs. This open, distributed computing environment is the reasoning behind common open software environments. The internationalization technology identified within this specification provides these benefits to a global market. Overview of Internationalization 2 Locales 7 Fonts, Font Sets, and Font Lists 8 Text Drawing 12 Input Methods 13 Interclient Communications Conventions (ICCC) 19 1 1 Overview of Internationalization Multiple environments may exist within a common open system for support of different national languages. Each of these national environments is called a locale, which considers the language, its characters, fonts, and the customs used to input and format data. The Common Desktop Environment is fully internationalized such that any application can run using any locale installed in the system. A locale defines the behavior of a program at run time according to the language and cultural conventions of a user’s geographical area. Throughout the system, locales affect the following: • • • • • Encoding and processing of text data • • • • Encoding and decoding for interclient text communication Identifying the language and encoding of resource files and their text values Rendering and layout of text strings Interchanging text that is used for interclient text communication Selecting the input method (which code set will be generated) and the processing of text data Bitmap/icon files Actions and file types User Interface Definition (UID) files An internationalized application contains no code that is dependent on the user’s locale, the characters needed to represent that locale, or any formats (such as date and currency) that the user expects to see and interact with. The desktop accomplishes this by separating language- and culture-dependent information from the application and saving it outside the application. 2 CDE: Internationalization Programmer’s Guide 1 Figure 1-1 shows the kinds of information that should be external to an application to simplify internationalization. Any string to be displayed: Icons Geometry Menu Items Help Text Prompt Labels Bitmaps Application Source Code Data Presentation Format Date Format Collation Order Time Format Numeric Format Figure 1-1 Currency Format Information external to the application By keeping the language- and culture-dependent information separate from the application source code, the application does not need to be rewritten or recompiled to be marketed in different countries. Instead, the only requirement is for the external information to be localized to accommodate local language and customs. An internationalized application is also adaptable to the requirements of different native languages, local customs, and character-string encodings. The process of adapting the operation to a particular native language, local custom, or string encoding is called localization. A goal of internationalization is to permit localization without program source modifications or recompilation. Introduction to Internationalization 3 1 For a quick overview of internationalization, refer to X/Open CAE Specification System Interface Definition, Issue 4, X/Open Company Ltd., 1992, ISBN: 1872630-46-4. Current State of Internationalization Previously, the industry supplied many variants of internationalization from proprietary functions to the new set of standard functions published by X/Open. Also, there have been different levels of enabling, such as simple ASCII support, Latin/European support, Asian multibyte support, and Arabic/Hebrew bidirectional support. The interfaces defined within the X/Open specification are capable of supporting a large set of languages and territories, including: Script Description Latin Language Americas, Eastern/Western European Greek Greece Turkish Turkey East Asia Japanese, Korean, and Chinese Indic Thai Bidirectional Arabic and Hebrew Furthermore, the goal of the Common Desktop Environment is that localization of these technologies (translation of messages and documentation and other adaptation for local needs) be done in a consistent way, so that a supported user anywhere in the world will find the same common localized environment from vendor to vendor. End users and administrators can expect a consistent set of localization features that provide a complete application environment for support of global software. Internationalization Standards Through the work of many companies, the functionality of the internationalization application program interface has been standardized over time to include additional requirements and languages, particularly those of East Asia. This work has been centered primarily in the Portable Operating 4 CDE: Internationalization Programmer’s Guide 1 System Interface for Computer Environments (POSIX) and X/Open specifications. The original X/Open specification was published in the second edition of the X/Open Portability Guide (XPG2) and was based on the Native Language Support product released by Hewlett-Packard. The latest published X/Open internationalization standard is referred to as XPG4. It is important that each layer within the desktop use the proper set of standards interfaces defined for internationalization to ensure end users get a consistent, localized interface. The definition of a locale and the common open set of locale-dependent functions are based on the following specifications: • X Window System, The Complete Reference to Xlib, Xprotocol, ICCCM, XLFD - X Version, Release 5, Digital Press, 1992, ISBN 1-55558-088-2. • ANSI/IEEE Standard Portable Operating System Interface for Computer Environments, IEEE. • OSF Motif 1.2 Programmer’ Reference, Revision 1.2, Open Software Foundation, Prentice Hall, 1992, ISBN 0-13-643115-1. • X/Open CAE Specification Commands and Utilities, Issue 4, X/Open Company Ltd., 1992, ISBN 1-872630-48-0. Within this environment, software developers can expect to develop worldwide applications that are portable, can interoperate across distributed systems (even from different vendors), and can meet the diverse language and cultural requirements of multinational users supported by the desktop standard locales. Common Internationalization System Figure 1-2 on page 6 shows a view of how internationalization is pervasive across a specific single-host system. The goal is that the applications (clients) are built to be shipped worldwide for the set of locales supported in the underlying system. Using standard interfaces improves access to global markets and minimizes the amount of localization work needed by application developers. In addition, country representatives can be ensured of consistent Introduction to Internationalization 5 1 localization within systems adhering to the principles of the desktop. Editors System Utilities Text Icon Audio Image Customization Printing Terminal Emulator ... Text Drawing Menus Applications Managers Database Object Media DME ... Window Manager File Manager Session Manager ... Text Input Text Cut/Paste Title Name Icon Name Text Input Client XmString Drawing Buttons List Vendor Shell (Geometry Mgmt) XmIm API Label Text XmFontList libXm Locale Mgmt Xt Xlib Local I18N Input XIMText Protocol Latin Ideographic Others Output Method Subsystem ISO8859 Interclient Communication Vendor Input Method Subsystem PC Code->88591 PC Codes EUC Others? Locale Subsystem 88591->PC Code SJIS->JS eucJP->JIS Conversion Subsystem Resource Management GUI Latin - ISO Latin - PC Codes Japan ISV Input Method Engine Internationalization Framework Figure 1-2 6 Common internationalized system CDE: Internationalization Programmer’s Guide 1 Locales Most single-display clients operate in a single locale that is determined at run time from the setting of the environment variable, which is usually $LANG or the xnlLanguage resource. Locale environment variables, such as LC_ALL, LC_CTYPE, and LANG, can be used to control the environment. See “Xt Locale Management” on page 104 for more information. The LC_CTYPE category of the locale is used by the environment to identify the locale-specific features used at run time. The fonts and input method loaded by the toolkit are determined by the LC_CTYPE category. Programs that are enabled for internationalization are expected to call the XtSetLanguageProc() function (which calls setlocale() by default) to set the locale desired by the user. None of the libraries call the setlocale() function to set the locale, so it is the responsibility of the application to call XtSetLanguageProc() with either a specific locale or some value loaded at run time. If applications are internationalized and do not use XtSetLanguageProc(), obtain the locale name from one of the following prioritized sources to pass it to the setlocale() function: • • • A command-line option A resource The empty string (“”) The empty string makes the setlocale() function use the $LC_* and $LANG environment variables to determine locale settings. Specifically, setlocale (LC_ALL, ““) specifies that the locale should be checked and taken from environment variables in the order shown in Table 1-1 for the various locale categories. Table 1-1 Locale Categories Category 1st Env. Var. 2nd Env. Var. 3rd Env. Var. LC_CTYPE: LC_ALL LC_TYPE LANG LC_COLLATE: LC_ALL LC_COLLATE LANG LC_TIME: LC_ALL LC_TIME LANG LC_NUMERIC: LC_ALL LC_NUMERIC LANG LC_MONETARY: LC_ALL LC_MONETARY LANG LC_MESSAGES: LC_ALL LC_MESSAGES LANG Introduction to Internationalization 7 1 The toolkit already defines a standard command-line option (-lang) and a resource (xnlLanguage). Also, the resource value can be set in the server RESOURCE_MANAGER, which may affect all clients that connect to that server. Fonts, Font Sets, and Font Lists All X clients use fonts for drawing text. The basic object used in drawing text is XFontStruct, which identifies the font that contains the images to be drawn. The desktop already supports fonts by way of the XFontStruct data structure defined by Xlib; yet, the encoding of the characters within the font must be known to an internationalized application. To communicate this information, the program expects that all fonts at the server are identified by an X Logical Font Description (XLFD) name. The XLFD name enables users to describe both the base characteristics and the charset (encoding of font glyphs). The term charset is used to denote the encoding of glyphs within the font, while the term code set means the encoding of characters within the locale. The charset for a given font is determined by the CharSetRegistry and CharSetEncoding fields of the XLFD name. Text and symbols are drawn as defined by the codes in the fonts. A font set (for example, an XFontSet data structure defined by Xlib) is a collection of one or more fonts that enables all characters defined for a given locale to be drawn. Internationalized applications may be required to draw text encoded in the code sets of the locale where the value of an encoded character is not identical to the glyph index. Additionally, multiple fonts may be required to render all characters of the locale using one or more fonts whose encodings may be different than the code set of the locale. Since both code sets and charsets may vary from locale to locale, the concept of a font set is introduced through XFontSet. While fonts are identified by their XLFD name, font sets are identified by a list of XLFD names. The list can consist of one or more XLFD names with the exception that only the base characteristics are significant; the encoding of the desired fonts is determined from the locale. Any charsets specified in the XLFD base name list are ignored and users need only concentrate on specifying the base characteristics, such as point size, style, and weight. A font set is said to be locale-sensitive and is used to draw text that is encoded in the code set of the locale. Internationalized applications should use font sets instead of font structs to render text data. 8 CDE: Internationalization Programmer’s Guide 1 A font list is a libXm Toolkit object that is a collection of one or more font list entries. Font sets can be specified within a font list. Each font list entry designates either a font or a font set and is tagged with a name. If there is no tag in a font list entry, a default tag (XmFONTLIST_DEFAULT_TAG) is used. The font list can be used with the XmString functions found in the libXm Toolkit library. A font list enables drawing of compound strings that consist of one or more segments, each identified by a tag. This allows the drawing of strings with different base characteristics (for example, drawing a bold and italic string within one operation). Some non-XmString-based widgets, such as XmText of the libXm library, use only one font list entry in the font list. Motif font lists use the suffix : (colon) to identify a font set within a font list. The user is generally asked to specify either a font list (which may contain either a font or font set) or a font set. In an internationalized environment, the user must be able to specify fonts that are independent of the code set because the specification can be used under various locales with different code sets than the character set (charset) of the font. Therefore, it is recommended that all font lists be specified with a font set. Font Specification The font specification can be either an X Logical Function Description (XLFD) name or an alias for the XLFD name. For example, the following are valid font specifications for a 14-point font: -dt-application-medium-r-normal-serif-*-*-*-*-p-*-iso8859-1 OR -*-r-*-14-*iso8859-1 Font Set Specification The font set specification is a list of names (XLFD names or their aliases) and is sometimes called a base name list. All names are separated by commas, with any blank spaces before or after the comma being ignored. Pattern-matching (wildcard) characters can be specified to help shorten XLFD names. Remember that a font set specification is determined by the locale that is running. For example, the ja_JP Japanese locale defines three fonts (character sets) necessary to display all of its characters; the following identifies the set of Gothic fonts needed. Introduction to Internationalization 9 1 • Example of full XLFD name list: -dt-mincho-medium-r-normal--14-*-*-m-*-jisx0201.1976-0, -dt-mincho-medium-r-normal--28-*-*-*-m-*-jisx0208.1983-0: • Example of single XLFD pattern name: -dt-*-medium-*-24-*-m-*: The preceding two cases can be used with a Japanese locale as long as fonts exist that match the base name list. Font List Specification A font list specification can consist of one or more entries, each of which can be either a font specification or a font set specification. Each entry can be tagged with a name that is used when drawing a compound string. The tags are application-defined and are usually names representing the expected style of font; for example, bold, italic, bigbold. A null tag is used to denote the default entry and is associated with the XmFONTLIST_DEFAULT_TAG identifier used in XmString functions. A font tag is identified when it is prefixed with an = (equal sign); for example, =bigbold (this matches the first font defined at the server). If an = is specified but there is no name following it, the specification is considered the default font list entry. A font set tag is identified when it is prefixed with a : (colon); for example, :bigbold (this matches the first server set of fonts that satisfy the locale). If a : is specified but no name is given, the specification is considered the default font list entry. Within a font list entry specification, a base name list is separated by ; (semicolons) rather than by , (commas). Example Font List Specification For the Latin 1 locales, enter: -*-r-*-14-*: ,\# default font list entry -*-b-*-18-*:bigbold# Large Bold fonts 10 CDE: Internationalization Programmer’s Guide 1 Base Font Name List Specification The base font name list is a list of base font names associated with a font set as defined by the locale. The base font names are in a comma-separated list and are assumed to be characters from the portable character set; otherwise, the result is undefined. Blank space immediately on either side of a separating comma is ignored. Use of XLFD font names permits international applications to obtain the fonts needed for a variety of locales from a single locale-independent base font name. The single base font name specifies a family of fonts whose members are encoded in the various charsets needed by the locales of interest. An XLFD base font name can explicitly name the font’s charset needed for the locale. This enables the user to specify an exact font for use with a charset required by a locale, fully controlling the font selection. If a base font name is not an XLFD name, an attempt is made to obtain an XLFD name from the font properties for the font. The following algorithm is used to select the fonts that are used to display text with font sets. For each charset required by the locale, the base font name list is searched for the first of the following cases that names a set of fonts that exist at the server. • The first XLFD-conforming base font name that specifies the required charset or a superset of the required charset in its CharSetRegistry and CharSetEncoding fields. • The first set of one or more XLFD-conforming base font names that specify one or more charsets that can be remapped to support the required charset. The Xlib implementation can recognize various mappings from a required charset to one or more other charsets and use the fonts for those charsets. For example, JIS Roman is ASCII with the ~ (tilde) and \ (backslash) characters replaced by the yen and overbar characters; Xlib can load an ISO8859-1 font to support this character set if a JIS Roman font is not available. • The first XLFD-conforming font name, or the first non-XLFD font name for which an XLFD font name can be obtained, combined with the required charset (replacing the CharSetRegistry and CharSetEncoding fields in the XLFD font name). In the first instance, the implementation can use a charset that is a superset of the required charset. Introduction to Internationalization 11 1 • The first font name that can be mapped in some locale-dependent manner to one or more fonts that support imaging text in the charset. For example, assume a locale requires the following charsets: • • • • ISO8859-1 JISX0208.1983 JISX0201.1976 GB2312-1980.0 You can supply a base font name list that explicitly specifies the charsets, ensuring that specific fonts are used if they exist, as shown in the following example: “-dt-mincho-Medium-R-Normal-*-*-*-*-*-M-*-JISX0208.1983-0,\ -dt-mincho-Medium-R-Normal-*-*-*-*-*-M- \ *-JISX0201.jisx0201\.1976-1,\ -dt-song-Medium-R-Normal-*-*-*-*-*-M-*-GB2312-1980.0,\ -*-default-Bold-R-Normal-*-*-*-*-M-*-ISO8859-1" You can supply a base font name list that omits the charsets, which selects fonts for each required code set, as shown in the following example: “-dt-Fixed-Medium-R-Normal-*-*-*-*-*-M-*,\ -dt-Fixed-Medium-R-Normal-*-*-*-*-*-M-*,\ -dt-Fixed-Medium-R-Normal-*-*-*-*-*-M-*,\ -*-Courier-Bold-R-Normal-*-*-*-*-M-*” Alternatively, the user can supply a single base font name that selects from all available fonts that meet certain minimum XLFD property requirements, as shown in the following example: “-*-*-*-R-Normal--*-*-*-*-*-M-*” Text Drawing The desktop provides various functions for rendering localized text, including simple text, compound strings, and some widgets. These include functions within the Xlib and Motif libraries. 12 CDE: Internationalization Programmer’s Guide 1 Input Methods The Common Desktop Environment provides the ability to enter localized input for an internationalized application that is using the Xm Toolkit. Specifically, the XmText[Field] widgets are enabled to interface with input methods provided by each locale. In addition, the dtterm client is enabled to use input methods. By default, each internationalization client that uses the libXm Toolkit uses the input method associated with a locale specified by the user. The XmNinputMethod resource is provided as a modifier on the locale name to allow a user to specify any alternative input method. The user interface of the input method consists of several elements. The need for these areas is dependent on the input method being used. They are usually Introduction to Internationalization 13 1 needed by input methods that require complex input processing and dialogs. See Figure 1-3 for an illustration of these areas. Label widget MainWindow preedit Area Application=ApplicationShell area widget (VendorShell) Text widget Status Auxiliary (ZENKOUHO) Figure 1-3 14 Example of VendorShell widget with auxiliary (Japanese) CDE: Internationalization Programmer’s Guide 1 Preedit Area A preedit area is used to display the string being preedited. The input method supports four modes of preediting: OffTheSpot, OverTheSpot (default), Root, and None. Note – A string that has been committed cannot be reconverted. The status of the string is moved from the preedit area to the location where the user is entering characters.. OffTheSpot In OffTheSpot mode preediting using an input method, the location of preediting is fixed at just below the MainWindow area and on the right side of the status area as shown in Figure 1-4. A Japanese input method is used for the example. Introduction to Internationalization 15 1 Figure 1-4 Example of OffTheSpot preediting with the VendorShell widget (Japanese) In the system environment, when preediting using an input method, the preedit string being preedited may be highlighted in some form depending on the input method. To use OffTheSpot mode, set the XmNpreeditType resource of the VendorShell widget either with the XtSetValues() function or with a resource file. The XmNpreeditType resource can also be set as the resource of a TopLevelShell, ApplicationShell, or DialogShell widget, all of which are subclasses of the VendorShell widget class. OverTheSpot (Default) In OverTheSpot mode, the location of the preedit area is set to where the user is trying to enter characters (for example, the insert cursor position of the Text widget that has the current focus). The characters in a preedit area are displayed at the cursor position as an overlay window, and they can be highlighted depending on the input method. 16 CDE: Internationalization Programmer’s Guide 1 Although a preedit area may consist of multiple lines in OverTheSpot mode. The preedit area is always within the MainWindow area and cannot cross its edges in any direction. Keep in mind that although the preEdit string under construction may be displayed as though it were part of the Text widget’s text, it is not passed to the client and displayed in the underlying edit screen until preedit ends. See Figure 1-5 on page 17 for an illustration. To use OverTheSpot mode explicitly, set the XmNpreeditType resource of the VendorShell widget either with the XtSetValues() function or with a resource file. The XmNpreeditType resource can be set as the resource of a TopLevelShell, ApplicationShell, or DialogShell widget because these are subclasses of the VendorShell widget class. Figure 1-5 Example of OverTheSpot preediting with the VendorShell widget (Japanese) Introduction to Internationalization 17 1 Root In Root mode, the preedit and status areas are located separate from the client’s window. The Root mode behavior is similar to OffTheSpot. See Figure 1-6 for an illustration. Figure 1-6 Example of Root preediting with the VendorShell widget (Japanese) Status Area A status area reports the input or keyboard status of the input method to the users. For OverTheSpot and OffTheSpot styles, the status area is located at the lower left corner of the VendorShell window. • • If Root style, the status area is placed outside the client window. If the preedit style is OffTheSpot mode, the preedit area is displayed to the right of the status area. The VendorShell widget provides geometry management so that a status area is rearranged at the bottom corner of the VendorShell window if the VendorShell window is resized. 18 CDE: Internationalization Programmer’s Guide 1 Auxiliary Area An auxiliary area helps the user with preediting. Depending on the particular input method, an auxiliary area can be created. The Japanese input method in Figure 1-3 on page 14 creates the following types of auxiliary areas: • • • ZENKOUHO JIS NUMBER Switching conversion method • SAKIYOMI-REN-BUNSETSU • IKKATSU-REN-BUNSETSU • TAN-BUNSETSU • FUKUGOU-GO MainWindow Area A MainWindow area is the widget used as the working area of the input method. In the system environment, the sole child of the VendorShell widget is the MainWindow widget. It can be any container widget, such as a RowColumn widget. The user creates the container widget as the child of the VendorShell widget. Focus Area A focus area is any descendant widget under the MainWindow widget subtree that currently has focus. The Motif application programmer using existing widgets does not need to worry about the focus area. The important information to remember is that only one widget can have input method processing at a time. The input method processing moves to the window (widget) that currently has the focus. Interclient Communications Conventions (ICCC) The Interclient Communications Conventions (ICCC) defines the mechanism used to pass text between clients. Because the system is capable of supporting multiple code sets, it may be possible that two applications that are communicating with each other are using different code sets. ICCC defines how these two clients agree on how the data is passed between them. If two clients have incompatible character sets (for example, Latin1 and Japanese (JIS)), some data may be lost when characters are transported. Introduction to Internationalization 19 1 However, if two clients have different code sets but compatible character sets, ICCC enables these clients to pass information with no data lost. If code sets of the two clients are not identical, CompoundText encoding is used as the interchange with the COMPOUND_TEXT atom used. If data being communicated involves only portable characters (7-bit, ASCII, and others) or the ISO8859-1 code set, the data is communicated as is with no conversion by way of the XA_STRING atom. Titles and icon names need to be communicated to the Window Manager using the COMPOUND_TEXT atom if nonportable characters are used; otherwise, the XA_STRING atom can be used. Any other encoding is limited to the ability to convert to the locale of the Window Manager. The Window Manager runs in a single locale and supports only titles and icon names that are convertible to the code set of the locale under which it is running. The libXm library and all desktop clients should follow these conventions. 20 CDE: Internationalization Programmer’s Guide Internationalization and the Common Desktop Environment 2 Multiple environments may exist within a common open system for support of different national languages. Each of these national environments is called a locale, which considers the language, its characters, fonts, and the customs used to input and format data. The Common Desktop Environment is fully internationalized such that any application can run using any locale installed in the system. Locale Management 22 Font Management 23 Drawing Localized Text 30 Inputting Localized Text 34 Extracting Localized Text 40 Localized Resources 45 Operating System Internationalized Functions 51 21 2 Locale Management For the desktop, most single-display clients operate in a single locale that is determined at run time from the setting of the environment variable, which is usually $LANG. The Xm library (libXm) can only support a single locale that is used at the time each widget is instantiated. Changing the locale after the Xm library has been initialized may cause unpredictable behavior. All internationalized programs should set the locale desired by the user as defined in the locale environment variables. For programs using the desktop toolkit, the programs call the XtSetLanguageProc() function prior to calling any toolkit initialization function; for example, XtAppInitialize(). This function does all of the initialization necessary prior to the toolkit initialization. For nondesktop programs, the programs call the setlocale() function to set the locale desired by the user at the beginning of the program. Locale environment variables (for example, LC_ALL, LC_CTYPE, and LANG) are used to control the environment. Users should be aware that the LC_CTYPE category of the locale is used by the X and Xm libraries to identify the localespecific features used at run time. Yet, the LC_MESSAGES category is used by the message catalog services to load locale-specific text. Refer to “Extracting Localized Text” on page 40 for more information. Specifically, the fonts and input method loaded by the toolkit are determined by the setting of the LC_CTYPE category. String encoding (for example, ISO8859-1 or Extended UNIX Code (EUC), in an application’s source code, resource files, and User Interface Language (UIL) files) should be the same as the code set of the locale where the application runs. If not, code conversion is required. All components are shipped as a single, worldwide executable and are required to support the R5 sample implementation set of locales: US, Western/Eastern Europe, Japan, Korea, China, and Taiwan. Applications should be written so that they are code-set-independent and include support for any multibyte code set. The following are the functions used for locale management: • • • • 22 XtSetLanguageProc() setlocale() XSupportsLocale() XSetLocaleModifiers() CDE: Internationalization Programmer’s Guide 2 Font Management When rendering text in an X WindowsTM client, at least two aspects are sensitive to internationalization: • • Obtaining the localized text itself Selecting the one or more fonts that contain all the glyphs needed to render the characters in the localized text. “Extracting Localized Text” on page 40 describes how to choose the correct fonts to render localized text. Matching Fonts to Character Sets A font contains a set of glyphs used to render the characters of a locale. However, you may also want to do the following for a given locale: • • • • Determine the fonts needed Specify the necessary fonts Determine the charset of a font in a resource file Choose multiple fonts per locale The last two fields of a font XFLD identify which glyphs are contained in a font and which value is used to obtain a specific glyph from the set. These last two fields identify the encoding of the glyphs contained in the font. For example: -adobe-courier-medium-r-normal--24-240-75-75-m-150-iso8859-1 The last two fields of this XLFD name are iso8859 and 1. These fields specify that the ISO8859-1 standard glyphs are contained in the font. Further, it specifies that the character code values of the ISO8859-1 standard are used to index the corresponding glyph for each character. The font charset used by the application to render data depends on the locale you select. Because the font charset of the data changes is based on the choice of locale, the font specification must not be hardcoded by the application. Instead, it should be placed in a locale-specific app-defaults file, allowing localized versions of the app-defaults file to be created. Internationalization and the Common Desktop Environment 23 2 Further, the font should be specified as a fontset. A fontset is an Xlib concept in which an XLFD is used to specify the fonts. The font charset fields of the XLFD are specified by the Xlib code that creates the fontset and fills in these fields based on the locale that the user has specified. For many languages (such as Japanese, Chinese, and Korean), multiple font charsets are combined to support single encoding. In these cases, multiple fonts must be opened to render the character data. Further, the data must be parsed into segments that correspond to each font, and in some cases, these segments must be transformed to convert the character values into glyphs indexes. The XFontset, which is a collection of all fonts necessary to render character data in a given locale, also deals with this set of problems. Further, a set of rendering and metric routines are provided that internally take care of breaking strings into character-set-consistent segments and transforming values into glyph indexes. These routines relieve the burden of the application developer, who needs only the user fontsets and the new X11R5 rendering and metric application program interfaces (APIs). Font Objects This section describes the following font objects: • • • Font sets Fonts Font lists Font Sets Generally, all internationalized programs expecting to draw localized text using Xlib are required to use an XmFontSet for specifying the localedependent fonts. Specific fonts within a font set should be specified using XLFD naming conventions without the charset field specified. The resource name for an XFontset is *fontSet. Refer to “Localized Resources” on page 45 for a list of font resources. Applications directly using Xlib to render text (as opposed to using XmString functions or widgets) may take advantage of the string-to-fontSet converter provided by Xt. For example, the following code fragment shows how to 24 CDE: Internationalization Programmer’s Guide 2 obtain a fontset when using Xt and when not using Xt: /* pardon the double negative... means "If using Xt..." */ #ifndef NO_XT typedef struct { XFontSet fontset; char *foo; } ApplicationData, *ApplicationDataPtr; static XtResource my_resources[] = { { XtNfontSet, XtCFontSet, XtRFontSet, sizeof (XFontSet), XtOffset (ApplicationDataPtr, fontset), XtRString, "*-18-*"}} #endif /* NO_XT */ ... #ifdef NO_XT fontset = XCreateFontSet (dpy, "*-18-*", &missing_charsets, &num_missing_charsets. &default_string); if (num_missing_charsets > 0) { (void) fprintf(stderr, "&s: missing charsets.\n", program_name); XFreeStringList(missing_charsets); } #else XtGetApplicationResources(toplevel, &data, my_resources, XtNumber(my_resources), NULL, 0); fontset = data.fontset; #endif /* NO_XT */ Fonts Internationalized programs should avoid using fonts directly, that is, XFontStruct, unless they are being used for a specific charset and a specific character set. Use of XFontStruct may be limiting if the server you are connecting to does not support the specific charsets needed by a locale. The resource name for an XFontStruct is *font. Font Lists All programs using widgets or XmString to draw localized text are required to specify an XFontList name for specifying fonts. A font list is a list of one or more fontsets or fonts, or both. It is used to convey the list of fonts and fontsets a widget should use to render text. For more complicated applications, a font list may specify multiple font sets with each font set being tagged with a name; Internationalization and the Common Desktop Environment 25 2 for example, Bold, Large, Small, and so on. The tags are to be associated with a tag of an XmString segment. A tag may be used to identify a specific font or fontset within a font list. Font Set and Font List Syntax Table 2-1 shows the syntax for a font set and font list. Table 2-1 Font Set and Font List Syntax Resource Type XLFD Separator Terminator FontEntry Separator *fontSet: (Xlib) comma None None *fontList: (Motif) semicolon colon comma Here are some examples of font resource specifications: app_foo*fontList: -adobe-courier-medium-r-normal--24-240-75-75-m-\ 150-*: The preceding fontList specifies a fontset, consisting of one or more 24-point Adobe Courier fonts, as appropriate for the user’s locale. app_foo*fontList: -adobe-courier-medium-r-normal--18-*; *-gothic-\ *-18-*: This fontList specifies a fontset consisting of an 18-point Courier font (if available) for some characters in the users data, and an 18-point Gothic font for the others. Motif-based applications sometimes need direct access to the font set contained in a font list. For example, an application that uses a DrawingArea widget may want to label one of the images drawn there. The following sample code shows how to extract a font set from a font list. In this example, the tag XmFONTLIST_DEFAULT_TAG looks for the font set because this is the tag that says “codeset of the locale.” Applications should use the tag XmFONTLIST_DEFAULT_TAG for any string that could contain localized data. XFontSet FontList2FontSet( XmFontList fontlist) { XmFontContext context; XmFontListEntry next_entry; XmFontType type_return = XmFONT_IS_FONT; char* font_tag; 26 CDE: Internationalization Programmer’s Guide 2 XFontSet fontset; XFontSet first_fontset; Boolean have_font_set = False; if ( !XmFontListInitFontContext(&context, fontlist)) { XtWarning(“fl2fs: can’t create fontlist context...”); exit 0; } while ((next_entry = XmFontListNextEntry(context) != NULL) { fontset = (XFontSet) XmFontListEntryGetFont(next_entry, &type_return); if (type_return == XmFONT_IS_FONTSET ) { font_tag = XmFontListEntryGetTag(next_entry); if (!strcmp(XmFONTLIST_DEFAULT_TAG, font_tag) { return fontset; } /* Remember the 1st fontset, just in case... */ if (!have_font_set) { first_fontset = fontset; have_font_set = True; } } } if (have_font_set) return first_fontset; return (XFontSet)NULL; } Font Functions The following Xlib font management API functions are available: • • • • • XCreateFontSet() XLocaleOfFontSet() XFontsOfFontSet() XBaseFontNameListOfFontSet() XFreeFontSet() The following Motif FontListAPI functions are available: • XmFontListEntryCreate() Internationalization and the Common Desktop Environment 27 2 • • • • • XmFontListEntryAppend() XmFontListEntryFree() XmFontListEntryGetTag() XmFontListEntryGetFont() XmFontListEntryLoad() Font Charsets To improve basic interchange, fonts are organized according to the standard XConsortium font charsets. Default Font Set Per Language Group Selecting base font names of a font set associated with a developer’s language is usually easy because the developer is familiar with the language and the set of fonts needed. Yet, when selecting the base font names of a font set for various locales, this task can be difficult because an XLFD font specification consists of 15 fields. For localized usage, the following fields are critical for selecting font sets: • • • • • FAMILY_NAME %F WEIGHT_NAME %W SLANT %S ADD_STYLE %A SPACING %SP This simplifies the number of fields, yet the possible values for each of these fields may vary per locale. The actual point size (POINT_SIZE) may vary across platforms. Throughout this documentation, the following convention should be used when specifying localized fonts: -dt-%F-%W-%S-normal-%A-*-*-*-%SP-* The following describes the minimum set of recommended values for each field to be used within the desktop for the critical fields when specifying font sets in resource (app-defaults) files. 28 CDE: Internationalization Programmer’s Guide 2 Latin ISO8859-1 Fonts FOUNDRY ‘dt’ FAMILY_NAME ‘interface user’ ‘interface system’ ‘application’ WEIGHT_NAME medium or bold SLANT r or i ADD_STYLE sans or serif SPACING p or m Other ISO8859 Fonts The same values defined for ISO8859-1 are recommended. JIS Japanese Font FOUNDRY ‘dt’ FAMILY_NAME Gothic or Mincho WEIGHT_NAME medium or bold SLANT r ADD_STYLE * SPACING m KSC Korean Font FOUNDRY ‘dt’ FAMILY_NAME Totum or Pathang WEIGHT_NAME medium or bold SLANT r ADD_STYLE * SPACING m Internationalization and the Common Desktop Environment 29 2 Note – The FAMILY_NAME values may change depending on the official romanization of the two common font families in use. As background, Totum corresponds to fonts typically shipped as Gothic, Kodig, or Dotum; Pathang corresponds to fonts typically shipped as Myungo or Myeongjo. CNS Traditional Chinese Font FOUNDRY ‘dt’ FAMILY_NAME Sung and Kai WEIGHT_NAME medium or bold SLANT r ADD_STYLE * SPACING m GB Simplified Chinese Font FOUNDRY ‘dt’ FAMILY_NAME Song and Kai WEIGHT_NAME medium or bold SLANT r ADD_STYLE * SPACING m Drawing Localized Text There are several mechanisms provided to render a localized string, depending on the Motif or Xlib library being used. The following discusses the interfaces that are recommended for internationalized applications. Yet, it is recommended that all localized data be externalized from the program using the simple text. 30 CDE: Internationalization Programmer’s Guide 2 Simple Text The following Xlib multibyte (char*) drawing functions are available for internationalization: • • • XmbDrawImageString() XmbDrawString() XmbDrawText() The following Xlib wide character (wchar_t*) drawing functions are available for internationalization: • • • XwcDrawImageString() XwcDrawString() XwcDrawText() The following Xlib multibyte (char*) font metric functions are available for internationalization: • • • • XExtentsOfFontSet() XmbTextEscapement() XmbTextExtents() XmbTextPerCharExtents The following Xlib wide character (char_t*) font metric functions are available for internationalization: • • • • XExtentsOfFontSet() XwcTextEscapement() XwcTextExtents() XwcTextPerCharExtents XmString (Compound String) For the Xm library, localized text should be inserted into XmString segments using XmStringCreateLocalized(). The tag associated with localized text is XmFONTLIST_DEFAULT_TAG, which is used to match an entry in a font list. Applications that mix several fonts within a compound string using XmStringCreate() should use XmFONTLIST_DEFAULT_TAG as the tag for any localized string. Internationalization and the Common Desktop Environment 31 2 More importantly, for interclient communications, the XmStringConvertToCT() function associates a segment tagged as XmFONTLIST_DEFAULT_TAG as being encoded in the code set of the locale. Otherwise, depending on the tag name used, the Xm library may not be able to properly identify the encoding on interclient communications for text data. A localized string segment inside an XmString can be drawn with a font list having a font set with XmFONTLIST_DEFAULT_TAG. Use of a localized string is recommended for portability. The following is an example of creating a font list for drawing a localized string: XmFontList CreateFontList( Display* dpy, char* pattern) } SmFontListEntry font_entry; XmFontList fontlist; font_entry = XmFontListEntryLoad( dpy, pattern, XmFONT_IS_FONTSET, XmFONTLIST_DEFAULT_TAG); fontlist = XmFontListAppendEntry(NULL, font_entry); /* XmFontListEntryFree(font_entry); */ if ( fontlist == NULL ) { XtWarning(“fl2fs: can’t create fontlist...”); exit (0); } return fontlist; } int main(argc,argv) int argc; char **argv; } Display *dpy; /* Display XtAppContext app_context;/* Application Context XmFontList fontlist; XmFontSet fontset; XFontStruct** fontstructs; char** fontnames; int i,n; 32 CDE: Internationalization Programmer’s Guide */ */ 2 char *progrname; /* program name without the full pathname */ if (progname=strrchr(argv[0], ‘/’)){ progname++; } else { progname = argv[0]; } /* Initialize toolkit and open display. */ XtSetLanguageProc(NULL, NULL, NULL); XtToolkitInitialize(): app_context = XtCreateApplicationContext(); dpy = XtOpenDisplay(app_context, NULL, progname, “XMdemos”, NULL, 0, &argc, argv); if (!dpy) { XtWarning(“fl2fs: can’t open display, exiting...”); exit(0); } fontlist = CreateFontList(dpy, argv[1] ); fontset = FontList2FontSet( fontlist ); /* * Print out BaseFontNames of Fontset */ n = XFontsOfFontSet( fontset, &fontstructs, &fontnames); printf(“Fonts for %s is %d\n”, argv[1], n); for (i = 0 ; i < n ; ++i ) printf(“font[%d} - %s\n”, i,\ fontnames[i] ); exit(1); } A localized string can be written in resource files because a compound string specified in resource files has a locale-encoded segment with Xm_FONTLIST_DEFAULT_TAG. For example, the fontList resource in the following example is automatically associated with XmFONTLIST_DEFAULT_TAG. labelString:Japanese string Internationalization and the Common Desktop Environment 33 2 *fontList:-dt-interface system-medium-r-normal-L*-*-*-*-*-*-*: The following set of XmString functions is recommend for internationalization: • • • • XmStringCreateLocalized() XmStringDraw() XmStringDrawImage() XmStringDrawUnderline() The following set of XmString functions is not recommend for internationalization because it takes a direction that may not work with languages not covered: • • XmStringCreateLtoR() XmStringSegementCreate() Inputting Localized Text Input for localized text is typically done by using either the local input method or the network-based input method. The local input method means that the input method is built in the Xlib. It is typically used for a language that can be composed using simple rules and that does not require language-specific features. The network-based input method means that the actual input method is provided as separate servers, and Xlib communicates with them through the XIM protocol to do the language-specific composition. Basic Prompts and Dialogs It is strongly recommended that applications use the Text widget to do all text input. Input within a DrawingArea Widget Many applications do their own drawing within a widget based on input. To provide consistency within the desktop environment, XmIm functions are recommended because the style and geometry management needed for an input method is managed by the VendorShell widget class. The application need only worry about handling key events, focus, and communicating the 34 CDE: Internationalization Programmer’s Guide 2 current input location within the drawing area. Using these functions requires some basic knowledge of the underlying Xlib input method architecture, but a developer need only be concerned with the XmIm pieces of information. Application-Specific and Language-Specific Intermediate Feedbacks Some applications may need to directly display intermediate feedback during preediting, such as when an application exceeds the functions supplied by Xlib. Examples of this include for PostScriptTM rendering or using vertical writing. The core Xlib provides the common set of interfaces that allow an application to display intermediate feedback during preediting. By registering the application's callbacks and setting the preediting style to XNPreeditCallbacks, an application can get the intermediate preediting data from the input method and can draw whatever it needs. Applications intended to do sophisticated language processing may recognize extensions within a specific XIM implementation and its input method engines. Such applications are on the leading edge and will require familiarity with details of the XIM functions. Text and TextField Widget For basic prompts and dialogs, the Text or TextField widget is recommended. Besides resources, all of the XmTextField and XmText functions are available for getting and for setting localized text inside a Text[Field] widget. Most XmText functions are based on the number of characters, not on the number of bytes. For example, all XmTextPosition() function positions are character positions, not byte positions. The XmTextGetMaxLength() function returns the number of bytes. When in doubt, remember that positions are always in character units. The width of a Text or TextField widget is determined by the resource value of XmNcolumns. But, this value means the number of the widest characters in the font set, not the number of bytes or columns. For example, suppose that you have selected a variable-width font for the Text widget. The character i may have a width of 1 pixel, while the character W may have a width of 7 pixels. When a value of 10 is set for XmNcolumns, this is considered Internationalization and the Common Desktop Environment 35 2 a request to make the Text widget wide enough to be able to display at least 10 characters. So the Text widget must use the width of the widest character to determine the pixel width of its core widget. With this example, it may be able to display 10 W characters in the widget, or 70 i characters. This structure for XmNcolumns may cause problems in locales whose code set is a multibyte and a multicolumn encoding. As such, this value should be set within a localized resource. The following section identifies the set of functions available for applications that are used to manage input methods. For applications that use the Text and TextField widgets, refer to “Input Method (Keyboards)” on page 49. Character Input within Customized Widgets Not Using Text[Field] Widgets In some cases, an application may obtain character input from the user but does not use a TextField or Text widget to do so. For example, an application using a DrawingArea widget may allow the user to type in text directly into the DrawingArea. In this case, the application could use the Xlib XIM functions as described in later sections, or alternatively, the application may use the XmIm functions of Motif 1.2. The XmIm functions allow an application to connect to and interact with an input method with a minimum of code. Further, it allows the Motif VendorShell widget to take care of geometry management for the input method on the application’s behalf. Although the XmIm functions are shipped in all implementations of Motif 1.2, the functions are not documented in Motif 1.2. OSF has announced its intention to augment and document the XmIm functions for Motif 2.0. The functions described here are the Motif 1.2 XmIm functions. Note – The Motif 1.2 XmIm functions do not support preedit callback style or status callback style input methods. The preedit callback can be used by the Xlib API. For more information, see “XIM Management” on page 38. Following are the XmIm functions you can safely use in a Motif 1.2-based application. The formal description of the parameters and types can be found in the Xm.h header file. 36 Function Name Description XmImRegister() Performs XOpenIM() and queries the input method for supported styles. CDE: Internationalization Programmer’s Guide 2 XmImSetValues() Negotiates and selects the preedit and status styles. XmImSetFocusValues() Creates the XIC, if one does not exist. Notifies the input method that the widget has gained the focus. Sets the values passed to the XIC. XmImUnsetFocus() Notifies the input method that the widget has lost the focus. XmImMbLookupString() Xm equivalent of XmbLookupString(); converts one or more key events into a character. Return value is identical to XmbLookupString(). Disconnects the input method and the widget, allowing connection to a new input method. Does not necessarily close the input method (implementationdependent). The XmImSetValues() and XmImSetFocusValues() functions allow the application to pass information needed by the input method. It is important for the application to pass all values even though not all values are needed (for each supports preedit and status style). This is because the application can never be sure which style has been selected by the user or the VendorShell widget. Following are the arguments and data types of each value that should be passed in each call to the XmImSet[Focus]Values() function. XmImUnregister() Argument Name Data Type XmNbackground Pixel XmNforeground Pixel XmNbackgroundPixmap Pixmap XmNspotLocation XPoint XmNfontList Motif fontlist XmNlineSpace int (pixel height between consecutive baselines) The XmIm functions are used in the following manner: • Before initializing the toolkit, the application should call XtSetLanguageProc(NULL, NULL, NULL) to initialize the locale. Internationalization and the Common Desktop Environment 37 2 • After creating the widget where character input is desired, the application should call XmImRegister(widget) to open the input method and establish a connection. • After establishing a connection to the input method, the application should pass the initial XIC values to the input method by calling XmImSetValues() and passing all of the values listed above. This function takes an arg_list and a number_args argument. The arglist is loaded by calling XtSetArg(). • Add an event handler, through the XtAddEventHandler() function, for the manager widget of the widget obtaining input from the input method. The event handler is for the FocusChangeMask mask. The handler should call XmImSetFocusValues() when gaining focus and should call XmImUnsetFocus() when losing focus. When setting focus for the input method, pass the full set of values listed above. • Add a DestroyCallback for the widget obtaining input from the input method. In the destroy callback, call XmImUnregister() to notify the input method that you are breaking the connection between the widget and the input method. • Use XmImSetValues() to notify the input method any time one or more of the input method values listed above change (for example, spotLocation). XIM Management Following are the XIM management functions. 38 Function Name Description XOpenIM() Establishes a connection to an input method. XCloseIM() Removes a connection to an input method previously established with a call to XOpenIM(). XGetIMValues() Queries the input method for a list of properties. Currently, the only standard argument in Xlib is XNQueryInputStyle. XDisplayOfIM() Returns the display associated with an input method. XLocaleOfIM() Returns a string identifying the locale of the input method. There are no standard strings; the value returned by this call is implementation-defined. CDE: Internationalization Programmer’s Guide 2 XCreateIC() Creates an input context. The input context contains both the data required (if any) by an input method and the information required to display that data. XDestroyIC() Destroys an input context, freeing any associated memory. XIMOfIC() Returns the input method currently associated with a given input context. XSetICValues() Passes zero or more values to an input context to control input of character data, or control display of preedit or status information. A table of all valid input context value arguments can be found in the X11R5 specification. XGetICValues() Queries an input context to get zero or more input context values. A table of all valid input context value arguments can be found in the X11R5 specification. XIM Event Handling Following are the XIM event handling functions: Function Name Description XmbLookupString() Converts keypress events into characters. XwcLookupString() Converts keypress events into wide characters. XmbResetIC() Resets an input context to its initial state. Any input pending on that context is deleted. Returns the current preedit value as a char* string. Depending on the implementation of the input method, the return value may be NULL. XwcResetIC() Resets an input context to its initial state. Any input pending on that context is deleted. Returns the current preedit value as a wchar_t* string. XFilterEvent() Allows the input method to process any incoming events to the clients before the application processes them. XSetICFocus() Notifies the input method that the focus window attached to the specified input context has received keyboard focus. Internationalization and the Common Desktop Environment 39 2 XUnsetICFocus() Notifies the input method that the specified input context has lost the keyboard focus and that no more input is expected on the focus window attached to that context. XIM Callback X Input Methods (XIMs) provide three categories of callbacks. One is preedit callbacks, which allow applications to display the intermediate feedbacks during preediting. The second is geometry callbacks, which allow applications and XIM to negotiate the geometry to be used for XIM. The third is status callbacks, which allow applications to display the internal status of XIM. Table 2-2 XIM Callbacks XIM Preedit Callbacks XIM Status Callbacks XIM Preedit Caret Callbacks XIM Geometry Callbacks (*PreeditStartCallback)() (*StatusStartCallback)() (*PreeditCaretCallback)() (*GeometryCallback)() (*PreeditDoneCallback)() (*StatusDoneCallback)() (*PreeditDrawCallback)() (*StatusDrawCallback)() Extracting Localized Text Although there are different methods to localize an application, the general rule is that any language-dependent information is outside the application and is stored in separate directories identified by a locale name. This section describes how the user, the application developer, and the implementation combine to establish the language environment of the application. Two general approaches to localizing applications are also discussed. The following three methods can be used: • • • 40 Resource files Message catalogs Private files CDE: Internationalization Programmer’s Guide 2 Resource Files This is the GUI toolkit mechanism for customizing all sorts of information about an application. The Intrinsic library (libXt) provides a sophisticated mechanism for merging the command-line options, application-defined resources, and user-defined resources. Resource files can be used for extracting localized text. The difference between resource files and message catalogs is that the resource database is compiled each time it is loaded. As such, care should be taken when deciding which strings to place in resource files and which to place in message catalogs. Also note that the Xm library functions do not depend on the LC_MESSAGE category when specifying the location from which localized resources are loaded. Refer to the XtSetLanguageProc() man page for more information. Message Catalogs This is the traditional operating system mechanism for accessing external databases containing localized text. These functions load a precompiled catalog file that is ready to be accessed. They also provide defaults within the actual program for cases when no catalogs may be found. The messaging support is based on both the XPG4 and System V Release 4 (SVR4) interfaces for accessing message catalogs. Private Files Private databases can be used by applications to provide generic, customized databases for more than just localization text. Usually, such databases do contain text. It is recommended that if the database is to be spread out over many files, some run-time indirect access of localized text be provided. Without this access, localization for the average user is a difficult effort. Generally, such private file formats are discouraged by groups doing localization. But problems are reduced if a tool is provided specifically for localization of text only. Message Guidelines Message guidelines foster consistent formatting of message and help information. They also promote creation and maintenance of messages that can be easily understood by inexperienced English-speaking end users, as well as Internationalization and the Common Desktop Environment 41 2 by inexperienced translators. Use these guidelines to create message files that are consistent in language and clear in meaning. Distribution of these guidelines enable programmers and writers to coordinate their messagewriting efforts. Default messages, external message files, and planned delivery of translatable messages are required for each executable to fully implement international language support. Message Extraction Functions One of the requirements of internationalizing programs (basic commands and utilities inclusive) is that the messages displayed on the output devices be in the language of the user. As these programs may be used in many countries (international locales), the messages must be translated into the various languages of these countries. There are two sets of message extraction functions in the desktop environment: XPG4 functions and Xlib functions. XPG4/Universal UNIX Messaging Functions The XPG4 message facility consists of several components: message source files, catalog generation facilities, and programming interfaces. Following are the XPG4/Universal UNIXTM message functions: • • • catopen() catgets() catclose() XPG4 Messaging Examples There are three parts to this example which demonstrates how to retrieve a message from a catalog. The first part shows the message source file and the second part shows the method used to generate the catalog file. The third part shows an example program using this catalog. Message Source File The message catalog can be specified as follows: example.msg file: $quote “ $ every message catalog should have a beginning set number. 42 CDE: Internationalization Programmer’s Guide 2 $set 1 This is the set 1 of messages 1 “Hello world\n” 2 “Good Morning\n” 3 “example: 1000.220 Read permission is denied for the file %s.\n“ $set 2 1 “Howdy\n” Generation of Catalog File This file is input to the gencat utility to generate the message catalog example.cat as follows: gencat example example.msg Accessing the Catalog in a Program #include#include char *MF_EXAMPLE = "example.cat" main() { nl_catd catd; int error; (void)setlocale(LC_ALL, “”); catd = catopen(MF_EXAMPLE, 0); /* Get the message number 1 from the first set.*/ printf( catgets(catd,1,1,“Hello world\n”) ); /* Get the message number 1 from the second set.*/ printf( catgets(catd, 2, 1,“Howdy\n”) ); /* Display an error message.*/ printf( catgets(catd, 1, 4,“example: 100.220 Permission is denied to read the file %s.\n“) , MF_EXAMPLE); catclose(catd); } Internationalization and the Common Desktop Environment 43 2 Xlib Messaging Functions The following Xlib messaging functions provide a similar input/output (I/O) operation to the resources. • • • • XrmPutFileDatabase() XrmGetFileDatabase() XrmGetStringDatabase() XrmLocaleOfDatabase() They are described in X Window System, The Complete Reference to Xlib, Xprotocol, ICCCM, XLFD - X Version 11, Release 5. Xlib Message and Resource Facilities Part of internationalizing a system environment, toolkit-based application is not having any locale-specific data hardcoded within the application source. One common locale-specific item is messages (error and warning) returned by the application of the standard I/O. In general, for any error or warning messages to be displayed to the user through a system environment toolkit widget or gadget, externalize the messages through message catalogs. For dialog messages to be displayed through a toolkit component, externalize the messages through localized resource files. This is done in the same way as localizing resources, such as the XmLabel and XmPushButton classes’ XmNlabelString resource or window titles. For example, if a warning message is to be displayed through an XmMessageBox widget class, the XmNmessageString resource cannot be hardcoded within the application source code. Instead, the value of this resource must be retrieved from a message catalog. For an internationalized application expected to run in different locales, a distinct localized catalog must exist for each of the locales to be supported. In this way, the application need not be rebuilt. Localized resource files can be put in the /usr/lib/X11/%L/appdefaults subdirectories, or they can be pointed to by the XENVIRONMENT environment variable. The %L variable is replaced with the name of the locale used at run time. 44 CDE: Internationalization Programmer’s Guide 2 Localized Resources This section describes which widget and gadget resources are locale-sensitive. The information is organized by related functionality. For example, the first section describes those resources that are locale-sensitive for widgets used to display labels or to provide push-button functionality. Labels and Buttons Table 2-3 lists the localized resources that are used as labels. Many of them are of type XmString. The rest are of type color or char*. See the Motif 1.2 Reference Manual for detailed descriptions of these resources. In each case, the application should not hardcode these resources. If resource values need to be specified by the application, it should be done with the app-defaults file, ensuring that the resource can be localized. Only the widget class resources are listed here; subclasses of these widgets are not listed. For example, the XmDrawnButton widget class does not introduce any new resources that are localized. However, it is a subclass of the XmLabelWidget widget class; therefore, its accelerator resource, acceleratorText resource, and so on, are also localized and should not be hardcoded by an application. Table 2-3 Localized Resources Widget Class Resource Name Core *background:1 XmCommand *command: XmCommand *promptString: XmFileSelectionBox *dirListLabelString: XmFileSelectionBox *fileListLabelString: XmFileSelectionBox *filterLabelString: XmFileSelectionBox *noMatchString: XmLabel[Gadget] *accelerator: XmLabel[Gadget] *acceleratorText: XmLabel[Gadget] *labelString: XmLabel[Gadget] *mnemonic: Internationalization and the Common Desktop Environment 45 2 Table 2-3 Localized Resources (Continued) Widget Class Resource Name XmList *stringDirection: XmManager *stringDirection: XmMessageBox *cancelLabelString: XmMessageBox *helpLabelString: XmMessageBox *messageString: XmMessageBox *okLabelString: XmPrimitive *foreground:1 XmRowColumn *labelString: XmRowColumn *menuAccelerator: XmRowColumn *mnemonic: XmRowColumn(SimpleMenu*) *buttonAccelerators: XmRowColumn *mnemonic: XmRowColumn *mnemonic: XmRowColumn *mnemonic: XmRowColumn *mnemonic: XmSelectionBox *applyLabelString: XmSelectionBox *cancelLabelString: XmSelectionBox *helpLabelString: XmSelectionBox *listLabelString: XmSelectionBox *okLabelString: XmSelectionBox *selectionLabelString: XmSelectionBox *textAccelerators: 1. The foreground and background colors are not localized due to restrictions in the X protocol that require color names to be limited to the portable character set. Localized color names are left to applications to provide a localized database to map to a name encoded with the portable character set. Note that the XmRowColumn widget has additional string resources that may be localized. These resources are listed in the XmRowColumn man page, under the heading “Simple Menu Creation Resource Set.” As the title implies, these resources affect only RowColumn widgets created with the 46 CDE: Internationalization Programmer’s Guide 2 XmCreateSimpleMenu() function. The resources affected are: *buttonAccelerators, *buttonAcceleratorText, *buttonMnemonics, *optionLabel, and *optionMnemonic. These resources are not included in Table 2-3 because they are rarely used and apply to RowColumn only when creating a simple menu. List Resources Several widgets allow applications to set or read lists of items in the widget. Table 2-4 shows which widgets allow this and the resources they use to set or read these lists. Because the list items may need to be localized, do not hardcode these lists. Rather, they should be set as resources in app-defaults files, allowing them to be localized. The type for each list is XmStringList. Table 2-4 Resources Used for Reading Lists Widget Class Resource Name XmList *items: XmList *selectedItems: XmSelectionBox *listItems: Title Table 2-5 lists the resources used for setting titles and icon names. Normally, an application need only set the *title: and *iconName: resources. The encoding of each is automatically detected for clients doing proper locale management. All of these are of type char or XmString. Table 2-5 Resources Used for Setting Titles and Icon Names Widget Class Resource Name TopLevelShell *iconName: TopLevelShell *iconNameEncoding:1 WmShell *title: WmShell *titleEncoding: 1 XmBulletinBoard *dialogTitle: XmScale *titleString: 1. This resource should not be set by the application. If the application calls XtSetLanguageProc, the default value (None) of this resource will automatically be set, ensuring that localized text can be used for the title. Internationalization and the Common Desktop Environment 47 2 Text Widget Table 2-6 lists the Text[Field] resources that are locale-sensitive or about which the developer of an internationalized application should know. Table 2-6 Locale-Sensitive Text[Field] Resources Widget Class Resource Name XmSelectionBox *textColumns:1 XmSelectionBox *textString: XmText *columns:1 XmText *modifyVerifyCallback: XmText *modifyVerifyCallbackWcs: XmText *value: XmText *valueWcs: XmTextField *columns:1 XmTextField *modifyVerifyCallback: XmTextField *modifyVerifyCallbackWcs: XmTextField *value: XmTextField *valueWcs: 1. The *columns resource specifies the initial width of the Text[Field] widget in terms of the number of characters to be displayed. In the case of a variable width font or in a locale where the size of a character varies significantly, a column is the amount of space required to display the widest character in that locale’s character repertoire. For example, a column width of 10 guarantees that at least 10 characters of the current locale can be displayed; it is possible (likely) that more than that number of characters can be displayed in the allocated space. 48 CDE: Internationalization Programmer’s Guide 2 Input Method (Keyboards) Table 2-7 lists localized resources for customizing the input method. These resources allow the user or the application to control which input method will be used for the specified locale and which preedit style (if applicable and available) will be used. Table 2-7 Localized Resources for Input Method Customization Widget Class Resource Name VendorShell *inputMethod: VendorShell *preeditType: Pixmap (Icon) Resources Table 2-8 lists pixmap resources. In some cases, a different pixmap may be needed for a given locale. Table 2-8 Pixmap Resources Widget Class Resource Name Core *backgroundPixmap: WMShell *iconPixmap: XmDragIcon *pixmap: XmDropSite *animation[Mask|Pixmap]: XmLabel[Gadget] *labelInsensitivePixmap: XmLabel[Gadget] *labelPixmap: XmMessageBox *symbolPixmap: XmPushButton[Gadget] *armPixmap: XmToggleButton[Gadget] *selectInsensitivePixmap: XmToggleButton[Gadget] *selectPixmap: A pixmap is a screen image that is stored in memory so that it can be recalled and displayed when needed. The desktop has a number of pixmap resources that allow the application to supply pixmaps for backgrounds, borders, shadows, label and button faces, drag icons, and other uses. As with text, some pixmaps may be specific to particular language environments; these pixmaps must be localized. Internationalization and the Common Desktop Environment 49 2 The desktop maintains caches of pixmaps and images. The XmGetPixmapByDepth() function searches these caches for a requested pixmap. If the requested pixmap is not in the pixmap cache and a corresponding image is not in the image cache, the XmGetPixmapByDepth() function searches for an X bitmap file whose name matches the requested image name. The XmGetPixmapByDepth() function calls the XtResolvePathname() function to search for the file. If the requested image name is an absolute path name, that path name is the search path for the XtResolvePathname() function. Otherwise, the XmGetPixmapByDepth() function constructs a search path in the following way: • If the XBMLANGPATH environment variable is set, the value of that variable is the search path. • If XBMLANGPATH is not set but XAPPLRESDIR is set, the XmGetPixmapByDepth() function uses a default search path with entries that include $XAPPLRESDIR, the user’s home directory, and vendordependent system directories. • If neither XBMLANGPATH nor XAPPLRESDIR is set, the XmGetPixmapByDepth() function uses a default search path with entries that include the user’s home directory and vendor-dependent system directories. These paths may include the %B substitution field. In each call to the XtResolvePathname() function, the XmGetPixmapByDepth() function substitutes the requested image name for %B. The paths may also include other substitution fields accepted by the XtResolvePathname() function. In particular, the XtResolvePathname() function substitutes the display’s language string for %L, and it substitutes the components of the display’s language string (in a vendor-dependent way) for %l, %t, and %c. The substitution field %T is always mapped to bitmaps, and %S is always mapped to Null. Because there is no string-to-pixmap converter supplied by default, pixmaps are generally set by the application at creation time by first retrieving the pixmap with a call to XmGetPixmap(). XmGetPixmap() uses the current locale to determine where to locate the pixmap. (See the XmGetPixmap() man page for a description of how locale is used to locate the pixmap.) 50 CDE: Internationalization Programmer’s Guide 2 Font Resources Table 2-9 lists the localized font resources. All XmFontList resources are of type XmFontList. In almost all cases, a fontset should be used when specifying a fontlist element. The only exception is when displaying character data that does not appear in the character set of the user (for example, displaying math symbols or dingbats). Table 2-9 Localized Font Resources Widget Class Resource Name VendorShell *buttonFontList: VendorShell *defaultFontList: VendorShell *labelFontList: VendorShell *textFontList: XmBulletinBoard *buttonFontList: XmBulletinBoard *defaultFontList: XmBulletinBoard *labelFontList: XmBulletinBoard *textFontList: XmLabel[Gadget] *fontList: XmList *fontList: XmMenuShell *buttonFontList: XmMenuShell *defaultFontList: XmMenuShell *labelFontList: XmText *fontList: XmTextField *fontList: Operating System Internationalized Functions Table 2-10 lists the base operating system internationalized functions in a common open software environment. Internationalization and the Common Desktop Environment 51 2 Applications should perform proper locale management with the assumption that a locale may have from 1 to 4 bytes per coded character. Table 2-10 Base Operating System Internationalized Functions Locale Management Single-byte Convert mb <-> wc Wide Character mbtowc mbstowcs wctomb wcstombs Classification isalpha is* isalpha isw* wctype Case Mapping tolower toupper towlower towupper Format Miscellaneous localeconv nl_langinfo Format of Numeric strtol strtod wcstol wcstod wcstoi Format Time/Monetary strftime strptime strfmon wcsftime String Copy strcat strcpy strncat strncpy wcscat wcsncat wcscpy wcsncpy String Collate strcoll wcscoll wcsxfrm mblen wcscmp wcsncmp String Misc strlen String Search strchr strcspn strpbrk strrchr strspn strtok I/O Display Width 52 Multibyte CDE: Internationalization Programmer’s Guide wcschr wcscspn wcspbrk wcsrchr wcsspn wcstok wcswcs wcscspn wcwidth1 wcswidth 2 Table 2-10 Base Operating System Internationalized Functions (Continued) Locale Management Multibyte Wide Character I/O Printf printf vprintf sprintf vsprint fprintf vfprint printf vprintf sprintf vsprint frpintf vfprint I/O Scan scanf sscanf fscanf scanf sscanf fscanf I/O Character Single-byte getc gets putc puts fgetwc fgetws fputwc fputws ungetwc Message gettxt catopen catgets catclose Convert Codeset iconv_open iconv iconv_close 1. These functions are provided for applications using terminals. Graphical user interface (GUI) applications should not use these functions; instead, they should use font metric functions listed on page 31 to determine spacing. Internationalization and the Common Desktop Environment 53 2 54 CDE: Internationalization Programmer’s Guide Internationalization and Distributed Networks 3 This chapter discusses tasks related to internationalization and distributed networks. Interchange Concepts 55 Simple Text Basic Interchange 60 Mail Basic Interchange 62 Encodings and Code Sets 63 Interchange Concepts This section describes the way 8-bit user names and 8-bit data can be communicated on a network for communications utilities, such as ftp, mail, or interclient communication between the desktop clients. There are three primary considerations for communicating data: • • Sender’s code set and the receiver’s code set. • Type of interchange encoding available, per protocol rules. The actual conversion needed is dependent on the specific protocol used. Whether the communications protocol allows 8-bit data or is limited to 7-bit coded data (for example, the Japanese JUNET passes Japanese Industrial Standard (JIS) coded data over 7-bit protocols). 55 3 If the remote host uses the same code set as the local host, the following is true: • • If the protocol allows 8-bit data, no conversions are needed. If the protocol allows only 7-bit data, a method is needed to map the 8-bit code points to 7-bit ASCII values. This could be accomplished using the iconv framework and one of the following types of 7-bit encoded methods: • Map 8-bit data as specified in the POSIX.2 specification for uuencode and uudecode algorithms. • Optionally, the 8-bit data may be mapped to a 7-bit interchange encoding as defined by the protocol; for example, 7-bit ISO2022 in Xlib or base64 in Multipurpose Internet Message Extensions (MIME). If the remote host’s code set is different from that of the local host, the following two cases may apply. The conversion needed is dependent on the specific protocol used. • If the protocol allows 8-bit data, the protocol will need to specify which side does the iconv conversion and to specify the encoding on the wire. In some protocols, an 8-bit interchange encoding is recommended that is capable of encoding all possible code sets and identifying character repertoire. • If the protocol allows only 7-bit data, a 7-bit interchange encoding is needed, as is the identifying character repertoire. iconv Interface In a network environment, the code sets of the communicating systems and the protocols of communication determine the transformation of user-specified data so that it can be sent to the remote system in a meaningful way. The user data (not user names) may need to be transformed from the sender’s code set to the receiver’s code set, or 8-bit data may need to be transformed into a 7-bit form to conform to protocols. A uniform interface is needed to accomplish this. In the following examples, using the iconv interface is illustrated by explaining how to use iconv_open(), iconv(), and iconv_close(). To do the conversion, iconv_open() must be followed by iconv(). The terms 7-bit interchange and 8-bit interchange are used to refer to any interchange encoding used for 7-bit and 8-bit data, respectively. 56 CDE: Internationalization Programmer’s Guide 3 Sender and Receiver Use the Same Code Sets: • If the protocol allows 8-bit data, use 8-bit data because the same code set is being used. No conversion is needed. • If the protocol allows only 7-bit data, use iconv: • Sender cd = iconv_open(locale_codeset, uuencoded ); • Receiver cd = iconv_open("uucode", locale_codeset ); Sender and Receiver Use Different Code Sets: • If the protocol allows 8-bit data: • Sender cd = iconv_open(locale_codeset,8-bitinterchange ); • Receiver cd = iconv_open(8-bitinterchange, locale_codeset ); • If the protocol allows only 7-bit data, do the following: • Sender cd = iconv_open(locale_codeset, 7-bitinterchange ); • Receiver cd = iconv_open(7-bitinterchange, locale_codeset ); The locale_codeset refers to the code set being used locally by the application. Note that while the nl_langinfo(CODESET) function may be used to obtain the code set associated with the current locale, it is implementation-dependent whether any conversion names match the return from the nl_langinfo(CODESET) function. Internationalization and Distributed Networks 57 3 The Table 3-1 outlines how iconv can be used to perform conversions for various conditions. Specific protocols may dictate other conversions needed. Table 3-1 Using iconv to Perform Conversions Communication with system using the same code set (for example, XYZ) Communication with system using different code sets or receiver’s code set is unknown Conversion to Use 7-bit Protocol 8-bit Protocol 7-bit Protocol 8-bit Protocol code XYZ Invalid Best Choice Invalid Invalid if remote code set is unknown 7-bit Interchange ISO2022 OK OK Best Choice OK 8-bit Interchange ISO2022 ISO 10646 Invalid1 OK Invalid Best Choice 7-bit Untagged quotedprintable uucode OK OK Requires code set identification Requires code set identification 8-bit Untagged base64 Invalid OK Requires code set identification Requires code set identification 1. Invalid means the interchange encoding should not be used for the choice of code set and type of protocol. 58 CDE: Internationalization Programmer’s Guide 3 Stateful and Stateless Conversions Code sets can be classified into two categories: stateful encodings and stateless encodings. Stateful Encodings Stateful encoding uses sequences of control codes, such as shift-in/shift-out, to change character sets associated with specific code values. For instance, under compound text, the control sequence "ESC$(B" can be used to indicate the start of Japanese 16-bit data in a data stream of characters, and "ESC(B" can be used to indicate the end of this double-byte character data and the start of 8-bit ASCII data. Under this stateful encoding, the bit value 0x43 could not be interpreted without knowing the shift state. The EBCDIC Asian code sets use shift-in/shift-out controls to swap between double- and singlebyte encodings, respectively. Converters that are written to do the conversion of stateful encodings to other code sets tend to be a little complex due to the extra processing needed. Stateless Encodings Stateless code sets are those that can be classified as one of two types: • • Single-byte code sets, such as the ISO8859 family Multibyte code sets, such as PC codes for Japanese and Shift-JIS (SJIS) The term multibyte code sets is also used to refer to any code set that needs one or more bytes to encode a character; multibyte code sets are considered stateless. Note – Conversions are meaningful only if the code sets represent the same character set. Internationalization and Distributed Networks 59 3 Simple Text Basic Interchange When a program communicates data to another program residing on a remote host, a need may arise for conversion of data from the code set of the source machine to that of the receiver. For example, this happens when a PC system using PC codes needs to communicate with a workstation using an International Organization for Standardization/Extended UNIX Code (ISO/EUC) encoding. Another example occurs when a program obtains data in one code set but has to display this data in another code set. To support these conversions, a standard program interface is provided based on the XPG4 iconv() function definitions. All components doing code set conversion should use the iconv functions as their interface to conversions. Systems are expected to provide a wide variety of conversions, as well as a mechanism to customize the default set of conversions. iconv Conversion Functions The common method of conversions from one code set to another is through a table-driven method. In some cases, these tables may be too large, hence an algorithmic method may be more desirable. To accommodate such diverse requirements, a framework is defined in XPG4 for code set conversions. In this framework, to convert from one code set to another, open a converter, perform the conversions, and close the converter. The iconv functions are iconv_open(), iconv(), and iconv_close(). Code set converters are brought under the framework of the iconv_open(), iconv(), and iconv_close() set of functions. With these functions, it is possible to provide and to use several different types of converters. Applications can call these functions to convert characters in one code set into characters in another code set. With the advent of the iconv framework, converters can be provided in a uniform manner. The access and use of these converters is being standardized under X/Open XPG4. 60 CDE: Internationalization Programmer’s Guide 3 X Interclient (ICCCM) Conversion Functions Xlib provides the following functions for doing conversions. X ICCCM Multibyte Functions XmbTextPropertyToTextList() XmbTextListToTextProperty() ICCCM Wide Character Functions XwcTextPropertyToTextList() XwcTextListToTextProperty() Note – The libXm library does provide the XmStringConvertToCT() and XmStringConvertFromCT() functions; however, these are not recommended because there are some hardcoded assumptions about certain XmString tags. For example, if the tag is bold, XmStringConvertToCT() is implementationdependent. Across various platforms, the behavior of this function cannot be guaranteed in all international regions. Refer to “Interclient Communications Conventions for Localized Text” on page 122 for more information. Window Titles The standard way for setting titles is to use resources. But for applications that set the titles of their windows directly, a localized title must be sent to the Window Manager. Use the XCompoundTextStyle encoding defined in XICCEncodingStyle, as well as the following guidelines: • Compound text can be created either by XmbTextListToTextProperty() or XwcTextListToTextProperty(). • Localized titles can be displayed using the XmNtitle and XmNtitleEncoding resources of the WMShell widget. Localized icon names can be displayed using the XmNiconName and XmNiconNameEncoding resources of the TopLevelShell widget. • Localized titles of dialog boxes can also be displayed using the XmNdialogTitle resource of the XmBulletinBoard widget. • Window Manager should have an appropriate fontlist for displaying localized strings. Internationalization and Distributed Networks 61 3 Following is an example of displaying a localized title and icon name. Compound text is made from the compound string in this example. include Widget Arg int XTextProperty char nl_catd toplevel; al[10]; ac; title; *localized_string; fd; XtSetLanguageProc( NULL, NULL, NULL ); fd = catopen( "my_prog", 0 ); localized_string = catgets(fd, set_num, mes_num, "defaulttitle"); XmbTextListToTextProperty( XtDisplay(toplevel), &localized_string, 1, XCompoundTextStyle, &title); ac = 0; XtSetArg(al[ac], XmNtitle, title.value); ac++; XtSetArg(al[ac], XmNtitleEncoding, title.encoding); ac++; XtSetValues(toplevel, al, ac); If you are using a window rather than widgets, the XmbSetWMProperties() function automatically converts a localized string into the proper XICCEncodingStyle. Mail Basic Interchange In general, electronic mail (email) strategy has been one of turning email into a canonical, labeled format as opposed to optimizing a message given knowledge of the receiver’s locale. This means that in the email world, you should always assume that the receiver may be in a different locale. In the desktop world, the default email transport is Simple Mail Transfer Protocol (SMTP), which only supports 7-bit transmission channels. With this understanding, the email strategy for the desktop is as follows: 62 • The sending agents, by default (unless instructed otherwise by the user), converts a body part into a standard format for the sending transmission channel and labels the body part with the character encoding used. • The receiving agent looks at the body part to see if it can support the character encoding; if it can, it converts it into the local character set. CDE: Internationalization Programmer’s Guide 3 In addition, because the MIME format is used for messages, any 8-bit to 7-bit transformations are done using the built-in MIME transport encodings (base64 or quoted-printable). See the Request for Comments (RFC) 1521 MIME standard specification. Encodings and Code Sets To understand code sets, it is necessary to first understand character sets. A character set is a collection of predefined characters based on the specific needs of one or more languages without regard to the encoding values used to represent the characters. The choice of which code set to use depends on the user's data processing requirements. A particular character set can be encoded using different encoding schemes. For example, the ASCII character set defines the set of characters found in the English language. The Japanese Industrial Standard (JIS) character set defines the set of characters used in the Japanese language. Both the English and Japanese character sets can be encoded using different code sets. The ISO2022 standard defines a coded character set as a group of precise rules that defines a character set and the one-to-one relationship between each character and its bit pattern. A code set defines the bit patterns that the system uses to identify characters. A code page is similar to a code set with the limitation that a code-page specification is based on a 16-column by 16-row matrix. The intersection of each column and row defines a coded character. Code Set Strategy The common open software environment code set support is based on International Organization for Standardization (ISO) and industry-standard code sets providing industry-standard code sets that satisfy the data processing needs of users. Each locale in the system defines which code set it uses and how the characters within the code set are manipulated. Because multiple locales can be installed on the system, multiple code sets can be used by different users on the system. While the system can be configured with locales using different code sets, all system utilities assume that the system is running under a single code set. Internationalization and Distributed Networks 63 3 Most commands have no knowledge of the underlying code set being used by the locale. The knowledge of code sets is hidden by the code-set-independent library subroutines (Internationalization libraries), which pass information to the code-set-dependent subroutines. Because many programs rely on ASCII, all code sets include the 7-bit ASCII code set as a proper subset. Because the 7-bit ASCII code set is common to all supported code sets, its characters are sometimes referred to as the portable character set. The 7-bit ASCII code set is based on the ISO646 definition and contains the control characters, punctuation characters, digits (0-9), and the English alphabet in uppercase and lowercase. Code Set Structure Each code set is divided into two principle areas: • • Graphic Left (GL) Columns 0-7 Graphic Right (GR) Columns 8-F The first two columns of each code set are reserved by ISO standards for control characters. The terms C0 and C1 are used to denote the control characters for the Graphic Left and Graphic Right areas, respectively. Note – The PC code sets use the C1 control area to encode graphic characters. The remaining six columns are used to encode graphic characters (see Table 3-2 on page 65). Graphic characters are considered to be printable characters, while the control characters are used by devices and applications to indicate some special function 64 CDE: Internationalization Programmer’s Guide 3 Table 3-2 Code Set Overview 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 A B C D E F C0 C o n t r o l s (Graphic Left) 7-Bit ASCII 7 8 9 C1 C o n t r o l s A B C D E F (Graphic Right) Code Set Unique Control Characters Based on the ISO definition, a control character initiates, modifies, or stops a control operation. A control character is not a graphic character, but can have graphic representation in some instances. The control characters in the ISO646IRV character set are present in all supported code sets,and the encoded values of the C0 control characters are consistent throughout the code sets. Graphic Characters Each code set can be considered to be divided into one or more character sets, such that each character is given a unique coded value. The ISO standard reserves six columns for encoding characters and does not allow graphic characters to be encoded in the control character columns. Single-Byte Code Sets Code sets that use all 8 bits of a byte can support European, Middle Eastern, and other alphabetic languages. Such code sets are called single-byte code sets. This provides a limit of encoding 191 characters, not including control characters. Internationalization and Distributed Networks 65 3 Multibyte Code Sets The term multibyte code sets is used to refer to all possible code sets regardless of the number of bytes needed to encode any specific character. Because the operating system should be capable of supporting any number of bits to encode a character, a multibyte code set may contain characters that are encoded with 8, 16, 32, or more bits. Even single-byte code sets are considered to be multibyte code sets. Extended UNIX Code (EUC) Code Set The EUC code set uses control characters to identify characters in some of the character sets. The encoding rules are based on the ISO2022 definition for the encoding of 7-bit and 8-bit data. The EUC code set uses control characters to separate some of the character sets. The term EUC denotes these general encoding rules. A code set based on EUC conforms to the EUC encoding rules but also identifies the specific character sets associated with the specific instances. For example, eucJP for Japanese refers to the encoding of the JIS characters according to the EUC encoding rules. The first set (CS0) always contains an ISO646 character set. All of the other sets must have the most-significant bit (MSB) set to 1, and they can use any number of bytes to encode the characters. In addition, all characters within a set must have: • • Same number of bytes to encode all characters Same column display width (number of columns on a fixed-width terminal) Each character in the third set (CS2) is always preceded with the control character SS2 (single-shift 2, 0x8e). Code sets that conform to EUC do not use the SS2 control character other than to identify the third set. Each character in the fourth set (CS3) is always preceded with the control character SS3 (single-shift 3, 0x8f). Code sets that conform to EUC do not use the SS3 control character other than to identify the fourth set. ISO EUC Code Sets The following code sets are based on definitions set by the International Organization for Standardization (ISO). 66 CDE: Internationalization Programmer’s Guide 3 • • • • • • ISO646-IRV ISO8859-1 ISO8859-x eucJP eucTW eucKR ISO646-IRV The ISO646-IRV code set defines the code set used for information processing based on a 7-bit encoding. The character set associated with this code set is derived from the ASCII characters. ISO8859-1 ISO8859-1 encoding is a single-byte encoding that is based on and is compatible with other ISO, American National Standards Institute (ANSI), and European Computer Manufacturer's Association (ECMA) code extension techniques. The ISO8859 encoding defines a family of code sets with each member containing its own unique character sets. The 7-bit ASCII code set is a proper subset of each of the code sets in the ISO8859 family. The ISO8859-1 code set is called the ISO Latin-1 code set and consists of two character sets: • • ISO646-IRV Graphic Left, 7-bit ASCII character set ISO8859-1 Graphic Right (Latin) character set These character sets combined include the characters necessary for Western European languages such as Danish, Dutch, English, Finnish, French, German, Icelandic, Italian, Norwegian, Portuguese, Spanish, and Swedish. While the ASCII code set defines an order for the English alphabet, the Graphic Right (GR) characters are not ordered according to any specific language. The language-specific ordering is defined by the locale. Other ISO8859 Code Sets This section lists the other significant ISO8859 code sets. Each code set includes the ASCII character set plus its own unique characters. Internationalization and Distributed Networks 67 3 ISO8859-2 Latin alphabet, No. 2, Eastern Europe • • • • • • • • • • Albanian Czechoslovakian English German Hungarian Polish Rumanian Serbo-Croatian Slovak Slovene ISO8859-5 Latin/Cyrillic alphabet • • • • • • Bulgarian Byelorussian English Macedonian Russian Ukrainian ISO8859-6 Latin/Arabic alphabet • • English Arabic ISO8859-7 Latin/Greek alphabet • • English Greek ISO8859-8 Latin/Hebrew alphabet • 68 English CDE: Internationalization Programmer’s Guide 3 • Hebrew ISO8859-9 Latin/Turkish alphabet • • • • • • • • • • • • • Danish Dutch English Finnish French German Irish Italian Norwegian Portuguese Spanish Swedish Turkish eucJP The EUC for Japanese consists of single-byte and multibyte characters (2 and 3 bytes). The encoding conforms to ISO2022 and is based on JIS and EUC definitions, see . Table 3-3 Encoding for eucJP CS Encoding Character Set cs0 0xxxxxxx ASCII cs1 1xxxxxxx 1xxxxxxx JIS X0208-1990 cs2 0x8E 1xxxxxxx JIS X0201-1976 cs3 0x8F 1xxxxxxx 1xxxxxxx JIS X0212-1990 JIS X0208-1990 A code of the Japanese graphic character set for information interchange (1990 version) that contains 147 special characters, 10 numeric digits, 83 Hiragana characters, 86 Katakana characters, 52 Latin characters, 48 Greek characters, 66 Cyrillic characters, 32 line-drawing elements, and 6355 Kanji characters. Internationalization and Distributed Networks 69 3 JIS X0201 A code for information interchange that contains 63 Katakana characters. JIS X0212-1990 A code of the supplementary Japanese graphic character set for information interchange (1990 version) that contains 21 additional special characters, 21 additional Greek characters, 26 additional Cyrillic characters, 27 additional Latin characters, 171 Latin characters with diacritical marks, and 5801 additional Kanji characters. eucTW The EUC for Traditional Chinese is an encoding consisting of characters that contain single-byte and multibyte (2 and 4 bytes) characters. The EUC encoding conforms to ISO2022 and is based on the Chinese National Standard (CNS) as defined by the Republic of China and the EUC definition, see Table 3-4. Table 3-4 70 Encoding for eucTW CS Encoding Character Set cs0 0xxxxxxx ASCII cs1 1xxxxxxx 1xxxxxxx cs2 0x8EA2 1xxxxxxx 1xxxxxxx CNS 11643.1992 - plane 2 cs3 0x8EA3 1xxxxxxx 1xxxxxxx CNS 11643.1992 - plane 3 0x8EB0 1xxxxxxx 1xxxxxxx CNS 11643.1992 - Plane 16 CNS 11643.1992 - plane 1 CDE: Internationalization Programmer’s Guide 3 CNS 11643-1992 defines 16 planes for the Chinese Standard Interchange Code, each plane can support up to 8836 characters (94x94). Currently, only planes 1 through 7 have characters assigned. Table 3-5 shows the 16 planes of the CNS 11643-1992 standard. Table 3-5 16 Planes of the CNS 11643-1992 Standard Plane Definition # of Character EUC Encoding 1 Most frequently used 6085 A1A1-FDCB 2 Secondary frequently 7650 8EA2 A1A1 - 8EA2 F2C4 6148 8EA3 A1A1 - 8EA3 E2C6 3 1 Exec.Yuen EDP center 2 4 RIS , Vendor defined 7298 8EA4 A1A1 - 8EA4 EEDC 5 Rarely used by MOE3 8603 8EA5 A1A1 - 8EA5 FCD1 6 Variation char set 1 by MOE 6388 8EA6 A1A1 - 8EA6 E4FA 7 Variation char set 2 by MOE 6539 8EA7 A1A1 - 8EA7 E6D5 8 Undefined 0 8EA8 A1A1 - 8EA8 FEFE 9 Undefined 0 8EA9 A1A1 - 8EA9 FEFE 10 Undefined 0 8EAA A1A1 - 8EAA FEFE 11 Undefined 0 8EAB A1A1 - 8EAB FEFE 12 User Defined Character (UDC) 0 8EAC A1A1 - 8EAC FEFE 13 UDC 0 8EAD A1A1 - 9EAD FEFE 14 UDC 0 8EAE A1A1 - 8EAE FEFE 15 UDC 0 8EAF A1A1 - 8EAF FEFE 16 UDC 0 8EB0 A1A1 - 8EB0 FEFE 1. EDP: Center of Directorate, General of Budget, Accounting, and Statistics 2. RIS: Residence Information System 3. MOE: Ministry of Education Internationalization and Distributed Networks 71 3 eucKR The EUC for Korean is an encoding consisting of single-byte and multibyte characters (shown in Table 3-6). The encoding conforms to ISO2022 and is based on Korean Standard Code (KSC) set and EUC definitions. Table 3-6 Encoding for eucKR. CS Encoding Character Set cs0 0xxxxxxx ASCII cs1 1xxxxxxx 1xxxxxxx KS C 5601-1992 cs2 Not used cs3 Not used KSC 5601-1992 (code of the Korean character set for information interchange, 1992 version) contains 432 special characters, 30 Arabic and Roman numeral characters, 94 Hangul alphabet characters, 52 Roman characters, 48 Greek characters, 27 Latin characters, 169 Japanese characters, 66 Russian characters, 68 line-drawing elements, 2344 precomposed Hangul characters, and 4888 Hanja characters. One Hangul character can be comprised of several consonants and vowels. Most Hangul words can be expressed in Hanja words. Hanja is a set of Traditional Chinese characters, which is currently used by Korean people. Each Hanja character has its own meaning and is thus more specific than Hangul most of the time. 72 CDE: Internationalization Programmer’s Guide 4 Motif Dependencies This chapter discusses tasks related to internationalizing with Motif. Locale Management 73 Font Management 75 Font List Syntax 78 Drawing Localized Text 80 Inputting Localized Text 87 Internationalized User Interface Language 92 Locale Management The term language environment refers to the set of localized data that the application needs to run correctly in the user-specified locale. A language environment supplies the rules associated with a specific language. In addition, the language environment consists of any externally stored data, such as localized strings or text used by the application. For example, the menu items displayed by an application might be stored in separate files for each language supported by the application. This type of data can be stored in resource files, User Interface Definition (UID) files, or message catalogs (on XPG3-compliant systems). 73 4 A single language environment is established when an application runs. The language environment in which an application operates is specified by the application user, often either by setting an environment variable (LANG or LC_* on POSIX-based systems) or by setting the xnlLanguage resource. The application then sets the language environment based on the user’s specification. The application can do this by using the setlocale() function in a language procedure established by the XtSetLanguageProc() function. This causes Xt to cache a per-display language string that is used by the XtResolvePathname() function to find resource, bitmap, and User Interface Language (UIL) files. An application that supplies a language procedure can either provide its own procedure or use an Xt default procedure. In either case, the application establishes the language procedure by calling the XtSetLanguageProc() function before initializing the toolkit and before loading the resource databases (such as by calling the XtAppInitialize() function). When a language procedure is installed, Xt calls it in the process of constructing the initial resource database. Xt uses the value returned by the language procedure as its per-display language string. The default language procedure performs the following tasks: • Sets the locale. This is done by using: setlocale(LC_ALL, language); where language is the value of the xnlLanguage resource, or the empty string (“”) if the xnlLanguage resource is not set. When the xnlLanguage resource is not set, the locale is generally derived from an environment variable (LANG on POSIX-based systems). • Calls the XSupportsLocale() function to verify that the locale just set is supported. If not, a warning message is issued and the locale is set to C. • • Calls the XSetLocaleModifiers() function specifying the empty string. Returns the value of the current locale. On ANSI C-based systems, this is the result of calling: setlocale(LC_ALL, NULL); The application can use the default language procedure by making the call to the XtSetLanguageProc() function in the following manner: 74 CDE: Internationalization Programmer’s Guide 4 XtSetLanguageProc(NULL, NULL, NULL); . . toplevel = XtAppInitialize(...); By default, Xt does not install any language procedure. If the application does not call the XtSetLanguageProc() function, Xt uses as its per-display language string the value of the xnlLanguage resource if it is set. If the xnlLanguage resource is not set, Xt derives the language string from the LANG environment variable. Note – The per-display language string that results from this process is implementation-dependent, and Xt provides no public means of examining the language string once it is established. By supplying its own language procedure, an application can use any procedure it wants for setting the language string. Font Management The desktop uses font lists to display text. A font defines a set of glyphs that represent the characters in a given character set. A font set is a group of fonts that are needed to display text for a given locale or language. A font list is a list of fonts, font sets, or a combination of the two, that may be used. Motif has convenience functions to create a font list. Font List Structure The desktop requires a font list for text display. A font list is a list of font structures, font sets, or both, each of which has a tag to identify it. A font set ensures that all characters in the current language can be displayed. With font structures, the responsibility for ensuring that all characters can be displayed rests with the programmer (including converting from the code set of the locale to glyph indexes). Each entry in a font list is in the form of a {tag, element} pair, where element can be either a single font or a font set. The application can create a font list entry from either a single font or a font set. For example, the following code segment creates a font list entry for a font set: Motif Dependencies 75 4 char font1[] = "-adobe-courier-medium-r-normal--10-100-75-75-M-60"; font_list_entry = XmFontListEntryLoad (displayID, font1, XmFONT_IS_FONTSET, “font_tag”); The XmFontListEntryLoad() function loads a font or creates and loads a font set. The following are the four arguments to the function: displayID Display on which the font list is to be used. fontname A string that represents either a font name or a base font name list, depending on the nametype argument. nametype A value that specifies whether the fontname argument refers to a font name or a base font name list. tag A string that represents the tag for this font list entry. If the nametype argument is XmFONT_IS_FONTSET, the XmFontListEntryLoad() function creates a font set in the current locale from the value in the fontname argument. The character sets of the fonts specified in the font set are dependent on the locale. If nametype is XmFONT_IS_FONT, the XmFontListEntryLoad() function opens the font found in fontname. In either case, the font or font set is placed into a font list entry. The following code example creates a new font list and appends the entry font_list_entry to it: XmFontList font_list; XmFontListEntry font_list_entry; . . font_list = XmFontListAppendEntry (NULL, font_list_entry); XmFontListEntryFree (font_list_entry); Once a font list has been created, the XmFontListAppendEntry() function adds a new entry to it. The following example uses the XmFontListEntryCreate() function to create a new font list entry for an existing font list. XFontSet font2; char *font_tag; XmFontListEntry font_list_entry2; . . font_list_entry2 = XmFontListEntryCreate (font_tag, XmFONT_IS_FONTSET, (XtPointer)font2); 76 CDE: Internationalization Programmer’s Guide 4 The font2 parameter specifies an XFontSet returned by the XCreateFontSet() function. The arguments to the XmFontListEntryCreate() function are font_tag, XmFONT_IS_FONTSET, and font2, which are the tag, type, and font, respectively. The tag and the font set are the {tag, element} pair of the font list entry. To add this entry to the font list, use the XmFontListAppendEntry() function again, only this time, its first parameter specifies the existing font list. font_list = XmFontListAppendEntry(font_list, font_list_entry2); XmFontListEntryFree(font_list_entry2); Font Lists Examples The syntax for specifying a font list in a resource file depends on whether the list contains fonts, font sets, or both. Obtaining a Font To obtain a font, specify a font and an optional font list element tag. • • If the tag is present, it should be preceded by an = (equal sign). If the tag is not present, do not use an = (equal sign). Entries specifying more than one font are separated by a , (comma). Obtaining a Font Set To obtain a font set, specify a base font list and an optional font list element tag. • • If the tag is present, it should be preceded by a : (colon) instead of an = (equal sign). If the tag is not present, the colon must still be present as this is what distinguishes a font from a font set in the resource declaration. Fonts specified in the base font list are separated by a ; (semicolon). Entries specifying more than one font set are separated by a , (comma). Motif Dependencies 77 4 Specifying a Font When the Font List Element Tag Is Absent If the font list element tag is not present, the default XmFONTLIST_DEFAULT_TAG is used. Here are some examples. • Specifying a font using the default font list element tag: *fontList: fixed *fontList: \ -adobe-courier-medium-r-normal--10-100-75-75-M-60-iso8859-1 • Specifying a font list element tag: *fontList: fixed=ROMAN, 8x13bold=BOLD • Specifying two fonts, one with the default font list element tag and one with an explicit tag: *fontList: fixed, 8x13bold=BOLD Specifying a Font Set When the Font List Element Tag Is Absent If the font list element tag is not present, the default XmFONTLIST_DEFAULT_TAG is used. Here are some examples of specifying a font set. • Let Xlib select the fonts without specifying a font list element tag: *fontList: -dt-application-medium-r-normal-*-m*-*-*-*-m-* • Let Xlib select the fonts and specify a font list element tag as MY_TAG: *fontList: -dt-application-medium-r-normal-*-m*-*-*-*-m*:MY_TAG • Let Xlib select the fonts, specify a font list element tag for bold fonts, and use the default font list element tag for the others: *fontList:-dt-application-medium-r-normal-*-m*-*-*-*-m-*:,\ -dt-application-medium-r-normal-style2-m*-*-*-*-m-*:BOLD Font List Syntax The XmFontList() data type can contain one or more entries that are associated with one of the following elements: 78 XFontStruct An X font that can be used to draw text encoded in the charset of the font, that is, font-encoded text. XFontSet A collection of XFontStruct fonts used to draw text encoded in a locale, that is, localized text. CDE: Internationalization Programmer’s Guide 4 The following syntax is used by the string-to-XmFontList converter: XmFontList := {’, ’fontentry} fontentry := | baselist := {’;’ } fontsetid := ’:’ | fontname := fontid := ’=’ | XLFD string := refer to XLFD Specification defaultfont := NULL defaultfontset:= ’:’NULL string := any character from ISO646IRV, except newline A fontentry with a given XmFontList can specify either a font or a font set. In either case, the ID (fontid or fontsetid) can be referenced by a segment within a compound string (XmString). Both defaultfont and defaultfontset can define the default fontentry, yet there can only be one default per XmFontList. The XmFONTLIST_DEFAULT_TAG identifier always references the default fontentry when XmString is drawn. If the default fontentry is not specified, the first fontentry is used to draw. The resource converter operates under a single locale so that all font sets created are associated with the same locale. Note – Some implementations reserve the code set name of a locale as a special charset ID (fontsetid and fontid) within an XmFontList string. For this reason, application developers are cautioned not to use code set names if they want their applications to be portable across platforms. Motif Dependencies 79 4 Drawing Localized Text A compound string is a means of encoding text so that it can be displayed in many different fonts without changing anything in the program. The desktop uses compound strings to display all text except that in the Text and TextField widgets. This section explains the structure of a compound string, the interaction between it and a font list (which determines how the compound string is displayed), and focuses on those aspects that are important to the internationalization process. Compound String Components A compound string is an internal encoding, consisting of tag-length-value segments. Semantically, a compound string has components that contain the text to be displayed, a tag (called a font list element tag) that is matched with an element of a font list, and an indicator denoting the direction in which it is to be displayed. A compound string component can be one of the following four types: • A font list element tag. • The font list element tag XmFONTLIST_DEFAULT_TAG indicates that the text is encoded in the code set of the current locale. • Other font list element tags are used later to match text with particular entries in a font list. • • A direction identifier. • A separator. The text of the string. For internationalized applications, the text falls into two broad categories: either the text requires localized treatment or it does not. The following describes each of the compound string components: 80 Font list element tag Indicates a string value that correlates the text component of a compound string to a font or a font set in a font list. Direction Indicates the relationship between the order in which characters are entered on the keyboard and the order in which the characters are displayed on the screen. For CDE: Internationalization Programmer’s Guide 4 example, the display order is left-to-right in English, French, German, and Italian, and right-to-left in Hebrew and Arabic. Text Indicates the text to be displayed. Separator Indicates a special form of a compound string component that has no value. It is used to separate other segments. The desktop uses the specified font list element tag identified in the text component to display the compound string. A specified font list element tag is used until a new font list element tag is encountered. The desktop provides a special font list element tag, XmFONTLIST_DEFAULT_TAG, that matches a font that is correct for the current code set. It identifies the default entry in a font list. See “Compound Strings and Font Lists” on page 83 for more information. The direction segment of a compound string specifies the direction in which the text is displayed. Direction can be left-to-right or right-to-left. Compound Strings and Resources Compound strings are used to display all text except that in the Text and TextField widgets. The compound string is set into the appropriate widget resource so that it can be displayed. For example, the label for the PushButton widget is inherited from the Label widget, and the resource is XmNlabelString, which is of type XmString. This means that the resource expects a value that is a compound string. A compound string can be created with a program or defined in a resource file. Setting a Compound String Programmatically An application can set this resource programmatically by creating the compound string using the XmStringCreateLocalized() compound string convenience function. This function creates a compound string in the encoding of the current locale and automatically sets the font list entry tag to XmFONTLIST_DEFAULT_TAG. Motif Dependencies 81 4 The following code segment shows one way to set the XmNlabelString resource for a push button using a program. #include Widget button; Args args[10]; int n; XmString button_label; nl_msg my_catd; (void)XtSetLanguageProc(NULL,NULL,NULL); . . button_label = XmStringCreateLocalized (catgets(my_catd, 1, 1, "default label"), XmFONTLIST_DEFAULT_TAG); /* Create an argument list for the button */ n = 0; XtSetArg (args[n], XmNlabelString, button_label); n++; /* Create and manage the button */ button = XmCreatePushButton (toplevel, "button”, args, n); XtManageChild (button); XmStringFree (button_label); Setting a Compound String in a Defaults File In an internationalized program, the label string for the button label should be obtained from an external source. For example, the button label can come from a resource file instead of the program. For this example, assume that the push button is a child of a Form widget called form1. *form1.button.labelString: Push Here Here, the desktop’s string-to-compound-string converter produces a compound string from the resource file text. This converter always uses XmFONTLIST_DEFAULT_TAG. 82 CDE: Internationalization Programmer’s Guide 4 Compound Strings and Font Lists When the desktop displays a compound string, it associates each segment with a font or font set by means of the font list element tag for that segment. The application must have loaded the desired font or font set, created a font list that contains that font or font set and its associated font list element tag, and created the compound string segment with the same tag. The desktop follows a set search procedure when it binds a compound string to a font list entry in this way: 1. The desktop searches the font list for an exact match with the font list element tag specified in the compound string. If it finds a match, the compound string is bound to that font list entry. 2. If this does not provide a binding between the compound string and the font list, the desktop binds the compound string to the first element in the font list, regardless of its font list element tag. For backward compatibility, if an exact match is not found, a value of XmFONTLIST_DEFAULT_TAG in either a compound string or a font list matches the tag that results from creating a compound string or font list entry with a tag of XmSTRING_DEFAULT_CHARSET. Figure 4-1 on page 84 shows the relationships between a compound string, a font set, and a font list when the font list element tag is set to something other than XmFONTLIST_DEFAULT_TAG. Motif Dependencies 83 4 Compound String Components Font List Element Tag Text tagb “Push Here” ... ... Font List Font_Set_A Font_Set_B Font_Set_C Font_Set_D Figure 4-1 taga tagb tagc tagd Relationships between compound strings, font sets, and font lists when the font list element tag is not XmFONTLIST_DEFAULT_TAG The following example shows how to use a tag called tagb. XFontSet *font1; XmFontListEntryfont_list_entry; XmFontList font_list; XmString label_text; char** missing; int missing_cnt; char* del_string; char *tagb; /* Font list element tag */ char *fontx; /* Initialize to XLFD or font alias */ char *button_label;/* Contains button label text */ . . font1 = XCreateFontSet (XtDisplay(toplevel), fontx, & missing, & missing_cnt, & def_string); font_list_entry = XmFontListEntryCreate (tagb, XmFONT_IS_FONTSET, (XtPointer)font1); font_list = XmFontListAppendEntry (NULL, font_list_entry); 84 CDE: Internationalization Programmer’s Guide 4 XmFontListEntryFree (font_list_entry); label_text = XmStringCreate (button_label, tagb); The XCreateFontSet() function loads the font set and the XmFontListEntryCreate() function creates a font list entry. The application must create an entry and append it to an existing font list or create a new font list. In either case, use the XmFontListAppendEntry() function. Because there is no font list in place, the preceding code example has a NULL value for the font list argument. The XmFontListAppendEntry() function creates a new font list called font_list with a single entry, font_list_entry. To add another entry to font_list, follow the same procedure but supply a nonnull font list argument. Motif Dependencies 85 4 Figure 4-2 shows the relationships between a compound string, a font set, and a font list when the font list element tag is set to XmFONTLIST_DEFAULT_TAG. In this case, the value field is locale text. Compound String Components Font List Element Tag XmFONTLIST_DEFAULT_TAG Text ... “Push Here” ... Font List Font_Set_A Font_Set_B Font_Set_C Font_Set_D taga tagb XmFONTLIST_DEFAULT_TAG tagc Font_Set_C font1C font2C font3C Figure 4-2 Relationships between compound strings, font sets, and font lists when a font list element tag is set to XmFONTLIST_DEFAULT_TAG Here, the default tag points to Font_Set_C, which in turn identifies the fonts needed to display the characters in the language. 86 CDE: Internationalization Programmer’s Guide 4 Text and TextField Widgets and Font Lists The Text and TextField widgets display text information. To do so, they must be able to select the correct font in which to display the information. The Text and TextField widgets follow a set search pattern to find the correct font as follows: 1. The widget searches the font list for an entry that is a font set and has a font list element tag of XmFONTLIST_DEFAULT_TAG. If a match is found, it uses that font list entry. No further searching occurs. 2. The widget searches the font list for an entry that specifies a font set. It uses the first one found. 3. If no font set is found, the widget uses the first font in the font list. Using a font set ensures that there are glyphs for every character in the locale. Inputting Localized Text In the system environment, the VendorShell widget class is enhanced to provide the interface to the input method. While the VendorShell class controls only one child widget in its geometry management, an extension has been added to the VendorShell class to enhance it for managing all components necessary in the interface to an input method. These components include the status area, preedit area, and the MainWindow area. When the input method requires a status area or a preedit area or both, the VendorShell widget automatically instantiates the status and preedit areas and manages their geometry layout. Any status area or preedit area is managed by the VendorShell widget internally and is not accessible by the client. The widget instantiated as the child of the VendorShell widget is called the MainWindow area. The input method to be used by the VendorShell widget is determined by the XmNinputMethod resource; for example, @im=alt. The default value of Null indicates to choose the default input method associated with the locale at the time that VendorShell is created. As such, the user can affect which input method is selected by either setting the locale, setting the XmNinputMethod resource, or setting both. The locale name is concatenated with the XmNinputMethod resource to determine the input method name. The locale Motif Dependencies 87 4 name must not be specified in this resource. The modifier name for the XmNinputMethod resource needs to be in the form @im=modifier, where modifier is the string used to qualify which input method is selected. The VendorShell widget can support multiple widgets that can share the input method. Yet only one widget can have the keyboard focus (for example, receive key press events and send them to an input method) at any given time. To support multiple widgets (such as Text widgets), the widgets need to be descendants of the VendorShell widget. Note – The VendorShell widget class is a superclass of the TransientShell and TopLevelShell widget classes. As such, an instantiation of a TopLevelShell or a DialogShell is essentially an instantiation of a VendorShell widget class. The VendorShell widget behaves as an input manager only if one of its descendants is an XmText[Field] instance. As soon as an XmText[Field] instance is created as a descendant of the VendorShell widget, VendorShell creates the necessary areas required by the particular input methods dictated by the current locale. Even if an XmText[Field] instance is not mapped but just created, VendorShell has the geometry management behavior as described previously. A VendorShell widget does the following: 88 • Enables applications to process multibyte character input and output that is supported by the locales installed in the system. • Manages an input method instance as defined in the XmIm reference functions. • Supports preediting within a preedit area in either OffTheSpot, OverTheSpot, Root, or None mode. Localized text can be entered into any Text child widget in a multiple Text children widget tree by changing the focus. • Provides geometry management for descendant child widgets. CDE: Internationalization Programmer’s Guide 4 Geometry Management The VendorShell widget provides geometry management and focus management for the input method’s user interface components, as necessary. If the locale warrants it (for example, if the locale is a Japanese Extended UNIX Code (EUC) locale), the VendorShell widget automatically allocates and manages the geometry of any required preedit area or status area or both. Depending on the current preediting being done, an auxiliary area may be required. If so, the VendorShell widget also instantiates and manages the auxiliary area. Typically, the child of the VendorShell widget is a container widget (such as the XmBulletinBoard or XmRowColumn widgets) that can manage multiple Text and TextField widgets, which allow multibyte character input from the user. In this scenario, all Text widgets share the same input method. Note – The status, preedit, and auxiliary areas are not accessible to the application programmer. For example, it is not intended for the application programmer to access the window ID of the status area. The user does not need to worry about the instantiation or management of these components as they are managed as required by the VendorShell widget class. The application programmer has some control over the behavior of the input method user interface components through XmNpreeditType resources of the VendorShell widget class. See “Input Methods” on page 13 for a description of OffTheSpot and OverTheSpot modes. Geometry management extends to all input method user interface components. When the application program window (a TopLevelShell widget) is resized, the input method user interface components are resized accordingly, and the preedited strings in them are rearranged as required. Of course, this assumes that the shell window has a resize policy of True. When the VendorShell widget is created, if a specific input method requires a status area, preedit area, or both, the size of the VendorShell considers the areas required by these components. The extra areas required by the preedit and status areas are part of the VendorShell widget’s area. They are also managed by the VendorShell widget, if resizing is necessary. Motif Dependencies 89 4 Because of the potential instantiation of these areas (status and preedit), depending on the input method currently being used, the size of the VendorShell widget area does not necessarily grow or shrink to accommodate exactly the size of its child. The size of the VendorShell widget area grows or shrinks to accommodate both its child’s geometry and the geometry of these input method user interface areas. There may be a difference (for example, of 20 pixels) in height between the VendorShell widget and its child widget (the MainWindow area). The width geometry is not affected by the input method user interface components. In summary, the requested size of the child is honored if possible; the actual size of the VendorShell may be larger than its child. The requests to specify the geometry of the VendorShell widget and its child are honored as long as they do not conflict with each other or are within the constraint of the VendorShell widget’s ability to resize. When they do conflict, the child’s widget geometry request has higher precedence. For example, if the size of the child widget is specified as 100x100, the size of VendorShell is also specified as 100x100. The resulting VendorShell has a size of 100x120, while its child widget gets a size of 100x100. If the size of the child widget is not specified, the VendorShell shrinks its child widget if necessary to honor its own size specification. For example, if the size of VendorShell is specified as 100x100 and no size is specified for its child, the child widget has a size of 100x80. If the VendorShell widget is disabled from resizing, regardless of what the geometry request of its child is, the VendorShell widget honors only its own geometry specification. Focus Management Languages with large numbers of characters (such as Japanese and Chinese) require an input method that allows the user to compose characters in that language interactively. This is because, for these languages, there are many more characters than can be reasonably mapped to a terminal keyboard. The interactive process of composing characters in such languages is called preediting. The preediting itself is handled by the input method. However, the user interface of the preediting is determined by the system environment. An interface needs to exist between the input method and the system environment. This is done through the VendorShell widget of the system environment. 90 CDE: Internationalization Programmer’s Guide 4 Figure 4-3 illustrates a case with Japanese preediting. The string shown in reverse video is the string in preediting. This string can be moved across different windows by giving focus to the particular window. However, only one preediting session can occur at one time. Figure 4-3 Japanese preediting example For an example of focus management, suppose a TopLevelShell widget (a subclass of the VendorShell widget) has an XmBulletinBoard widget child (MainWindow area), which has five XmText widgets as children. Assume the locale requires the preedit area, and assume the OverTheSpot mode is specified. Because the VendorShell widget manages only one instance of an input method, you can run only one preedit area at a time inside the TopLevelShell widget. If the focus is moved from one Text widget to another, the current preedit string under construction is also moved on top of the Text widget that currently has focus. Processing of keys to the old Text Motif Dependencies 91 4 widget is suspended temporarily. Subsequent interface of the input method, such as the delivery of the string at preedit completion, is made to the new, focused Text widget. The string being preedited can be moved to the location of the focus; for example, by clicking the mouse. A string that the end user is finished preediting and that is already confirmed cannot be reconverted. Once the string is composed, it is committed. Committing a string means that it is moved from the preedit area to the focus point of the client. Internationalized User Interface Language The capability to parse a multibyte character string as a string literal has been added to the User Interface Language (UIL). Creation of a UIL file is performed by using the characteristics of the target language and writing the User Interface Definition (UID) file. Programming for Internationalized User Interface Language The UIL compiler parses nonstandard charsets as locale text. This requires the UIL compiler to be run in the same locale as any locale text. If the locale text of a widget requires a font set (more than one font), the font set must be specified within the resource file. The font parameter does not support font sets. To use a specific language with UIL, a UIL file is written according to characteristics of the target language and compiled into a UID file. The UIL file that contains localized text needs to be compiled in the locale in which it is to run. String Literals The following shows examples of literal strings. The cur_charset value is always set to the default_charset value, which allows the string literal to contain locale text. 92 CDE: Internationalization Programmer’s Guide 4 To set locale text in the string literal with the default_charset value, enter the following: XmNlabelString = ’XXXXXX’; OR XmNlabelString = #default_charset“XXXXXX”; Compile the UIL file with the LANG environment variable matching the encoding of the locale text. Otherwise, the string literal is not compiled properly. Font Sets The font set cannot be set through UIL source programming. Whenever the font set is required, you must set it in the resource file as the following example shows: *fontList: -*-r-*-20-*: Font Lists UIL has three functions that are used to create font lists: FONT, FONTSET, and FONT_TABLE. The FONT and FONTSET functions create font list entries. The FONT_TABLE function creates a font list from these font list entries. The FONT function creates a font list entry containing a font specification. The argument is a string representing an XLFD font name. The FONTSET function creates a font list entry containing a font set specification. The argument is a comma-separated list of XLFD font names representing a base name font list. Both FONT and FONTSET have optional CHARACTER_SET declaration parameters that specify the font list element tag for the font list entry. In both cases, if no CHARACTER_SET declaration parameter is specified, UIL determines the font list element tag as follows: • If the module contains no CHARACTER_SET declaration and if the uil command was called with the -s option or the Uil() function was started with use_setlocale_flag set, the font list element tag is XmFONTLIST_DEFAULT_TAG. Motif Dependencies 93 4 • Otherwise, the font list element tag is the code set component of the LANG environment variable, if it is set in the UIL compilation environment; or it is the value of XmFALLBACK_CHARSET if the LANG environment variable is not set or has no code set. The FONT_TABLE function creates a font list from a comma-separated list of font list entries created by FONT or FONTSET. The resulting font list can be used as the value of a font list resource. If a single font list entry is supplied as the value for such a resource, UIL converts the entry to a font list. Creating Resource Files If necessary, set the input method-related resources in the resource file as shown in the following example: *preeditType: OverTheSpot, OffTheSpot, Root, or None Setting the Environment For a locale-sensitive application, set the UID file to the appropriate directory. Set the UIDPATH or XAPPLRESDIR environment variable to the appropriate value. For example, to run the uil_sample program with an English environment (LANG environment variable is en_US), set uil_sample.uid with Latin characters at the $HOME/en_US directory, or set uil_sample.uid to a directory and set the UIDPATH environment variable to the full path name of the uil_sample.uid file. To run the uil_sample program with a Japanese environment (LANG environment variable is ja_JP), create a uil_sample.uid file with Japanese (multibyte) characters at the $HOME/ja_JP directory, or place uil_sample.uid to a unique directory and set the UIDPATH environment variable to the full path name of the uil_sample.uid file. The following list specifies the possible variables: 94 %U Specifies the UID file string. %N Specifies the class name of the application. %L Specifies the value of the xnlLanguage resource or LC_CTYPE category. %l Specifies the language component of the xnlLanguage resource or the LC_CTYPE category. CDE: Internationalization Programmer’s Guide 4 If the XAPPLRESDIR environment variable is set, the MrmOpenHierarchy() function searches the UID file in the following order: 1. UID file path name 2. $UIDPATH 3. %U 4. $XAPPLRESDIR/%L/uid/%N/%U 5. $XAPPLRESDIR/%l/uid/%N/%U 6. $XAPPLRESDIR/uid/%N/%U 7. $XAPPLRESDIR/%L/uid/%U 8. $XAPPLRESDIR/%l/uid/%U 9. $XAPPLRESDIR/uid/%U 10. $HOME/uid/%U 11. $HOME/%U 12. /usr/lib/X11/%L/uid/%N/%U 13. /usr/lib/X11/%l/uid/%N/%U 14. /usr/lib/X11/uid/%N/%U 15. /usr/lib/X11/%L/uid/%U 16. /usr/lib/X11/%l/uid/%U 17. /usr/lib/X11/uid/%U 18. /usr/include/X11/uid/%U If the XAPPLRESDIR environment variable is not set, the MrmOpenHierarchy() function uses $HOME instead of the XAPPLRESDIR environment variable. Motif Dependencies 95 4 default_charset Character Set in UIL With the default_charset string literal, any characters can be set as a valid string literal. For example, if the LANG environment variable is el_GR, the string literal with default_charset can contain any Greek character. If the LANG environment variable is ja_JP, the default_charset string literal can contain any Japanese character encoded in Japanese EUC. If no character set is set to a string literal, the character set of the string literal is set as cur_charset. And, in the system environment, the cur_charset value is always set as default_charset. Example: uil_sample Figure 4-4 shows a UIL sample program on English and Japanese environments. Figure 4-4 Sample UIL program on English and Japanese environments In the following sample program, LLL indicates locale text, which can be Japanese, Korean, Traditional Chinese, Greek, French, or others. uil_sample.uil ! ! sample uil file - uil_sample.uil ! ! C source file - uil_sample.c ! ! Resource file - uil-sample.resource 96 CDE: Internationalization Programmer’s Guide 4 ! module Test version = ’v1.0’ names = case_sensitive objects = { XmPushButton = gadget; } !************************************ ! declare callback procedure !************************************ procedure exit_CB ; !*************************************************************** ! declare BulletinBoard as parent of PushButton and Text !*************************************************************** object bb : XmBulletinBoard { arguments{ XmNwidth = 500; XmNheight = 200; }; controls{ XmPushButton pb1; XmText text1; }; }; !**************************** ! declare PushButton !**************************** object pb1 : XmPushButton { arguments{ XmNlabelString = #Normal “LLLexit buttonLLL”; XmNx = 50; XmNy = 50; }; callbacks{ XmNactivateCallback = procedure exit_CB; }; }; !********************* ! declare Text !********************* text1 : XmText { arguments{ XmNx = 50; Motif Dependencies 97 4 XmNy = 150; }; }; end module; * * C source file - uil_sample.c * */ #include #include void exit_CB(); static MrmHierarchy hierarchy; static MrmType *class; /******************************************/ /* specify the UID hierarchy list*/ /*****************************************/ static char *aray_file[]= {“uil_sample.uid” }; static int num_file = (sizeof aray_file / sizeof aray_file[0]); /******************************************************/ /* define the mapping between UIL procedure names*/ /* and their addresses */ /******************************************************/ static MRMRegisterArg reglist[]={ {“exit_CB”,(caddr_t) exit_CB} Compound Strings in UIL Three mechanisms exist for specifying strings in UIL files: • As string literals, which may be stored in UID files as either null-terminated strings or compound strings • • As compound strings As wide character strings Both string literals and compound strings consist of text, a character set, and a writing direction. For string literals and for compound strings with no explicit direction, UIL infers the writing direction from the character set. The UIL concatenation operator (&) concatenates both string literals and compound strings. 98 CDE: Internationalization Programmer’s Guide 4 Regardless of whether UIL stores string literals in UID files as null-terminated strings or as compound strings, it stores information about each string’s character set and writing direction along with the text. In general, UIL stores string literals or string expressions as compound strings in UID files under the following conditions: • When a string expression consists of two or more literals with different character sets or writing directions • When the literal or expression is used as a value that has a compound string data type (such as the value of a resource whose data type is compound string) UIL recognizes a number of keywords specifying character sets. UIL associates parsing rules, including parsing direction and whether characters have 8 or 16 bits, for each character set it recognizes. It is also possible to define a character set using the UIL CHARACTER_SET function. The syntax of a string literal is one of the following: • • • ’[character_string]’ [#char_set] “[character_string]” For each syntax, the character set of the string is determined as follows: • For a string declared as ’character_string’, the character set is the code set component of the LANG environment variable, if it is set in the UIL compilation environment; or it is the value of XmFALLBACK_CHARSET if the LANG environment variable is not set or has no code set. By default, the value of XmFALLBACK_CHARSET is ISO8859-1, but vendors may supply different values. • • For a string declared as #char_set “string”, the character set is char_set. For a string declared as “character_string”, the character set depends on whether the module has a CHARACTER_SET clause and whether the UIL compiler’s use_setlocale_flag is set. • If the module has a CHARACTER_SET clause, the character set is the one specified in that clause. • If the module has no CHARACTER_SET clause but the uil command was started with the -s option, or if the Uil() function was started with use_setlocale_flag set, UIL calls the setlocale() function and parses the string in the current locale. The character set of the resulting string is XmFONTLIST_DEFAULT_TAG. Motif Dependencies 99 4 • If the module has no CHARACTER_SET clause and the uil command was started without the -s option, or if the Uil() function was started without use_setlocale_flag, the character set is the code set component of the LANG environment variable, if it is set in the UIL compilation environment, or the character set is the value of XmFALLBACK_CHARSET if LANG is not set or has no code set. UIL always stores a string specified using the COMPOUND_STRING function as a compound string. This function takes as arguments a string expression and optional specifications of a character set, direction, and whether to append a separator to the string. If no character set or direction is specified, UIL derives it from the string expression, as described in the preceding section. Note – Certain predefined escape sequences, beginning with a \ (backslash), may be displayed in string literals, with the following exceptions: – A string in single quotation marks can span multiple lines, with each new line character escaped by a backslash. A string in double quotation marks cannot span multiple lines. – Escape sequences are processed literally inside a string that is parsed in the current locale (a localized string). 100 CDE: Internationalization Programmer’s Guide 5 Xt and Xlib Dependencies This chapter discusses tasks related to internationalizing with Xt and Xlib. Locale Management 101 Font Management 108 Drawing Localized Text 110 Inputting Localized Text 111 Interclient Communications Conventions for Localized Text 122 Messages 125 Locale Management The following defines support for the locale mechanism that controls all locale-dependent Xlib and Common Desktop Environment functions. X Locale Management X locale supports one or more of the locales defined by the host environment. The Xlib conforms to the American National Standards Institute (ANSI) C library, and the locale announcement method is the setlocale() function. This function configures the locale operation of both the host C library and Xlib. The operation of Xlib is governed by the LC_CTYPE category; this is called the current locale. 101 5 The XSupportsLocale() function is used to determine whether the current locale is supported by X. The client is responsible for selecting its locale and X modifiers. Clients should provide a means for the user to override the clients’ locale selection at client invocation. Most single-display X clients operate in a single locale for both X and the host-processing environment. They configure the locale by calling three functions: setlocale(), XSupportsLocale(), and XSetLocaleModifiers(). The semantics of certain categories of X internationalization capabilities can be configured by setting modifiers. Modifiers are named by implementation-dependent and locale-specific strings. The only standard use for this capability at present is selecting one of several styles of keyboard input methods. The XSetLocaleModifiers() function is used to configure Xlib locale modifiers for the current locale. The recommended procedure for clients initializing their locale and modifiers is to obtain locale and modifier announcers separately from one of the following prioritized sources: 1. A command-line option 2. A resource 3. The empty string (“ ”) The first of these that is defined should be used. Note – When a locale command-line option or locale resource is defined, the effect should be to set all categories to the specified locale, overriding any category-specific settings in the local host environment. Locale and Modifier Dependencies The internationalized Xlib functions operate in the current locale configured by the host environment and in the X locale modifiers set by the XSetLocaleModifiers() function, or in the locale and modifiers configured at the time some object supplied to the function was created. For each locale-dependent function, Table 5-1 lists locale and modifier dependencies. 102 CDE: Internationalization Programmer’s Guide 5 Table 5-1 Locale and Modifier Dependencies Locale from... Affects the Function... In the... Locale Query/Configuration setlocale XSupportsLocale XSetLocaleModifiers Locale queried Locale modified Resources setlocale XrmGetFileDatabase XrmGetStringDatabase Locale of XrmDatabase XrmDatabase XrmPutFileDatabase XrmLocaleOfDatabase Locale of XrmDatabase Setting Standard Properties setlocale XmbSetWMProperties Encoding of supplied returned text (some WM_ property text in environment locale) setlocale XmbTextPropertyToTextList XwcTextPropertyToTextList XmbTextListToTextProperty XwcTextListToTextProperty Encoding of supplied/returned text Text Input setlocale XOpenIM XIM input method XIM XCreateIC XLocaleOfIM, etc. XIC input method configuration Queried locale XmbLookupText XwcLookupText Keyboard layout Encoding of returned text XIC Text Drawing setlocale XCreateFontSet Charsets of fonts in XFontSet XFontSet XmbDrawText, XwcDrawText, etc. XExtentsOfFontSet, etc. XmbTextExtents, XwcTextExtents, etc. Locale of supplied text Locale of supplied text Locale-dependent metrics Xt and Xlib Dependencies 103 5 Table 5-1 Locale and Modifier Dependencies (Continued) Locale from... Affects the Function... In the... Xlib Errors setlocale XGetErrorDatabaseText XGetErrorText Locale of error message Clients can assume that a locale-encoded text string returned by an X function can be passed to a C library function, or the string result of a C library function can be passed to an X function, if the locale is the same at the two calls. All text strings processed by internationalized Xlib functions are assumed to begin in the initial state of the encoding of the locale, if the encoding is state-dependent. All Xlib functions behave as if they do not change the current locale or X modifier setting. (This means that any function, provided within a library either by Xlib or by the application, that changes the locale or calls the XSetLocaleModifiers() function with a nonnull argument, must save and restore the current locale state on entry and exit.) Also, Xlib functions on implementations that conform to the ANSI C library do not alter the global state associated with the mblen(), mbtowc(), wctomb(), and strtok() ANSI C functions. Xt Locale Management Xt locale management includes the following two functions: • • XtSetLanguageProc() XtDisplayInitialize() XtSetLanguageProc Before the initialization of the Xt Toolkit, applications should normally call the XtSetLanguageProc() function with one of the following functions: XtSetLanguageProc (NULL, NULL, NULL) Note – The locale is not actually set until the toolkit is initialized (for example, by way of the XtAppInitialize() function). Therefore, the setlocale() function may be needed after the XtSetLanguageProc() function and the initializing of the toolkit (for example, if calling the catopen() function). 104 CDE: Internationalization Programmer’s Guide 5 Resource databases are created in the current process locale. During display initialization prior to creating the per-screen resource database, the Intrinsics call to a specified application procedure to set the locale according to options found on the command line or in the per-display resource specifications. The callout procedure provided by the application is of type XtLanguageProc, as in the following syntax: typedef String(*XtLanguageProc)(displayID, languageID, clientdata); Display *displayID; String languageID; XtPointer clientdata; displayID Passes the display. languageID Passes the initial language value obtained from the command line or server per-display resource specifications. clientdata Passes the additional client data specified in the call to the XtSetLanguageProc() function. The language procedure allows an application to set the locale to the value of the language resource determined by the XtDisplayInitialize() function. The function returns a new language string that is subsequently used by the XtDisplayInitialize() function to establish the path for loading resource files. This string is cached and is the locale of the display. Initially, no language procedure is set by the intrinsics. To set the language procedure for use by the XtDisplayInitialize() function, use the XtSetLanguageProc() function: XtLanguageProc XtSetLanguageProc(applicationcontext, procedure, clientdata) XtAppContext applicationcontext; XtLanguageProc procedure; XtPointer clientdata; applicationcontext Specifies the application context in which the language procedure is to be used or specifies a null value. procedure Specifies the language procedure. clientdata Specifies additional client data to be passed to the language procedure when it is called. The XtSetLanguageProc() function sets the language procedure that is called from the XtDisplayInitialize() function for all subsequent displays initialized in the specified application context. If the applicationcontext Xt and Xlib Dependencies 105 5 parameter is null, the specified language procedure is registered in all application contexts created by the calling process, including any future application contexts that may be created. If the procedure parameter is null, a default language procedure is registered. The XtSetLanguageProc() function returns the previously registered language procedure. If a language procedure has not yet been registered, the return value is unspecified; but if this return value is used in a subsequent call to the XtSetLanguageProc() function, it causes the default language procedure to be registered. The default language procedure does the following: • Sets the locale according to the environment. On ANSI C-based systems, this is done by calling the setlocale (LC_ALL, “language”) function. If an error is encountered, a warning message is issued with the XtWarning() function. • Calls the XSupportsLocale() function to verify that the current locale is supported. If the locale is not supported, a warning message is issued with the XtWarning() function and the locale is set to "C". • • Calls the XSetLocaleModifiers() function specifying the empty string. Returns the value of the current locale. On ANSI C-based systems, this is the return value from a final call to the setlocale (LC_CTYPE, NULL) function. A client can use this mechanism to establish a locale by calling the XtSetLanguageProc() function prior to the XtDisplayInitialize() function, as in the following example. Widget top; XtSetLanguageProc(NULL, NULL, NULL); top = XtAppInitialize( ... ); ... XtDisplayInitialize The XtDisplayInitialize() function first determines the language string to be used for the specified display and loads the application’s resource database for this display-host-application combination from the following sources in order of precedence: 1. Application command line (argv) 2. Per-host user environment resource file on the local host 106 CDE: Internationalization Programmer’s Guide 5 3. Resource property on the server or user-preference resource file on the local host 4. Application-specific user resource file on the local host 5. Application-specific class resource file on the local host The XtDisplayInitialize() function creates a unique resource database for each display parameter specified. When a database is created, a language string is determined for the display parameter in a manner equivalent to the following sequence of actions. The XtDisplayInitialize() function initially creates two temporary databases. The first database is constructed by parsing the command line. The second database is constructed from the string returned by the XResourceManagerString() function or, if the XResourceManagerString() function returns a null value, the contents of a resource file in the user’s home directory. The name for this user-preference resource file is $HOME/.Xdefaults. The database constructed from the command line is then queried for the resource name.xnlLanguage, class class.XnlLanguage, where name and class are the specified application name and application class. If this database query is unsuccessful, the server resource database is queried; if this query is also unsuccessful, the language is determined from the environment. This is done by retrieving the value of the LANG environment variable. If no language string is found, the empty string is used. The application-specific class resource file name is constructed from the class name of the application. It points to a localized resource file that is usually installed by the site manager when the application is installed. The file is found by calling the XtResolvePathname() function with the parameters (displayID, applicationdefaults, NULL, NULL, NULL, NULL, 0, NULL). This file should be provided by the developer of the application because it may be required for the application to function properly. A simple application that needs a minimal set of resources in the absence of its class resource file can declare fallback resource specifications with the XtAppSetFallbackResources() function. The application-specific user resource file name points to a user-specific resource file and is constructed from the class name of the application. This file is owned by the application and typically stores user customizations. Its name is found by calling the XtResolvePathname() function with the parameters Xt and Xlib Dependencies 107 5 (displayID, NULL, NULL, NULL, path, NULL, 0, NULL), where path is defined in an operating-system-specific manner. The path variable is defined to be the value of the XUSERFILESEARCHPATH environment variable if this is defined. Otherwise, the default is vendor-defined. If the resulting resource file exists, it is merged into the resource database. This file can be provided with the application or created by the user. The temporary database created from the server resource property or user resource file during language determination is then merged into the resource database. The server resource file is created entirely by the user and contains both display-independent and display-specific user preferences. If one exists, a user’s environment resource file is then loaded and merged into the resource database. This file name is user- and host-specific. The user’s environment resource file name is constructed from the value of the user’s XENVIRONMENT environment variable for the full path of the file. If this environment variable does not exist, the XtDisplayInitialize() function searches the user’s home directory for the .Xdefaults-host file, where host is the name of the machine on which the application is running. If the resulting resource file exists, it is merged into the resource database. The environment resource file is expected to contain process-specific resource specifications that are to supplement those user-preference specifications in the server resource file. Font Management International text drawing is done using a set of one or more fonts, as needed for the locale of the text. The two methods of internationalized drawing within the system environment allow clients to choose one of the static output widgets (for example, XmLabel) or to choose the DrawingArea widget to draw with any other primitive function. Static output widgets require that text be converted to XmString. The following information explains the mechanism for managing fonts using the Xlib routines and functions. 108 CDE: Internationalization Programmer’s Guide 5 Creating and Freeing a Font Set Xlib international text drawing is done using a set of one or more fonts, as needed for the locale of the text. Fonts are loaded according to a list of base font names supplied by the client and the charsets required by the locale. The XFontSet is an opaque type. • The XCreateFontSet() function is used to create an international text drawing font set. • The XFontsOfFontSet() function is used to obtain a list of XFontStruct structures and full font names given an XFontSet. • To obtain the base font name list and the selected font name list given an XFontSet, use the XBaseFontNameListOfFontSet() function. • To obtain the locale name given an XFontSet, use the XLocaleOfFontSet() function. • The XLocaleOfFontSet() function returns the name of the locale bound to the specified XFontSet as a null-terminated string. • The XFreeFontSet() function frees the specified font set. The associated base font name list, font name list, XFontStruct list, and XFontSetExtents, if any, are freed. Obtaining Font Set Metrics Metrics for the internationalized text drawing functions are defined in terms of a primary draw direction, which is the default direction in which the character origin advances for each succeeding character in the string. The Xlib interface is currently defined to support only a left-to-right primary draw direction. The drawing origin is the position passed to the drawing function when the text is drawn. The baseline is a line drawn through the drawing origin parallel to the primary draw direction. Character ink is the pixels painted in the foreground color and does not include interline or intercharacter spacing or image text background pixels. The drawing functions are allowed to implement implicit text direction control, reversing the order in which characters are rendered along the primary draw direction in response to locale-specific lexical analysis of the string. Xt and Xlib Dependencies 109 5 Regardless of the character rendering order, the origins of all characters are on the primary draw direction side of the drawing origin. The screen location of a particular character image may be determined with the XmbTextPerCharExtents() or XwcTextPerCharExtents() functions. The drawing functions are allowed to implement context-dependent rendering, where the glyphs drawn for a string are not simply a combination of the glyphs that represent each individual character. A string of two characters drawn with the XmbDrawString() function may render differently than if the two characters were drawn with separate calls to the XmbDrawString() function. If the client adds or inserts a character in a previously drawn string, the client may need to redraw some adjacent characters to obtain proper rendering. The drawing functions do not interpret newline characters, tabs, or other control characters. The behavior when nonprinting characters are drawn (other than spaces) is implementation-dependent. It is the client’s responsibility to interpret control characters in a text stream. To find out about context-dependent rendering, use the XContextDependentDrawing() function. The XExtentsOfFontSet() function obtains the maximum extents structure given an XFontSet. The XmbTextEscapement() and XwcTextEscapement() functions obtain the escapement in pixels of the specified text as a value. The XmbTextExtents() and XwcTextExtents() functions obtain the overall bounding box of the string’s image and a logical bounding box (overall_ink_return and overall_logical_return arguments respectively). The XmbTextPerCharExtents() and XwcTextPerCharExtents() functions return the text dimensions of each character of the specified text, using the fonts loaded for the specified font set. Drawing Localized Text The functions defined in this section draw text at a specified location in a drawable. They are similar to the XDrawText(), XDrawString(), and XDrawImageString() functions except that they work with font sets instead of single fonts, and they interpret the text based on the locale of the font set instead of treating the bytes of the string as direct font indexes. If a BadFont error is generated, characters prior to the offending character may have been drawn. 110 CDE: Internationalization Programmer’s Guide 5 The text is drawn using the fonts loaded for the specified font set; the font in the graphics context (GC) is ignored and may be modified by the functions. No validation that all fonts conform to some width rule is performed. Use the XmbDrawText() or XwcDrawText() function to draw text using multiple font sets in a given drawable. To draw text using a single font set in a given drawable, use the XmbDrawString() or XwcDrawString() function. To draw image text using a single font set in a given drawable, use the XmbDrawImageString() or XwcDrawImageString() function. Inputting Localized Text The following discusses the Xlib and desktop mechanisms used for international text input. If you are using Motif Text[Field] widgets or you are using the XmIm APIs for text input, this section provides background information. However, it will not impact your application design or coding practice. If you are not interested in how character input is achieved from the keyboard with low-level Xlib calls, you can proceed to “Interclient Communications Conventions for Localized Text” on page 122. Xlib Input Method Overview This section provides definitions for terms and concepts used for internationalized text input and a brief overview of the intended use of the mechanisms provided by Xlib. A large number of languages in the world use alphabets consisting of a small set of symbols (letters) to form words. To enter text into a computer in an alphabetic language, a user usually has a keyboard on which there are key symbols corresponding to the alphabet. Sometimes, a few characters of an alphabetic language are missing on the keyboard. Many computer users who speak a Latin-alphabet-based language only have an English-based keyboard. They need to press a combination of keystrokes to enter a character that does not exist directly on the keyboard. A number of algorithms have been developed for entering such characters, known as European input methods, the compose input method, or the dead-keys input method. Japanese is an example of a language with a phonetic symbol set, where each symbol represents a specific sound. There are two phonetic symbol sets in Japanese: Katakana and Hiragana. In general, Katakana is used for words that Xt and Xlib Dependencies 111 5 are of foreign origin, and Hiragana for writing native Japanese words. Collectively, the two systems are called Kana. Hiragana consists of 83 characters; Katakana, 86 characters. Korean also has a phonetic symbol set, called Hangul. Each of the 24 basic phonetic symbols (14 consonants and 10 vowels) represent a specific sound. A syllable is composed of two or three parts: the initial consonants, the vowels, and the optional last consonants. With Hangul, syllables can be treated as the basic units on which text processing is done. For example, a delete operation may work on a phonetic symbol or a syllable. Korean code sets include several thousands of these syllables. A user types the phonetic symbols that make up the syllables of the words to be entered. The display may change as each phonetic symbol is entered. For example, when the second phonetic symbol of a syllable is entered, the first phonetic symbol may change its shape and size. Likewise, when the third phonetic symbol is entered, the first two phonetic symbols may change their shape and size. Not all languages rely solely on alphabetic or phonetic systems. Some languages, including Japanese and Korean, employ an ideographic writing system. In an ideographic system, rather than taking a small set of symbols and combining them in different ways to create words, each word consists of one unique symbol (or, occasionally, several symbols). The number of symbols may be very large: approximately 50,000 have been identified in Hanzi, the Chinese ideographic system. There are two major aspects of ideographic systems for their computer usage. First, the standard computer character sets in Japan, China, and Korea include roughly 8,000 characters, while sets in Taiwan have between 15,000 and 30,000 characters, which make it necessary to use more than one byte to represent a character. Second, it is obviously impractical to have a keyboard that includes all of a given language’s ideographic symbols. Therefore a mechanism is required for entering characters so that a keyboard with a reasonable number of keys can be used. Those input methods are usually based on phonetics, but there are also methods based on the graphical properties of characters. In Japan, both Kana and Kanji are used. In Korea, Hangul and sometimes Hanja are used. Now, consider entering ideographs in Japan, Korea, China, and Taiwan. In Japan, either Kana or English characters are entered and a region is selected (sometimes automatically) for conversion to Kanji. Several Kanji characters can have the same phonetic representation. If that is the case, with the string 112 CDE: Internationalization Programmer’s Guide 5 entered, a menu of characters is presented and the user must choose the appropriate option. If no choice is necessary or a preference has been established, the input method does the substitution directly. When Latin characters are converted to Kana or Kanji, it is called a Romaji conversion. In Korea, it is usually acceptable to keep Korean text in Hangul form, but some people may choose to write Hanja-originated words in Hanja rather than in Hangul. To change Hangul to Hanja, a region is selected for conversion and the user follows the same basic method as described for Japanese. Probably because there are well-accepted phonetic writing systems for Japanese and Korean, computer input methods in these countries for entering ideographs are fairly standard. Keyboard keys have both English characters and phonetic symbols engraved on them, and the user can switch between the two sets. The situation is different for Chinese. While there is a phonetic system called Pinyin promoted by authorities, there is no consensus for entering Chinese text. Some vendors use a phonetic decomposition (Pinyin or another), others use ideographic decomposition of Chinese words, with various implementations and keyboard layouts. There are about 16 known methods, none of which is a clear standard. Also, there are actually two ideographic sets used: Traditional Chinese (the original written Chinese) and Simplified Chinese. Several years ago, the People’s Republic of China launched a campaign to simplify some ideographic characters and eliminate redundancies altogether. Under the plan, characters would be streamlined every five years. Characters have been revised several times now, resulting in the smaller, simpler set that makes up Simplified Chinese. Input Method Architecture As shown in the previous section, there are many different input methods used today, each varying with language, culture, and history. A common feature of many input methods is that the user can type multiple keystrokes to compose a single character (or set of characters). The process of composing characters from keystrokes is called preediting. It may require complex algorithms and large dictionaries involving substantial computer resources. Xt and Xlib Dependencies 113 5 Input methods may require one or more areas in which to show the feedback of the actual keystrokes, to show ambiguities to the user, to list dictionaries, and so on. The following are the input method areas of concern. Status area Intended to be a logical extension of the light-emitting diodes (LEDs) that exist on the physical keyboard. It is a window that is intended to present the internal state of the input method that is critical to the user. The status area may consist of text data and bitmaps or some combination. Preedit area Intended to display the intermediate text for those languages that are composing prior to the client handling the data. Auxiliary area Used for pop-up menus and customizing dialog boxes that may be required for an input method. There may be multiple auxiliary areas for any input method. Auxiliary areas are managed by the input method independent of the client. Auxiliary areas are assumed to be a separate dialog that is maintained by the input method. There are various user interaction styles used for preediting. The following are the preediting styles supported by Xlib. OnTheSpot Data is displayed directly in the application window. Application data is moved to allow preedit data to be displayed at the point of insertion. OverTheSpot Data is displayed in a preedit window that is placed over the point of insertion. OffTheSpot Preedit window is displayed inside the application window but not at the point of insertion. Often, this type of window is placed at the bottom of the application window. Root window Preedit window is the child of RootWindow. It would require a lot of computing resources if portable applications had to include input methods for all the languages in the world. To avoid this, a goal of the Xlib design is to allow an application to communicate with an input method placed in a separate process. Such a process is called an input server. The server to which the application should connect is dependent on the environment when the application is started up: what the user language is and 114 CDE: Internationalization Programmer’s Guide 5 the actual encoding to be used for it. The input method connection is said to be locale-dependent. It is also user-dependent; for a given language, the user can choose, to some extent, the user-interface style of input method (if there are several choices). Using an input server implies communications overhead, but applications can be migrated without relinking. Input methods can be implemented either as a token communicating to an input server or as a local library. The abstraction used by a client to communicate with an input method is an opaque data structure represented by the XIM data type. This data structure is returned by the XOpenIM() function, which opens an input method on a given display. Subsequent operations on this data structure encapsulate all communication between client and input method. There is no need for an X client to use any networking library or natural language package to use an input method. A single input server can be used for one or more languages, supporting one or more encoding schemes. But the strings returned from an input method are always encoded in the (single) locale associated with the XIM object. Input Contexts Xlib provides the ability to manage a multithreaded state for text input. A client may be using multiple windows, each window with multiple text entry areas, with the user possibly switching among them at any time. The abstraction for representing the state of a particular input thread is called an input context. The Xlib representation of an input context is an XIC. See Figure 5-1 on page 116 for an illustration. Xt and Xlib Dependencies 115 5 Input Method Input Context Application Window Figure 5-1 Input Context Application Window Input method and input contexts An input context is the abstraction retaining the state, properties, and semantics of communication between a client and an input method. An input context is a combination of an input method, a locale specifying the encoding of the character strings to be returned, a client window, internal state information, and various layout or appearance characteristics. The input context concept somewhat matches for input the graphics context abstraction defined for graphics output. 116 CDE: Internationalization Programmer’s Guide 5 One input context belongs to exactly one input method. Different input contexts can be associated with the same input method, possibly with the same client window. An XIC is created with the XCreateIC() function, providing an XIM argument, affiliating the input context to the input method for its lifetime. When an input method is closed with the XCloseIM() function, no affiliated input contexts should be used again (and should preferably be deleted before closing the input method). Considering the example of a client window with multiple text entry areas, the application programmer can choose to implement the following: • As many input contexts are created as text-entry areas. The client can get the input accumulated on each context each time it looks up that context. • A single context is created for a top-level window in the application. If such a window contains several text-entry areas, each time the user moves to another text-entry area, the client has to indicate changes in the context. Application designers can choose a range of single or multiple input contexts, according to the needs of their applications. Keyboard Input To obtain characters from an input method, a client must call the XmbLookupString() function or XwcLookupString() function with an input context created from that input method. Both a locale and display are bound to an input method when they are opened, and an input context inherits this locale and display. Any strings returned by the XmbLookupString() or XwcLookupString() function are encoded in that locale. Xlib Focus Management For each text-entry area in which the XmbLookupString() or XwcLookupString() function is used, there is an associated input context. When the application focus moves to a text-entry area, the application must set the input context focus to the input context associated with that area. The input context focus is set by calling the XSetICFocus() function with the appropriate input context. Xt and Xlib Dependencies 117 5 Also, when the application focus moves out of a text-entry area, the application should unset the focus for the associated input context by calling the XUnsetICFocus() function. As an optimization, if the XSetICFocus() function is called successively on two different input contexts, setting the focus on the second automatically unsets the focus on the first. Note – To set and unset the input context focus correctly, it is necessary to track application-level focus changes. Such focus changes do not necessarily correspond to X server focus changes. If a single input context is used to do input for multiple text-entry areas, it is also necessary to set the focus window of the input context whenever the focus window changes. Xlib Geometry Management In most input method architectures (OnTheSpot being the notable exception), the input method performs the display of its own data. To provide better visual locality, it is often desirable to have the input method areas embedded within a client. To do this, the client may need to allocate space for an input method. Xlib provides support that allows the client to provide the size and position of input method areas. The input method areas that are supported for geometry management are the status area and the preedit area. The fundamental concept on which geometry management for input method windows is based is the proper division of responsibilities between the client (or toolkit) and the input method. The division of responsibilities is the following: • • The client is responsible for the geometry of the input method window. The input method is responsible for the contents of the input method window. It is also responsible for creating the input method window per the geometry constraints given to it by the client. An input method can suggest a size to the client, but it cannot suggest a placement. The input method can only suggest a size: it does not determine the size, and it must accept the size it is given. Before a client provides geometry management for an input method, it must determine if geometry management is needed. The input method indicates the need for geometry management by setting the XIMPreeditArea() or 118 CDE: Internationalization Programmer’s Guide 5 XIMStatusArea() function in its XIMStyles value returned by the XGetIMValues() function. When a client decides to provide geometry management for an input method, it indicates that decision by setting the XNInputStyle value in the XIC. After a client has established with the input method that it will do geometry management, the client must negotiate the geometry with the input method. The geometry is negotiated by the following steps: • The client suggests an area to the input method by setting the XNAreaNeeded value for that area. If the client has no constraints for the input method, it either does not suggest an area or sets the width and height to 0 (zero). Otherwise, it sets one of the values. • The client gets the XIC XNAreaNeeded value. The input method returns its suggested size in this value. The input method should pay attention to any constraints suggested by the client. • The client sets the XIC XNArea value to inform the input method of the geometry of the input method’s window. The client should try to honor the geometry requested by the input method. The input method must accept this geometry. Clients performing geometry management must be aware that setting other IC values may affect the geometry desired by an input method. For example, the XNFontSet and XNLineSpacing values may change the geometry desired by the input method. It is the responsibility of the client to renegotiate the geometry of the input method window when it is needed. In addition, a geometry management callback is provided by which an input method can initiate a geometry change. Event Filtering A filtering mechanism is provided to allow input methods to capture X events transparently to clients. It is expected that toolkits (or clients) using the XmbLookupString() or XwcLookupString() function call this filter at some point in the event processing mechanism to make sure that events needed by an input method can be filtered by that input method. If there is no Xt and Xlib Dependencies 119 5 filter, a client can receive and discard events that are necessary for the proper functioning of an input method. The following provides a few examples of such events: • • Expose events that are on a preedit window in local mode. • Key events can be sent to a filter before they are bound to translations such as Xt provides. Events can be used by an input method to communicate with an input server. Such input server protocol-related events have to be intercepted if the user does not want to disturb client code. Clients are expected to get the XIC XNFilterEvents value and add to the event mask for the client window with that event mask. This mask can be 0. Callbacks When an OnTheSpot input method is implemented, only the client can insert or delete preedit data in place and possibly scroll existing text. This means the echo of the keystrokes has to be achieved by the client itself, tightly coupled with the input method logic. When a keystroke is entered, the client calls the XmbLookupString() or XwcLookupString() function. At this point, in the OnTheSpot case, the echo of the keystroke in the preedit has not yet been done. Before returning to the client logic that handles the input characters, the lookup function must call the echoing logic for inserting the new keystroke. If the keystrokes entered so far make up a character, the keystrokes entered need to be deleted, and the composed character is returned. The result is that, while being called by client code, input method logic has to call back to the client before it returns. The client code, that is, a callback routine, is called from the input method logic. There are a number of cases where the input method logic has to call back the client. Each of those cases is associated with a well-defined callback action. It is possible for the client to specify, for each input context, which callback is to be called for each action. There are also callbacks provided for feedback of status information and a callback to initiate a geometry request for an input method. 120 CDE: Internationalization Programmer’s Guide 5 X Server Keyboard Protocol This section discusses the server and keyboard groups. A keysym is the encoding of a symbol on a keycap. The goal of the server’s keysym mapping is to reflect the actual key caps on the physical keyboards. The user can redefine the keyboard by running the xmodmap command with the new mapping desired. X Version 11 Release 4 (X11R4) allows for definition of a bilingual keyboard at the server. The following describes this capability. A list of keysyms is associated with each key code. The following list discusses the set of symbols on the corresponding key: • If the list (ignoring trailing NoSymbol entries) is a single keysym K, the list is treated as if it were the list K NoSymbol K NoSymbol. • If the list (ignoring trailing NoSymbol entries) is a pair of keysyms K1 K2, the list is treated as if it were the list K1 K2 K1 K2. • If the list (ignoring trailing NoSymbol entries) is three keysyms K1 K2 K3, the list is treated as if it were the list K1 K2 K3 NoSymbol. When an explicit void element is desired in the list, the VoidSymbol value can be used. The first four elements of the list are split into two groups of keysyms. Group 1 contains the first and second keysyms; Group 2 contains the third and fourth keysyms. Within each group, if the second element of the group is NoSymbol, the group is treated as if the second element were the same as the first element, except when the first element is an alphabetic keysym K for which both lowercase and uppercase forms are defined. In that case, the group is treated as if the first element is the lowercase form of K and the second element is the uppercase form of K. The standard rules for obtaining a keysym from an event make use of the Group 1 and Group 2 keysyms only; no interpretation of other keysyms in the list is given here. The modifier state determines which group to use. Switching between groups is controlled by the keysym named MODE SWITCH by attaching that keysym to some key code and attaching that key code to any one of the modifiers Mod1 through Mod5. This modifier is called the group modifier. For any key code, Group 1 is used when the group modifier is off, and Group 2 is used when the group modifier is on. Xt and Xlib Dependencies 121 5 Within a group, the keysym to use is also determined by the modifier state. The first keysym is used when the Shift and Lock modifiers are off. The second keysym is used when the Shift modifier is on, when the Lock modifier is on, and when the second keysym is uppercase alphabetic, or when the Lock modifier is on and is interpreted as ShiftLock. Otherwise, when the Lock modifier is on and is interpreted as CapsLock, the state of the Shift modifier is applied first to select a keysym; if that keysym is lowercase alphabetic, the corresponding uppercase keysym is used instead. No spatial geometry of the symbols on the key is defined by their order in the keysym list, although a geometry might be defined on a vendor-specific basis. The server does not use the mapping between key codes and keysyms. Rather, it stores it merely for reading and writing by clients. The KeyMask modifier named Lock is intended to be mapped to either a CapsLock or a ShiftLock key, but which one it is mapped to is left as an application-specific decision, user-specific decision, or both. However, it is suggested that users determine mapping according to the associated keysyms of the corresponding key code. Interclient Communications Conventions for Localized Text The following information explains how components use Interclient Communications Conventions (ICCC) to communicate text data and is offered as a guideline to understand how ICCC selections are performed. The XmText widget, XmTextField widget, and the dtterm command adhere to these guidelines. The toolkit is enhanced for internationalized ICCC compliance. The selection mechanism of XmText, XmTextField, and dtterm is enhanced to ensure proper matching of data and data encoding in any selection transaction. This includes standard cut-and-paste operations. For developers who use the toolkit to write their applications, the toolkit enables the application to be ICCC-compliant. However, for developers who may use another non-ICCC-compliant toolkit to develop applications that communicate with toolkit-based applications, the following may be helpful. 122 CDE: Internationalization Programmer’s Guide 5 Owner of Selection Any owner returns at least the following atom list when XA_TARGETS is requested on some localized text: • • • Atom code set of current locale COMPOUND_TEXT XA_STRING When XA_TEXT is requested, the owner returns its text as is with the encoding type of the property set to the code set of the current locale (no data conversion). An atom is created, representing the name of the code set of the locale. When COMPOUND_TEXT is requested, the owner converts its localized text to compound text and passes it with the property type of COMPOUND_TEXT. When XA_STRING is requested, the owner attempts to convert the localized text to XA_STRING. If the text string contains characters that cannot be converted to XA_STRING, the operation is unsuccessful. Note – XA_STRING is defined to be ISO8859-1. Requester of Selection A requester first requests XA_TARGET when text data is to be communicated with the selection owner. The requester then searches for one of the following atoms in priority order: • • • • Atom for the code set of the requester’s locale COMPOUND_TEXT XA_STRING XA_TEXT If the code set of the requester’s locale matches one of the targets, the requester makes a request using the atom representing that code set. The XA_TEXT atom is used only if none of the other atoms is found. Because the owner returns a property with a type representing its encoding, the requester attempts to convert to the code set of its locale. Xt and Xlib Dependencies 123 5 If the type COMPOUND_TEXT or XA_STRING is requested, the requester attempts to convert the text property to the code set of its current locale by using the XmbTextPropertyToTextList() or XwcTextPropertyToTextList() functions. These are used when the owner client and requester client are running under different code sets. When converting from COMPOUND_TEXT or XA_STRING, not all text data is guaranteed to be converted; only those characters that are in common between the owner and the requester will be converted. XmClipboard XmClipboard is also enhanced to be ICCC-compliant in conjunction with the XmText and XmTextField widgets. When text is being put on the clipboard by way of the XmText and XmTextField widgets, the following ICCC protocol is implemented: When text is being retrieved from the clipboard by way of the XmText and XmTextField widgets, the text from the clipboard is converted to encoding of the current locale from either COMPOUND_TEXT or XA_STRING. All text on the clipboard is assumed to be in either the compound text format or the string format. Note – If text is put directly on the clipboard, the application needs to specify the format, or encoding type in the form of an atom, along with the text to put on the clipboard. Similarly, if text is retrieved directly from the clipboard, the retrieving application needs to check the format to see what encoding the data on the clipboard is encoded in and take the appropriate action. Passing Window Title and Icon Name to Window Managers The default of the XtNtitleEncoding and XtNiconNameEncoding resources for the VendorShell class is set to None. This is done only when using the libXm.a library. The libXt.a library still retains XA_STRING as the default for the resources. This is done so that, as a default case, the XmNtitle and XmNiconName resources are converted to a standard ICCC interchange, such as compound text, based on the assumption the text (title and icon name) is localized text. 124 CDE: Internationalization Programmer’s Guide 5 It is recommended that the user not set the XtNtitleEncoding and XtNiconNameEncoding resources. Instead, ensure that the XtNtitle and XtNiconName resources are strings encoded in the encoding of the currently active locale of the running client. If the None value is used, the toolkit converts the localized text to the standard ICCC style. (The encoding communicated is COMPOUND_TEXT or XA_STRING.) If the XtNtitleEncoding and XtNiconNameEncoding resources are set, the XtNtitle and XtNiconName resources are not converted in any way and are communicated to the Window Manager with the encoding specified. Assuming the Window Manager being communicated with is ICCC-compliant, that Window Manager is able to use the encoding type of COMPOUND_TEXT or XA_STRING, or both. When setting the XmNdialogTitle resource of the XmBulletinBoard widget class, remember that there is a restriction on the charset segment. For charsets that are not X Consortium-standard compound text encodings or XmFONTLIST_DEFAULT_TAG-associated, the text segment is treated as localized text. Localized text is converted to either compound text or ISO8859-1 before being communicated to the Window Manager. The Window Manager is enhanced so that it always converts the client title and icon name passed from clients to the encoding of its current locale, and an XmString is created using the XmFONTLIST_DEFAULT_TAG identifier. Thus, the client title and icon name are always drawn with the default font list entry of the Window Manager font list. Note – This allows clients running with different code sets but with similar character sets to communicate their titles to the Window Manager. For example, both a PC code client and an ISO8859-1 client can display their titles regardless of the code set of the Window Manager. Messages Part of internationalizing a system environment toolkit-based application is not to have any locale-specific data hardcoded within the application source. One common locale-specific item is messages (error and warning) returned by the application of the standard I/O (input/output). Xt and Xlib Dependencies 125 5 In general, for any error or warning messages to be displayed to the user through a system environment toolkit widget or gadget, the messages need to be externalized through message catalogs. For dialog messages to be displayed through a toolkit component, the messages need to be externalized through localized resource files. This is done in the same way as localizing resources, such as the XmLabel and XmPushbutton classes’ XmNlabelString resource or window titles. For example, if a warning message is to be displayed through an XmMessageBox widget class, the XmNmessageString resource cannot be hardcoded within the application source code. Instead, the value of this resource needs to be retrieved from a message catalog. For an internationalized application expected to run in different locales, a distinct localized catalog must exist for each of the locales to be supported. In this way, the application need not be rebuilt. The localized resource files can be put in the /opt/dt/app-defaults/%L subdirectories or they can be pointed to by the XENVIRONMENT environment variable. The %L variable indicates the locale used at run time. The preceding two choices are left as design decisions for the application developer. 126 CDE: Internationalization Programmer’s Guide A Message Guidelines Refer to the information in this appendix to write messages that are easily internationlized. Refer to the information in this appendix to write messages that are easily internationlized. 127 Cause and Recovery Information 128 Comment Lines for Translators 128 Writing Style 129 Usage Statements 131 Regular Expression Standard Messages 134 Sample Messages 135 File-Naming Conventions The conventions used in naming files with user messages are discussed here. Usually, the message source file has the suffix .msg; the generated message catalog has the suffix .cat. There may be other such files related to messages. The following criteria must be met for a file to have these suffixes: • • It is X/Open-compliant. It becomes a *.cat file through the use of the gencat command. 127 A Cause and Recovery Information Whenever possible, explain to users exactly what has happened and what they can do to remedy the situation. The message Bad arg is not very helpful. However, the following message tells users exactly what to do to make the command work: Do not specify more than 2 files on the command line Similarly, the message Line too long does not giver users recovery information. However, the following message gives users more specific recovery information: Line cannot exceed 20 characters If detailed recovery information is necessary for a given error message, add it to the appropriate place in online information or help. See “Sample Messages” on page 135 for samples of original and rewritten messages. Comment Lines for Translators A message source file should contain comments to help the translator in the process of translation. These comments will not be part of the message catalog generated. The comments are similar to C language comments to help document a program. A dollar sign ($) followed by a space will be interpreted by the translation tool and the gencat command as comments. The following is an example of a comment line in a message source file. $ This is a comment Use comment lines to tell translators and writers what variables, such as %s, %c, and %d, represent. For example, note whether the variable refers to such things as a user, file, directory, or flag. Place the comment line directly beneath the message to which it refers, rather than at the bottom of the message catalog. Global comments for an entire set can be placed directly below the $set directive in the source file. Specify in a comment line any messages within the message catalog that are obsolete. 128 CDE: Internationalization Programmer’s Guide A Programming Format For the programming format of messages, see the following list. • Do not construct messages from clauses. Use flags or other means within the program to pass information so that a complete message can be issued at the proper time. • Do not use hardcoded English text as a variable for a %s string in an existing message. This is also the construction of messages and is not translatable. • Capitalize the first word of the sentence, and use a period at the end of the sentence or phrase. • End the last line of the message with \n (backslash followed by a lowercase n, indicating a new line). This also applies to one-line messages. • Begin the second and remaining lines of a message with \t (backslash followed by a lowercase t, indicating a tab). • End all other lines with \n\ (backslash followed by a lowercase n, followed by another backslash, indicating a new line). • If, for some reason, the message should not end with a new line, use a comment to tell the writers. • Precede each message with the name of the command that called the message, followed by a colon. The command name should precede the component number in error messages. The command name is shown in the following example as it should appear in a message: OPIE “foo: Opening the file.” Writing Style The following guidelines on the writing style of messages include terminology, punctuation, mood, voice, tense, capitalization, and other usage questions. • • • Use sentence format. One-line and one-sentence messages are preferable. Add articles (a, an, the) when necessary to eliminate ambiguity. Capitalize the first word of the sentence and use a period at the end. Message Guidelines 129 A • Use the present tense. Do not allow future tense in a message. For example, use the sentence: The foo command displays a calendar. Instead of: The foo command will display a calendar. • • Do not use the first person (I or we) anywhere in messages. Avoid using the second person. Do not use the word you except in help and interactive text. • Use active voice. The first line is the original message. The second line is the preferred wording. MYNUM “Month and year must be entered as numbers.” MYNUM “foo: 7777-222 Enter month and year as numbers.\n” 7777-222 is the message ID. • Use the imperative mood (command phrase) and active verbs: specify, use, check, choose, and wait are examples. • State messages in a positive tone. The first line is the original message. The second line is the preferred wording. BADL “Don’t use the f option more than once.” BADL “foo: 7777-009 Use the -f flag only once.\n” 130 • Do not use nouns as verbs. Use words only in the grammatical categories shown in the dictionary. If a word is shown only as a noun, do not use it as a verb. For example, do not solution a problem (or, for that matter, architect a system). • Do not use prefixes or suffixes. Translators may not understand words beginning with re-, un-, in-, or non-, and the translations of messages that use these prefixes or suffixes may not have the meaning you intended. Exceptions to this rule occur when the prefix is an integral part of a commonly used word. The words previous and premature are acceptable; the word nonexistent, is not. CDE: Internationalization Programmer’s Guide A • Do not use plurals. Do not use parentheses to show singular or plural, as in error(s), which cannot be translated. If you must show singular and plural, write error or errors. A better way is to condition the code so that two different messages are issued depending on whether the singular or plural of a word is required. • Do not use contractions. Use the single word cannot to denote something the system is unable to do. • Do not use quotation marks. This includes both single and double quotation marks. For example, do not use quotation marks around variables such as %s, %c, and %d or around commands. Users may take the quotation marks literally. • • Do not hyphenate words at the end of lines. • Do not use and/or. This construction does not exist in other languages. Usually it is better to say or to indicate that it is not necessary to do both. • Use the 24-hour clock. Do not use a.m. or p.m. to specify time. For example, write 1:00 p.m. as 1300. • Avoid acronyms. Only use acronyms that are better known to your audience than their spelled-out versions. To make a plural of an acronym, add a lowercase s, without an apostrophe. Verify that it is not a trademark before using it. • Avoid the “no-no” words. Examples are abort, argument, and execute. See the project glossary. • Retain meaningful terminology. Keep as much of the original message text as possible while ensuring that the message is meaningful and translatable. Do not use the standard highlighting guidelines in messages, and do not substitute initial or all caps for other highlighting practices. Usage Statements The usage statement is generated by commands when at least one flag that is not valid has been included in the command line. The usage statement must not be used if only the data associated with a flag is missing or incorrect. If this occurs, an error message unique to the problem is used. • Show the command syntax in the usage statement. For example, a possible usage statement for the del command reads: Message Guidelines 131 A Usage: del {File ...|-} • • Clauses defining the purpose of a command are to be removed. • Do not abbreviate parameters on the command line. It may be perfectly obvious to experienced users that Num means Number, but spell it out to ensure correct translation. • Use only the following delimiters in usage statements: Capitalize the first letter of such words (parameters) as File, Directory, String, Number, and so on only when used in a usage statement. Delimiter Description [] Parameter is optional. {} There is more than one parameter choice, but one of the parameters is required. (See the following text.) | Choose one parameter only. [a|b] indicates that you can choose a or b or neither a nor b. {a|b} indicates that you must choose either a or b. .. Parameter can be repeated on the command line. (Note that there is a space before the ellipsis.) - Standard input. • A usage statement parameter does not require square brackets or braces if it is required and is the only choice, as in the following: banner String • In usage statements, put a space between flags that must be separated on the command line. For example: unget [-n] [-rSID] [-s] {File|-} • If flags can be used together without a separating space, do not separate them with a space on the command line. For example: wc [-cwl] {File ...|-} • When the order of flags on the command line does not make a difference, put them in alphabetical order. If the case is mixed, put lowercase versions first: get -aAijlmM 132 CDE: Internationalization Programmer’s Guide A • Some usage statements can be long and involved. Use your best judgment to determine where you should end lines in the usage statement. The following example shows an old-style usage statement for the get command: Usage: get [-e|-k] [-cCutoff] [-iList] [-rSID] [-wString] [xList] [-b] [-gmnpst] [-l[p]] File ... Retrieves a specified version of a Source Code Control System (SCCS) file. Standard Messages Certain commands have standard errors defined in POSIX.2 documentation. Follow the guidelines set up in POSIX.2, if applicable. • Tell the user to Press the ------ key to select a key on the keyboard, including the specific key to press (such as, Press Ctrl-D). • Unless the system is overloaded, there is no need to tell the user to Try again later. That should be obvious from the message. • When writing message text, use the word parameter to describe text on the command line; use the word value to indicate numeric data. • • • • Use the word flag rather than the words command option. Do not use commas to set off the one-thousandth place in values. Do not use 1,000. Use 1000. If a message must be set off with an asterisk, use two asterisks at the beginning of the message and two asterisks at the end of the message. ** Total ** • Use log in and log off as verbs. Log in to the system; enter the data; then log off. • Use user name, group name, and login as nouns. The user name is sam. The group name is staff. The login directory is /u/sam. • User number and group number refer to the number associated with the user’s name and group. • Do not use the term superuser. The root user may not have all privileges. Message Guidelines 133 A • • Use the words command string to describe the command with its parameters. Many of the same messages occur frequently. Table A-1 lists the new standard message that replaces the old message. Table A-1 New Standard Messages Use the Following Standard Messages Instead of These Messages Cannot find or open the file. Can’t open filename. Cannot find or access the file. Can’t access The syntax of a parameter is not valid. syntax error Regular Expression Standard Messages Table A-2 lists the standard regular expression error messages, including the message number associated with each regular expression error: Table A-2 Regular Expression Standard Messages 134 Number Use These Standard Messages Instead of These Messages 11 Specify a range end point that is less than 256. Range end point too large. 16 The character or characters between \{ and \} must be numeric. Bad number. 25 Specify a \digit between 1 and 9 that is not greater than the number of subpatterns. \digit out of range. 36 A delimiter is not correct or is missing. Illegal or missing delimiter. 41 There is no remembered search string. No remembered search string. 42 There is a missing \( or \). \(\) imbalance. 43 Do not use \( more than 9 times. Too many \(. 44 Do not specify more than 2 numbers between \{ and \}. More than two numbers given in \{ and \}. CDE: Internationalization Programmer’s Guide A Table A-2 Regular Expression Standard Messages (Continued) 45 An opening \{ must have a closing \}. } expected after \. 46 The first number cannot exceed the second number between \{ and \}. First number exceeds second in \{ and \}. 48 Specify a valid end point to the range. Invalid end point in range expression. 49 For each [ there must be a ]. [ ] imbalance. 50 The regular expression is too large for internal memory storage. Simplify the regular expression. Regular expression overflow. Sample Messages These are examples of original messages and rewritten messages. The rewritten message follows each original message. AFLGKEYLTRS “Too Many -a Keyletters (Ad9)” AFLGKEYLTRS “foo: 7777-007 Use the -a flag less than 11 times.\n” FLGTWICE “Flag %c Twice (Ad4)” FLGTWICE “foo: 7777-004 Use the %c header flag once.\n” ESTAT “can’t access %s.\n” ESTAT “foo: 7777-031 Cannot find or access %s.\n” EMODE “foo: invalid mode\n” EMODE “foo: 7777-033 A mode flag or value is not correct.\n” DNORG “-d has no argument (ad1)” DNORG “foo: 7777-001 Specify a parameter after the -d flag.\n” FLOORRNG “floor out of range (ad23)” FLOORRNG “foo: 7777-021 Specify a floor value greater than 0\n\ \tand less than 10000.\n“ AFLGARG “bad -a argument (ad8)” AFLGARG “foo: 7777-006 Specify a user name, group name, or\n\ \tgroup number after the -a flag.\n“ BADLISTFMT “bad list format (ad27)” BADLISTFMT “foo: 7777-025 Use numeric version and release\ \tnumbers.\n” Message Guidelines 135 A 136 CDE: Internationalization Programmer’s Guide Index A app-defaults file 23 application programmer, controlling input method components 89 application requirements 1 auxiliary area 19 B base font name list 11 basic interchange in a network 55 button resources 45 C callbacks, with Xlib 120 changing the locale 22 character set keywords 99 character sets, defining with UIL CHARACTER_SET function 99 charset segment, restriction 125 clipboard data encoding 124 CNS character definitions 71 code page 63 code segment, example using XmNlabelString resource 82 code set name, portability 79 code sets control characters 65 eucJP, description 69 eucKR, description 72 eucTW, description 70 extended UNIX code (EUC) 66 graphic characters 65 ISO EUC 66 ISO646-IRV, description 67 ISO8859, list of other 67 ISO8859-1, description 67 multibyte 66 network local hosts 56 network remote host 56 single-byte 65 stateful encodings 59 stateless encodings 59 strategy 63 structure 64 Common Desktop Environment description 1 goal of 4 input area auxiliary area 19 details of 13 focus area 19 MainWindow area 19 preedit area 15 status area 18 137 138 input method interface 90 keyboard groups 121 National Language Support input areas 13 setlocale function 7 using locales 7 window manager, ICCC enhancements 125 common desktop environment functions found in 51 Common Desktop Environment Toolkit ICCC compliance 122 non-ICCC-compliant 122 communicating text data, ICCC 122 compound strings components 80 in default files 82 directions 80 font list element tags 80 for international text display 80 relationship to font list 83 separator 80 setting programmatically 81 structures, interaction with font lists 80 in UIL 98 conversions iconv text 60 simple text 60 stateful code sets 60 stateless encodings 59 Xlib 61 customizing keyboard input, localization 121 customizing the input method 49 dependencies, modifier for internationalization 102 determining language string with XtDisplayInitialize function 106 dialog message, toolkit 126 direction identifiers as compound string components 80 distributed internationalization guidelines 55 double-byte character set (DBCS) 101 drawing a localized string 32 drawing text, Xlib routines and functions 108 dtterm command ICCC 19 ICCC compliance 122 D F data encoding, clipboard 124 default font list entry drawing client title 125 drawing icon name 125 default, resource ICCC compliance 124 default_charset string literal 96 file, naming conventions 127 focus area 19 focus management example description 92 focus area 19 international text input 90 CDE: Internationalization Programmer’s Guide E encodings 63 environment, language 73 error message, see also message 125 eucJP code set 69 eucKR code set 72 eucTW code set 70 event filtering with Xlib 119 examples of displaying localized title and icon name 62 externalizing dialog messages 126 extracting localized text using message catalogs 41 using private files 41 using resource files 41 XMbLookupString or XwcLookupString 117 font list entries, creating 85 font lists 25 description 87 element tags as compound string components 80 internationalizing 10 relationship to compound strings 83 setting in resource files 77 structures 75 Text widgets 87 TextField widgets 87 font lists in UIL, creating functions for 93 font management choosing correct fonts 23 listing of functions 27 font selection algorithm, displaying text with font sets 11 font sets creating with Xlib 109 drawing text 110 internationalizing 9 metrics, obtaining with Xlib interfaces 109 programming for international UIL 93 specifying 78 specifying base name list 11 font-encoded text, definition 78 fonts character code values 23 glyphs contained in 23 limitations with internationalized programs 25 matching to character sets 23 for Motif-based applications 26 name tags 26 organization 28 rendering for an X Windows client 23 resource specifications 26 syntax for a fontset 26 fonts, creating 76 Index G geometry management application programmer controls 89 international text input 89 Text widget 89 TextField widget 89 with Xlib 118 XmBulletinBoard widget 89 XmRowColumn widget 89 guidelines for window titles 61 H help information guidelines 41 I ICCC compliance default for resources 124 dtterm command 122 for internationalization 122 passing icon name 124 passing window title 124 toolkit 122 window manager 124 XmClipboard 124 XmText widget 122 XmTextField widget 122 iconv interface 56 text conversion functions 60 input method Common Desktop Environment interface 90 determining, XmNinputMethod resource 88 international text input 87 multibyte characters 90 requirements 87 Text widget 89 VendorShell widget class 87 XMbLookupString or XwcLookupString 117 interfaces 139 between input method and Common Desktop Environment 90 for network communications 55 international application in different locales 126 international text drawing XmFontList function 78 XmString 79 international text input focus management 90 geometry management 89 input methods 87 multibyte characters 89 VendorShell widget operations 88 internationalization common system 5 definition 1 goals of 3 ICCC compliance 122 input method architecture 113 managing locales 101 preediting supported by Xlib 114 specifications 5 specifying base name lists 11 supported languages 4 using Xlib for text input 111 X locales, managing 101 Xt locales, managing 104 ISO EUC code set 66 ISO646-IRV code set 67 ISO8859, other significant code sets 67 ISO8859-1 code set 67 J Japanese Input Method auxiliary area 19 preediting, reconverted strings 15 K keyboards customizing localization input 121 groups for Common Desktop Environment 121 140 CDE: Internationalization Programmer’s Guide keys, code associated with keysym 121 keysyms associated key code 121 definition 121 L language environment description 73 language procedure 74 languages 4 libXm library 19 list resources 47 loading fonts 76, 85 locale management description 22 functions used 22 locales behavior 2 definition 2, 21 environment variables 22 fonts for 23 managing 101 managing X 101 managing Xt 104 modifier dependencies 102 UIL compiler 92 localization definition 3 results of 4 localized catalog for each supported locale 126 resource file, location 126 localized resources 45 customizing the input method 49 gadget 45 text 48 titles and icon names 47 widget 45 localized text definition 78 drawing compound 31 drawing simple 30 extracting 40 input methods 34 methods for establishing 40 writing in resource files 33 localizing customizing keyboard input 121 location of localized resource files 126 M MainWindow area 19 message dialog, externalizing 126 error 125 internationalizing 125 warning 125 messages cause and recovery information 128 comment lines for translators 128 extraction functions requirements for internationalization 42 Xlib set 44 XPG4 set 42 file-naming conventions 127 guidelines 41 option 134 programming format 129 punctuation and wording guidelines 133 samples 135 usage statements in 131 writing style in 129 modes of preediting OffTheSpot 15 OverTheSpot 16 Root 18 modifier dependencies for internationalization 102 MrmOpenHierarchy function, searching UID file 95 N National Language Support entering input 13 Index font lists 8 font sets 8 fonts 8 input areas 13 internationalized ICCC 19 programming for international use international text input 111 Xt locale management 104 specifying base name list 11 understanding font lists 8 font sets 8 fonts 8 User Interface Language (UIL) 92 using input methods 13 Window Manager communicating icon names 20 communicating titles 20 network-based input method 34 networks 55 non-ICCC-compliant toolkit owner 123 requester 123 XmClipboard 124 O OffTheSpot mode, preedit area 15 OS internationalized functions 51 OverTheSpot mode, preedit area 16 owner, non-ICCC-compliant toolkit 123 P pixmaps, localizing 49 portability of code set names 79 preedit areas default mode 16 description 15 OffTheSpot mode 15 OverTheSpot mode 16 Root mode 18 VendorShell widget class 15 preediting 90 141 programming for international UIL 93, 94 programming for international use ICCC compliance 122 international text input 111 messages 125 UIL 93, 94 locale text 92 parsing multibyte character string 92 parsing nonstandard charsets 92 string literals 92 Xt locale management 104 R requester, non-ICCC-compliant toolkit 123 resource files creating for international UIL 94 localized, location 126 writing a localized string 33 resource files, creating 94 resources button 45 locale sensitive 45 for reading lists 47 for setting lists 47 for setting titles 47 used as labels 45 Root mode, preedit area 18 S separators as compound string components 80 setlocale function for internationalization 7 setting the environment for international UIL 94 searching the UID file 95 setting the locale 22 simple text conversion functions 60 standard interfaces, benefit of using 5 standards 4, 5 142 CDE: Internationalization Programmer’s Guide stateful and stateless encodings, conversion of 59 status area 18 string literals default_charset in UIL 96 in UID files 99 programming for international UIL 92 syntax 99 T text input in applications without Text widget 36 intermediate feedback 35 managing with Xlib 115 prompts and dialogs 34 within a DrawingArea widget 34 text resources 48 Text widget font list search 87 Text widgets, input method 89 text, obtaining localized 23 TextField widget font list search 87 titles for windows 61 toolkit component, dialog messages 126 U UID file search 95 UIL (User Interface Language) sample Japanese and English program 96 usage statements, delimiters 132 User Interface Language (UIL), see UIL using default encoding, ICCC-compliant resources 124 ICCC to communicate text data 122 V VendorShell widget class auxiliary area 19 child widget size 90 focus area 19 focus management 90 geometry management 87 as input manager 87 as interface 90 MainWindow area 19 managing components MainWindow area 87 preedit area 87 status area 87 preedit area 15 size 90 status area 18 VendorShell widget operations processing multibyte character I/O 88 W warning message, see also message 125 Window Manager communicating titles and icon names 20 window manager converting client title 125 converting icon name 125 font list drawing client title 125 font list drawing icon name 125 X X interclient (ICCCM) conversion functions 61 X Logical Font Description (XLFD) font names for international locale 11 identifying glyphs 23 name fields 23 X/Open specifications 5 XFontStruct 78 XIM callback 40 event handling 39 management functions 38 Xlib message/resource facilities 44 Index Xlib routines and functions, drawing text 108 XLoadQueryFont 84 XmClipboard ICCC compliance 124 non-ICCC-compliant toolkit 124 XmText widget 124 XmTextField widget 124 XmFontList functions, international drawing 78 XmFontListEntryLoad 75 XmGetPixmapByDepth 49 XmIm functions 36 XmNinputMethod resource, determining input method 88 XmNlabelString resource, code segment 82 XmString functions 34 XmStringCreate description 86 XmStringCreateLocalized 86 XmStringCreateLtoR 86 XmStringLoadQueryFont, international text drawing, example syntax 79 XmText functions 35 XmText widget class, ICCC compliance 122 XmTextField widget class, ICCC compliance 122 XPG4 messaging examples 42 Xt locale management programming for international use 104 XtAppSetFallbackResources function 107 XtDisplayInitialize function 106, 108 XtResolvePathname function 107 XtAppSetFallbackResources, Xt locale management 107 XtDisplayInitialize function description 107 locale management 106 managing locales with 106 143 Xt locale management 108 XtResolvePathname Xt locale management 107 XtSetLanguageProc default language 75 managing locales 104 144 CDE: Internationalization Programmer’s Guide
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.4 Linearized : No Page Count : 162 XMP Toolkit : XMP toolkit 2.9.1-13, framework 1.6 About : uuid:4556b24f-f8a0-11ec-0000-f34dbd8bee9e Producer : GPL Ghostscript 9.05 Modify Date : 2012:06:27 12:12:57-06:00 Create Date : 2012:06:27 12:12:57-06:00 Creator Tool : Frame 4.0 Document ID : uuid:4556b24f-f8a0-11ec-0000-f34dbd8bee9e Format : application/pdf Title : Untitled Creator : Frame 4.0EXIF Metadata provided by EXIF.tools