002093 A00_Domain_C_Language_Reference_Jul88 A00 Domain C Language Reference Jul88
002093-A00_Domain_C_Language_Reference_Jul88 002093-A00_Domain_C_Language_Reference_Jul88
User Manual: 002093-A00_Domain_C_Language_Reference_Jul88
Open the PDF directly: View PDF .
Page Count: 554
Download | |
Open PDF In Browser | View PDF |
Domain C Language Reference 002093-AOO apollo Domain C Language Reference Order No. 002093-AOO Apollo Computer Inc. 330 Billerica Road Chelmsford, MA 01824 Confidential and Proprietary. Copyright © 1988 Apollo Computer, Inc., Chelmsford, Massachusetts. Unpublished -- rights reserved under the Copyright Laws of the United States. All Rights Reserved. First Printing: Latest Printing: October 1982 July 1988 This document was produced using the Interleaf Technical Publishing Software (TPS) and the InterCAP Illustrator I Technical Illustrating System, a product of InterCAP Graphics Systems Corporation. Interleaf and TPS are trademarks of Interleaf, Inc. Apollo and Domain are registered trademarks of Apollo Computer Inc. ETHERNET is a registered trademark of Xerox Corporation. Personal Computer AT and Personal Computer XT are registered trademarks of International Business Machines Corporation. Copyright 1979, 1980, 1983, 1986 Regents of the University of California and 1979, AT&T Bell Laboratories, Incorporated. UNIX is a registered trademark of AT&T in the USA and other countries. 3DGMR, Aegis, D3M, DGR, Domain/ Access, Domain/ Ada, Domain/Bridge, Domain/C, Domain/ComController, Domain/CommonLISP, Domain/CORE, Domain/Debug, Domain/DFL, Domain/Dialogue, Domain/DQC, Domain/IX, Domain/Laser-26, Domain/LISP, Domain/PAK, Domain/PCC, Domain/PCI, Domain/SNA, Domain X.25, DPSS, DPSS/Mail, DSEE, FPX, GMR, GPR, GSR, NLS, Network Computing Kernel, Network Computing System, Network License Server, Open Dialogue, Open Network Toolkit, Open System Toolkit, Personal Supercomputer, Personal Super Workstation, Personal Workstation, Series 3000, Series 4000, Series 10000, and VCD-8 are trademarks of Apollo Computer Inc. Apollo Computer Inc. reserves the right to make changes in specifications and other information contained in this publication without prior notice, and the reader should in all cases consult Apollo Computer Inc. to determine whether any such changes have been made. THE TERMS AND CONDITIONS GOVERNING THE SALE OF APOLLO COMPUTER INC. HARDWARE PRODUCTS AND THE LICENSING OF APOLLO COMPUTER INC. SOFTWARE PROGRAMS CONSIST SOLELY OF THOSE SET FORTH IN THE WRITTEN CONTRACTS BETWEEN APOLLO COMPUTER INC. AND ITS CUSTOMERS. NO REPRESENTATION OR OTHER AFFIRMATION OF FACT CONTAINED IN THIS PUBLICATION, INCLUDING BUT NOT LIMITED TO STATEMENTS REGARDING CAPACITY, RESPONSE-TIME PERFORMANCE, SUITABILITY FOR USE OR PERFORMANCE OF PRODUCTS DESCRIBED HEREIN SHALL BE DEEMED TO BE A WARRANTY BY APOLLO COMPUTER INC. FOR ANY PURPOSE, OR GIVE RISE TO ANY LIABILITY BY APOLLO COMPUTER INC. WHATSOEVER. IN NO EVENT SHALL APOLLO COMPUTER INC. BE LIABLE FOR ANY INCIDENTAL, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES WHATSOEVER (INCLUDING BUT NOT LIMITED TO LOST PROFITS) ARISING OUT OF OR RELATING TO THIS PUBLICATION OR THE INFORMATION CONTAINED IN IT, EVEN IF APOLLO COMPUTER INC. HAS BEEN ADVISED, KNEW OR SHOULD HAVE KNOWN OF THE POSSIBILITY OF SUCH DAMAGES. THE SOFTWARE PROGRAMS DESCRIBED IN THIS DOCUMENT ARE CONFIDENTIAL INFORMATION AND PROPRIETARY PRODUCTS OF APOLLO COMPUTER INC. OR ITS LICENSORS. Preface The Domain C Language Reference manual describes the Domain C programming language and the Domain programming environment relevant to C programmers. We've organized this manual as follows: Chapter 1 Presents an overview of the Domain C compiler. Chapter 2 Describes the lexical components of a C program (such as identifiers, comments, and keywords), and describes the general organization of C programs. Chapter 3 Describes data types and storage classes, and the syntax and semantics of declaring variables. Chapter 4 Provides encyclopedic descriptions of all C language statements and operators, as well as descriptions of general C programming concepts. Chapter 5 Provides details about declaring and invoking functions. Chapter 6 Describes compiler options and the compilationllinking process. Chapter 7 Describes how to call FORTRAN and Pascal routines from a C program, and how to share global data with routines written in other languages. Chapter 8 Provides an overview of input and output operations that can be performed with the standard C run-time library and the UNIX system library. Chapter 9 Describes the types of diagnostic messages that the compiler issues, and lists each message along with its probable cause. Appendix A Lists the ISO Latin-l code values. Appendix B Lists Domain extensions to the C programming language. Preface iii Appendix C Describes the BSD version of the lint utility. Appendix D Describes the SysV version of the lint utility. Appendix E Describes the std_Scall keyword, which is now obsolete. Revision History Because this manual has been extensively revised, we have not used marginal change bars to indicate each modification. See the C Compiler Release Document for a list of functional changes to the C compiler. Related Manuals For more information about the standard C run-time library and UNIX system calls, see the BSD Programmer's Reference manual (005801) and the SysV Programmer's Reference manual (005799). For more information about system calls see the Domain/OS Call Reference manual (007196) and Programming with Domain/OS Calls '(005506). For more information about the programming environment and software tools, see the Domain lOS Programming Environment Reference manual (011010). For more information about the Domain Pascal programming language, see the Domain Pascal Programming Language Reference manual (000792). For information about the Domain FORTRAN programming language, see the Domain FORTRAN Programming Language Reference manual (000530). For more information about the binder (bind), link editor (ld), librarian (lbr), and archiver (ar), see the Domain/OS Programming Environment Reference manual (0011010). For more information about the Domain high-level debugger, see the Domain Distributed Debugging Environment Reference manual (011 024) . For more information about DSEE, see the Domain Software Engineering Environment (DSEE) Reference manual (003016). Problems, Questions, and Suggestions We appreciate comments from the people who use our system. To make it easy for you to communicate with us, we provide the Apollo Product Reporting (APR) system for comments related to hardware, software, and documentation. By using this formal channel, you make it easy for us to respond to your comments. iv Preface You can get more information about how to submit an APR by consulting the appropriate Command Reference manual for your environment (Aegis, BSD, or SysV). Refer to the mkapr (make apollo product report) shell command description. You can view the same description online by typing: $ man mkapr (in the SysV environment) % man mkapr (in the BSD environment) $ help mkapr (in the Aegis environment) Alternatively, you may use the Reader's Response Form at the back of this manual to submit comments about the manual. Documentation Conventions Unless otherwise noted in the text, this manual uses the following symbolic conventions. literal values Bold words or characters in formats and command descriptions represent commands or keywords that you must use literally. Pathnames are also in bold. Bold words in text indicate the first use of a new term. user-supplied values Italic words or characters in formats and command descriptions represent values that you must supply. Domain extensions Domain-specific features of C appear in color. sample user input In samples, information that the user enters appears in color. output Information that the system displays appears in this typeface. [ ] Square brackets enclose optional items in formats and command descriptions. { } Braces enclose a list from which you must choose an item in formats and command descriptions. In sample Pascal statements, braces assume their Pascal meanings. A vertical bar separates items in a list of choices. < > CTRL/ Angle brackets enclose the name of a key on the keyboard. The notation CTRL/ followed by the name of a key indicates a control character sequence. Hold downwhile you press the key. Preface v Horizontal ellipsis points indicate that you can repeat the preceding item one or more times. Vertical ellipsis points mean that irrelevant parts of a figure or example have been omitted. I ----88---- vi Preface Because this manual has been extensively revised, we have not used marginal change bars to indicate each modification. This symbol indicates the end of a chapter. Contents Chapter 1 1.1 1.2 1.3 1.3.1 1.3.2 1.3.3 1.4 1.4.1 1.5 1.5.1 1.5.2 Chapter 2 2.1 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.2 2.2.1 2.2.2 2.2.3 2.2.4 2.3 Overview of Domain C History of C ................................................ C Standards ................................................ Two Ways to Call C .......................................... Two Preprocessors ......................................... Two Styles of Object Code .................................. Two Command Line Syntaxes ............................... A Sample Program ........................................... Compiling and Executing ................................... Online Sample Programs ...................................... Accessing Sample Programs with getcc ........................ Accessing Sample Programs with Domain/Delphi ................ . . . . . . . . . . . 1-1 1-2 1-3 1-4 1-4 1-5 1-5 1-6 1-6 1-6 1-7 Program Organization Lexical Elements ............................................. White Space and Newlines .................................. , Comments ................................................ Spreading Source Code Across Multiple Lines. . . . . . . . . . . . . . . . . .. Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Case Sensitivity ............................................ Keywords. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Integer Constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Floating-Point Constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Character Constants ........................................ String Constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Program Organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2-1 2-2 2-2 2-3 2-4 2-5 2-5 2-6 2-6 2-7 2-8 2-10 2-11 Contents vii 2.3.1 2.3.2 2.3.3 2.3.4 2.4 2.4.1 2.4.2 Chapter 3 3.1 3.1.1 3.1.2 3.2 3.2.1 3.3 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 3.4 3.4.1 3.4.2 3.4.3 3.5 3.5.1 3.5.2 3.5.3 3.6 3.7 3.7.1 3.7.2 3.7.3 3.8 3.8.1 3.8.2 3.8.3 3.8.4 3.8.5 3.8.6 3.8.7 3.9 3.9.1 3.9.2 3.9.3 viii Contents Functions....... .... . .... .. . . .. . . . . .. . ... . . . . ... . .. . . . . . .. The Begin and End Symbols: {}............................. Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Preprocessor Directives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Declarations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Typedef Declarations ....................................... Name Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2-12 2-13 2-13 2-13 2-13 2-14 2-16 Data Types and Storage Classes Data Type Overview .......................................... . Scalar Types ............................................. . Aggregate Types .......................................... . Overview of Variable Initialization .............................. . Old-Style Initialization ..................................... . Integer Data Types ........................................... . 32-Bit Integers ........................................... . 16-Bit Integers ........................................... . 8-Bit Integers (Character Data Type) ......................... . Initializing Integer Variables ................................. . Integer Overflow .......................................... . Floating-Point Data Types ..................................... . Single-Precision Floating-Point .............................. . Double-Precision Floating-Point ............................. . Initializing Floating-Point Variables ........................... . Enumerated Data Types ....................................... . The Values of Enumerated Constants ......................... . Initializing Enumerated Variables ............................. . Sized enums - Domain Extension ............................ . The void Data Type .......................................... . Pointer Data Types ........................................... . Internal Representation of Pointers ........................... . Initializing Pointers ........................................ . Generic Pointers .......................................... . Structure and Union Data Types ................................ . Declaring a Structure or Union ............................... . Internal Representation of Structures .......................... . Internal Representation of Unions ............................ . Bit Fields in Structures and Unions ........................... . struct and union Name Spaces ............................... . Initializing Structures ....................................... . Initializing Unions ......................................... . Arrays ..................................................... . Omitting the Array Size .................................... . Initializing Arrays .......................................... . Multidimensional Arrays .................................... . 3-1 3-2 3-4 3-4 3-5 3-6 3-6 3-7 3-8 3-9 3-10 3-11 3-11 3-12 3-13 3-14 3-15 3-17 3-17 3-18 3-19 3-20 3-20 3-21 3-22 3-23 3-24 3-29 3-31 3-32 3-33 3-33 3-35 3-36 3-36 3-37 3.9.4 3.9.5 3.10 3.11 3.11.1 3.11.2 3.12 3.12.1 3.12.2 3.12.3 3.12.4 3.12.5 3.13 3.13.1 3.13.2 3.13.3 3.13.4 3.14 3.14.1 3.14.2 3.15 3.15.1 3.16 3.16.1 3.16.2 3.16.3 3.16.4 3.16.5 3.16.6 Chapter 4 4.1 4.1.1 4.1.2 4.1.3 4.1.4 4.1.5 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5 4.2.6 4.2.7 Storage of Arrays ......................................... . Strings .................................................. . Abstract Declarators .......................................... . Complex Declarations ......................................... . Deciphering Complex Declarations ............................ . Composing Complex Declarations ............................ . Storage Classes .............................................. . Declaration Position ....................................... . Scope of a Variable Declaration ............................. . Duration of a Variable ..................................... . Storage Class Specifiers ..................................... . The register Specifier ...................................... . Global Variables ............................................. . Definitions and Allusions ................................... . Defining Global Variables ................................... . Portability Considerations Regarding Global Variables ............ . Sections ................................................. . Storage Class of Functions ..................................... . Function Definitions ....................................... . Function Allusions ......................................... . Reference Variables - Domain Extension '" ..................... . Declaring Reference Variables ............................... . The #attribute Modifier - Domain Extension ..................... . Inheritance of Declaration Modifiers .......................... . #attribute and Pointer Types ................................ . The volatile Specifier ...................................... . The device Specifier ....................................... . The address Specifier ...................................... . The section Specifier ....................................... . 3-39 3-40 3-41 3-42 3-43 3-44 3-46 3-46 3-48 3-52 3-55 3-56 3-57 3-57 3-57 3-60 3-60 3-60 3-61 3-61 3-62 3-63 3-63 3-64 3-64 3-64 3-66 3-68 3-69 Code Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Null Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Simple Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Compound Statement or Block ............................... Branching Statements ....................................... Looping Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Overview: Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Pointer Operators .......................................... Increment and Decrement Operators. . . . . . . . . . . . . . . . . . . . . . . . . .. Cast. Operator ............................................. sizeof Operator ............................................ Arithmetic Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Comparison (Relational) Operators ............................ Bit Operators ............................................ : Contents 4-1 4-2 4-2 4-2 4-3 4-3 4-3 4-4 4-5 4-5 4-5 4-6 4-6 4-7 ix Logical Operators ......................................... . 4.2.8 Conditional Expression Operator ............................. . 4.2.9 Comma Operator .......................................... . 4.2.10 Assignment Operators ...................................... . 4.2.11 Precedence and Associativity of Operators ..................... . 4.2.12 Parentheses .............................................. . 4.2.13 Order of Evaluation ....................................... . 4.2.14 Type Conversions ............................................ . 4.3 4.4 Overview: Preprocessor Directives .............................. " 4.5 Encyclopedia of Domain C Code ................................ arithmetic operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. array operations ...................................................... assignment operators .................................................. -bitBFMT - COFF ..................................................... operators ......................................................... break .............................................................. cast operations ....................................................... comma operator ...................................................... conditional expression operator ......................................... continue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. _DATE_ and _TIME_ (predefined symbols) .......................... #debug (preprocessor directive) ......................................... default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. #define and #undef (preprocessor directives) .............................. do/while . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. #eject (preprocessor directive) ....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. else ................................................................ #else ............................................................... #endif .............................................................. enum operations..................................................... expressions .......................................................... FILE .......................................................... for ................................................................ goto ............................................................... if .................................................................. #if, #ifdef, #ifndef, #elif, #else, #endif (preprocessor directives) ............. #ifdef .............................................................. #ifndef .......... -................................................... #include (preprocessor directive) ........................................ increment and decrement operators ...................................... _LINE_ and _FILE_ (predefined symbols) ........................... #line (preprocessor directive) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. #list and #nolist (preprocessor directives) ................................ logical operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. #module (preprocessor directive) ........................................ #nolist ............................................................. pointer operations .................................................... predefined macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. x Contents 4-7 4-8 4-8 4-8 4-9 4-9 4-10 4-12 4-15 4-17 4-19 4-22 4-34 4-41 4-42 4-46 4-49 4-54 4-55 4-57 4-60 4-61 4-63 4-64 4-72 4-74 4-75 4-76 4-77 4-78 4-79 4-82 4-83 4-88 4-91 4-96 4-101 4-102 4-103 4-106 4-111 4-112 4-114 4-115 4-119 4-121 4-122 4-131 relational operators ................................................... return ............................................................. #section (preprocessor directive) ........................................ sizeof .............................................................. _STDC_ and _BFMT_COFF (predefined names) ...................... structure and union operations .......................................... switch .............................................................. #systype (preprocessor directive) and the systypeO macro ................... _TIME_ (predefined symbol) ........................................ while .............................................................. Chapter 5 5.1 5.1.1 5.1.2 5.2 5.2.1 5.3 5.3.1 5.3.2 5.4 5.4.1 5.4.2 5.4.3 5.4.4 5.5 5.5.1 5.5.2 5.6 5.7 5.7.1 5.7.2 5.7.3 5.7.4 5.8 Chapter 6 6.1 6.2 6.2.1 6.2.2 6.2.3 4-132 4-137 4-140 4-143 4-145 4-146 4-154 4-160 4-163 4-164 Functions Function Definitions .......................................... Function Preamble ........................................ The Body of the Function .................................. Function Allusions ........................................... Forward References and Backward References .................. Function Calls ............................................... Call by Value ............................................. Passing Arguments By Reference ............................. Function Prototypes ........................................... Function Definitions ....................................... Prototyping a Variable Number of Arguments .................. Backwards Compatibility .................................... Using Prototypes to Write More Efficient Functions ............. Returning a Value Back to the Caller ............................ Returning Values By Reference .............................. The #options Specifier - Domain Extension ................... Recursive Functions .......................................... Pointers to Functions ......................................... Assigning a Value to a Function Pointer ....................... Return Type Agreement .................................... Calling a Function Using Pointers ............................ Passing a Pointer to a Function as an Argument ................ The mainO Function ......................................... . . . . . . . . . . . . . . . . . . . . . . . 5-1 5-2 5-4 5-5 5-6 5-7 5-7 5-11 5-12 5-14 5-15 5-16 5-17 5-17 5-18 5-19 5-20 5-20 5-21 5-22 5-23 5-24 5-25 Program Development in a Domain/OS Environment. . . . .. .. . . . . . ... Compiling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Compiling with /bin/cc ...................................... Compiling with /com/cc ..................................... /com/cc Compiler Errors .................................... 6-1 6-3 6-3 6-13 6-14 Program Development Contents xi 6.3 6.3.1 6.3.2 6.3.3 6.3.4 6.3.5 6.3.6 6.3.7 6.3.8 6.3.9 6.3.10 6.3.11 6.3.12 6.3.13 6.3.14 6.3.15 6.3.16 6.3.17 6.3.18 6.3.19 6.3.20 6.3.21 6.3.22 6.3.23 6.3.24 6.3.25 6.3.26 6.3.27 6.3.28 xii Contents Domain Compiler Options ..................................... . Absolute Code in User Space: -ac (Icom/cc) ................... . Longword Alignment: -align and -nalign (Icom/cc) .............. . Displaying Messages about Alignment: -alnchk and -nalnchk (Icom/cc) ............................. . Binary Output: -bl-nb (/com/cc) -0 (/bin/cc) .............................................. . Global Variables in .bss Section: -bssl-nbss (/com/cc) ........... . Comment Checking: -comchkl-ncomchk (Icom/cc) ............. . Conditional Compilation: -condl-ncond (/com/cc) .............. . Target Node Selection: -cpu cpu (lcom/cc) -M cpu (/bin/cc) ......................................... . Debugger Output: -db I-ndb I-dbs I-dba (I coml cc) -g (/bin/cc) .............................................. . Name Definition: -def name [= value] (/com/cc) -Dname[=value] (/bin/cc) .................................. . Preprocessor Options: -esl-esf (/com/cc) -EI-P (/bin/cc) ........................................... . Expanded Code Listing: -expl-nexp (/com/cc) -S (/bin/cc) .............................................. . Floating-Point Accuracy: -frnd (/com/cc only) ................. . Include Directories: -idir (Icom/cc) .......................... . Array Reference Index: -indexil-nindexi (Icom/cc) ............. . Informational Messages: -info I -ninfo (Icom/cc) ............... . Installed Libraries: -inlib (/com/cc) ........................... . Listing File: -ll-nl (/com/cc) ................................ . Symbol Map: -mapl-nmap (/com/cc) ......................... . Error and Warning Summary: -msgsl-nmsgs (/com/cc) ........... . Optimization Levels: -opt [n] (/com/cc) -0 [n] (/bin/cc) .......................................... . Position-Independent Code: -pic (/com/cc) .................... . Profiling: -prof (/com/cc) -p (/bin Icc) .............................................. . Nonportable References: -stdl-nstd (/com/cc) ............ ',' .... . Run-Time UNIX Version Selection: -runtype systype (/com/cc) ... . UNIX Version Selection: -systype systype (/com/cc) - T systype (/bin/cc) ....................................... . Function Prototypes: -typel-ntype (/com/cc) ................... . Line Numbers: -ulinel-nuIine (/com/cc) ....................... . 6-19 6-20 6-20 6-20 6-20 6-21 6-21 6-22 6-22 6-23 6-24 6-26 6-27 6-27 6-28 6-29 6-29 6-30 6-30 6-31 6-33 6-33 6-39 6-39 6-40 6-40 6-40 6-42 6-42 6.3.29 6.3.30 6.4 6.4.1 6.4.2 6.5 6.6 6.6.1 6.6.2 6.7 6.7.1 6.7.2 6.8 6.8.1 6.8.2 6.9 6.9.1 Chapter 7 7.1 7.2 7.2.1 7.2.2 7.3 7.3.1 7.4 7.5 7.5.1 7.5.2 7.5.3 7.5.4 7.6 7.6.1 7.6.2 7.6.3 7.6.4 7.6.5 7.6.6 7.6.7 7.7 7.7.1 7.7.2 7.7.3 Version Number: -version (/com/cc) ......................... Warning Messages: -warnl-nwarn (/com/cc) -w (/bin/cc) .............................................. Linking in a Domain Environment .............................. The /bin/ld Utility ......................................... The bind Command ....................................... Archiving in a Domain Environment ............................ System Libraries ............................................. The Standard C Library .................................... Built-in Routines .......................................... Executing Programs in a Domain/OS Environment ................. Executing in a UNIX Environment ........................... Executing in an Aegis Environment ........................... Debugging Programs in a Domain Environment .................... The dde Utility ........................................... The dbx Utility ........................................... Program Development Tools ................................... tb (Traceback) ........................................... . 6-42 . . . . . . . . . . . . . . . . 6-43 6-43 6-43 6-44 6-44 6-44 6-45 6-47 6-48 6-48 6-48 6-49 6-49 6-50 6-50 6-51 . . . . . . . . . . . . . . . . . . . . . . . . 7-2 7-3 7-3 7-3 7-4 7-5 7-5 7-7 7-8 7-9 Cross-Language Communication Suppressing Automatic Type Promotions of Arguments .............. Data Type Agreement in C, Pascal and FORTRAN ................ Non-C Data Types ........................................ Non-FORTRAN Data Types ................................ Data Type Agreement of Return Value .......................... Functions Returning Pointers ................................ Argument Passing Conventions ................................. Pascal Examples ............................................. Passing Integers and Floating-Point Numbers ................... Passing Character Arrays ................................... Passing Pointers ........................................... Simulating the BOOLEAN Type ............................. FORTRAN Examples ......................................... Names of FORTRAN Routines ............................... Passing Integers and Floating-Point Data ...................... Passing Character Data ..................................... Passing Arrays ............................................ Passing Pointers ........................................... Simulating the LOGICAL Types ............................. Simulating the COMPLEX Types ............................. Data Sharing ................................................ Global Variable Declarations Using /com/cc .................... Global Variable Declarations Using /bin/cc ................ : .... Case Sensitivity and Global Names ........................... 7-11 7-12 7-14 7-14 7-15 7-16 7-17 7-22 7-23 7-25 7-26 7-26 7-27 7-27 Contents xiii 7.7.4 7.7.5 7.8 7.8.1 7.8.2 7.8.3 Chapter 8 8.1 8.1.1 8.1.2 8.2 8.2.1 8.2.2 8.2.3 8.2.4 8.2.5 8.2.6 8.2.7 8.2.8 8.2.9 8.3 8.3.1 Chapter 9 9.1 9.2 Appendix A Data Sharing Between C and Pascal .......................... . Data Sharing Between FORTRAN and C ...................... . System Service Routines ....................................... . Insert Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Returned Status Code ...................................... . Linking and Execution ..................................... . Input and Output General Remarks ............................................ . File Types ............................................... . Streams and File Descriptors ................................ . The Standard I/O Library ..................................... . Buffering ................................................ . The Header File .................................. . Macros and Functions ..................................... . Error Handling ........................................... . File Position Indicators ..................................... . I/O to Standard Devices .................................... . 1/0 to Files .............................................. . Opening and Closing a File ................................. . Reading and Writing Data .................................. . UNIX Unbuffered 110 Functions ............................... . UNIX I/O Error-Handling .................................. . Common C Programming Mistakes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Domain C Compiler Messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Contents 9-2 9-2 ISO Latin-l Codes A-l Domain C Extensions Domain C Extensions ................................................ xiv 8-2 8-3 8-3 8-4 8-4 8-6 8-7 8-7 8-8 8-8 8-10 8-12 8-16 8-25 8-27 Diagnostic Messages ISO Latin-1 Code ............................ '.' . . . . . . . . . . . . . . . . . . . .. Appendix B 7-27 7-32 7-35 7-35 7-36 7-36 B-1 Appendix C The lint Utility (BSD) C.l Introduction ................................................ . Summary of lint Options ...................................... . C.2 Usage ................................................... . C.2.l C.2.2 Unused Variables and Functions ............................. . C.2.3 Set/Used Information ...................................... . C.2.4 Flow of Control ........................................... . C.2.S Function Values .......................................... . Type Checking ............................................ . C.2.6 C.2.7 Type Casts ............................................... . C.2.8 Nonportable Character Use ................................. . Assignments of "longs" to "ints" ............................. . C.2.9 C.2.l0 Unorthodox Constructions .................................. . C.2.11 Antiquated Syntax ......................................... . C.2.l2 Pointer Alignment ......................................... . C.2.l3 Multiple Uses and Side Effects .............................. . C.3 Implementation Details ........................................ . Portability ................................................ . C.3.l Suppressing Unwanted Output ............................... . C.3.2 C.3.3 Library Declaration Files ................................... . Appendix D C-l C-l C-2 C-3 C-3 C-4 C-4 C-S C-6 C-6 C-6 C-7 C-7 C-8 C-8 C-9 C-9 C-ll C-12 The lint Utility (SysV) D.l Usage ...................................................... lint Message Types ........................................... D.2 D.2.1 Unused Variables and Functions ............................. D.2.2 Set/U sed Information ...................................... D.2.3 Flow of Control ........................................... D.2.4 Function Values .......................................... Type Checking ............................................ D.2.S Type Casts ............................................... D.2.6 D.2.7 Nonportable Character Use ................................. D.2.8 Assignments of longs to ints ................................. Strange Constructions ...................................... D.2.9 D.2.l0 Old Syntax ............................................... D.2.11 Pointer Alignment ......................................... D.2.12 Multiple Subexpressions and Side Effects ...................... . . . . . . . . . . . . . . D-l D-3 D-3 D-4 D-4 D-S D-6 D-7 D-7 D-8 D-8 D-9 D-l0 D-10 Contents xv Appendix E E.l E.2 E.2.1 E.2.2 E.2.3 E.2.4 E.3 E.3.1 E.3.2 E.4 E.4.1 E.4.2 E.4.3 E.4.4 E.4.S E.4.6 E.S E.S.l E.S.2 E.S.3 E.S.4 E.S.S E.S.6 E.S.7 Using std$_call Data Type Agreement of Arguments ............................ . Data Types of Constant Arguments .............................. . Integer Constants .......................................... . Floating-Point Constants .................................... . Character Constants ....................................... . String Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Type Agreement of Function Declarations ................... . Functions Returning Pointers ................................ . Using std_$call ........................................... . Pascal Examples ............................................. . Passing Integers ........................................... . Passing Floating-Point Numbers .............................. . Passing Character Data ..................................... . Passing Character Arrays ................................... . Passing Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulating the BOOLEAN Type ............................. . FORTRAN Examples ......................................... . Passing Integers ........................................... . Passing Floating-Point Numbers .............................. . Passing Character Data ..................................... . Passing Arrays ............................................ . Passing Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simulating the LOGICAL Types ............................. . Simulating the COMPLEX Type ............................. . E-l E-2 E-2 E-2 E-2 E-3 E-3 E-3 E-4 E-S E-6 E-7 E-9 E-l0 E-12 E-14 E-1S E-16 E-17 E-19 E-21 E-24 E-2S E-27 Figures 2-1 2-2 Domain C Keywords ........................................ . Organization of a File of C Source Code ........................ . 2-S 2-11 3-1 3-2 3-3 Hierarchy of C Data Types ................................... . Scalar Type Keywords ....................................... . 32-Bit Integer Format ....................................... . 16-Bit Integer Format ..................... : ................. . Internal Representation of Character Variables ................... . Single-Precision Floating-Point Format ......................... . Internal Representation of +100.5 ............................. . Double-Precision Floating-Point Format ........................ . Pointer Variable Format ..................................... . Default Layout of Structure S 1 ................................ . Layout of Structure S2 ...................................... . Naturally Aligned Structure S3 with 1-byte Padding ............... . 3-2 3-2 3-4 3-S 3-6 3-7 3-8 3-9 3-10 3-11 3-12 xvi Contents 3-7 3-8 3-9 3-11 3-12 3-13 3-20 3-26 3-26 3-27 3-13 3-21 3-22 3-23 3-24 3-25 Layout of S2 Using Word Alignment Rules ...................... Array of S 1 Structures, Not Naturally Aligned ................... Example of Union Memory Storage ............................ Storage in Union example After Assignment ..................... Syntax of Bit Field Declarations ............................... Sample Bit-Field Alignment in a Structure ...................... Syntax of an Array Declaration ............................... Magic Square .............................................. Storage of a Multidimensional Array ........................... ierarchy of Active Regions (Scopes) ............................ Two Declarations and One Definition with No Initialization ........ The Effect of Initializing a Global Variable ...................... The Effect of Linking Order on Variable Initialization ............. . . . . . . . . . . . . . 4-1 4-2 4-3 4-4 4-5 4-6 4-7 Hierarchy of C Scalar Data Types ............................. Keyword Listings in Encyclopedia .............................. Preprocessor Directive Listings in Encyclopedia .................. Other Listings in Encyclopedia ................................ Bitwise Operators ........................................... Syntax of a Function-Like Macro ............................. How a for Loop Is Executed ................................. . . . . . . . 5-1 5-2 5-3 Syntax of a Function Allusion ................................ . Syntax of a Function Call .................................... . Pass by Reference vs. Pass by Value ........................... . 5-6 5-7 5-8 6-1 Program Development in a Domain/OS System ................... . 6-2 8-1 8-2 Hierarchy of I/O Libraries ................................... . C Programs Access Data on Files Through Streams ............... . 8-1 8-3 1-1 Compiling and Executing a Simple Program ...................... 1-6 2-1 2-2 2-3 Legal and Illegal Identifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Floating-Point Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Character Escape Codes ...................................... 2-4 2-8 2-8 3-1 3-2 3-3 Domain C's Arithmetic Data Types ............................. Legal and Illegal Declarations in Domain C ...................... Storage Class Summary ...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3-3 3-45 3-56 4-1 4-2 4-3 Binding and Precedence of Operators ........................... Predefined Macros and Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Preprocessor Directives ....................................... 4-11 4-15 4-16 3-14 3-15 3-16 3-17 3-18 3-19 3-20 3-28 3-29 3-30 3-30 3-31 3-32 3-35 3-37 3-40 3-48 3-58 3-58 3-59 4-14 4-17 4-17 4-18 4-44 4-65 4-84 Tables Contents xvii 4-4 4-5 4-6 4-7 4-8 4-9 4-10 4-11 4-12 4-13 The Bitwise AND Operator ................................... Examples Using the Bitwise Inclusive OR Operator ........... ; .... Example Using the XOR Operator ............................. Example Using the Bitwise Complement Operator ................ Integer Conversions Truth for C's Logical Operators .............................. Examples of Expressions Using the Logical Operators ............. Examples of Expressions Using the Relational Operators· ........... Relational Expressions ....................................... Example of #section Directive ................................ . . . . . . . . . 4-44 4-45 4-45 4-45 4-50 4-115 4-116 4-133 4-133 4-142 6-1 6-2 6-3 6-4 6-5 6-6 6-7 !binI cc Command Options .................................... C Compiler Options ......................................... Arguments to the -cpu and -M Options ........................ DEBUG Compilation Options ................................. The Effect of -def .......................................... The Effect of -D ........................................... Header Files ............................................... . . . . . . . 6-6 6-15 6-23 6-24 6-25 6-25 6-46 7-1 7-2 C Function Argument Conversions without Prototypes ............. . Domain C, Pascal, and FORTRAN Data Types .................. . 7-3 7-4 8-1 8-2 8-3 fopenO Text Modes ........................................ . File and Stream Properties of fopenO Modes .................... . UNIX I/O Functions ........................................ . 8-13 8-14 8-26 A-lI ISO Latin-I Codes .......................................... . A-2 B-1 B-2 ANSI C and C++ Extensions Supported by Domain C ............. . Domain Extensions to the C Language ......................... . B-2 E-1 C Function Argument Conversions without Prototype .............. . E-1 B-1 Bug Alerts Using typedefs for Arrays The Dual Meanings of "static" ........................................ Integer Division and Remainder ...................................... Walking Off the End of an Array ..................................... Referencing Elements in a Multidimensional Array ....................... The Dangling else .................................................. Side Effects ....................................................... Side Effects in Relational Expressions .................................. Confusing = with == ................................................ Comparing Floating-Point Values ...................................... Passing Structures vs Passing Arrays .................................... Opening a File ..................................................... xviii Contents . . . . . . . . . . . 2-16 3-54 4-21 4-23 4-31 4-93 4-109 4-117 4-132 4-135 4-150 8-15 Chapter 1 An Overview of Domain C This manual describes Domain® C, which is our implementation of the C programming language. In this chapter, we provide an overview of the C language, list some of the key Domain extensions, and show how to compile and execute a simple C program. 1.1 History of C The C language was first developed in 1972 by Dennis M. Ritchie at AT&T Bell Labs as a systems programming language-that is, a language to write operating systems and system utilities. Ritchie's intent in designing C was to give programmers a convenient means of accessing a machine's instruction set. This meant creating a language that was high-level enough to make programs readable and portable, but simple enough to map easily onto the underlying machine architecture. C was so flexible, and enabled compilers to produce such efficient machine code, that in 1973 Ritchie and Ken Thompson rewrote most of the UNIX* operating system in C. Since then, C and the UNIX system have had a close association, although in recent years C has become more popular as a general-purpose programming language. Although the power and flexibility of C is undisputed, C has also acquired the reputation for being a mysterious and messy language that promotes bad programming habits. Part of the problem is that C gives special meanings to many punctuation characters, such as asterisks, plus signs, braces, and angle brackets. Once a programmer has learned the C language, these symbols look quite commonplace, but there is no denying that a typical C program can be intimidating to the uninitiated. The other, more serious, complaint concerns the relative dearth of rules. Other programming languages, such as Pascal, have relatively strict rules to protect programmers from ·UNIX is a registered trademark of AT&T in the USA and other countries. An Overview of Domain C 1-1 making accidental blunders. It is assumed in Pascal, for instance, that if a programmer attempts to assign a floating-point number to a variable that is supposed to hold an integer, it is a mistake, and the compiler issues an error message. In C, the compiler quietly converts the floating-point value to an integer. The C language was designed for experienced programmers. The compiler, therefore, assumes little about what the programmer does or does not intend to do. This can be summed up in the C tenet: Trust the programmer. As a result, C programmers have tremendous liberty to write unusual code. In many instances, this freedom allows programmers to write useful programs that would be difficult to write in other languages. However, the freedom can be abused by inexperienced programmers who delight in writing needlessly tricky code. C is a powerful language. but it requires self-restraint and discipline. You should be somewhat familiar with C before attempting to use this manual. If you are not. please consult a good C tutorial. If you are familiar with C, you should be able to write programs in Domain C after reading this manual. 1.2 C Standards Until recently. the only formal specification for the C language was a document written by Dennis Ritchie entitled The C Reference Manual. In 1977. Ritchie and Brian Kernighan expanded this document into a full-length book called The C Programming Language (sometimes called "the white book" because of its white cover). For years, The C Programming Language was the only C text and so acquired the status of a de facto standard. We refer to this book, and the language it defines, as the K&R standard. In the early days of C, the language was used primarily on UNIX systems. Even though there were different versions of UNIX systems available, each used the same C compiler. The version of C running under a UNIX operating system is known as PCC (Portable C Compiler) . Like the K&R standard. PCC became a de facto standard. In fact. PCC can be viewed as an implementation of the K&R standard. There are a few points about the C language, however, that the K&R standard does not define. In these cases, the PCC implementation has become the standard. In February 1983, James Brodie of Motorola Corporation applied to the X3 Committee of the American National Standards Institute (ANSI) to draft a C standard. ANSI approved the application, and in March the X3J11 Technical Committee of ANSI was formed. X3Jll is composed of representatives from all the major C compiler developers (including Apollo), as well as representatives from several companies that program their applications in C. In the summer of 1983. the committee met for the first time, and they have been meeting four times a year since then. The final version of the C standard is expected to be approved by ANSI in 1988. 1-2 An Overview of Domain C In addition to the K&R standard, the PCC implementation, and the ANSI standard, there is a new language based on C called C++. C++ was developed by Bjarne Stroustrup at AT&T. It includes many of the features in the ANSI standard, as well as further extensions to make the language object-oriented. Except for a few rare cases, Domain C is fully compatible with the K&R standard and with PCC. Therefore, programs compiled in a UNIX environment can be ported to Apollo machines without altering the source text, and vice versa. At the same time, Domain C supports many of the newer features introduced by ANSI and C++. In particular, Domain C supports the following: • enum data type • Function prototypes • Reference variables • Generic pointers Finally, Domain C includes some features that are not available in any of the existing standards. These features enable you to take full advantage of the Domain/OS environment, though use of special Domain syntaxes will make your programs less portable. Throughout this manual, we highlight all Domain-specific features in colored text. Everything printed in black is consistent with either the K&R standard or the ANSI standard. Where the two standards differ, we explicitly state the difference in the text. Appendix D contains a detailed list of ANSI and C++ features that Domain C supports. 1.3 Two Ways to Call C Although there is only one Domain C compiler, there are two command line interfaces to it. By default, typing ee in a UNIX shell gives you the Ibin/ce interface. Typing cc in an Aegis shell gives you the Icom/ec interface. The Ieom/ce interface is always available regardless of what shell you are running and which environments are installed on your node. If you are in a UNIX shell and have Aegis installed on your node, you can access the Icom/cc interface by typing Icom/cc on the command line. If Aegis is not installed on your node, the leo mice interface will reside in lusr/apollo/lib/cc. Note, however, that you can also access the Icom/cc interface by using the -YO option with the Ibin/cc command. See Chapter 6 for more information about this compiler option. The Ibin/ee interface is available only if a UNIX environment is installed on your node. If a UNIX environment is installed but you are running an Aegis shell, you can access the Ibin/ee interface by typing Ibin/cc on the command line. An Overview of Domain C 1-3 The Ibin/cc command first calls the UNIX preprocessor (cpp); then it invokes the Domain C compiler; after compilation, it invokes the UNIX link editor (ld). The Icom/cc command only invokes the Domain C compiler (which includes the Aegis preprocessor). Unlike the Ibin/cc command, Icorn/cc does not automatically invoke a link editor. See Chapter 6 for more information about the differences between Icom/cc and Ibin/cc. 1.3.1 Two Preprocessors The C product supports two preprocessors-a UNIX preprocessor called cpp and an Aegis preprocessor that is bundled with the Domain C compiler. The UNIX preprocessor is automatically invoked whenever you execute the Ibin/cc command. You can also invoke it as a stand-alone utility by executing the lusr/lib/cpp command. The Aegis preprocessor executes whenever you invoke the Domain C compiler. Note that when you compile in a UNIX environment, your source text is passed through both preprocessors-cpp first and then the Aegis preprocessor. In general, the two preprocessors behave identically. The key differences are: • The two preprocessors use different methods for resolving relative pathnames in #include directives. See the description of the #include directive in Chapter 4 for more information about this difference. • The two preprocessors support different sets of command options. See Chapter 6 for details about all command options. • The UNIX preprocessor supports the #elif directive; the Aegis preprocessor does not. • The Aegis preprocessor supports many Domain-specific directives and predefined macros that cpp does not support. 1.3.2 Two Styles of Object Code Both Ibin/cc and Icom/cc produce COFF (Common Object File Format) object files. However, the two commands produce slightly different styles of COFF. The notable differences are: • Object files produced by Ibin/cc have a cc have a . bin suffix. • If you compile with Ibin/cc, the resulting code will not be optimized by default. If you compile with Icorn/cc, your code will be optimized at optimization level 3. .0 suffix. Object files produced by Icornl You can override both of these defaults with compiler options. See Chapter 6 for more information about optimization levels. 1-4 An Overview of Domain C • If you compile with Ibin/cc, all uninitialized global variables will be placed in the .bss section of the object file. If you compile with Icom/cc, all global variables will be placed in named overlay sections. This becomes an issue in cross-language communication, as explained in Chapter 7. • Object files compiled by Icom/cc are executable if they contain a mainO function and do not reference externally defined objects. All object files produced by Ibin/cc must be processed by a binder before they can be executed. Note that since Ibin/cc automatically invokes the link editor (Id), this difference is usually invisible. 1.3.3 Two Command Line Syntaxes The Ibin/cc and Icom/cc commands have separate syntaxes and recognize entirely different sets of command line options (although the functionality overlaps to a large extent). Chapter 6 describes these differences in detail. Here, we briefly list some of the principal differences. • The Ibin/cc command accepts multiple source filenames on the command line. The Icom/cc command accepts only one filename. • With Ibin/cc, you can specify the names of object files, which are passed to the link editor. The Icom/cc command accepts only source files. • When you compile with Ibin/ee, all source filenames must have a .c suffix and all object filenames must have a .0 suffix. There are no suffix requirements with the Icom/cc command. • With the Ibin/cc command, you must place compiler options before filenames. With Icom/cc, compiler options are placed after the filename. • The compiler options supported by Ibin/cc are case-sensitive. The /com/cc compiler options are not case-sensitive. 1.4 A Sample Program The best way to get started with Domain C is to write, compile, and execute a simple program. Here is a simple program to get you started: /* Program name is "getting_started" */ #include int maine void { int x, y; printf( "enter an integer -- " ); scanf ( "%d", &x ); y = x * 2; printf( "\n%d is twice %d\n" , y, x ); An Overview of Domain C 1-5 1.4.1 Compiling and Executing Suppose that you store this program in a file named getting_started.c. (If you use Ibinl cc, you must enter the full name of the source file, including the .c suffix; with Icom/cc, you may omit the .c suffix.) Compiling with Icorn/cc produces an executable object file named getting_started. bin; compiling with Ibin/cc produces an executable binary file named a.out. To run these objects, just enter the name of the file. Table 1-1 summarizes the whole process. Table 1-1. Compiling and Executing a Simple Program With Icom/cc With Ibin/cc No errors, No warnings. $ gettinLstarted. bin Enter an integer -- 15 $ Ibin/cc getting_started.c No errors, No warnings. $ a.out Enter an integer -- 15 30 is twice 15 30 is twice 15 $ /corn/cc getting_started 1.5 Online Sample Programs Many of the programs from this manual are stored online, along with sample programs from other Apollo manuals. These programs illustrate features of the C language, and demonstrate programming with Domain/OS graphics calls and system calls. There are two ways to access these online programs-with the getcc utility or with the Delphi system. 1.5.1 Accessing Sample Programs with getcc The getcc utility enables you to extract a program from a master file that contains all sample programs. The getcc utility prompts you for the name of the sample program and the pathname of the file to which you want it copied. If the online examples are stored on your node, you can access getcc directly or through a link. To access them directly, you must change your working directory before invoking getcc: # In an Aegis shell $ wd Idornain_exarnples/cc_exarnples $ getcc # In a UNIX shell $ cd Idomain_examples/cc_examples $ getcc To access the examples through a link, you need to create the following link before invoking getcc: 1-6 An Overview of Domain C # In an Aegis shell $ crl -/com/getcc /domain_examples/cc_examples/getcc $ getcc # $ $ # # In a UNIX shell In -s /domain_examples/cc_examples/getcc path_dir/getcc getcc where "path_dir" is a name of a directory on your list of search pathnames. If the online examples are stored on a remote node, you need to create the following two links to invoke getcc: # In an Aegis shell $ crl /domain_examples/cc_examples IInode/domain_examples/cc_examples $ crl -/com/getcc //node_name/domain_examples/cc_examples/getcc $ getcc # $ $ $ # # # In a UNIX shell In -s IInode/domain_examples/cc_examples /domain_examples/cc_examples In -s IInode/domain_examples/cc_examples/getcc path_dir/getcc getcc where "node" is the name of the node where the examples are stored, and "path_dir" is a name of a directory on your list of search pathnames. 1.5.2 Accessing Sample Programs with Domain/Delphi All of the sample programs are available through the Delphi online documentation system. To compile and run an example, enter the name of the program in the Domain/Delphi subject field. When the source for the program appears, cut it and paste it into another file. You can then compile and execute this file as you would any other source file. See the Retrieving Information With Domain/Delphi manual for more information. -------88------- An Overview of Domain C 1-7 Chapter 2 Program Organization This chapter describes the following subjects: • Lexical elements of a C program • Organization of a C program • Constants • Declarations 2.1 Lexical Elements The lexical elements of the C language include the characters that may appear in a C source file, and how these characters are grouped into meaningful tokens by the Domain C compiler. In particular, we describe the following syntactic objects: • • • • • White space and newlines Comments Identifiers Keywords Constants Program Organization 2-1 2.1.1 White Space and Newlines In C source files, blanks, newlines, vertical tabs, horizontal tabs, and formfeeds are all considered to be white space characters. The main purpose of white space characters is to format source files so that they are more readable to humans. In general, the compiler ignores white space characters, except when they are used to separate tokens or when they appear within string literals. The newline character also serves the special function of terminating preprocessor directives. See the "Preprocessor Directives" section in Chapter 4 for more information about preprocessor directives. 2.1.2 Comments A comment is any series of characters beginning with 1* and ending with *I. The compiler ignores all comments. In the following example, a comment follows an assignment statement: average = total / number_of_components; /* Find mean value. */ Comments may also span multiple lines, as in: /* This is a multi-line comment. */ Domain C allows comments to appear anywhere in the source file. Since the compiler interprets comments as nulls, this can result in unusual concatenations if you are not careful. For instance, the statement, int x/* This is an example */z; becomes: int xz; NOTE: Domain C's implementation of comments conforms to the PCC implementation. The ANSI standard, however, states that comments must be replaced by a single space character. The C language does not support nested comments. The following, for example, will produce a compile-time error: /* This is an outer comment * /* This is an attempted inner comment -- WRONG */ * * This will be interpreted as code. */ 2-2 Program Organization C identifies the beginning of a comment by the character sequence 1*. It then strips all characters up to, and including, the end comment sequence *1. What's left gets passed to the compiler to be further processed. In the example above, therefore, the preprocessor will delete everything up to the first *1 sequence, but pass the rest to the compiler. So the compiler will attempt to process: * * This will be interpreted as code. */ Not recognizing these lines as valid C statements, the compiler will issue an error message. You can check for nested comments by compiling with the -comchk option (available with Icomlcc only). 2.1.3 Spreading Source Code Across Multiple Lines In C, you can start a statement or declaration at any column and spread it over as many lines as you want. In older versions of C, including the K&R standard, you cannot split a keyword or identifier across a line. Domain C, in conformance with the ANSI standard, defines the continuation character more generally, allowing you to use it to split identifiers and tokens as well as strings. For example, the compiler views the following two lines as the keyword switch: swit\ ch You can split a string or preprocessor directive across one or more lines. (See Chapter 3 for a definition of strings.) To split a string or preprocessor directive, however, you must use the continuation character (\) at the end of the line to be split; for example: #define foo_macro(x,y,z) «x) + (y»\ * «z) - (x» printf("This is an very, very, very lengthy and very, very \ uninteresting string."); Program Organization 2-3 2.1.4 Identifiers Identifiers, also called names, can consist of the following: • Letters (ASCII decimal values 65-90 and 97-122) • Digits o Dollar sign ($) • Underscore U The first character must be a letter or an underscore. Identifiers that begin with an underscore are generally reserved for system use. In fact, the ANSI standard has reserved all names that begin with two underscores or an underscore followed by an uppercase letter. Note that the dollar sign is a Domain extension. In addition, identifiers may not conflict with reserved keywords, which are listed in Figure 2-1. Table 2-1 lists some legal and illegal identifiers: Table 2-1. Legal and Illegal Identifiers Identifier Legal or Illegal meters green_eggs_and_ham system_name UPPER_AND_lower_case 20 meters $name nameS int uo%#@good Legal. Legal. Legal. Legal. Illegal, because it starts with a digit. Illegal, because it starts with a dollar sign. Legal in Domain C, but nonstandard. Illegal. because iut is a reserved keyword. Illegal, because it contains illegal characters. Identifiers are unique up to 4096 characters. Because Domain C exceeds the limits required by the K&R and ANSI standards, long names may not be portable. The ANSI standard requires compilers to support names of up to 32 characters for local variables and 6 characters for global variables. 2-4 Program Organization 2.1.5 Case Sensitivity In C, identifier names are always case-sensitive; that is, an identifier written in uppercase letters is considered different from the same identifier written in lowercase. For example, the following three identifiers are all considered unique: kilograms KILOGRAMS Kilograms Some Domain/OS programming languages (such as Pascal and FORTRAN) are case-insensitive. When writing a Domain C program that calls routines from these other languages, you must be aware of this difference in sensitivity. (See Chapter 7 for details on cross-language communication.) Note that strings (discussed in Chapter 3) are also case-sensitive. That is, the system recognizes the following two strings as distinct: "THE RAIN IN SPAIN" "the rain in spain" 2.1.6 Keywords Domain C supports the list of keywords shown in Figure 2-1. You cannot use keywords as identifiers; if you do, the compiler will report an error. You cannot abbreviate a keyword and you must enter keywords in lowercase letters only. auto extern size of break float static case for std_$call char goto struct continue if switch default int typedef do long union double register unsigned else return void enum short while Figure 2-1. Domain C Keywords Program Organization 2-5 2.2 Constants There are four types of constants in C: • Integer constants • Floating-point constants • Character constants • String constants Every constant has two properties: value and type. For example, the constant 15 has value 15 and type int. 2.2.1 Integer Constants An integer constant is a simple number like 12 as opposed to an integer variable (like x or y) or an integer expression. Whenever you use an integer constant in your source code, Domain C represents it as an int (32 bits). You cannot change this default. However, you can append an I or L to any constant to specify that you want it long. For example, 55L is a constant with a decimal value of 55 and the storage size of a long int. Since long and int have the same meaning in Domain C, the I or L is redundant. You may still want to use'it, though, if you are planning to port your programs to a non-Apollo machine. If the constant value cannot fit in a long int, the results are unpredictable. However, the compiler will not report an error. Domain C supports three forms of integer constants: decimal, octal, and hexadecimal. Decimal constants consist of one or more digits from 0-9 (but not starting with 0). Octal constants are formed by preceding the constant with a zero (0) ; hexadecimal constants are formed by preceding the constant with Ox or OX. Hexadecimal constants consist of the digits 0-9 and the letters a-f (or A-F). Integer constants may not contain any punctuation such as commas or periods. The following examples show some legal constants in all three forms. 2-6 Program Organization Decimal Octal 3 8 15 16 21 -87 187 255 003 010 017 020 025 -0127 0273 0377 Hexadecimal Ox3 Ox8 OxF Oxl0 Ox15 -Ox57 OxBB Oxff Strictly speaking, constants are always positive values. A negative constant is interpreted as a positive constant preceded by the unary negation operator. In practice, this distinction is moot. Technically, an octal constant cannot contain the digits 8 and 9 since they are not part of the octal number set. The Domain C compiler accepts 8 and 9 in octal numbers but issues a warning message. For example, the statement x = 098; compiles successfully, but a warning message appears. The compiler interprets this value to mean 9 eights plus 8 ones, so that 098 has a decimal value of 80. (The ANSI Standard does not support this feature.) 2.2.2 Floating-Point Constants A floating-point constant is any number that contains a decimal point and/or exponent sign for scientific notation. All floating-point constants are of type double even if they can be accurately represented in four bytes. If the magnitude of a floating-point constant is too great or too small to be represented in a double, the C compiler will substitute a value that can be represented. This substitute value is not always predictable. See Chapter 3 for a description of the representable ranges of floating-point types. 2.2.2.1 Scientific Notation Scientific notation is a useful shorthand for writing lengthy floating-point values. In scientific notation, a value consists of two parts: a number called the mantissa followed by a power of 10 called the characteristic (or exponent). The letter e or E, standing for exponent, is used to separate the two parts. The floating-point constant 3e2, for instance, is interpreted as 3*102 , or 300. Likewise, the value -2.5e-4 is interpreted as -2.5*10-4 , or -0.00025. Table 2-2 shows some legal and illegal floating-point constants. Program Organization 2-7 Table 2-2. Floating-Point Constants Constant Legal or Illegal 3. 35 3.141 3,500.45 .3333333333 4E Legal. Legal - Interpreted as art integer. Legal. Illegal - commas are illegal. Legal. Illegal - the exponent sign must be followed by a number. Legal. Legal. Illegal - the exponent must be an integer. Legal. Illegal - Domain C doesn't support a unary plus sign. Legal. 0.3 3e2 4e3.6 3.0E5 +3.6 0.4E-5 2.2.3 Character Constants A character constant is any printable character or legal escape sequence enclosed in single quotes. The value of a character constant is the integer ASCII (or ISO) value of the character. For example, the value of the constant 'x' is 120. 2.2.3.1 Escape Characters Domain C supports several predefined character constants known as escape characters. They are listed in Table 2-3. Table 2-3. Character Escape Codes Escape Code \b \f \n \r \t \v \' \11 2-8 Program Organization Character What It does backspace formfeed newline carriage return horizontal tab vertical tab single quote double quote Moves the cursor back one space. Moves the cursor to the next logical page. Prints a newline. Prints a carriage return. Prints a horizontal tab. Prints a vertical tab. Prints a single quote. Prints a double quote. In addition to the escape sequences listed in Table 2-3, C also supports escape character sequences of the form: \octal-number and \xhex-number which translates into the character represented by the octal or hexadecimal number. For example, if ASCII representations are being used, the letter 'a' may be written as '\141' or '\x61' and 'Z' as '\132' or '\x5A'. This syntax is most frequently used to represent the null character as '\0'. This is exactly equivalent to the numeric constant zero (0). When you use the octal format, you do not need to include the zero prefix as you would for a normal octal constant. 2.2.3.2 Multi-Character Constants Each character in a character constant takes up one byte of storage; therefore, you can store up to a four-byte character constant in a 32-bit integer and up to a two-byte character constant in a 16-bit integer. For example, the following assignments are quite legal (though not recommended and probably not portable): { char x; short int si; long int li; x si Ii ,j , ; 'ef'; 'abed' ; /* one-byte integer */ /* two-byte integer */ /* four-byte integer */ /* one-byte character constant */ /* two-byte character constant */ /* four-byte character constant */ } The variable si is assigned the value of 'e' and 'f', where each character takes up 8 bits of the 16-bit value. The Domain C compiler places the last character in the rightmost (least significant) byte. Therefore, the constant 'ef' will have a hexadecimal value of 6566. Since the order in which bytes are assigned is machine dependent, other machines may reverse the order, assigning f to the more significant byte. In that case, the resulting value would be 6665. For maximum portability, we recommend that you do not use multi-character constants. Program Organization 2-9 2.2.4 String Constants A string constant is any series of printable characters or escape characters enclosed in double quotes. The compiler automatically appends a null character ('\0') to the end of the string so that the size of the array is one greater than the number of characters in the string. For example, "A short string" becomes an array with 15 elements: 0 1 A 2 3 4 5 6 s h 0 r t 7 8 9 s t 10 11 r i 12 13 14 n 9 \0 To span a string constant over more than one line, use the backslash character (\), also called the continuation character. The following, for instance, is legal: string = "This is a very long string that requires more \ than one line"; Note that if you indent the second line, the spaces will be part of the string. In Domain C, the length of a string constant is limited to 4095 characters including the trailing null character. This limit may differ on other implementations. The type of a string is array of char, and strings obey the same conversion rules as other arrays. Except when a string appears as the operand of sizeof or as an initializer, it is converted to a pointer to the first element of the string. Note also that the null string, 1111 is legal, and contains a single trailing null character. 2-10 Program Organization 2.3 Program Organization When you write a Domain C .program, you can put all your source code into one file or spread it across many files. Figure 2-2 shows a simplified scheme for organizing C source files. The C language permits other file organizations that are not depicted in the figure. For example, most preprocessor directives may appear anywhere in a source file, and global declarations may appear between functions. The figure, however, depicts a general organization that reflects many C programs. Source File Preprocessor Directives ...... preprocessor directives #define #include #line ·• · Global Declarations typedef declarations ...... global declarations definitions of variables with file scope definitions of variables with program scope allusions to variables and functions defined in another source file Function Definitions ...... function I function signature I ..... Function Signatures I old-style signatures I { Ilocal declarations I I function ·· • Istatements I I I prototypes I } Figure 2-2. Organization of a File of C Source Code Program Organization 2-11 To help illustrate this organization, we provide the following commented program: /* Program name is "file_org_example */ #include /* preprocessor directive */ #define WEIGHTING_FACTOR 0.6 /* preprocessor directive */ typedef float THIRTY_TWO_BIT_REAL; /* global typedef declo */ THIRTY_TWO_BIT_REAL correction_factor = 1.15; /* global variable * decl. */ float average( float arg1, THIRTY_TWO_BIT_REAL arg2) /* prototype */ { /* start of function body */ float mean; /* local variable declo */ mean = (argl * WEIGHTING_FACTOR) + (arg2 * (1.0 - WEIGHTING_FACTOR»; /* assignment stmnt */ return (mean * correction_factor); /* return statement */ /* end of function body */ } int main( void ) /* prototype for main */ { float value 1 , value2, result; /* local variable declarations */ printf( "Enter two values -- "); /* scanf( "%f%f", &valuel, &value2); /* result = average( value1, value2 ); /* printf( "The weighted average adjusted of %4.2f is %5.2f\n", correction_factor, } /* statement */ statement */ statement */ by a correction factor \ result); /* statement */ end of function */ In the following sections, we describe the various components of a C program. 2.3.1 Functions As shown in both the figure and the example, functions are the primary organizational unit of C. A C program must contain one or more functions. The function called main has a special meaning. The C run-time system uses the first executable statement in main as the starting address of the entire program. Consequently, if you do not name one of the functions main, the program will have no starting address. Conversely, naming more than one function main will cause a compile-time or link-time error. Unlike some other languages (such as Pascal), which support both procedures and functions, C supports only functions. However, a C function can emulate a Pascal procedure or a Pascal function. In other words, you can declare a C function that either returns or does not return a value to the calling program. See the description of the void type in Section 3.6 for more information about functions that behave like procedures. Every function (main or not) adheres to the same rules of organization, and we detail these rules in Chapter 5. For now, we provide an overview. 2-12 Program Organization 2.3.2 The Begin and End Symbols: {} In all structured programming languages it is necessary to mark where a block starts and finishes. A block is any logically distinct section of source code. In some languages, marking blocks is accomplished through keywords like begin and end. In C, you mark the beginning of a block of C code with the { symbol and the end with the } symbol. Because every function must contain at least one block, you need to specify { and } to denote the start and finish of a function. In addition to delimiting a function, the { and } symbols serve to demarcate blocks in a variety of declarations and statements. 2.3.3 Statements A function can contain zero or more statements. Chapter 4 describes all the statements that Domain C supports. Note that you cannot put a statement outside of a function. 2.3.4 Preprocessor Directives Domain C supports a wide variety of preprocessor directives that serve purposes such as controlling conditional compilation, including header files, and defining program constants. Preprocessor directives begin with the # character. Although some preprocessor directives can be placed anywhere in a file, others can only be placed at specific junctures. For complete information on preprocessor directives, see the "Preprocessor Directives" listing in Chapter 4. 2.4 Declarations With a few rare exceptions, every variable must be declared before it is referenced. A declaration serves to identify the data type and storage class of a variable, and may optionally give the variable an initial value. As Figure 2-2 shows, C supports declarations made both within a block and outside of a block. The position of the declaration affects the storage class of the variable, as explained later in this chapter. Program Organization 2-13 In general, a variable declaration takes the following format: [storage_claSSJPecifier] [data_type] variable name [= initial_Value] ; where: storage _class_specifier is an optional keyword that we describe later in Section 3.12. is one of the data types described in Chapter 3. variable_name is a legal identifier. initial value is an optional initializer for the variable. variable initialization in Chapter 3.) (We describe For example, here are a few sample variable declarations without storage class identifiers or initial values: int float char int enum age; /* an integer variable named age */ ph; /* a floating-point variable named ph */ a_letter; /* a character variable named a_letter */ values [10] ; /* an array of 10 integers named values */ days {man, wed, fri}; /* an enumerated variable named * days */ It is legal to omit the data type in certain instances, although it is considered bad practice. You may omit the data type in global declarations and in local declarations that include a storage class specifier. In all of these cases, the data type defaults to int. (The proposed ANSI Standard does not support omitting the data type.) 2.4.1 Typedef Declarations The C language allows you to create your own names for data types with the typedef keyword. Syntactically, a typedef is exactly like a variable declaration except that the declaration is preceded by the typedef keyword. Semantically, the variable name becomes a synonym for the data type rather than a variable that has memory allocated for it. For example, the statement, typedef long int FOUR_BYTE_INT; makes the name FOUR_BYTE_INT synonymous with long int. The following two declarations are now identical: long int j; FOUR_BYTE_INT j; 2-14 Program Organization A typedef declaration may appear anywhere a variable declaration may appear and obeys the same scoping rules as a normal declaration. You may not, however, include an initializer with a typedef. Once declared, a typedef name may be used anywhere that the type is allowed (such as in a declaration, cast operation, or sizeof operation). By convention, typedef names are written in all uppercase so that they are not confused with variable names. There are a number of uses for typedefs. They are especially useful for abstracting global types that can be used throughout a program, as shown in the following structure and array declaration: typedef struct {char month [4] ; day; int int year; } BIRTHDAY; typedef char A_LINE [80] ; /* A_LINE is an array of 80 * characters */ Another use of typedefs is to compensate for differences in C compilers. For example: #if SMALL COMPUTER typedef int SHORTINT; typedef long LONGINT; #else #if BIG_COMPUTER typedef int LONGINT; typdef short SHORTINT; #endif #endif The idea here is that you may be writing code to run on two computers, a small computer where an int is two bytes, and a large computer where an int is four bytes. Instead of using short, long, and int, you can use SHORTINT and LONGINT and be assured that SHORTINT is two bytes and LONGINT is four bytes regardless of the machine. You can also use typedefs to simplify complex declarations. Consider the following example: typedef float *PTRF, ARRAYF[], FUNCF(); This declares three new types called PTRF (a pointer to a float), ARRAYF (an array of floats), and FUNCF (a function returning a float). These typedefs could then be used in declarations such as: PTRF x[5]; FUNCF z; /* a 5-element array of pointers to floats */ /* A function returning a float */ Program Organization 2-15 2.4.2 Name Spaces All identifiers (names) in a program fall into one of three name spaces. The three name spaces are: Structure, Union, and Enumeration Tags Member Names All Oth~r Names Tag names that immediately follow these type specifiers: struct, union, and enum. These types are described in Chapter 3. Names of members of a structure or union. Any names that are not members of the preceding two classes. Names in different name spaces never interfere with each other. That is, you can use the same name for an object in each of the three classes without these names affecting one another. 2-16 Program Organization The following example uses the same name, overuse, in all three ways (this is an example of name spaces, not of good programming style): int maine void) { int overuse; struct overuse { float overuse; char *p; /* normal identifier */ /* tag name */ /* member name */ } } Note that each struct, union, or enum defines its own name space, so that different types can have the same member names without conflict. The following, for example, is legal: struct A { int x; float y; }; struct B { int x; float y; }; The members in struct A are distinct from the members in struct B. Note that this is consistent with the ANSI standard, although it is an extension to the K&R standard. Macro names do interfere with the other three name spaces. Therefore, when you specify a macro name, do not use this name in one of the other three name spaces. For example, the following program fragment is incorrect because it contains a macro named square and a label named square: #define square(arg) arg * arg int maine void ) { square: } -------88------- Program Organization 2-17 Chapter 3 Data Types and Storage Classes Every variable and expression has a data type and every function has a return data type. The type determines how the bits are to be interpreted by the computer. This chapter describes all Domain C data types in the following order: • Integer types (int, char, short, long, unsigned) • Floating-point types (float, double) • Enumerated types (enum) • void • Pointers • Structures and unions (struct, union) • Arrays In addition to data type, every variable has a storage class, which defines its scope and duration. The latter half of this chapter describes storage classes. 3.1 Data Type Overview The C language offers a moderately sized and useful set of data types. There are six different types of integers and two types of floating-point objects. These types-integers and floating-points-are called arithmetic types. Together with pointers and enumerated types, they are known as scalar types because all of the values lie along a linear scale. That is, any scalar value is less than, equal to, or greater than another scalar value of the same type. Data Types and Storage Classes 3-1 In addition to scalar types, there are aggregate types, which are built by combining one or more scalar types. Aggregate types, which include arrays, structures, and unions, are useful for organizing logically related variables into physically-adjacent groups. There is also one type-void-that is neither scalar nor aggregate. Figure 3-1 shows the logical hierarchy of C data types. Figure 3-1. Hierarchy of C Data Types 3.1.1 Scalar Types There are nine reserved words for scalar data types, as shown in Figure 3-2. char long short float unsigned enum int double Figure 3-2. Scalar Type Keywords The types char, int, float, double, and enum are basic types. The others-long, short, and unsigned-are qualifiers that modify a basic type in some way. You can think of the basic types as nouns and the qualifiers as adjectives. An enumerated variable consists of an ordered group of identifiers. The only value you can assign to an enumerated variable is one of those identifiers. By default, the size of an 3-2 Data Types and Storage Classes enumeration variable is four bytes, but you can explicitly make it two bytes by using the short modifier. You can also use long to explicitly specify 4-byte enums. Applying short and long to enums is a Domain extension. Table 3-1 shows the scalar data types supported by Domain C, their size, and their range of values. Types listed together in a group are synonymous. Table 3-1. Domain C's Arithmetic Data Types Data Type Size (in bytes) Lowest Possible Value Highest Possible Value int long long int 4 -2147483648 +2147483647 unsigned int unsigned long unsigned long int 4 0 4295967295 short short int 2 -32768 +32767 unsigned short unsigned short int 2 0 +65535 char 1 -128 +127 unsigned char 1 0 +255 float 4 -0.29 • 1038 +1.7 • 1038 double long float 8 -1.0 * 10308 +1.0 • 10308 short enum 2 -32768 +32767 enum long enum 4 -2147483648 +2147483647 none NiA N/A 4 NiA N/A void pointers Data Types and Storage Classes 3-3 3.1.2 Aggregate Types The following briefly describes the supported aggregate data types: arrays An array variable consists of a fixed number of elements of the same data type. The size of an array equals the number of elements times the size of each element. structures A structure variable consists of one or more members, each having its own data type. For instance, a structure variable could be composed of two integers and one float. (A structure in C is similar to a fixed record in Pascal.) The size of a structure is the sum of the sizes of all the members, plus possible padding due to alignment rules. union A union variable consists of one or more members, each having its own data type. The difference between a structure and a union is that all the members of a structure occupy separate (unique) addresses, but all the members of a union share the same address. (A union in C is similar to a variant record in Pascal.) The size of a union is equal to the size of its largest member. 3.2 Overview of Variable Initialization C permits you to initialize certain variables when you declare them. Throughout this chapter, we detail variable initialization for specific data types. Here in this section we provide some general guidelines about initialization. The following variables may not be initialized: • Automatic structures, unions, and arrays • Variables declared with the extern keyword If you do not explicitly initialize a fixed variable, the run-time system initializes it to zero for you. Members of fixed aggregate types not explicitly assigned an initalization value are automatically initialized to zero. Automatic variables do not receive a default initialization. If you do not explicitly initialize them, they will start with unpredictable values. 3-4 Data Types and Storage Classes Fixed variables may be intialized only with constant expressions (defined in the "expressions" listing of Chapter 4). Automatic scalar variables may be initialized with either constant or non-constant expressions. If the data type of the initialization expression does not match the data type of the variable, the expression is converted as if a normal assignment were being made. For instance: 1·, int global_int /* Fixed duration integer initalized to 1 */ int maine void { float f = 1; /* Initialization value is converted to 1.0 */ char char int global_int/2; /* Automatic integer initialized * to 0 (after conversion). */ } Scalar initializations may optionally include surrounding braces. That is, int ;X = 1; is the same as: int x = {I}; In practice, however, braces are generally reserved for initialization of aggregate types. 3.2.1 Old-Style Initialization Some older compilers permit initialization without the equal sign. For example. int x 1; is equivalent to the current: int x = 1; To support programs written for these early C compilers, the Domain C compiler accepts the old-style initialization but issues a warning message. Do not use the old-style syntax for programs you are writing now. Data Types and Storage Classes 3-5 3.3 Integer Data Types Integers come in three different sizes and can be either signed (the default) or unsigned. With one exception, an integer declaration must include at least one of the type keywords: unsigned, long, short, int, or char. (The one exception is that a global declaration that does not contain a data type defaults to an int.) An integer declaration may also include combinations of these keywords. To declare an integer variable, simply specify the name of one of the integer data types followed by the variable name. The following examples show all of the possible combinations of integer variables: int a', long int b; long c; /* signed 32-bit integer */ /* same as int in Domain C */ /* same as int in Domain C*/ unsigned unsigned unsigned unsigned /* /* /* /* int d; e', long int f·, long g; unsigned 32-bit integer */ same as unsigned int */ same as unsigned int in Domain C */ same as unsigned int in Domain C */ short h', short int i; /* signed 16-bit integer */ /* same as short */ unsigned short j; unsigned short int k', /* unsigned 16-bit integer */ /* same as unsigned short */ char m; unsigned char n', /* signed 8-bit integer in Domain C */ /* unsigned 8-bit integer */ The sizes of integer types are implementation-dependent. The K&R and ANSI standards only require that a short be no larger than an int, and an int be no larger than a long. Programs that depend on ints being 32 bits long, for example, may not be portable. 3.3.1 32-Bit Integers You declare a signed 32-bit integer by specifying one of the following three data types: • int • long int • long Such variables can hold any integral value from -2147483648 (-Ox80000000) through 214748367 (Ox7FFFFFFF) inclusive. 3-6 Data Types and Storage Classes You declare an unsigned 32-bit integer with any of the following data types: • unsigned int • unsigned • unsigned long int • unsigned long Unsigned 32-bit variables hold values from 0 through 4295967295. The Domain system stores 32-bit integers in four contiguous bytes as illustrated in Figure 3-3. The most significant bit in the integer is bit 31; the least significant bit is bit O. For signed 32-bit integers, bit 31 holds the sign bit. Negative signed integers are stored in two's-complement form. 16 31 (MSB) Byte 0 Byte 1 Byte 2 Byte 3 o (LSB) 15 Figure 3-3. 32-Bit Integer Format 3.3.2 16-Bit Integers You declare a signed 16-bit integer by specifying either of the following two data types: • short • short int 16-bit signed integer variables can hold any integral value from -32768 through +32767 inclusive. Data Types and Storage Classes 3-7 You declare an unsigned 16-bit integer by specifying either of these two data types: • unsigned short • unsigned short int Unsigned 16-bit variables can hold any value from 0 through 65535. The Domain/OS system stores 16-bit integers in two contiguous bytes as illustrated in Figure 3-4. The most significant bit is bit 15; the least significant bit is bit O. Negative signed integers are stored in two's-complement form. For signed 16-bit integers, bit 15 holds the sign bit. 15 (MSB) a (LSB) r--------B-yt-e-O--------~--------B-Y-te-1---------,1 Figure 3-4. 16-Bit Integer Format 3.3.3 8-Bit Integers (Character Data Type) In C, the distinction between characters and numbers is blurred. There is a data type called char, but it is really just a 1-byte integer value that can be used to hold either characters or numbers. Domain C supports two kinds of character data types-char and unsigned char. The char data type holds signed 8-bit quantities ranging from -127 through +128. The unsigned char data type holds unsigned 8-bit quantities ranging from 0 through 255. Since the ASCII values of characters range from 0 to 127, you can use either data type to hold keyboard characters. Here are two sample character variable definitions: char unsigned char c1; c2; After declaring cl as a char, you can make either of the following assignments: c1 c1 'A' ; 65; In both cases, the decimal value 65 is loaded into the variable c1 since 65 is the ASCII code for the letter' A'. Note that character constants are enclosed in single quotes. The 3-8 Data Types and Storage Classes quotes tell the compiler to get the numeric code value of the character. For instance, in the following example, a gets the value 5, whereas b gets the value 53 since that is the ASCII code for the character '5'. char a , b; a = 5; b = ~5~; Figure 3-5 shows how the Domain/OS system stores character variables. If the variable is an unsigned char, then bit 7 contains the most significant bit (MSB), and bit 0 contains the least significant bit (LSB). If the variable is a char, then bit 7 contains the sign bit, and bit 0 contains the least significant bit. char variables with a negative value are stored in two's-complement form. 1"----- -I 7 (MSB) 0 (LSB) Figure 3-5. Internal Representation of Character Variables 3.3.4 Initializing Integer Variables You may initialize integer variables with integer or floating-point values. If the initialization expression is a floating-point value, it is converted to an integer before being assigned. If the variable has fixed duration, the initializer must be a constant expression. Here are a few sample initializations: { int short int unsigned long int static int char unsigned char x y z xx yy zz 50000; x/2; x*y; l. 5; -20; 200; /* converted to 1 */ See Section 3.2 for details on how storage class affects initialization. See Section 4.3 for information about assignment conversions. You can initialize character variables with integer or floating-point expressions. All of the following, for example, are legal: char char char char zebra zebra zebra zebra 'g'; /* a character enclosed in single quotes. */ 103; /* a small integer */ 0147;/* an octal integer */ '\147' /* a small integer preceded by a backs lash * and enclosed in single quotes */ Data Types and Storage Classes 3-9 Interestingly, all four formats produce the same results. The character constant 'g' causes the compiler to initialize zebra with the ASCII value of the letter g, which happens to be 103. By specifying the decimal integer value 103, we accomplish the same thing. The octal value 147 is also equal to 103. Finally, by preceding 147 with a backslash and enclosing it in single quotes, we tell the compiler to treat it as an octal number. 3.3.5 Integer Overflow An overflow condition occurs whenever a value is too large to be represented in the bits allocated for it. Overflow for expressions containing unsigned objects is explicitly defined by the K&R and ANSI standards. Overflow for signed expressions, however, is implementation-dependent. Domain C handles both cases identically. When the Domain/OS system identifies an overflow condition, it truncates the most significant bits (including the sign bit). When performing an operation on signed integers, an overflow condition may cause an unexpected change of sign in the answer. When performing an operation on unsigned integers, you can spot an overflow by recognizing an answer that is much smaller than anticipated. Consider the following example: /* Program name is "int_overflow_example" */ #include int main( void ) { short x unsigned short y printi( "X printf( "y OxFFFO; OxFFFO; %hd\n" , x %hd\n" , y ) ; ) ; } The results are: x y -16 65520 In both cases, the same bit pattern results: 11111111 11110000 However, x is interpreted as a negative number whereas y is interpreted as a positive value. 3-10 Data Types and Storage Classes 3.4 Floating-Point Data Types Domain C supports three types, float, long float and double, for representing floatingpoint values. The float type is a single-precision floating-point type and the double type is double-precision. The long float type is a synonym for double (long float is an extension to the ANSI and K&R standards). You may not use the unsigned qualifier in a floating-point declaration. Here are a few sample declarations: float double long float tiger; giraffe; elephant; 3.4.1 Single-Precision Floating-Point Single-precision floating-point numbers (type float) occupy four contiguous bytes, as shown in Figure 3-6. The range of a float is approximately -.29*10 38 through 1.7*1038 . It is accurate to approximately seven digits. 23 31 s 16 22 Exponent + 127 Mantissa Mantissa (continued) 15 Figure 3-6. Single-Precision Floating-Point Format o (LSB) The first bit (bit 31) is the sign bit. The sign bit is set (S=l) to denote a negative number, and clear (S=O) to denote a positive number. The next eight bits contain the exponent plus 127. The following 23 bits contain the mantissa of the number without the leading 1. (The mantissa is stored in magnitude, not two's-complement, form.) The following example shows how Domain/OS stores the floating-point value +100.5. The four bytes contain the bit pattern shown in Figure 3-7. Data Types and Storage Classes 3-11 31 23 22 16 0 1 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o 15 Figure 3-7. Internal Representation of +100.5 Breaking up the number into sign, exponent, and mantissa gives us the following information: o (positive) 10000101 (133 in decimal) 1001001 sign exponent significant part of mantissa The exponent is 133, and 133 is equal to 127 plus 6. Therefore, we view the mantissa bits as follows: bit bit bit bit bit bit bit 22 21 20 19 18 17 16 represents represents represents represents represents represents represents 25 24 23 22 21 20 2_1 * * * * * * * 1 0 0 1 0 0 1 The quantity 100.5 is equal to (2 6+ 25 + 22 + 2_1 ) 3.4.2 Double-Precision Floa ting-Point Double-precision floating-point numbers (type double and long float) are represented in eight bytes (64 bits). Figure 3-8 illustrates the format. A double has a range of approximately -10308 to 10308 and is accurate to approximately 16 decimal digits. 3-12 Data Types and Storage Classes 63 (MSB) 52 Exponent + 1023 Sigr 48 51 Mantissa Mantissa Mantissa Mantissa o (LSB) Figure 3-8. Double-Precision Floating-Point Format The first bit of the first word is the sign bit. The next 11 bits contain the exponent plus 1023. The remaining 52 bits hold the mantissa without the leading 1. (The mantissa is stored in magnitude form, not in two's-complement form.) 3.4.3 Initializing Floating-Point Variables You may initialize floating-point variables with either integer or floating-point data. The data is converted to the variable's type as if a normal assignment were being made. For example: float guava double pi float z 3.2; 3.1415926535; 5; Data Types and Storage Classes 3-13 3.5 Enumerated Data Types An enumerated data type consists of an ordered group of identifiers. Enumeration types are particularly useful when you want to create a unique set of values that may be associated with a variable. The compiler reports a warning if you attempt to assign a value that's not part of the declared set of legal values to an enum variable. The possible formats of enumerated declarations are as follows: enum enum [taLname] {idl [=val] [{, idN [=val]}]} variable_namel [{, variable_nameN}] That is, to declare an enumerated variable, you must specify the keyword enum followed by an optional tag_name. The tag_name is not the name of the variable; rather it is the name of the enumerated type that you are declaring. After the optional tag_name, you optionally specify a list of identifiers separated by commas. This list of identifiers must be enclosed in braces. Each identifier may be followed by an optional constant expression that assigns a value to the enumeration constant. If no value is specified, the enumeration constant is assigned a value one greater than the value assigned to the previous enumeration constant in the list. If no values are specified for the entire list, the numbering begins at zero. Following the optional list of identifiers, you can optionally specify one or more variable_names. A tag name cannot be used by itself; it must be preceded by the keyword enum; for example, compare the right and wrong ways to use the tag name forest: enum forest {maple, pine, fir} nordic; forest southern; /* wrong */ enum forest alpine; /* right */ Here are five sample enumerated declarations: /* These two declarations have a tag name and a variable name. */ enum citrus {lemon, lime, orange, carambola, grapefruit} c_fruits; enum beatles {John, Paul, George, Ringo} beatles_members; /* This declaration has a tag name and two variable names. */ enum color { red , blue , yellow} used, not_used; /* This declaration has a variable name, but no tag name. */ enum {one, two, three} cardinal_numbers; /* This declaration has a tag name, but no variable name. */ enum ordinal_numbers {first, second, third}; 3-14 Data Types and Storage Classes Consider the third declaration. It declares an enumerated type called color with possible values of red, blue, and yellow. Two variables, used and not_used, are defined to have thi~ type. Therefore, variables used and not_used can have the values red, blue, or yellow. For example, you can make these assignments used = red; not_used = yellow; used = not_used; but you cannot make this assignment: used = orange; /* ILLEGAL: orange is not a value of color */ Because enumeration types are stored as integers, it is possible to assign integer values to an enumeration variable. However, the Domain C compiler will issue a warning message when it encounters such usages. For example, the assignment, would produce the following warning message: ******** Line 6: [Warning #205] 5] to the Enumeration type clash [not_used, = operator. You can avoid this warning by casting the integer expression to the enumeration type: not_used = (enum color) 5; For details on how you can use enumerated variables within statements, see the "enumerated operations" listing in Chapter 4. 3.5.1 The Values of Enumerated Constants Enumerated constants are the list of possible identifiers that an identifier can have. For example, the enumerated constants for variables used and not_used are red, blue, and yellow. The Domain C compiler automatically associates an integer value with each enumerated constant. By default, the integer values of enumerated constants start at zero and increment by one with each constant. For example, in the declaration of tag name color, the compiler assigns red=O, blue=1 and yellow=2. Therefore: (yellow > red) (yellow == red) /* evaluates to 1 (true) */ /* evaluates to 0 (false) */ Data Types and Storage Classes 3-15 You can override this numbering scheme by explicitly assigning a number to one or more enumerated constants. For instance, the following initializations enum fruits {apple=3, pear=1, orange, banana, melon=(-l)}; result in the following integer representations: apple pear orange banana melon 3 1 2 3 -1 You can specify the values in any order and you do not have to supply consecutive integer values. If you do not explicitly assign an integer value, the system assigns a value by adding one to the previous constant's value. In our example, this means that both apple and banana have a value of 3. This is perfectly legal and means, in effect, that apple and banana are synonyms. Some compilers allow previously defined enum constants to be used in the initializing expression, as in: enum vegetables {carrots=1, celery=carrots+2}; However, the Domain C compiler does not allow this syntax. Since enumerated constants have an explicit or implicit value, you can use an enumerated constant in place of an integer to subscript an array. For example: { enum {part_number=O, order_number, quantity} num; int part [1000] [2] ; part [0] [part_number] = 1357; part[O] [order_number] = 22567; part [0] [quantity] = 370; } 3-16 Data Types and Storage Classes /* assign part [0] [0] */ /* assign part [0] [1] */ /* assign part [0] [2] */ 3.5.2 Initializing Enumerated Variables You can initialize an enumerated variable when you define it; for example: enum citrus { lemon, lime, orange, carambola, grapefruit} c_fruits = lime; enum beatles {John, Paul, George, Ringo} beatles_members John; enum eurofrancophones {France, Suisse, Belgique} la_langue Belgique; enum color { red , blue , yellow} used = red, not_used = yellow; If the enumerated type has dynamic duration, it may also be initialized by a previously declared variable with the same enumerated type. The following lines, for instance, initialize color to blue, hue to red, and shade to red: { static enum rainbow {red, blue, green} color = blue, hue = red; enum rainbow shade = hue; /* Automatic variable initialized * with previously declared * variable. */ } 3.5.3 Sized enums - Domain Extension By default, the Domain C compiler allocates four bytes for all enumeration variables. However, if you know that the range values being assigned to an enum variable is small, you can direct the compiler to allocate only two bytes by using the short type specifier. You can also use the long type specifier to indicate four-byte enums even though this is the default. For example: enum default_enum { ERR1, ERR2, ERR3, ERR4 }; /* four-byte enum * type */ long big_enum { STO, ST1, ST2, ST3 }; /* four-byte enum type */ short enum small_enum { cats, dogs }; /* two-byte enum type */ When mixed in expressions, enums behave exactly like their similarly sized integer counterparts. That is, an enum behaves like an int, a long enum acts like a long int, and a short enum acts like a short into Note, however, that you will receive a warning message when you mix enum variables or constants with integer or floating-point types, or with differently typed enums. Data Types and Storage Classes 3-17 3.6 The void Data Type Domain C supports the void data type, which has become a common feature of modern C compilers. The void type is not a data type in the traditional sense. You cannot declare a simple variable as being void; for instance, a declaration like the following will cause an error: void x', The void data type has three important purposes. The first is to indicate that a function does not return a value. For instance, you can write a function definition such as: void func( a, b int a, b; { } This indicates that the function does not return any useful value. Likewise, on the calling side, you would declare funcO as: extern void func(); This informs the compiler that any attempt to use the returned value from funeO is a mistake and should be flagged as an error. For example, you could invoke funeO as follows: func( x, y ); But you cannot assign the returned value to a variable: num = func( x, y); /* * This should produce an error */ In situations where the function returns an actual value that you want to ignore, you c'an use void in a cast operation. In the following example, for instance, function print_line_rtn returns an integer error code, but we explicitly discard the returned value through a cast: 3-18 Data Types and Storage Classes /* Program name is "void_example2" */ #include #include int print_line( char *string { if (strlen( string) > 80) /* If line is too long, return * error */ return -1; else { printf("%s\n",string); return 1; } } int main( void { char *string "This is an example of a void function"; (void) print_line ( string ); } In the preceding example, the void cast is not required since the context makes it clear that the value returned by the function should be discarded. Nevertheless, the void cast enables you to make this explicit. You cannot use in any wayan object that has been cast to void. That is, you cannot cast it to another type, you cannot pass it as an argument, and you cannot assign it to a variable. Another purpose of void is to declare a function that takes no arguments. This is described in Section 5.4, which discusses prototypes. Finally, the void type allows you to create generic pointers, as described in Section 3.7.3. 3.7 Pointer Data Types The C language allows you to create a pointer to an object of any type. To declare a pointer variable, precede the pointer variable name with an asterisk (*). The following statements show some examples of pointer declarations. Data Types and Storage Classes 3-19 int *ip; char *chp; char *cp[]; float *fp () ; /* ip is a pointer to an into */ /* chp is a pointer to a char. */ /* cp is an array of pointers to chars. */ /* fp is a function that returns a pointer * to a float. */ float (*pfp)();/* pfp is a pointer to a funtion that returns a * float. */ short **cpp; /* cpp is a pointer to a pointer to a short. */ In the fifth declaration, we need to use parentheses to achieve correct binding. The rules for composing complex declarations such as this one are described in Section 3.11. For details on using pointers in the action part of your program, see the "pointer operations" listing of Chapter 4. 3.7.1 Internal Representation of Pointers Domain C stores pointers in the 32-bit structure shown in Figure 3-9. 16 31 Address (most significant part) Address (least significant part) o 15 Figure 3-9. Pointer Variable Format 3.7.2 Initializing Pointers You can initialize pointer variables with pointer expressions or with the constant zero (0). If the pointer variable has automatic duration, any pointer expression is legal. If the vari- able has fixed duration, the expression must be a pointer constant. Initialization with zero produces a null pointer. Due to dynamic conversions, it is possible to initialize a pointer with a function name, array name, string constant, or address of an object. The following examples show a variety of ways to initialize pointers. 3-20 Data Types and Storage Classes float *null_point = 0; /* Null Pointer */ int i, *pi = &i; /* Address of i */ static char *string="string"; /* Pointer to "string" */ float array[5] , *pa = array; /* Pointer to beginning of array */ float *pal = array+2; /* Pointer to third element of array */ extern void f(); void (*pf) () = f; int *p_absolute /* Define a function named f */ /* Initialize pf to point to f */ (int *) OxFFAABB12; /* Pointer to absolute * address */ 3.7.3 Generic Pointers In accordance with the ANSI standard, the Domain C compiler now allows you to create a generic pointer variable by declaring a pointer to void: void *genp; /* genp is a generic pointer */ A generic pointer can be cast to any other pointer type. Moreover, a generic pointer is implicitly converted to the destination type when it is assigned a pointer value or is assigned to a pointer variable. When a generic pointer is compared to a pointer of another type, it is implicitly converted to the other pointer type. For example: char *cp; float *fp; void * genp; genp cp; /* genp is implicitly converted to pointer to char. */ fp = genp; /* genp is implicitly converted to pointer to * float. */ genp) /* genp is implicitly converted to pointer to if (cp * char. */ It is illegal to dereference a generic pointer without first casting it to a valid pointer type. float f = 2.0; void *genp; genp = &f; /* ok */ f *genp; /* ILLEGAL * / f = *(float *)genp; /* ok */ Generic pointers are particularly useful for functions that can return pointers to different types of objects. The classic example is mallocO, which dynamically allocates memory for Data Types and Storage Classes 3-21 different types of objects. Traditionally, mallocO returns a pointer to char, which must then be cast to the appropriate pointer type. For example: struct S { char str [10] ; int val; } int main( void { extern char *malloc(); struct S *ps; /* ps cast returned value to pointer to struct s. = (struct S *) malloc( sizeof(struct S) ); */ } By redefining mallocO to return a pointer to void rather than a pointer to char, you can avoid casting the returned value because it will be implicitly converted: struct S { char str[lO]; int val; } int main( void { extern void *malloc(); struct S *ps; /* ps returned value is implicitly converted to type of ps. */ = malloc( sizeof(struct S) ); } 3.8 Structure and Union Data Types Because structures and unions obey most of the same syntactic rules, we describe them together. A structure is an object that contains other objects. It is similar to a fixed record in Pascal. The objects within a structure, called members or components, are usually named and can be of any data type, including other structures, unions, or arrays. For instance, a structure might contain an int, a float, and a char as members. A bit field is a special member that takes up from 1 to 32 bits of memory. A union is similar to a structure, but instead of holding all of the members at once, it can hold only one at a time because each member has its storage allocated at the same address. It is similar to a variant record in Pascal. The compiler makes sure that enough space is allocated to hold the largest member. 3-22 Data Types and Storage Classes For details on using structures and unions in statements, see the "structure and union operations" listing in Chapter 4. 3.8.1 Declaring a Structure or Union The only difference between declaring a structure and a union is in the keywords struct and union. There are four basic types of structure and union declarations: 1. No tag name-If you do not specify a tag name, you should declare at least one variable. For instance, the following declares a structure variable called struct_example, which is a structure with three members: struct { int member_one; float member_two; char member_three; } struct_example; 2. Tag name and member declaration(s), but no variable name(s)-This defines a name that can be used in place of the full structure specification in future declarations. For instance, after declaring struct 81 {int i; float f;}; you can declare: struct 81 X,Y; which declares x and y to be structures containing an int member named i and a float member named f. 3. Tag name, member declaration(s), and variable name(s)-This type of declaration serves two purposes: it defines a tag name that can be used in subsequent declarations, and it declares specific variables. For example, union U char ch[8]; int i; u1, u2, u3; defines a type called U, and three variables-ul, u2, and u3-that have this type. 4. Tag name and variable name(s), but no member declarations-This form of declaration may only be used if you have already defined the tag name. For example, after making the preceding declaration, we could write: union U u4; Data Types and Storage Classes 3-23 This would define another variable, u4, with type U. Note that you cannot use the tag name by itself; it must be preceded by the keyword union or struct. Tag names and member names are distinct from each other and from variable names so that a tag and a variable and a member may all have the same name without a conflict arising. The following, for example, is a legal declaration: struct x { int x;}; char x; The compiler will not confuse the three x's: their usage in the code makes it clear which one is being referenced. A structure or union may not contain instances of itself, but it may contain pointers to itself. For example: struct S { int a,b; float c; struct S d; /* struct S *d; THIS IS ILLEGAL! */ /* This is legal */ }; It is possible to create structures and unions that reference each other as shown in the following example: union VI { int a; union V2 *b; }; union V2 { int a; union Ul *b; }; Each union contains an integer as the first component and a pointer to the other union as the second component. Note that it is possible to declare a pointer to U2 before U2 is ever declared. This is one of the few situations in the C language where you may use an identifier before it has been declared. 3.8.2 Internal Representation of Structures Each member of a structure takes up the same amount of space that it would require if it were an unattached variable rather than a member of a structure. For instance, an int requires 32 bits whether it is used as a scalar variable or used as a member of a structure. The boundary alignment rules are somewhat different, however, as explained in the next section. 3.B.2.1 Alignment of Structure Members The alignment of an object identifies the set of legal addresses at which that object can be allocated. Objects that are byte-aligned can be allocated anywhere; objects that are 3-24 Data Types and Storage Classes word-aligned can only be allocated at even addresses; objects that are longword-aligned can only be allocated at addresses that are evenly divisible by four. Natural alignment means that an object's address is evenly divisible by its size. For example, a naturally aligned 4-byte object begins at an address that is evenly divisible by 4, and a naturally aligned 8-byte object begins at an address that is evenly divisible by 8. In general, natural alignment produces faster executable code, although the efficiency savings vary a great deal from one processor to another. Code for the 68000 family of processors runs slightly faster if objects are naturally aligned. By default, all scalar objects are naturally aligned. The rules for structures and unions, and for structure and union members, however, are somewhat different. This section describes the default rules. NOTE: The rules described in this section do not apply to bit fields. See Section 3.8.4 for information about the alignment of bit fields. Alignment rules affect two properties of structures: • How members are laid out in the record (that is, whether padding is inserted between members). • How memory for the entire record is allocated. 3.8.2.2 Layout of Structure Members The compiler lays out structure members based on word alignment rules. According to word alignment rules, all objects longer than a byte must be aligned on shortword boundaries (even addresses). chars may be aligned on odd or even addresses. As illustrated in the following examples, the default alignment rules can produce padding (also called "holes" or "gaps") in a structure, but the padding is never larger than one byte. Consider the following structure: typedef struct { long int a; char b; short c; } S1; Figure 3-10 shows how the members are laid out. Note that there is a byte of padding inserted after b to ensure that c is aligned on a word boundary. Data Types and Storage Classes 3-25 ... 1 word a b c Figure 3-10. Default Layout of Structure S1 The total size of a structure must be an even multiple of two bytes. This rule can result in padding at the end of a structure. (This rule also means that the smallest possible structure is 16 bits.) Figure 3-11 shows the layout of a structure that contains a gap in the middle and a gap at the end as a result of the default alignment rules. typedef struct { char c1; short sl; char c2; } S2; ... c1 s1 c2 Figure 3-11. Layout of Structure S2 3-26 Data Types and Storage Classes 3.8.2.3 Memory Allocation of Structures Structures are always allocated on even addresses (word boundaries). In addition, they may be allocated on even larger boundaries if that allocation will produce natural alignment for some of the structure's members. The actual algorithm used by the compiler to decide how to allocate structures is somewhat complex. The general steps are as follows: 1. As the compiler lays out members, it assumes that the starting address of the structure is zero. 2. The compiler then notes which members are naturally aligned. 3. After all the members have been laid out, the compiler looks for the largest member that is naturally aligned. The compiler then allocates the entire structure on a boundary that matches the natural alignment for this member. These rules will be clearer if we show how they work for a couple of examples. Consider the following structure type: typede£ struct { float a; char b; short c; S3; The layout for this structure is shown in Figure 3-12. 1 word o a 2 4 6 b c Figure 3-12. Naturally Aligned Structure S3 with 1-byte Padding The compiler lays out the members according to word alignment rules, and assumes that the structure begins at address zero. For this structure, the alignment rules produce a layout in which all elements are naturally aligned. (Any member that starts at address zero is naturally aligned.) The compiler then searches for the largest member that is naturally aligned, which is a. The natural alignment of a is longword; therefore, structures of type S3 will be allocated on longword boundaries. Data Types and Storage Classes 3-27 Consider a second example: typedef struct { short a; float b; } 84; The layout is shown in Figure 3-13. In this case, a is naturally aligned, but b is not naturally aligned (because the address 2 is not evenly divisible by b's size, which is 4). Therefore, the compiler uses the natural alignment of a (word alignment) to allocate structures of type S4. 1 word a 2 b 4 Figure 3-13. Layout of S2 Using Word Alignment Rules You can usually guarantee that all members of a structure will be naturally aligned by arranging the members in descending order of size. This technique will always work if all the members are scalar objects. This technique may not work if one or more of the structure members is an aggregate. Arranging members in decreasing order of size also guarantees that there will be no padding between structure members. (There might still be a byte of padding at the end of the structure to make it an even number of bytes.) In some instances, a structure that would normally be allocated on a longword or quadword boundary receives a different allocation because the structure is part of a larger aggregate type (such as a structure or array). For example, consider the declaration of Sl: typedef struct long int x; short y; 81; The compiler can guarantee that an individual structure of type Sl will be allocated on a longword boundary (so that x and y will be naturally aligned), but if you declare an array of Sl structures, only half of them will be aligned on longword boundaries Figure 3-14 shows the layout of an array of three S1 structures. Note that the second element is aligned on a shortword boundary, not a longword boundary. 3-28 Data Types and Storage Classes longword __ boundary 1 word ~ a shortwor int j = 10; /* Program scope */ int main( void { int j; /* Block scope -- hides global j */ for (j=O; j < 5; ++j) printf( "j: %d\n", j ); } There are two j's, one with program scope and the other with block scope. Although they have the same name, they are distinct variables. The j with block scope temporarily hides the other j, so the result of running the program is: j: j: j: j: j: 0 1 2 3 4 The j with program scope retains its value of 10. 3.12.2.2 Block Scope A variable with block scope cannot be accessed outside of its block. Block scoping allows you to write sections of code without worrying about whether your variable names conflict with names used in other parts of the program. It is also possible to declare a variable within a nested block. This temporarily hides any variables of the same name declared in outer blocks. This feature can be useful when you want to add some debugging code into a function. By creating a new block and declaring variables within it, you eliminate the possibility of naming conflicts. In addition, if you delete the debugging code at a later date, you need not look at the top of the function to find variable declarations that also need to be deleted. In the following example, we add some debugging code that prints the values of the first ten elements of an array. 3-50 Data Types and Storage Classes foo( ) { int ar[20]; int j; /* Begin debug code */ { /* This j does not conflict with other j's.*/ int j; for (j=O; j <= 10; ++j) printf( "%d\t", ar[j] ); } /* End debug code */ 3.12.2.3 Function Scope The only names that have function scope are goto labels. Labels are active from the beginning to the end of a function. This means that labels must be unique within a function. Different functions, however, may use the same label names without creating conflicts. 3.12.2.4 File and Program Scope Giving a variable file scope makes the variable active throughout the rest of the file. So if a file contains more than one function, all of the functions following the declaration are able to use the variable. To give a variable file scope, declare it outside of a function with the static keyword. Variables with program scope, called global variables, are visible to routines in other files as well as their own file. To create a global variable, declare it outside of a function without the static keyword. In the following program segment, j has program scope and k has file scope. Both variables can be accessed by routines in the same file, but only j can be accessed by routines in other files. int j; static int k; main () { Variables with file scope are particularly useful when you have a number of functions that operate on a shared data structure, but you don't want to make the data visible to other functions. Data Types and Storage Classes 3-51 3.12.3 Duration of a Variable The duration of a variable describes the lifetime of a variable's memory storage. There are two categories of duration: automatic and fixed. As the names imply, a fixed variable is one that is stationary, whereas an automatic variable is one whose memory storage is automatically allocated when its scope is entered during program execution. This means that a fixed variable has memory allocated for it at program start-up time, and the variable is associated with a single memory location until the end of the program. An automatic variable has memory allocated for it whenever its scope is entered. The automatic variable refers to that memory address only as long as code within the scope is being executed. Once the scope of the automatic variable is exited, the compiler is free to assign that memory location to the next automatic variable it sees. If the scope is re-entered, a new address is allocated for the variable. There is no way to ensure that an automatic variable will retain its value from one scope entry to another. The difference between fixed and automatic variables is especially important for initialized variables. Fixed variables are initialized only once whereas automatic variables are initialized each time their block is re-entered. Consider the following program: /* Program name is "example_of_static" */ #include void increment( void { int j = 1; static int k l', j++; k++; printf( "j: %d\tk: %d\n", j, k ); } int maine void { increment(); increment() ; increment() ; } The incrementO function increments two variables, j and k, both initialized to 1. j has automatic duration by default, while k has fixed duration because of the static keyword. The result of running the program is: j: 2 j: 2 j: 2 k: 2 k: 3 k: 4 When incrementO is called the second time, memory for j is reallocated and j is reinitialized to 1. k, on the other hand, has still maintained its memory address and is not reini- 3-52 Data Types and Storage Classes tialized, so its value of 2 from the first function call is still present. No matter how many times we call incrementO, the value of j will always be 2, while k will increase by 1 every time we call it. We can summarize this observation with the following rule: an automatic variable, when declared with an initializer, is re-initialized every time its block is re-entered; a fixed variable is initialized only once, at program startup-time. Another important difference between automatic and fixed variables is that automatic variables are not initialized by default, whereas fixed variables get a default initial value of zero. If we rewrite the previous program without initializing the variables, we get: /* Program name is "init_example" */ #include void increment( void { int j; static int k; j++; k++; printf( "j: %d\tk: %d\n", j, k ); int main( void { increment(); increment(); increment(); } Executing the program on our machine results in: j: 52517483 k: 1 j: 52517483 k: 2 j: 52517483 k: 3 The values of j are random because the variable is never initialized. With each invocation of incrementO, j receives a new memory allocation and acquires whatever "garbage" value happens to be at the new location. Because Domain C uses a stack-frame implementation, the garbage values are, in this simple example, the same each time. The C language, however, does not guarantee this. If you use a more complicated calling sequence, the results will be different. The Domain C compiler issues the following warning if you attempt to use an uninitialized automatic variable before you have made an assignment to it: ******** Line 15: Warning: Variable "aut02" was not initialized before this use. No errors, 1 warning, C Compiler, Rev 4.82 Data Types and Storage Classes 3-53 Another difference between initializing variables with fixed and automatic duration is the kinds of expressions that may be used as an initializer. For scalar variables with automatic duration, the initializer may be any expression, so long as all of the variables in the expression have been previously declared. For example, all of the following declarations are legal: { extern double f(); int x la, Y = x*x; float z = x + f(x); For variables with fixed duration, on the other hand, the initilization expressions must be constant expressions. We can summarize the differences between fixed and automatic variables as follows: • Fixed variables maintain their values from one block invocation to another, but automatic variables lose their value each time the block is deactivated. • Fixed variables get a default initialization value of zero if you do not explicitly initialize them. If you do not explicitly initialize an automatic variable, the compiler will not initialize it for you. • The run-time system initializes fixed variables only once, whereas automatic variables, if they are declared with an intializer, are re-initialized each time their block is entered. Bug Alert: The Dual Meanings of" static" One of the most confusing aspects aboutstorage-'classdeclarations in C is that the static keyword seems to have two effects depending oTlwhere it appears. Ina declara~ tionwithin a block, static gives a variable fixed duration instead of automatic duration. Outside of a function, on the other hand, static has nothing to do with duration. Rather, it controls the scope of a variable, giving it file scope instead of program scope. One way of reconciling these dual meanings is to think of static as signifying both file scopingandfixed duration. Within a block, the stricter block scoping rules override static's file scoping, s6 fixed duration is the only manifest result. Outside of afunction, duration is already fixed, so file scopingis the only manifest result. 3-54 Data Types and Storage Classes 3.12.4 Storage Class Specifiers As mentioned earlier, you can supply an optional storage class specifier when you declare a variable. There are four storage-class specifiers (auto, static, extern, and register). Any of the storage class keywords may appear before or after the type name in a declaration, but by convention they come before the type name. (The ANSI standard requires that storage class specifiers appear before type specifiers.) The semantics of each keyword depends to some extent on the location of the declaration. Omitting a storage class specifier also has a meaning, as described below. Table 3-3 summarizes the scope and duration semantics of each storage class specifier. auto The auto keyword, which makes a variable automatic, is legal only for variables with block scope. Since this is the default anyway, auto is somewhat superfluous and is rarely used. static The static keyword may be applied to declarations both within and outside of a function (except for function arguments), but the meaning differs in the two cases. In declarations within a function, static causes the variable to have fixed duration instead of the default automatic duration. For variables declared outside of a function, the static keyword gives the variable file scope instead of program scope. extern The extern specifier may be used for declarations both within and outside of a function (except for function arguments). In both cases, it signifies a global allusion, discussed in Section 3.13. register The register keyword may be used only for variables declared within a function. It makes the variable automatic, but also passes a hint to the compiler to store the variable in a register whenever possible. You should use the register keyword for automatic variables that are accessed frequently. Compilers support this feature at various levels. Some don't support it at all, while others support as many as 20 concurrent register assignments. Note that it is illegal to apply the address-of operator (&) to any variable declared with register. omitted For variables with block scope, omitting a storage class specifier is the same as specifying auto. For variables declared outside of a function, omitting the storage class specifier is the same as specifying extern. It causes the compiler to produce a global definition. Data Types and Storage Classes 3-55 Here are some sample declarations that contain storage class specifiers: auto register static extern int short char float i', quart; dog [] f·, "Fenster" ; Table 3-3. Storage Class Summary ~ Declared Storage Class Specifier . auto or register Outside of a Function (top-level) scope: NOT ALLOWED scope: static Within a Function (head-of-block) file duration: block duration: automatic scope: block duration: automatic scope: block fixed duration: fixed scope: program scope: block extern duration: duration: No storage class specifier present scope: fixed program duration: Function Arguments fixed scope: fixed block duration: automatic NOT ALLOWED NOT ALLOWED scope: block duration: automatic 3.12.5 The register Specifier The register keyword enables you to help the compiler by giving it suggestions about which variables should be kept in registers. However, register is only a hint, not a directivethe compiler is free to ignore it. In fact, the Domain compiler is so efficient in allocating variables to registers that using the register keyword has little or no effect on most programs. Since a variable declared with register might never be assigned a memory address, it is illegal to take the address of a register variable (registers are not addressable). This is true regardless of whether the variable is actually assigned to a register. You will get a compiletime error if you ever try to take the address of a variable declared with register. 3-56 Data Types and Storage Classes 3.13 Global Variables A global variable (also called an external variable) is one that can be accessed by modules in different source files; that is, a global variable has program scope. There are two types of declarations for global variables: allusions and definitions, as described in the next section. 3.13.1 Definitions and Allusions The difference between an allusion and a definition in C is subtle but important. An allusion associates a data type with an identifier, but does not actually allocate any storage for it. A definition, on the other hand, actually allocates memory. For example, consider the following allusions and definitions: static extern int int int x', y; z·, /* This is a definition /* This is a definition /* This is an allusion */ */ */ If you use the storage class specifier extern, you generate an allusion. If you use a storage class specifier other than extern, or if you omit a storage class specifier, then you generate a variable definition. The distinction between allusions and definitions is particularly important when creating global variables. NOTE: At some points during this manual, the distinction between an allusion and a definition is unimportant For these instances, we use the more general word "declaration." Typically, you put all allusions in a header file, which can be included in other source files. This ensures that all source files use consistent allusions. Any change to a declaration in a header file is automatically propagated to all source files that include that header file. 3.13.2 Defining Global Variables In Domain C, every global variable can be alluded to zero or more times (in different files), but must be defined at least once. It may be defined more than once in different files. You cannot, however, define a global variable more than once in the same file. If you explicitly initialize a global variable in more than one file, the last initializer read by the linker is the variable's initial value at run time. Therefore, the order in which you list files in the bind or Id command determines the initial values of external variables. If you do not initialize a global definition, its initial value defaults to O. To demonstrate these rules, consider Figures 3-23, 3-24, and 3-25. Data Types and Storage Classes 3-57 t1.c extern int x;/*all*/ t2.c extern int x;/*all*/ t3.c int x;/*def*/ maine ) fO g() { { printf("%5d", x); { printf("%5d", x); printf("%5d\n", x); fO; gO; } $ cc t1; cc t2; cc t3 $ bind t1. bin t2. bin t3. bin -b t $ t o o $ cc t1.c t2.c t3.c $ a.out o 000 Figure 3-23. Two Declarations and One Definition with No Initialization tl.c extern int x;/*all*/ t2.c extern int x;/*all*/ main 0 { printf("%5d", x); fO fO; { int x g() { printf ("%5d\n", x); printf("%5d", x); } t3.c 5;/*def*/ } gO; } $ cc 11; cc t2; cc t3 $ bind t1. bin t2. bin t3. bin -b t $ t 5 5 $ cc t1.c t2.c t3.c $ a.out 555 5 Figure 3-24. The Effect of Initializing a Global Variable 3-58 Data Types and Storage Classes t1.c extern int x;/*all*/ maine ) { printf("%5d", x); int x fO { t2.c 7;/*def*/ printf("%5d", x); t3.c 5;/*def*/ int x gO { printf(I%5d\n", x); } fO; gO; } $ cc t1; cc t2; cc t3 $ bind t1. bin t2. bin t3. bin -b t $ t 5 5 5 $ bind t1. bin t2. bin t3. bin -b t $ t 7 7 7 $ cc tl.c t2.c t3.c $ a.out 5 5 5 $ cc tl.c t2.c t3.c $ a.out 7 7 7 Figure 3-25. The Effect of Linking Order on Variable Initialization For further clarification on global variables, we provide the following program fragments: Here is FILE 1: int dl; /* This is a definition of a global variable. */ int d2=1; /* This is a definition of a global variable with an * initializer. */ extern int d3; /* This is an allusion to a global variable defined * in FILE2. */ THIS IS ILLEGAL! You cannot initialize an /* extern int d4=5; allusion. * */ int maine void ) { int local; /* This is a definition of a local variable. It is * not exported by the binder. */ extern int d5; /* This is an allusion to a global variable * defined in FILE 2. */ Data Types and Storage Classes 3-59 Here is FILE 2: int d3 = 0; char d5; /* This is a definition of a global variable. */ /* This is a definition of a global variable. */ void some_function( void { extern int dl; /* This is an allusion to the variable defined on * line #1 of FILE 1. */ } 3.13.3 Portability Considerations Regarding Global Variables If you are planning to port your Domain C programs to a different machine, take into ac- count that not all compilers use the same strategy for external definitions and declarations. For maximum portability, follow these guidelines: • Do not define the same global variable more than once in the same program. Domain C permits you to define a global variable multiple times, but other C compilers may be stricter. • For each routine that refers to a global variable, declare the variable with the keyword extern, and without an initializer. 3.13.4 Sections The Domain C compiler creates a named section for each globally defined variable. Sections are detailed in the Domain/OS Programming Environment Reference manual. When the object files are bound together, the linker makes sure that all global variables with the same name refer to the same named section. 3.14 Storage Class of Functions Just like variables, functions also have a scope, although the rules are somewhat different. When discussing storage class of functions it is important to distinguish between function definitions and function allusions. 3-60 Data Types and Storage Classes 3.14.1 Function Definitions A function definition is a complete function-that is, a data type that the function returns, the name of the function, optional parameters, parameter declarations, and the function body. For example, here is the function definition of a function named fun: #include int func int x, int y) { printf ( "%d %d", x, return (x + y); y ); } By default, function definitions have global scope. In other words, you can call these routines from any place in the program (including some other file). If you want the function definition to have file scope instead, then use the storage class specifier static. For example: #include static int fun( int x, int y) { printf C "%d %d" , x, return (x + y) ; y ) ; By using static, you limit the scope function fun to the file in which it is defined. Note that static is the only legal storage class specifier for a function definition. 3.14.2 Function Allusions A function allusion identifies a function that is defined elsewhere, either in the same source file or in another source file. A function allusion can begin with the extern storage class specifier. It optionally contains the data type that the function can return, and concludes with the name of the function followed by an empty pair of parentheses. (This is the old-style type of function allusion; the new style uses prototypes, as described below.) For example: extern int fun(); extern fun(); int funO; Note that you can omit either the type specifier or extern, but not both. If you omit both, the declaration will be interpreted as a function invocation. Data Types and Storage Classes 3-61 Domain C supports a new syntax for function allusions called prototypes. A prototype enables you to specify the types and number of arguments that the function accepts. For example: extern int fun( int, char *, float ); Prototypes are described in detail in Chapter S. You can specify a function allusion either within a block or outside of a block. When declared within a block, it means that you can invoke that function within the block. When declared outside of a block, you can invoke the function anywhere from the declaration point to the end of the source file. Technically, you do not need to declare functions that return an int since this is the default. However, it is good programming practice to declare all functions since it makes your programs easier to understand. For more information on function allusions and definitions, see Chapter S. 3.15 Reference Variables - Domain Extension The Domain C compiler supports reference variables as implemented in the C++ language. This discussion describes the most common usages of reference variables. For a more complete discussion, we recommend that you read The C++ Programming Language by Bjarne Stroustrup. (Reference variable features will not be activated if you compile with the -ntype option.) A reference variable is a variable that refers to another object (an Ivalue or an rvalue). Whenever a reference variable appears in an expression, the object it denotes is accessed. Reference variables have three main applications: • Reference variables allow you to create aliases for a variable so that two or more names refer to the same object. • Reference variables allow you to give names to constants, and, more importantly, to use the constants as lvalues. In effect, reference variables turn constants into variables. • Reference variables provide a clean syntax for passing function arguments by reference. These applications of reference variables are discussed in Chapter S. The following section describes how to declare reference variables. 3-62 Data Types and Storage Classes 3.15.1 Declaring R.eference Variables To declare a reference variable, precede the variable name with the address-of operator (&) and include an initializer: int j; int &rj j; float &rf = 3.141; /* rj refers to j */ /* rf refers to the constant 3.141 */ The initializer is required because it specifies the object that the reference variable denotes. Having made these declarations, you can write: rj = 1; /* assigns 1 to j */ rj++; /* increments j */ rf *= rf /* squares 3.141 */ The last example is the most interesting because it uses a reference variable denoting a constant as an lvalue. This is legal because the compiler generates a temporary variable for all reference variables initialized with a constant value. For example, the declaration, int &r = 0; causes the compiler to generate a hidden temporary variable initialized to zero. Whenever r appears in an expression, this hidden variable is accessed. 3.16 The #aUribute Modifier - Domain Extension The Domain C compiler supports a declaration modifier called #attribute that enables you to access special features of the Domain C compiler. One of the purposes of #attribute is to turn off certain kinds of compiler optimizations. This feature is particularly useful for writing device drivers or other programs that access fixed memory locations. Although it begins with the # character, #attribute is a reserved word, not a preprocessor statement. You use it when you declare or define a variable, tag name, or typedef. The #attribute modifier always takes one of the following arguments (called attribute specifiers) enclosed in brackets: address Binds a variable to a specific virtual address. device Informs the compiler that the variable is a device register. The device specifier is similar to volatile, but restricts optimizations even further. section Specifies a named section in which to overlay the variable. volatile Informs the compiler that the variable may change in ways that it cannot predict. Consequently, the compiler refrains from executing certain optimizations. Data Types and Storage Classes 3-63 Each of these specifiers is described in detail in later sections. First, however, we provide some general information about the #attribute modifier. 3.16.1 Inheritance of Declaration Modifiers The device, volatile, and section modifiers are inheritable in the type declaration hierarchy. That is, if you define a type in terms of some more primitive type that was declared with one or more of these modifiers, then the new type inherits those modifiers. For example, the following declaration defines a type called SEMAPHORE and an array called resource: typedef int SEMAPHORE #attribute[volatile]; SEMAPHORE resource [10] ; The resource array inherits the volatile storage class from the definition of the SEMAPHORE typedef. Note that this rule does not apply to the address specifier because this specifier is valid only in variable definitions, not in tag name or typedef declarations. 3.16.2 #attribute and Pointer Types It is usually incorrect to associate the device and volatile specifiers with a pointer type. For example, declaring a pointer to a device register by means of the following declaration is almost certainly incorrect: int *iodata #attribute[device] ; The correct specification is: typedef int DEVINT #attribute[device]; DEVINT *iodata; which declares a pointer to an int with the #attribute modifier, rather than assigning the modifier to the pointer itself. 3.16.3 The volatile Specifier The syntax of the volatile specifier is: [specifier] data _type variable_name #attribute [volatile] [initializer] where specifier can be extern, auto, static, register, or typedef. 3-64 Data Types and Storage Classes The volatile specifier informs the compiler that memory contents may change in a way that the compiler cannot predict. There are two situations, in particular, where this might occur: • The variable is in a shared memory location accessed by two or more processes. • The variable can be accessed by two different access paths. (That is, multiple pointers with different base types refer to the same memory locations.) In both of these situations, it is crucial that you tell the compiler not to perform certain optimizations as it normally would. For example, the following code causes optimizations leading to erroneous code. /* Program name is "volatile_example" */ #include #ifdef ATTR # define VOL #attribute[volatile] #else # define VOL #endif typedef int VINT VOL; void killer( int a, VINT b ) { int j; int *p = &a + 1; j = b* (b+1) ; *p = 0; j =j +b*(b+1); printf( "b = %d\n", b*(b+1) ); } int main( void { killer( 1,2 ); In the preceding program, the compiler sees that the calculation b * (b+1) is done three times without any change to b. Since it appears to the compiler that it is wasteful to do the same calculation needlessly, it will make the calculation only once, then store the result in a register. Then, instead of calculating it a second or third time, the value will simply be fetched from the register. The problem with this optimization is that b's value is indirectly changed between the first and second calculations. Therefore, you must use #attribute [volatile] to tell the compiler to avoid the optimization. Notice that #attribute is defined in a conditional compilation directive. Therefore, if we compile with the following compilation option: Data Types and Storage Classes 3-65 -def ATTR and run the resulting program, we get the following results: b = 0 However, if we compile without the -def ATTR option, and we run the program, we get the following results: b = 6 3.16.41 The device Specifier The syntax for the device specifier is: [speCifier] data_type variable_name #attribute[device [ ([read, ] [write])]] [initializer] where specifier can be extern, auto, static, register, or typedef. The device specifier informs the compiler that a device register (control or data) is mapped as a specific virtual address. The device specifier prevents the same optimizations that volatile prevents, and prevents two other optimizations as well. The first optimization that device prevents concerns adjacent references. By default, the compiler optimizes certain adjacent references by merging them into large reference. The device specifier prevents this optimization. For example, consider the following fragment: short int a,b; a=O; b=O; By default, the compiler optimizes the two 16-bit assignments by merging them into one 32-bit assignment. (That is, at run time, the system assigns a 32-bit zero instead of assigning two 16-bit zeros.) By specifying the device specifier, you suppress this optimization. The device specifier also prevents the compiler from generating gratuitous read-modifywrite references for device registers. That is, specifying a variable as #device causes the compiler to avoid using instructions that do unnecessary reads. 3-66 Data Types and Storage Classes Now let's demonstrate device through some examples. Suppose kb in the following fragment is a device register that accepts characters from the keyboard: char c, c1, *kb; c c1 *kb; *kb; The purpose of the program is to read a character from the keyboard and store it in c, then read the next character and store it in cl. However, the C compiler, unaware that the value of kb can be changed outside of the block, optimizes the code as follows: It stores the value of kb in a register, and thus assigns both c and cl identical values. Obviously, this is not what the programmer intended, since Domain C assigns the same character to both c and cl. To ensure that Domain C reads kb twice, declare it as: char *kb #attribute[device] ; Another situation where normal optimization techniques can change the meaning of a program is in loop-invariant expressions. For instance, using kb again, suppose we have the program segment: int x; char c, *kb; { while (x < 10) { c = *kb; foo(c) ; ++x; } The purpose of the block is to read 10 successive characters from the keyboard and pass each to a function called foo. However, to the compiler, it looks like an inefficient program since c will be assigned the same value 10 times. To optimize the program, the compiler may translate it as if it had been written: int x; char c, *kb; { c = *kb; while (x < 10) { foo (c) ; ++x; } Data Types and Storage Classes 3-67 To ensure that the compiler does not optimize your program in that manner, declare kb as follows: char c Hat tribute [device] ; In addition to suppressing optimizations, you can also use device to specify that a device is either exclusively read from or exclusively written to. You achieve this by using the read and write options: device (read) This attribute specifies read-only access for this variable or type. That is, if you attempt to write to this variable, the compiler flags the attempt as invalid and issues an error message. Although the syntax is available, the read and write options currently have no effect. device (write) This attribute specifies write-only access for this variable or type. That is, if you attempt to read from this variable, the compiler flags the attempt as invalid and issues an error message. Although the syntax is available, the read and write options currently have no effect. It will be implemented in a future release of Domain C. device (read, write) This attribute specifies both read and write access for this variable. Using it is identical to using device by itself (without any options). device (write ,read) Same as device(read,write). For example, here are some sample declarations using device: typedef int a[10] Hattribute[device(read)]; /* read access */ char c Hattribute[device(write); /* write access */ char c2 Hattribute[device(read,write)]; /* read and * write * access */ 3.16.5 The address Specifier The syntax for the address specifier is: [specifier] data_type variable_name #attribute [address] [initializer] where specifier can be auto, static, or register. The address specifier binds a variable to the specified virtual address, specified by a constant. You can use address for a variable definition only; therefore, you cannot use it with typedef or extern. The address specifier is useful for referencing objects at fixed 10- 3-68 Data Types and Storage Classes cations in the address space (such as device registers, the PEB page, or certain system data structures). Typically, the compiler generates absolute addressing modes when accessing such an operand. Using address by itself (without device or volatile) does not suppress any compiler optimizations. You should use it in conjunction with device or volatile. The example below associates the variable peb_page with the hexadecimal virtual address FF7000. char peb_page Hattribute[device, address(OxFF7000)]; 3.16.6 The section Specifier The syntax for the section specifier is: [extern] data_type variable_name #attribute[section(name)] [initializer] where name is the named section in which to place the variable. Note that the #attribute [section] modifier is legal only for global declarations. You will receive an error if you attempt to use it with local declarations. When you compile with Ibin/cc, the compiler places all uninitialized global declarations in a section of the object file called . bss. All initialized global variables are placed in a section called .data. This is the standard format for UNIX object files. The Icomlcc compiler, on the other hand, creates a special named section for each global variable, whether it is initialized or not. By default, the name of the section is the same as the global variable. (You can obtain the Ibin/cc object file format by compiling with the -bss option.) The section specifier enables you to mimic Icomlcc behavior when you compile with Ibinl cc. This is particularly useful for interacting with FORTRAN programs that use common blocks. For example, suppose a FORTRAN program contains the following common block definition: integer*4 first real*8 second char*20 third common /com_block/ first, second, third These declarations produce a named section called com_block in the object file that contains the three variables named first, second, and third. If you want to access these variables from a C program compiled with Ibin/cc, you need to use the section specifier: typedef struct { int first; double second; char third[20]; } COM_BLOCK; COM_BLOCK com_block Hattribute[section(com_block)]; Data Types and Storage Classes 3-69 If you are compiling with Icomlcc, the section #attribute[section] modifier is unnecessary because leo mice automatically creates a named section for each global variable. The binder then overlays sections that have the same name. See Chapter 7 for more information about sharing global data with Pascal and FORTRAN programs. -------88------- 3-70 Data Types and Storage Classes Chapter 4 Code This chapter describes the statements and operators that make up the action part of a Domain C function. We provide an overview at the beginning of this chapter. The remainder of the chapter is a Domain C encyclopedia. If you are just beginning to learn C, we suggest you read a good C tutorial textbook before trying to use this chapter. This overview of Domain C code is divided into the following categories: • Statements • Operators • Type Conversions • Preprocessor Directives 4.1 Statements There are many type of C statements-null statements, simple statements, compound statements, branching statements, and looping statements. The following sections briefly describe each of these types. Code 4-1 4.1.1 Null Statement A null statement is simply a semicolon by itself. The null statement is sometimes used as the block of a for or while loop, when the action is specified in the loop. The following loop, for instance, reads characters from stdin until an EOF character is encountered: while«c = getchar(» != EOF) /* null statement */ 4.1.2 Simple Statement A simple statement consists of an expression followed by a semicolon. Here are a few examples of simple statements: x = 5; x++; f(x); /* a variable assignment */ /* a variable increment */ /* a function call (see Chapter 5 for details) */ 4.1.3 Compound Statement or Block A compound statement or block has the following format: { declarationl declarationN statementl statementN } That is, a compound statement consists of one or more optional declarations followed by one or more optional statements. A declaration can be any variable or typedef declaration. (Note that such a declaration has block scope.) A statement can be any null statement, simple statement, or compound statement. The body of a function is itself a block. C programmers commonly use compound statements as the body of a loop. In the following example, two statements (an assignment statement and a function call) make up the compound statement: for (x = 1; x < 11; x++) { } 4-2 Code running total = running total + x; /* assignment statement */ printf("running_total is %d\n ll , running total); /* function call */ A right brace } marks the end of a compound statement; do not put a semicolon after this right brace. 4.1.4 Branching Statements C supports two conditional branching statements-if(and if/else) and switch. The if and if/else statements test expressions and execute statements depending on the results of the test. The switch statement selects among several statements based on constant values. The case, default, and break keywords are optional elements of a switch statement. C supports two unconditional branching statements-goto and return. The go to statement causes a jump to a label (or more specifically, a jump to the first statement following that label). All statements may be preceded by a label. The return statement causes an unconditional return to the calling routine. You can optionally use return to pass data back to the caller. 4.1.5 Looping Statements Domain C supports three looping statements-for, while, and do/while. These statements enable you to iterate through a block of code. Within a loop, you can use the continue and break statements. The continue statement causes a jump to the next iteration of the loop, while break transfers control to the first statement following the end of the loop. 4.2 Overview: Operators Operators are the verbs of the C language that let you calculate values. C's rich set of operators is one of its distinguishing characteristics. The operator symbols are composed of one or more special characters. If an operator consists of more than one character, you should enter the characters without any intervening spaces: x <= Y x < =y /* legal expression */ /* illegal expression */ Each operator takes one or more operands. If you think of operators as verbs', then the operands are the subject and object of those verbs. Code 4-3 Domain C supports the following kinds of operators: • • • • • • • • • • • Pointer operators Increment and decrement operators Cast operators sizeof operator Arithmetic operators Comparison (relational) operators Bit operators Logical operators Conditional expression operators Comma operator Assignment operator We summarize these operators in this section. For many of the operators. one or more of the operands must be an Ivalue. An Ivalue is an expression that refers to a region of storage that can be manipulated. In other words. an lvalue is any expression that you can use on the left side of an assignment operation. For example. all simple variables. like ints and floats. are lvalues. An element of an array is also an lvalue; however. an entire array is not. A member of a structure or union is an lvalue; an entire structure or union is not. 4.2.1 Pointer Operators We begin this overview with a look at the pointer operators: Dereferences a pointer. That is. it finds the contents stored at the virtual address that ptr_exp holds. ptr->member Dereferences a ptr to a structure or union where member is a member of that structure or union. &lvalue Finds the virtual address where the lvalue is stored. See the "pointer operations" listing later in this chapter for details. 4-4 Code 4.2.2 Increment and Decrement Operators C supports the increment and decrement unary operators listed below. ++lvalue Increments the current value of lvalue before lvalue is referenced. lvalue++ Increments the current value of lvalue after lvalue has been referenced. --lvalue Decrements the current value of lvalue before lvalue is referenced. lvalue-- Decrements the current value of lvalue after lvalue has been referenced. For details, see the "increment and decrement operators" listing later in this chapter. 4.2.3 Cast Operator C supports the cast operator which takes the following form: (data_type)exp .Casts the value of exp to a new data type. For details, see the "cast operator" listing later in this chapter. 4.2.4 sizeof Operator The following list provides an overview of the sizeof operator: sizeof exp Calculates the size (in bytes) of expo sizeof (data _type) Calculates the size (in bytes) that a variable of this data_type takes up in memory. For details, see the "sizeof" listing later in this chapter. Code 4-5 4.2.5 Arithmetic Operators The following list summarizes all the binary arithmetic operators: expJ + exp2 Adds expJ and exp2. An exp can be any integer expression or floating-point expression. expJ - exp2 Subtracts exp2 from expl. An exp can be any integer expression or floating-point expression. expl * exp2 Multiplies expJ by exp2. An exp can be any integer expression or floating-point expression. expJ I exp2 Divides expl by exp2. (Can perform integer or real division. If integer division. I operator performs division and truncates result to an integer.) expJ % exp2 Finds modulo of expJ divided by exp2. (That is. finds the remainder of an integer division.) An exp can be any integer expression. -exp Negates the value of expo (That is. it multiplies exp by -1.) exp can be any integer expression or floating-point expression. For full details on these operators. see the "arithmetic operators" listing later in this chapter. 4.2.6 Comparison (Relational) Operators Use the following operators to compare two expressions: 4-6 expl < exp2 Evaluates to 1 (true) if expl is less than exp2; otherwise. evaluates to 0 (false). expl > exp2 Evaluates to 1 if expJ is greater than exp2; otherwise. evaluates to O. expl <= exp2 Evaluates to 1 if expl is less than or equal to exp2; otherwise. evaluates to O. expl >= exp2 Evaluates to 1 if expl is greater than or equal to exp2; otherwise. evaluates to O. expl == exp2 Evaluates to 1 if expl is equal to exp2; otherwise. evaluates to O. Code expJ != exp2 Evaluates to 1 if expJ is not equal to exp2; otherwise. evaluates to O. For details. see the "relational operators" listing later in this chapter. 4.2.7 Bit Operators Use operators from the following list to perform bit operations. Note that all operands in this list must be integers. expJ « exp2 Left shifts the bits in expJ by exp2 positions. expJ » exp2 Right shifts the bits in expJ by exp2 positions. expJ & exp2 Performs a bitwise AND operation. expJ exp2 Performs a bitwise exclusive OR operation. I exp2 Performs a bitwise inclusive OR operation. expJ A -exp Calculates the one's-complement of expo For details. see the "bit operators" listing later in this chapter. 4.2.8 Logical Operators The following list summarizes the three logical operators: expJ && exp2 expJ !exp II exp2 Performs a logical AND on the values of expJ and exp2. In C. the value 0 is equivalent to false, and any nonzero value is equivalent to true. Performs a logical OR on the values of expJ and exp2. Calculates the logical negation of exp. For details. see the "logical operators" listing later in this chapter. Code 4-7 4.2.9 Conditional Expression Operator C supports the following conditional expression operator: expJ ? exp2 : exp3 C shorthand for an iflelse statement. If expJ is true zero), then the result is exp2. If expJ is false (zero), the result is exp3. Note that the conditional operator the advantage that it can be used in some places that else statement cannot. (nonthen has an if1 For details, see the "conditional expression operator" listing later in this chapter. 4.2.10 Comma Operator C supports the comma operator as follows: expJ, exp2 Separates two expressions. Note that all expressions return values. The value of a comma operation is equal to the value of exp2. For details, see the "comma operator" listing later in this chapter. 4.2.11 Assignment Operators Finally, C supports all of the following assignment operators: 4-8 lvalue = exp Sets lvalue (a variable name) to the value of expo lvalue += exp Sets lvalue equal to lvalue + expo lvalue -= exp Sets lvalue equal to lvalue - expo lvalue *= exp Sets lvalue equal to lvalue * expo lvalue 1= exp Sets lvalue equal to lvalue 1 expo lvalue %= exp Sets lvalue equal to lvalue % expo lvalue »= exp Sets lvalue equal to lvalue » expo lvalue «= exp Sets lvalue equal to lvalue « expo lvalue &= exp Sets lvalue equal to lvalue & expo lvalue '= exp Sets lvalue equal to lvalue ' expo Code [value 1= Sets lvalue equal to [value exp 1 expo See the "assignment operators" listing later in this chapter. 4.2.12 Precedence and Associativity of Operators All operators have two important properties associated with them called precedence and associativity. Both properties affect how operands are attached to operators. Operators with higher precedence have their operands bound, or grouped, to them before operators of lower precedence, regardless of the order in which they appear. For example, the multiplication operator has higher precedence than the addition operator, so the two expressions, * + 3 2 * 3 4 4 + 2 both evaluate to 14-the operands 3 and 4 are grouped with the multiplication operator rather than the addition operator because the multiplication operator has higher precedence. If there were no precedence rules, and the compiler grouped operands to operators in left-to-right order, the first expression, * 2 + 3 4 would evaluate to 20. Table 4-1 lists every C operator in order of precedence. In cases where operators have the same precedence, associativity (sometimes called binding) is used to determine the order in which operands are grouped with operators. Grouping occurs in either right-to-Ieft or left-to-right order, depending on the operator. Right-to-Ieft associativity means that the compiler starts on the right of the expression and works left. Left-to-right associativity means that the compiler starts on the left of the expression and works right. For example, the plus and minus operators have the same precedence and are both left-to-right associative: + b - c; a /* add a to b, then subtract c */ The assignment operator, on the other hand, is right-associative: a = b = c; /* assign c to b, then assign b to a */ 4.2.13 Parentheses The compiler groups operands and operators that appear within parentheses first, so you can use parentheses to specify a particular grouping order. For example: /* * subtract 3 from 2, then multiply that by 4 -result is -4 */ (2 - /* * * 3) 4 multiply 3 and 4, then subtract from 2 -result is -10 */ 2 - (3 * 4) Code 4-9 In the second case, the parentheses are unnecessary since multiplication has a. higher precedence than addition. Nevertheless, parentheses serve a valuable stylistic function by making an expression more readable, even though they may be redundant from a semantic viewpoint. In the event of nested parentheses, the compiler groups the expression enclosed by the innermost parentheses first. 4.2.14 Order of Evaluation An important point to understand is that precedence and associativity have little to do with order of evaluation, another important property of expressions. The order of evaluation refers to the actual order in which the compiler evaluates operators. This is independent of the order in which the compiler groups operands to operators. For most operators, the compiler is free to evaluate subexpressions in any order it pleases. It may even reorganize the expression, so long as the reorganization does not affect the final result. For example, given the expression, (2 + 3) * 4 the" compiler might first add 2 and 3, and then multiply by 4. On the other hand, a compiler is free to reorganize the expression into: (2 * 4) + (3 * 4) since this gives the same result. The order of evaluation can have a critical impact on expressions that contain side effects. Moreover, reorganization of expressions can sometimes cause overflow conditions. 4-10 Code Table 4-1. Binding and Precedence of Operators class of operator operators in that class [] binding primary () -> Left-to-Right unary cast operator size of & (address of) * (dereference) - (reverse sign) Right-to-Left - 1 ++ -- multiplicative * I additive + - Left-to-Right shift « » Left-to-Right relational < <= equality -- 1= bitwise AND & Left-to-Right A Left-to-Right bitwise inclusive OR I Left-to-Right logical AND && Left -to-Right logical OR II Left-to-Right conditional ? : Right-to-Left assignment = 1= &= >= comma . A = Left-to-Right Left-to-Right bitwise exclusive OR += %= HIGHEST Left-to-Right % > precedence -= »= != *= «= Right-to-Left Left-to-Right LOWEST Code 4-11 4.3 Type Conversions The C language allows you to mix arithmetic types in expressions with few restrictions. For example, you can write: num = 3 * 2.1; even though the expression on the right-hand side of the assignment is a mixture of two types, an int and a double. Also, the data type of num could be any scalar data type except a pointer. To make sense out of an expression with mixed types, C performs conversions automatically. These implicit conversions make the programmer's job easier, but it puts a greater burden on the compiler since it is responsible for reconciling mixed types. This can be dangerous since the compiler may make conversions that you don't expect. For example, the expression, 3.0 + 1/2 does not evaluate to 3.5 as you might expect. Instead, it evaluates to 3.0 because the value .5 (result of 1/2) is converted to an integer (the fractional part is truncated, leaving a value of zero). Implicit conversions, sometimes called quiet conversions or automatic conversions, occur under four circumstances: 1. In assignment statements, the value on the right side of the assignment is converted to the data type of the variable on the left side. These are called assignment conversions and are described in the "assignment operators" section of this chapter. 2. Whenever a char or short int appears in an expression, it is converted to an into An unsigned char or unsigned short is converted to an unsigned int. These are called integral widening conversions. 3.. In an arithmetic expression, objects are converted to conform to the conversion rules of the operator. These arithmetic conversions are described later in this section. 4. In certain situations, arguments to functions are converted. This type of conversion is described in Chapter 5. As an example of the first type of conversion, suppose j is an int in the following statement: j = 2.6; Before assigning the double constant to j, the compiler converts it to an int, giving it an integral value of 2. Note that the compiler truncates the fractional part rather than rounding to the closest integer. 4-12 Code The second type of implicit conversion, called integral widening or integral promotion, is almost always invisible. To understand the third type of implicit conversion, we first need to briefly describe how the compiler processes expressions. When the compiler encounters an expression, it divides it into subexpressions, where each subexpression consists of one operator and one or more objects, called operands, that are bound to the operator. For example, the expression, -3 / 4 + 2.5 contains three operators: -, I, and +. The operand to - is 3; there are two operands to I, -3 and 4; and there are two operands to +, -3/4 and 2.5. The minus operator is said to be a unary operator because it takes just one operand, whereas the division and addition operators are binary operators. Each operator has its own rules for operand type agreement, but most binary operators require both operands to have the same type. If the types differ, the compiler converts one of the operands to agree with the other one. To decide which operand to convert, the compiler resorts to the hierarchy of data types shown in Figure 4-1, and converts the "lower" type to the "higher" type. For example: 1 + 2.5 involves two types, an int and a double. Before evaluating it, the compiler converts the int into a double because double is higher than int in the type hierarchy. The conversion from an int to a double does not usually affect the result in any way. It is as if the expression were written: 1.0 + 2.5 Code 4-13 long double double float unsigned long int Figure 4-1. Hierarchy of C Scalar Data Types The rules for implicit conversions in expressions can be summarized as follows. Note that these conversions occur after all integral widening conversions have taken place. • If a pair of operands contains a long dquble, the other value is converted to long double. • Otherwise, if one of the operands is a double, the other is converted to double. • Otherwise, if one of the operands is a float, the other is converted to float. • Otherwise, if one of the operands is an unsigned long int, the other is converted to unsigned long into • Otherwise, if one of the operands is a long int, then the other is converted to long int. • Otherwise, if one of the operands is an unsigned int, then the other is converted to unsigned into In general, most implicit conversions are invisible. They occur without any obvious effect. 4-14 Code 4.4 Overview: Preprocessor Directives The compiler analyzes preprocessor directives before analyzing any statements or declarations. The preprocessor directives provide information to the compiler on how the code should be compiled. There is no limit to the number of preprocessor directives that a program can contain. Preprocessor directives (with the exception of #module. #section. and #systype) can appear on any line in a program. Domain C supports the preprocessor directives shown in Table 4-3. Preprocessor directives always begin with the # character. In addition to these directives. Domain C supports the predefined macros and names shown in Table 4-2. Table 4-2. Predefined Macros and Names Name or Macro What It Does . defined A macro that returns 1 if the argument is defined; argument is not defined. systype A macro that sets the systype environment variable. o if the - DATE- A name that expands to the date at compilation time. - FILE A name that expands to the current source filename. - - LINE A name that expands to the current line number in the source file. _TIME_ A name that expands to the time of compilation. - STDC- A name that expands to 1 if prototyping is turned on; otherwise it expands to zero. Code 4-15 Table 4-3. Preprocessor Directives Preprocessor Directive #debug What It Does * #define,#undef Marks source code for conditional compilation. Defines and undefines constants and macros. #eject * Inserts a page break into the listing file. #elif * Same as an #else directive followed by an #if directive. (The #elif directive is support by the UNIX preprocessor (cpp) but not by the preprocessor in the Domain C compiler. Therefore, use #elif only if you are compiling in a UNIX environment or explicitly specify the Ibin/cc command. #if, #ifdef, #ifndef, #else, #endif Controls conditional compilation. #include Loads an include file. #line Resets the compiler's knowledge of the current source line number and filename. #list, #nolist Enables and disables the listing of source code in the listing file. #module' * Changes the internally stored name of the object module. #section * Directs the binder to place instructions and data into named sections rather than the default sections. #systype * Defines the target system on which the program will run. • Preprocessor directives marked with an asterisk can begin on any column; however, the other preprocessor directives must begin in the very first column of a line. 4-16 Code 4.5 Encyclopedia of Domain C Code The remainder of this chapter contains an alphabetical listing of all the elements that can make up the action part of a function. Figure 4-2 shows all the listings of C keywords in this encyclopedia, Figure 4-3 provides all the preprocessor directive listings, and Figure 4-4 gives all the other listings. break if continue return do/while sizeof for switch goto while Figure 4-2. Keyword Listings in Encyclopedia DATE #debug #line #define, #undef #list #eject #module #if, #ifdef, #ifndef, #else, #endif #section #include _STDC_ and _BFMT_COFF #systype Figure 4-3. Preprocessor Directive Listings in Encyclopedia Code 4-17 arithmetic operators expressions array operations increment and decrement operators assignment operators logical operators bit operators pointer operations cast operations predefined macros comma operator relational operators conditional expression operator structure and union operations enum operations Figure 4-4. Other Listings in Encyclopedia 4-18 Code arithmetic operators arithmetic operators Operators used to perform arithmetic calculations. FORMAT expJ expJ expJ expJ expJ -exp + exp2 - exp2 * exp2 I exp2 % exp2 Addition Subtraction Multiplication Division Modulo division Sign reversal ARGUMENTS exp Any constant or variable expression. DESCRIPTION The addition, subtraction, and multiplication (+, -, and *) operators perform the usual arithmetic operations in C programs. All of the arithmetic operators (except the unary sign reversal operator) bind from left to right. The operands may be any integral or floatingpoint value (except for the modulo operator, which accepts only integer operands). The addition and subtraction operators also accept pointer types as operands. Pointer arithmetic is described in the "pointer operations" section of this chapter. C's modulo operator (o/D) produces the remainder of integer division and so equals zero if the two numbers divide each other exactly. This can be useful for something like determining whether or not it's a U.S. presidential election year. For example: if (year % 4 == 0) printf("This is a u.s. presidential election year.\n"); else printf("There will not be a u.s. presidential election this\ year. \n"); As required by the ANSI standard, Domain C supports the following relationship between the remainder and division operators: a equals a%b + (alb) * b for any integer values of a and b As with division expressions, the result of a remainder expression is undefined if the right operand is zero. The additive inverse operator (-) multiplies its sole operand by -1. For example, if x is an integer with the value -8, then -x evaluates to 8. Code 4-19 arithmetic operators Refer to the precedence rules at the beginning of this chapter for information about how these and other operators evaluate with respect to each other. 4-20 Code arithmetic operators Bug Alert: Integer Division and Remainder When both operands of the division operator (I) are integers, the result is an integer. If both operands are positive, and the division is inexact, the fractional part is truncated: evaluates to evaluates to evaluates to 5/2 7/2 1/3 2 3 o If either operand is negative, however, the compiler is free to round the result either up or down. In accord with the PCC implementation of C, the Domain C compiler always rounds up: -5/2 71-2 -1/-3 evaluates to evaluates to evaluates to -2 (on Apollo maChines) but -3 (on some machines) -3 (on Apollo maChines) but -4 (on some machines) o (on Apollo machines) but -1 (on some maChines) By the same token, the sign of the result of a remainder operation is undefined by the K&R and ANSI standards: -5 % 2 7 % -4 evaluates to evaluates to 1 or -1 3 or -3 Domain C makes the sign of the result agree with the sign of the left-hand operand: -5 % 2 7 % -4 evaluates to evaluates to -1 (on Apollo maChines) 3 (on Apollo maChines) This· is consistent with the PCC implementation. For portability reasons,youshould avoid division and remainder operations with negative numbers since the results can vary from one compiler to another .. One wayto avoid the sign problem during division is to always cast the operands to float or double. Even if the result is assigned to aninteger, you are guaranteed that the compiler will convert to an integer by truncating the fractional part. For example: /* I f j is an integer, it will be assigned the value-2. */ j= (float) 5 / -2; Although this is a portable solution, it is expensive, since it requires the CPU to perform floating-:point· arithmetic. The sign of the remainder is a more difficult problem to circumvent because the operands must be integer-'-you cannot cast them to float or double. If you always want the sign to be positive, you can use the run-time library absO function, which returns the absolute value of its argument: . /* Ensures that the. value assigned to j is positive. */ j = abs (k%m) ; If the sign of the remainder is important to your program's operations, you should use the. runtime library divO function,whichcomputes the quotient and the remainder of its two arguments; The sign of both results isdetermined in a guaranteed and portable manner. (Seethe description ofdiyOin theSysV Programmer.~sReference manual or the BSD Programmer's Re!erence manu~) . . Code 4-21 array operations array operations Operations that may be performed with arrays. DESCRIPTION Chapter 3 explains how to declare array variables. Here we explain how to use array variables in statements. You assign a value to an element of an array by specifying an assignment statement of the following form: array_name[component_number] = value; For example, given the following array declaration float r_array[lOOO] ; you can assign the value 5.29 to element 3 with the following statement: r_array[3] = 5.29; Note that the component_number must always be an integral value. Consider the following legal and illegal declarations: r_array[3] r_array['B'] r_arraY[143.5] 5.29; 5.29; 5.29; /* legal */ /* legal */ /* illegal * / The following program fragment assigns values to an integer array and shows the use of a simple index expression: int i, num[5]; for (i 0; i < 5; i++) num [i] i; The array num can hold five integers, and those five are assigned with a simple for loop. Notice that the loop begins its assignments with the zeroth element of the array. All C array subscripts, or indexes, begin at zero (array[O]). Some programming languages always begin at 1 (array [1]), while others allow the programmer to determine the initial subscript value, but C always starts counting at zero. This is important because it means if you create an array of size n, no nth element is defined. In the example above, Dum has these five elements: num[O] num(l] num[2] num[3] num[4] 4-22 Code /* /* /* /* /* first element second */ third */ fourth */ fifth */ */ array operations Even though there is no num[5] element, the compiler does not complain if you assign something to it (or num[6], or num[12], or whatever), and that fact can create hard-tofind errors. When storing an array value, C looks at the array name and then uses the subscript value to determine the memory offset. No bounds checking occurs, as explained in the "Bug Alert: Walking Off the End of an Array." Subscripting with enums Domain C allows you to use an enumerated value as an array index. In the following code fragment, the value 3.14159 is assigned to array[2]: { enum subscripts { zero, one, two, three, four}; float array [10] ; array [two] = 3.14159; } Bug Alert: Walking Off the End·ofan Array Unlike some programming languages,C does not require compilers to check array bounds. This means that you can attempt to access elements for which no memory has been allocated. The results are unpredictable. Sometimes you will access memory that has been allocated for other variables. Sometimes you will attempt to access special protected areas of memory and your program will abort. Usually this type of error occurs because you are off by one in testing for theendof the array. For example, consider the following program which attempts to initialize every element of an array to zero: main() { int ar [10], j; for (j =0 ; j <= 10; j++ ) ar[j] = 0; } Since we have declaredarUto hold ten elements, we can validly refer to elements 0 through 9 .. Our for loop, however, has an off-by-one bug in it. The loop runs from o through 10, .so element 10 also gets assigned zero. Since there is no element 10, the compiler overwrites a portion of memory, very likely the portion of memory reserved for j. This will produce an infinite loop because j will be reset to zero. Code 4-23 array operations Accessing Array Elements Through Pointers One way to access array elements is to enter the array name followed by a subscript. Another way is through pointers. The declarations, short ar[4] ; short *p; create an array of four variables of type short, called ar[O], ar[l], ar[2], and ar[3], and a variable named p that is a pointer to a short. Using the address-of operator (&), you can now make the assignment, p = &ar [0] ; which assigns the address of array element 0 to p. If we dereference p, *p we get the value of element ar[O]. Until the value of p is changed, the expressions ar[O] and *p refer to the same memory location. Due to the scaled nature of pointer arithmetic, the expression, * (p+3) refers to the same memory contents as: ar[3] In fact, for any integer expression e, *(p+e) is the same as: ar[e] This brings us to the first important relationship between arrays and pointers: Adding an integer to a pointer that points to the beginning of an array, and then dereferencing that expression, is the same as using the integer as a subscript value to the array. The second important relationship is that an array name that is not followed by a subscript is interpreted as a pointer to the initial element of the array (except when an array name appears as the operand of the sizeof operator). That is, the expressions, ar and &ar[O] 4-24 Code array operations are exactly the same. Combining these two relationships, we arrive at the following important equivalence: ar[n] is the same as * (ar + n) This relationship is unique to the C language and is one of C's most important features. When the C compiler sees an array name, it translates it into a pointer to the initial element of the array. Then the compiler interprets the subscript as an offset from the base address position. For example, the compiler interprets the expression ar[2] as a pointer to the first element of ar, plus an offset of 2 elements. Due to scaling, the offset determines how many elements to skip, so an offset of 2 means skip two elements. The two expressions ar[2] *(ar+2) are equivalent. In both cases, ar is a pointer to the initial element of the array, and 2 is an offset that tells the compiler to add 2 to the pointer value. Because of this interrelationship, pointer variables and array names can be used interchangeably to reference array elements. It is important to remember, however, that the values of pointer variables can be changed whereas array names cannot be changed. This is because an array name by itself is not a variable-it refers to the address of the array variable. You cannot change the address of variables. This means that a naked array name (one without a subscript or indirection operator) cannot appear on the left-hand side of an assignment statement. For instance: float ar[5], *p; p = ar; p; ar &p ar; ar++; ar[l] p++; ++ar[2] *(p+3); /* /* /* /* /* /* /* /* /* /* /* legal -- same as p= &ar[O] */ illegal you may not assign */ to an array address */ illegal you may not assign */ to a pointer address */ illegal -- you may not */ increment an array address */ ar[l] is a variable legal */ legal you may increment a */ pointer variable */ legal increment element 2 or array */ In the above examples, note that scaling allows you to use the increment and decrement operators to point to the next or previous element of an array. Code 4-25 array operations Passing Arrays as Function Arguments In C, an array name that appears as a function argument is interpreted as the address of the first element of the array. For instance: int maine void { extern float func( float [] ); float x, farray[5]; x func( farray ); /* Same as func(&farray[O]) */ On the receiving side, you need to declare the argument as a pointer to the initial element of an array. There are two ways to do this: func( float *ar ) { or func( float ar[] ) { } The second example declares ar to be an array of indeterminate size. You may omit the size specification because no storage is being allocated for the array. (You may include a size for documentation purposes.) The array has already been created in the calling routine, and what is being passed is really a pointer to the first element of the array. Since the compiler knows that array expressions result in pointers to the first element of the array, it converts ar into a pointer to a float, just like the first declaration. Functionally, therefore, the two versions are equivalent. The choice of declaring a function argument as an array or a pointer has no effect on the compiler's operation-it is purely for human readability. To the compiler, ar simply points to a float-it is not an array. Because of the pointer-array equivalence, however, you can still access ar as if it were an array. But you cannot find out the size of the array in the calling function by using the sizeof operator on the argument. For example: 4-26 Code array operations /* Program name is "print_size" */ #include void print_size( float arg[] ) { printf( "The size of arg is: %d\n" , sizeof(arg ) ); } int main( void) { float f_array[lO]; printf( "The size of f_array is: %d\n" , sizeof(f_array) ); print_size( f_array ); } The results of running this program are: The size of f_array is: 40 The size of arg is: 4 The variable Carray is an array of ten 4-byte floats, so the value 40 is its correct size in bytes. The variable arg, on the other hand, is converted to a pointer to a float. Pointers are four bytes long, so the size of arg is 4. Because it is impossible for the called function to deduce the size of the passed array, it is often a good idea to pass the size of the array along with the base address. This enables the receiving function to check array boundaries: #define MAX_SIZE 1000 void foo( f_array, f_array_size ); float f_array[]; int f_array_size; { if (f_array_size > MAX_SIZE) { printf( "Array too large.\n" ); exit ( 1 ); } You can obtain the number of elements in an array by dividing the size of the array by the size of each element. On the calling side, you would write: foo( f_array, sizeof(f_array)/sizeof(f_array[O]) ); Note that this expression works regardless of the type of element in Carray[]. Code 4-27 array operations Returning Arrays from Functions The return statement can pass only one value back to the caller. It may therefore seem impossible to pass an array back to the caller, but it can be done. The trick is to define the called function so that it returns a pointer to the base type of the array. The following example demonstrates this method. In it, we pass in an array of lowercase letters to the function fO, and it returns an array of uppercase letters. /* Program name is "returning_arrays". It demonstrates how a * function can return an array to the caller. */ #include #include #include /* Define a function that returns a pointer to a character */ char *toupper_string( char *arg ) { static char result [100] ; int i=O; while (*arg) toupper( *arg++ ); result[i++] return result; /* pass back the address of the first element * of array 'result'. */ } int main( void { char x[lOO] , *px; strcpy( x, "hi there" ); px = toupper_string( x ); /* upon return from the function, * px points to the first element * of array result */ printf( "%s => %s\n", x, px ); } NOTE: In the preceding example, we declare array result as a static so that it will not disappear after function invocation. Note though that any dereference of pointer px may inadvertently alter the contents of the array, so be careful. Multidimensional Arrays An array of arrays is a multidimensional array and is declared with consecutive pairs of brackets. To access an element in a multidimensional array, you specify as many subscripts as are necessary. 4-28 Code array operations Consider the following array of arrays: int ar[2] [3] = { { 0, 1, 2 }, { 3, 4, 5 } }; The array reference, ar [1] [2] is interpreted as *(ar[l] + 2) which is further expanded to: *(*(ar+1)+2) Recall that ar is an array of arrays. When * (ar+ 1) is evaluated, therefore, the 1 is scaled to the size of the object, which in this case is a 3-element array of ints (which we assume are four bytes long), and the 2 is scaled to the size of an int: *«int *) «char *)ar + (1*3*4» + (2*4» We put in the (char *) cast to turn off scaling because we have already made the scaling explicit. The (int *) cast ensures that we get all four bytes of the integer when we dereference the address. After doing the arithmetic, the expression becomes: *(int *) «char *)ar + 20 ) The value 20 has already been scaled so it represents the number of bytes to skip. If ar starts at address 1000 ar[1] [2] refers to the int that begins at address 1014 (in hex), which is S. If you specify fewer subscripts than there are dimensions, the result is a pointer to the base type of the array. For example, given the 2-dimensional array declared above, you could make the reference, ar[l] which is the same as: &ar [1] [0] The result is a pointer to an into Passing Multidimensional Arrays as Arguments To pass a multidimensional array as an argument, you pass the array name as you would a single-dimension array. The value passed is a pointer to the initial element of the array, Code 4-29 array operations but in this case the initial element is itself an array. On the receiving side, you must declare the argument appropriately, as shown in the following example. flO { int ar [5] [6] [7] ; f2( ar ); } f2( received_arg ) int received_arg[] [6] [7] ; { } Again, you may omit the size of the array being passed, but you must specify the size of each element in the array. Most compilers don't check bounds, so it doesn't really matter whether you specify the first size. For example, the compiler would interpret the declaration of received_arg as if it had been written: int (*received_arg) [6] [7]; Another way to pass multidimensional arrays is to explicitly pass a pointer to the first element, and pass the dimensions of the array as additional arguments. In our example, what gets passed is actually a pointer to a pointer to a pointer to an into flO { int ar [5] [6] [7] ; f2( ar, 5, 6, 7 ); f2( received_arg, diml, dim2, dim3 ) int ***received_arg; int diml, dim2, dim3; { 4-30 Code array operations The advantage of this approach is that you need not know ahead of time the shape of the multidimensional array. The disadvantage is that you need to manually perform the indexing arithmetic to access an element. For example, to access ar[x] [y] [z] in f20, you would need to write: *((int *)received_arg + x*dim3*dim2 + y*dim2 + z) Note that we need to cast received_arg to a pointer to an int because we are performing our own scaling. Although this method requires considerably more work on the programmer's part, it gives more flexibility to f20 since it can accept 3-dimensional arrays of any size and shape. Moreover, it is possible to define a macro that simplifies the indexing expression. Bug Alert: Referencing Elements in a Multidimensional Array One of the most common mistakes made by beginning C programmers-especially those familiar with another programming language-is to use a comma to separate subscripts, ar[1,2] 0; /* Legal, but probably wrong */ instead of: ar[l] [2] = 0; /* Correct */ The comma notation is used in some other languages, such as FORTRAN and Pascal. In C, however, this notation has a very different meaning because the comma is a C operator in its own right. The first statement above causes the compiler to evaluate the expression 1 and discard the result; then evaluate the expression 2. The result of a comma expression is the value of the rightmost operand, so the value 2 becomes the subscript to ar. As a result, the array reference accesses element 2 of ar. If ar is a 2-dimensional array of ints, the type of ar[2] i&a pointer to an int, so this mistake will produce a type incompatibility error. This can be misleading since the real mistake is using a comma instead of brackets. Code 4-31 array operations EXAMPLE Program name is "bubble_sort". It sorts an array of ints in ascending order using the bubble sort algorithm. /* * */ #define FALSE 0 #define TRUE 1 #include void bubble_sort( int list[], int list_size { int j, k, temp, sorted while ( sorted ) = FALSE; { sorted = TRUE; /* assume list is sorted */ /* Print loop -- not part of bubble sort algorithm */ for (k = 0; k < list_size; k++) printf( "%d\t", list[k] ); printf ( "\n" ); /* End of print loop */ for (j = 0; j < list_size -1; j++) { if (list [j] > list [j+l] ) { /* */ temp = list [j] ; list [j] = list [j+l] ; list[j+l] = temp; At least 1 element is out of order sorted = FALSE; } } /* end of for loop */ } /* end of while loop */ } int maine void ) { int i; static int list[] = { 13, 56, 23, 1, 89, 58, 20, 125, 86, 3}; bubble_sort ( list, sizeof(list)/sizeof(list[O]»; exit ( 0 ); } The function accepts two parameters, a pointer to the first element of an array of ints and an int representing the size of the array. The following program calls bubble_sortO with a 10-element array. 4-32 Code array operations int main( void ) { int i; static int list[) { 13, 56, 23, 1, 89, 58, 20, 125, 86, 3}; bubble_sort ( list, sizeof(list)/sizeof(list[O)); exi t( 0 ); } USING THIS EXAMPLE Program execution results in the following output: 13 13 13 1 1 1 1 1 1 56 23 1 13 13 13 13 13 3 23 1 23 23 20 20 20 3 13 1 56 56 20 23 23 3 20 20 89 58 20 56 56 3 23 23 23 58 20 58 58 3 56 56 56 56 20 89 86 3 58 58 58 58 58 125 86 3 86 86 86 86 86 86 86 3 89 89 89 89 89 89 89 3 125 125 125 125 125 125 125 125 The bubble sort is not very efficient, but it's a simple algorithm that illustrates array manipulation. The standard run-time library contains a much more efficient sorting function called qsortO, which is described in the SysV Programmer's Reference manual and the BSD Programmer's Reference manual. Code 4-33 assignment operations assignment operators Assign new values to variables. FORMAT lvalue lvalue lvalue lvalue lvalue lvalue lvalue lvalue lvalue lvalue lvalue = exp += exp -= exp *= exp 1= exp %= exp «= exp »= exp &= exp "'= exp 1= exp Simple assignment Addition and assignment Subtraction and assignment Multiplication and assignment Division and assignment Modulo division and assignment Left shift and assignment Right shift and assignment Bitwise AND and assignment Bitwise XOR and assignment Bitwise OR and assignment ARGUMENTS lvalue Any lvalue. exp Any legal expression. DESCRIPTION The = is the fundamental assignment operator in C. The other assignment operators provide shorthand ways to represent common variable assignments. We begin with a discussion of =. The Assignment (=) Operator When C sees an equal sign, it processes the statement on the right side of the sign and assigns the result to the variable on the left side. For example: x 3; /* assigns the value 3 to variable x */ x = y; /* assigns the value of y to x x */ (y*z) ; /* performs the multiplication and assigns * the result to x */ An assignment expression itself has a value, which is the same value that is assigned to the left-hand operand. 4-34 Code assignment operations The assign operator has right-to-Ieft associativity, so the expression, a = b = c = d = 1; is interpreted as: (a = (b = (c = (d = 1»»; First 1 is assigned to d, then d is assigned to c, then c is assigned to b, and finally, b is assigned to a. The value of the entire expression is 1. This is a convenient syntax for assigning the same value to more than one variable. Note, however, that each assignment may cause quiet conversions, so, int j; double f; f = j = 3.5; assigns the truncated value 3 to both f and j. On the other hand, j = f = 3.5; assigns 3.5 to f and 3 to j. The Other Assignment Operators C's assignment operators provide a handy way to avoid some keystrokes. Any statement in which the left-hand side of the equation is repeated on the right is a candidate for an assignment operator. If you have a statement like this: i = i + 10; you can use the assignment operator format to shorten the statement to: i += 10; In other words, any statement of the form var = var op exp; /* traditional form */ can be represented in the following shorthand form: var op= exp; /* shorthand form */ The only internal difference between the two forms is that var is evaluated only once in the shorthand form. Most of the time this is not important; however, it is important when the left-hand operand contains side effects, as in the following example: int *ip++ *ip++ *ip; += 1; *ip++ + 1; /* These two statements produce */ /* different results. */ Code 4-35 assignment operations The second statement is ambiguous because C does not specify which assignment operand is evaluated first. See Section 4.2.14 for more information concerning order of evaluation. Assignment Operators in Older C Compilers Some older C compilers accept assignment operators written with the equal sign first (for example, =+ instead of +=). When the Domain C compiler encounters such an old-style operator, it processes it as if the two signs were reversed, and issues a warning message. Also, some compilers accept a space between the two signs. In those compilers, something like + is interpreted as += Since this can lead to ambiguous expressions, the Domain C compiler forbids the space between the operator and the equal sign. Assignment Type Conversions Whenever you assign a value to a variable, the value is converted to the variable's data type if possible. In the example below, for instance, the floating-point constant 3.5 is converted to an int so that i gets the integer value 3. mainO { int i; i = 3.5; } Unlike arithmetic conversions, which always expand the datum, assignment conversions can shorten the datum and therefore affect its value. For example, suppose c is a char, and you make the assignment: c = 882; The binary representation of 882 is: 00000011 01110010 4-36 Code assignment operations It requires two bytes of storage, but the variable c has only one byte allocated for it, so the two upper bits don't get assigned to c. This is known as overflow and the result is not defined by the ANSI and K&R standards for signed types. Domain C simply ignores the extra byte, so c would be assigned the right-most byte: 01110010 This would erroneously give c the value of 114. The principle illustrated for chars also applies to shorts, in~s, and long ints. For unsigned types, however, C has well-defined' rules for dealing with overflow conditions. When an integer value x is converted to a smaller unsigned integer type, the result is the non-negative remainder of x / (U_MAX+1) where U_MAX is the largest number that can be represented in the shorter unsigned type. For example, if j is an unsigned short, which is two bytes, then the assignment j = 71124; assigns to j the remainder of: 71124 / (65535+1) The remainder is 5588. Note that for non-negative numbers, and for negative numbers represented in two's complement notation, this is the same result that you would obtain by ignoring the extra bytes. It is perfectly legal to assign an integer value to a floating-point variable. In this case, the integer value is implicitly converted to a floating-point type. If the floating-point type is capable of representing the integer, there is no change in value. If f is a double, the assignment f = 10; is executed as if it had been written: f = 10.0; This conversion is invisible. There are cases, however, where a floating-point type is not capable of exactly representing all integer values. Even though the range of floating-point values is generally greater than the range of integer values, the precision may not be as good for large numbers. In these instances, conversion of an integer to a floating-point value may result in a loss of precision. Consider the following example: #include main( ) { long int j float x; 2147483600; x = j; printf( "j is %d\nx is %10f\n" , j, x ); exit ( 0 ); } Code 4-37 assignment operations If you compile this program with the -nopt switch to ensure that x is not stored in a register, and then execute it, you get: j is 2147483600 x is 2147483648.000000 The most risky mixture of integer and floating-point values is the case where a floatingpoint value is assigned to an integer variable. First, the fractional part is discarded. Then, if the resulting integer can fit in the integer variable, the assignment is made. In the following statement, assuming j is an int, the double value 2.5 is converted to the int value 2 before it is assigned. j = 2.5; This causes a loss of precision which could have a dramatic impact on your program. The same truncation process occurs for negative values. After the assignment, j = -5.8; the value of j is -5. An equally serious situation occurs when the floating-point value cannot fit in an integer. For example: j = 999999999999.0 This causes an overflow condition whiCh will produce unpredictable results if it is not caught by the compiler. As a general rule, it is a good idea to keep floating-point and integer values separate unless you have a good reason for mixing them. As is the case with assigning floating-point values to integer variables, there are also potential problems when assigning double values to float variables There are two potential problems: loss of precision and an overflow condition. In Domain C a double can represent approximately 16 decimal places, and a float can only represent 7 decimal places. If f is a float variable, and you make the assignment, f = 1.0123456789 the computer rounds the double constant value before assigning it to f. The value actually assigned to f, therefore, will be 1.012346 (Domain C always rounds toward zero). The following example shows rounding due to conversions. 4-38 Code assignment operations 1* Program name is "float_rounding". It show how double values can be rounded when assigned to a float. * *1 #include int main( void { float f32; double f64; int i; for (i=l, f64=0; i < 1000; ++i) f64 += 1. O/i; f32 = f64; printf( "Value of f64: %1.7f\n". f64 ); printf( "Value of f32: %1.7f\n". f32 ); } The output is: Value of f64: 7.4844709 Value of f32: 7.4844708 A more serious problem occurs when the value being assigned is too large to be represented in the variable. For example. the largest positive number that can be represented by a float is approximately 2e38. What happens if you try to execute the following assignment? f = 2e40; The behavior is not defined by the K&R or ANSI standards. In this simple case, the compiler will recognize the problem and report a compile-time error. In other instances. however, a run-time error could result. Code 4-39 assignment operations EXAMPLE Following are examples of each assignment operator. In each case, x = 5 and y = 2 before the statement is executed. /* * */ x x x x x x x x x x x x 4-40 Code ..,. y; Y + y * *= y + /= y; %= y; <:= y; >= y; &= y; += y; 1= y; =y = 1 1; ..,. 3; ..,. 1; ..,. ..,. ..,. ..,. ..,. ..,. ..,. ..,. ..,. x= 2 x 8 -1 x x 15 2 x x = 1 x 20 x 1 x 0 x 7 x 7 x = 1, y 1 BFMT COFF Refer to the _STDC_ listing later in this chapter. Code 4-41 bit operators bit operators Access specific bits in an object. FORMAT expJ « exp2 expJ » exp2 expJ & exp2 expJ exp2 expJ I exp2 -expJ A Left shifts Oogical shift) the bits in expJ by exp2 positions Right shifts (logical or arithmetic shift) the bits in expJ by exp2 positions Performs a bitwise AND operation Performs a bitwise OR operation Performs a bitwise inclusive OR operation Performs a bitwise negation (one's complement) operation ARGUMENTS expJ Any integer expression. exp2 Any integer expression. DESCRIPTION Domain C supports the usual six bit operators, which we group for descriptive purposes into shift operators and logical operators. Bit Shift Operators The « and » operators shift an integer left or right respectively. The operands must have integer type, and all automatic promotions are performed for each operand. For example, the following program fragment short int short int to the left = 53, to_the_right= 53; left_shifted_result, right_shifted_result; left_shifted_result = to_the_left « 2; right_shifted_result = to_the_right » 2; sets left_shiftedJesult to 212 and right_shiftedJesult to 13. The results are clearer in binary: base 2 0000000000110101 0000000011010100 0000000000001101 4-42 Code base 10 53 212 /* 53 shifted left 2 bits */ 13 /* 53 shifted right 2 bits */ bit operators Shifting to the left is equivalent to mUltiplying by powers of two. x« y is equivalent to x • 2 Y Shifting non-negative integers to the right is equivalent to dividing by powers of two: x» y is equivalent to x / 2 Y The « operator always fills the vacated rightmost bits with zeros. If expJ is unsigned, the » operator fills the vacated leftmost bits with zeros. If expJ is signed, then » fills the leftmost bits with ones (if the sign bit is 1) and zeros (if the sign bit is 0). In other words, if expJ is signed, the two bit shift operators preserve its sign. NOTE: Not all compilers preserve the sign bit when doing bit shift operations on signed integers. The K&R and ANSI standards make this behavior implementation-defined. Domain C is consistent with the PCC implementation of C. Make sure that the right operand is not larger than the size of the object being shifted. For example, the following produces unpredictable arid nonportable results because ints have fewer than 50 bits: 10 » 50 You will also get nonportable results if the shift count (the second operand) is a negative value. Bit Logical Operators The logical bitwise operators are similar to the Boolean operators, except that they operate on every bit in the operand(s). For instance, the bitwise AND operator (&) compares each bit of the left operand to the corresponding bit in the right operand. If both bits are one, a one is placed at that bit position in the result. Otherwise, a zero is placed at that bit position. Code 4-43 bit operators The four logical operators perform logical operations on a bit-by-bit level using the following truth tables: & bit x of op2 bit x of opl 0 0 1 1 I AND bit x of result bit x of opl 0 0 0 1 0 0 1 1 0 1 0 1 "- bit x of opl 0 0 1 1 bit x of op2 0 1 0 1 bit x of op2 bit x of result bit x of result 0 1 0 1 - Exclusive OR Inlusive OR Bitwise Complement bit x of op2 0 1 1 0 0 0 0 1 1 1 bit x of result 0 1 Figure 4-5. Bitwise Operators Table 4-4 shows some examples of the bitwise AND operator. Table 4-4. The Bitwise AND Operator Expression Hexadecimal Value Binary Representation 9430 5722 Ox24D6 Ox165A 00100100 00010110 11010110 01011010 9430 & 5722 Ox0452 00000100 01010010 The bitwise inclusive OR operator CI) places a 1 in the resulting value's bit position if either operand has a bit set at the position (see Table 4-5). 4-44 Code bit operators Table 4-5. Examples Using the Bitwise Inclusive OR Operator Expression Hexadecimal Value 9430 5722 9430 I 5722 Binary Representation Ox24D6 Ox165A 00100100 11010110 00010110 01011010 Ox36DE 00110110 11011110 The bitwise exclusive OR (XOR) operator (A) sets a bit in the resulting value's bit position if either operand (but not both) has a bit set at the position (see Table 4-6). Table 4-6. Example Using the XOR Operator Expression .Hexadecimal Value 9430 5722 9430 A 5722 Binary Representation Ox24D6 Ox165A 00100100 00010110 11010110 01011010 Ox328C 00110010 10001100 The bitwise complement operator (-) reverses each bit in the operand (see Table 4-7). Table 4-7. Example Using the Bitwise Complement Operator Expression Hexadecimal Value Binary Representation 9430 Ox24d6 00100100 11010110 -9430 Oxdb29 11011011 00101001 Code 4-45 break break Provides an early exit from for, while, and do/while loops and from switch statements. FORMAT break; DESCRIPTION There are times when it is convenient to be able to exit from a loop without testing a condition at the top or bottom. The break statement allows you to exit immediately from the for, while, or do/while loop that encloses it. Execution resumes at the first statement after the end of the loop. The break statement is also used to exit from switch statements. For more information on that use of break, see switch later in this encyclopedia. 4-46 Code break EXAMPLE Program name is "break_example". This program finds what /* * number day (out of 365) a user-supplied date is in a year. * Leap years are ignored. */ #include int maine void { int i, month_num, day, tot_days; static int m[13] = {O, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; char answer = 'y'; printf ("\n"); The program asks for a month and day and then checks to see if they are valid. If not, the break statement terminates the do/while loop. otherwise, the number day is computed and printed. /* * * * */ while «answer != 'n') && (answer != 'N'» { printf( "Enter the month and day separated by a space: " ); scanf ( "%d %d", &month_num, &day ); fflush( stdin ); if (month_num> 12 I I day> m[month_num]) { printf ( "You entered an invalid date\n" ); break; } /* end if */ tot_days = 0; for .(i = 1; i < month_num; i++) tot_days += m[i] ; tot_days += day; printf( "The date you entered is number %d of the year.\n", tot_days ); printf ( "Again? " ); scanf ("%c", &answer); fflush( stdin ); /* end while */ } Code 4-47 break USING THIS EXAMPLE If we execute this program, we get the following output: Enter the month and day separated by a space: 7 13 The date you entered is number 194 of the year. Again? y Enter the month and day separated by a space: 19 24 You entered an invalid date 4-48 Code cast operations cast operations Convert a value to another data type. FORMAT (data_type) exp ARGUMENTS Any scalar data type including a scalar data type created through a typede! statement. data_type cannot be an aggregate type, but it can be a pointer to an aggregate type. Any scalar expression. exp DESCRIPTION To cast a value means to explicitly convert it to another data type. For example, given the following two definitions: int float y = 5; x; the following cast operation casts the value of y to float: x = (float) y; /* x now equals 5.0 */ Here are four more casts (assume that j is a scalar variable): i i i i (float) j; /* Cast j's value to float */ (char *) j ; /* Cast j's value to a pointer to a char */ ((int *)(»j;/* Cast j's value to a pointer to a function * returning an int */ (float) (double) j; /* Cast j ' s value first to a double * and then to a float */ It is important to note that if exp is a variable, a cast does not change this variable's data type; it only changes the type of the variable's value for that one expression. For instance, in the preceding casting examples, the cast does not produce any permanent effect on variable j. There are no restrictions on casting from one scalar type to another, except that you may not cast a void object to any other type. You should be careful when casting integers to Code 4-49 cast operations pointers. If the integer value does not represent a valid address, the results are unpredictable. The type specifier that makes up the cast expression is called an abstract declarator. The rules for composing abstract declarators are described in Chapter 3. Casting Integers to Other Integers It is possible. to cast one integer into an integer of a different size and to convert a floating-point value, enumeration value or pointer to an integer. Conversions from one type of integer to another fall into five cases (A-E) as shown in Table 4-8. Each of these conversions is described in the following sections. Table 4-8. Integer Conversions char Converted Type unsigned unsigned unsigned int char short short int char A B B D E E short C A B C D E int (long) C C A C C D unsigned char D B B A B B unsigned short C D B C A B unsigned int C C D C C A Original Type CASE A: Trivial Conversions It is legal to "convert" a value to its current type by casting it, but this conversion has no effect. CASE B: Integer Widening Casting an integer to a larger size is fairly straightforward. The value remains the same but the storage area is widened. The compiler preserves the sign of the original value by filling the new leftmost bits with ones if the value is negative or with zeros if the value is positive. When converting to an unsigned integer, the value is always positive so the new bits are always filled with zeros. The following table illustrates this principle. 4-50 Code cast operations char i = (short) i (int) i hex 37 0037 => => 00000037 char j = (short) j (int) j c3 ffc3 => => ffffffc3 -61 -61 -61 unsigned char k = 37 (short) k => 0037 (int) k => 00000037 55 55 55 dec 55 55 55 CASE C: Casting Integers to a Smaller Type When an int value is cast to a narrower type (short or char), the excess bits on the left are discarded. The same is true when a short is cast to a char. For instance, if an int is cast to a short, the 16 leftmost bits are truncated. The following table of values illustrates these conversions. signed long int i (signed short int)i (signed char)i (unsigned char)i hex cf34bfl => => => 4bf1 f1 f1 dec 217271281 19441 -15 241 Note that if, after casting to a signed type, the leftmost bit is 1, then the number is negative. However, if you cast to an unsigned type and after the shortening the leftmost bit is 1, then that 1 is part of the value (not the sign bit). CASE D: Casting from Signed to Unsigned, and Vice Versa When the orginal type and the converted type are the same size, a representation change is necessary. That is, the internal representation of the value remains the same, but the sign bit is interpreted differently by the compiler. For instance: signed int i (unsigned int)i => hex fffffca9 -855 hex 0000f2a1 dec 62113 fffffca9 4294966441 0000f2a1 62113 dec The hexadecimal notation shows that the numbers are the same internally, but the decimal notation shows that the compiler interprets them differently. CASE E: Casting Signed to Unsigned and Widening This case is equivalent to performing two conversions in succession. First, the value is converted to the signed widened type as described in case B, and then it is converted to Code 4-51 cast operations signed as described in case D. In the table below, note that the new leftmost bits are filled with ones to preserve negativeness even though the final value is unsigned. signed short int i (unsigned long int)i => hex ff55 dec -171 fffff55 4294967125 Casting Floating-Point Values to Integers Casting floating-point values to integers may produce useless values if an overflow condition occurs. The conversion is made simply by truncating the fractional part of the number. For example, the floating-point value 3.712 is converted to the integer 3 and the floating-point value -504.2 is converted to -504. Here are some more examples: float f = 3.700, f2 (int)f (unsigned int)f (char)f -502.2, f3 7.35e9; => 3 => 3 => 3 (int)f2 in decimal fffffeOa in hex => -502 (unsigned int)f2 => 4294966794 in decimal or fffffeOa in hex (char)f2 in decimal => 10 Oa in hex (int)f3 => run-time error (unsigned int)f3 => run-time error (char)f3 => run-time error Note that converting a large float to a char produces unpredictable results if the rounded value cannot fit in one byte. If the value cannot fit in four bytes, the run-time system issues an overflow error. Casting Pointers to Integers Pointers are treated like unsigned ints and obey the same conversion rules. Casting Enumerated Values to Integers When you cast an enumerated expression, the conversion goes through two steps. First, the enumerated value is converted to an int and then the int is converted to the final target data type. Note that the sign is preserved during these conversions. 4-52 Code cast operations Casting Double to Float and Vice Versa When you cast a float up to a double, the system extends the number's precision without changing its true value. However, when you cast a double down to a float, the system shrinks the number's precision and this shrinking may change the number's value due to rounding. The rounding generally occurs on the sixth or seventh decimal digit. Also, when you cast down from double to float, you run the risk of causing a run-time overflow error caused by a double that is too big or too small to fit within the confines of a float. Casting Pointers to Pointers You may cast a pointer of one type to a pointer to any other type. For example: int *int_p; float *float_p; struct S *str_p; extern foo( struct T * ); int_p = (int *) float_p; float_p = (float *) str_p; foo( (struct T *) str_p ); The cast is required whenever you assign a pointer value to a pointer variable that has a different base type, and when you pass a pointer value as a parameter to a function that has been prototyped with a different pointer type. The only exception to this rule concerns generic pointers (pointers to void). You may assign any pointer value to a generic pointer without casting. See Chapter 3 for more information about generic pointers. Code 4-53 comma operator comma operator Separates two expressions and returns the value of the latter. FORMAT expJ, exp2 ARGUMENTS expJ Any expression. exp2 Any expression. DESCRIPTION Use the comma operator to separate two expressions that are to be evaluated one right after the other. The comma operator is popular within for loops, as demonstrated by the following example: for (i = 10, j = 4; (i * j) < n; i++, j++); In the preceding example, the comma operator allows you to initialize both i and j at the beginning of the loop. The comma operator also allows you to increment i and j together. Note that all expressions return values. (See the "expressions" listing in this chapter for details.) When using a comma operator, the expression returns the value of the rightmost expression. For example, the following statement sets variable j to 2: j = (x = 1, y = 2); Note, however, that assignments such as these are considered poor programming style. You should confine use of the comma operator to for loops. 4-54 Code conditional expression operator conditional expression operator Alternative to jf... else statement constructions. FORMAT expl ? exp2 exp3 ARGUMENTS expl Any expression. exp2 Any expression. exp3 Any expression. DESCRIPTION The conditional expression construction provides a shorthand way of coding an if... else condition. The syntax described above is equivalent to: i f (expl) exp2; else exp3; When a conditional expression is executed, expl is evaluated first. If it is true (that is, nonzero) exp2 is evaluated and its result is the value of the conditional expression. If expl is false, exp3 is evaluated and its result is the value of the conditional expression. There is no requirement that you put parentheses around the expl portion of the conditional expression, but doing so will improve your code's readability. Both exp2 and exp3 must be assignment-compatible. If exp2 and exp3 are pointers to different types, then the compiler issues a warning. The value of a conditional expression is either expr2 or expr3, whichever is selected. Note that the other expression is not evaluated. The type of the result is the type that would be produced if exp2 and exp3 were mixed in an expression. For instance, if exp2 is a char and exp3 is a double, the result type will be double regardless of whether exp2 or exp3 is selected. Code 4-55 conditional expression operator EXAMPLE Program name is "conditional_exp_op_example" /* * This program reads four user-input numbers, adds * them together and prints the total. It then uses the conditional expression to determine whether the user wants to continue. If the string answer is 'y' or 'Y', a value of 1 (true) is assigned to again. If the answer is anything else, a value of 0 (false) is assigned. * * * * * */ #include int main( void { int a, b, c, d, again, total; char answer; printf ("\n"); again = 1; while (again) { total = 0; printf ("Enter four numbers -- separated by spaces -- that\ you want added together: "); scanf ("%d %d %d %d", &a, &b, &c, &d); fflush ( stdin ); total = a + b + c + d; printf ("\nThe total is: %d\n", total); printf ("Do you want to continue? "); scanf ("%c", &answer); again = (answer == 'y' I I answer == 'Y') ? 1 o·, } /* end while */ } USING THIS EXAMPLE If we execute this program, we get the following output: Enter four numbers -- separated by spaces -- that you want added together: 20 30 40 SO The total is: 140 Do you want to continue? y Enter four numbers -- separated by spaces -- that you want added together: 1 2 3 4 The total is: Do you want to continue ? n 4-56 Code continue continue Causes the next iteration of the enclosing for, while, or do/while loop to begin immediately. FORMAT continue; DESCRIPTION Continue halts execution of its enclosing for, while, or do/while loop and skips to the next iteration of the loop. In the while and do/while, this means the expression is tested immediately, and in the for loop, the third expression (if present) is evaluated. Code 4-57 continue EXAMPLE /* * * * * * * * * Program name is "continue_example". This program reads a file of student names and test scores and computes each student's average grade. However, the instructor has decided to drop the score from the third test because she discovered someone had found and distributed the answer sheet. So the for loop includes a continue statement that tells it to read over this test's score, excluding it from the averaging calculations. */ #include int maine void ) { int test _score, tot _score, i', float average; FILE *fp; char fname[lO] , lname[15] ; fp = fopen( "grades_data", "r" ); printf ( "\n\n" ); while (!feof( fp » /* while not end of file */ { tot_score = 0; f scanf ( fp, "%s %s", fname, lname ); printf( "\nStudent's name: %s %s\nGrades: " fname, lname ); for (i = 0; i < 5; i++) { fscanf( fp, "%d", &test_score ); printf( "%d" test_score); if (i == 2) /* leave out this test score */ continue; tot_score += test_score; } /* end for i */ fscanf( fp, "\n" ); /* read end-of-line at end of */ /* each student's data */ average = tot_score/4.0; printf( "\nAverage test score: %4.1f\n", average ); } /* end while */ fclose ( fp ); } 4-58 Code continue USING THIS EXAMPLE If we execute this program, we get the following output: student's name: Barry Quigley Grades: 85 91 88 100 75 Average test score: 87.8 student's name: Pepper Rosenberg Grades: 91 76 88 92 88 Average test score: 86.8 Student's name: Sue Connell Grades: 95 93 91 92 89 Average test score: 92.3 Code 4-59 DATE and _TIME_ _ DATE_ and _TIME_ (predefined symbols) Expands to the date and time of compilation. FORMAT Note that there are two underscores before and two underscores after each of these preprocessor symbols _DATE_ - TIME- DESCRIPTION The preprocessor recognizes these special predefined symbols and replaces their occurrences with the following: - DATE - Expands to a string representing the date of program compilation. - TIME - Expands to a string representing the time of program compilation. EXAMPLE The _DATE_ and _TIME_ macros are useful for recording the date and time a file was last compiled. For instance: /* Program name is "date_and_time_example". */ void print_version( void ) { printf( "This utility last compiled on %s at %s\n". _DATE_. _TIME_ ); } int main( void ) { print_version 0 ; } USING THIS EXAMPLE If we execute this program. we get the following output: $ date_and_time_example.bin This utility last compiled on Nov 16 1987 at 17:34:12 4-60 Code #debug #debug (preprocessor directive) Marks source code for conditional compilation. (Domain Extension) FORMAT ARGUMENT Any line of source code. DESCRIPTION Domain C provides the #debug preprocessor control line, which marks source code for conditional compilation. If you compile with the -cond compiler option (explained in Chapter 6), lines prefixed with #debug are compiled. If you compile with the -ncond switch (which is the default), then lines prefixed by #debug are ignored. (Note that -cond and -ncond are Icom/cc options; they are not available with Ibin/cc.) In general, you should use the conditional compilation preprocessor directives rather than #debug, since the former are portable and the latter is not. (See the "#if" listing of this encyclopedia for information on the conditional compilation directives.) EXAMPLE /* Program name is "debug_preprocessor_cmd". Use this * program to experiment with the -cond and -ncond * compiler options */ #include int maine void ) { char a_letter; printf( "Enter a letter -- " ); scanf( "%c", &a_letter); #debug printf("Echo the input -- %c\t%d\n" ,a_letter, a_letter); } Code 4-61 #debug USING THIS EXAMPLE If we compile with the -ncond switch (or without the -cond switch). we get the following results: Enter a letter -- r If we compile with the -cond switch. we get these results instead: Enter a letter r Echo the input -- r 4-62 Code 114 default default Refer to switch later in this encyclopedia. Code 4-63 #define and #undef #define and #undef (preprocessor directives) Defines and undefines program constants and macros. FORMAT #define macro_name macro_body Define constants #define macro_name( arg [{,arg}]) macro_body Define macros #undef macro_name Undefine constants and macros ARGUMENTS macro_name An identifier. arg An identifier. macro_body Any group of tokens. If the macro_body is to span more than one line, you must place a backslash \ at the end of the line (just as you would for a long string). DESCRIPTION A macro is a name that has an associated text string, called the macro body. By convention, macro names that represent constants consist of uppercase letters only. This makes it easy to distinguish macro names from variable names, which are generally composed of lowercase characters. In the following example, BIG_BUFF is the macro name and 512 is the macro body. #define BIG_BUFF 512 When a macro name appears outside of its definition (referred to as an invocation), it is replaced with its macro body. The act of text replacement is referred to as macro expansion. For example, having defined BIG_BUFF, you might write: char buf[BIG_BUFF] ; During the preprocessing stage, this line of code would be translated into: char buf[512]; The simplest and most common use of macros is to represent numeric constant values, as in the case of BIG_BUFF. There is another form of macros that is similar to a C function in that it takes arguments that can be used in the macro body. The syntax for this type of macro is shown in Figure 4-6. 4-64 Code #define and #undef For example, you could write: #define MUL_BY_TWO(a) «a) + (a» Then you can use MUL_BY_TWO in your program just as you would use a function. For example, the macro invocation, is translated by the preprocessor into: j = «5) + (5»; The actual argument 5 is substituted for the formal argument a wherever it appears in the macro body. The parentheses around a and around the macro body are necessary to ensure correct binding when the macro is expanded. Note that macro arguments are not variables-they have no type, and no storage is allocated for them. Consequently, macro arguments do not conflict with variables that have the same name. The following, for example, is perfectly legal: which, after expansion, becomes: j = «a-I) + (a-I»; [.Cdenne~ ':::~~ macro argument -'PII L---"""':""_ _ _ cD-1 ~OadC;° I Figure 4-6. Syntax of a Function-Like Macro Code 4-65 #define and #undef ,', ." ,Thispmgrammingerror will actually gotmnoticed by the c()mpih~r, which will interpret the second semicolon as a null statement. The following, however, will cause a compile:....time parsing error: int array [SIZE1; to • WhatJllak~,s, tllis bug so, difficUlt fhid' is, that the l!neon whith the error, is reported , looks perfectly legal; The,mostpernicious"exampl~ '()fthis typ~ 'ofbugoccllrs when the. r~sUlting syntaX"after, replacement;' is', legal but ,is semantically. different from whatwasmtended,Forexample: " ,,' " ,'The.semi~OJ6riafter'. (var== '1) is interPrJt~d 'as ,a' null, statement, •and more 'iInpor~ : tcmtly,all thebbdyofthewhile loop. , j\sa:;result, the call to fooOls not part of the while, body, IfvarequaIs one, you will~e~,~n infinit~~oop.; pomainC,suPP()tts' th~-es option (withZC?~(cc}and the,;"E optiO,n (with/binlcc) thatJetyou::~xectite,just,the preprocess06::Thismakesit.much easier to find this' ,:' type; ,ofbitg,.b¢~atise'Youcaninspect ,th~.ispurcecode afteran:6f ,the:fi1acros h'ave "b~enexp~nded: " ',' ',' ;":, ,;'" .. ',".:'"' " . ":;::~:';:~/~':;~ ;::~,~,:;~,;y . , 4-66 Code . ''"-.~t;\: . ;~\ .;\<~.;'" #define and #undef Bug Alert: Binding of Macro Arguments A potential problem with macros is that argument expressions that are not carefully parenthesized can produce erroneous results due to operator precedence and binding. Consider the following macro: #define square( a ) a * a square has the advantage that it will work regardless of the argument data types. However, watch what happens when we pass it an arithmetic expression: j = 2 * square( 3 + 4 ); expands to: j = 2 * 3 + 4 * 3 + 4; Because of operator precedence, the compiler interprets this expression as: j = (2 * 3) + (4 * 3) + 4; which assigns the value of 22 to j, instead of 98. To avoid this problem, you should always enclose the macro body and macro arguments in parentheses: #define square( a ) «a) * (a) ) Now, the macro invocation expands to: j = 2 *«3 + 4) * (3 + 4»; which produces the correct result. ; No Type Checking for Macro Arguments From an operational point of view, the macro MUL_BY_TWO may seem identical to the following function: int mul_by_two( a ) int a; { return a+a; } However, there is one significant difference-there is no type checking for macros. In the function version of mul_by_two, you must pass an integral value, and the function must return an into In the macro version, you can substitute any type of value for a. Code 4-67 #define and #undef Suppose, for example, that f is a float variable. If you write, the preprocessor translates it into: f = «2.5) + (2.5»; which assigns the value 5.0 to f. In contrast, if you write, the compiler takes one of two actions, depending on whether function prototypes are being used. In the presence of prototyping, the compiler converts 2.5 into an int, giving it a value of 2; adds two and two together, and returns 4 instead of 5.0. Without function prototypes, the compiler passes a double-precision 2.5 to the function, which interprets it as an into This produces unpredictable results. The lack of type checking for macro arguments can be a powerful feature if used with care. Consider the following macro, which returns the lesser of two arguments: #define MIN( a, b ) «a) < (b) ? (a) : (b» Note that this works regardless of whether a and b are integers or floating-point values. It is extremely difficult to write an equivalent function that works for all data types. Another difference between macros and functions is that the preprocessor checks to make sure that the number of arguments in the definition is the same as the number of arguments in the invocation. The C compiler only does this type of checking for functions if you use the ANSI prototyping syntax in the function declaration. For example, the statement, would produce a compile-time error. The analogous statement would produce a compile-time error only if the function is declared with the ANSI prototyping syntax. Otherwise, this statement would compile without errors, but would produce unpredictable results when executed. 4-68 Code #define and #undef Bug Alert: Using = to Define a Macro A common mistake made in defining macros is to use the assignment operator as if you were initializing a variable. Instead of writing, #define MAX 100 you write: #define MAX = 100 This type of mistake can lead to obscure bugs. For example, the expression, for (j=MAX; j > 0; j--) would expand to: for (j== 100; j > 0; j--) Suddenly, the assignment is turned into a relational expression. The expression is legal, so the compiler will not complain, making the error difficult to track down. Bug1\lert: Space Between Left Parenthesis and Macro Name Note in Figure 4-6 that the left parenthesis must.coine immediately after the macro name, without any intervening spaces; Jnsertion of a space usually results in a compile-time error, but occasionally obscure bugs can result. Consider the following macro: The expression, j=neg_a_pll.ls_f (x) ; expands to: j>=-(x) + f; Butwatch what happens if we accidentally insert a space between the left parenthesis and the macro name in the definition: Now, the expression expands to: J=(a) -a) +i(x); Ifais a variable>name and f is a function name, this will look like a perfectly legal expression.to the compiler. Code 4-69 #define and #undef Macros vs. Functions Macros and functions are similar in that they both enable a set of operations to be represented by a single name. Sometimes it is difficult to decide whether to implement an operation as a macro or as a function. In general, macros execute more quickly than functions because there is none of the function overhead involved in copying arguments and maintaining stack frames. When trying to speed up slow programs, therefore, you should be on the lookout for small, heavily used functions that can be implemented as macros. Converting functions to macros, however, will have a noticeable impact on execution speed only if the function is called frequently. Using macros can also have a significant impact on code size. The resulting executable object will probably be larger if you use macros, unless the equivalent function requires a lot of overhead. The following lists summarize the advantages and disadvantages of macros compared to functions. Advantages of Macros • Macros are usually faster than functions since they avoid the function call overhead. • The number of macro arguments is checked to match the definition. (Domain C compiler also does this for functions if you use the new ANSI prototyping syntax.) • No type restriction is placed on arguments so that one macro may serve for several data types. Disadvantages of Macros • Macro arguments are re-evaluated at each mention in the macro body, which can lead to unexpected behavior if an argument contains side effects. • Function bodies are compiled once so that mUltiple calls to the same function can share the same code without repeating it each time. Macros, on the other hand, are expanded each time they appear in a program. As a result, a program with many large macros may be longer than a program that uses functions in place of the macros. • Though macros check the number of arguments, they don't check the argument types. ANSI function prototypes check both the number of arguments and the argument types. • It is more difficult to debug programs that contain macros because the source code goes through an additional layer of translation, making the object code even further removed from the source code. 4-70 Code #define and #undef Bug Alert: Side Effects in Macro Arguments A potential hazard of macros involves side effect operators in argument expressions. Suppose, for instance that we invoke the MIN macro as follows: a = MIN( b++, c ); The preprocessor translates this into: a = «b++) < (c) ? (b++) : c); If b is less than c, it gets incremented twice, obviously not what is intended. To be on the safe side, you should never use a side effect operator in a macro invocation. Side effect operators include the increment and decrement operators, the assignment operators, and function invocations. Removing a Macro Definition Once defined, a macro name retains its meaning until the end of the source file, or until it is explicitly removed with an #undef directive. The most typical use of #undef is to remove a definition so you can redefine it. According to the ANSI standard and most existing C compilers, it is illegal to redefine a macro without an intervening #undef statement, unless the two definitions are the same. This is a useful rule because it enables you to define the same macro in different header files. If you include mUltiple header files (and hence, mUltiple definitions of the same macro), your compiler will complain only if the definitions conflict. Code 4-71 do/while do/while Executes the statements within a loop until a specified condition is satisfied. FORMAT do statement; while (exp); ARGUMENTS statement A null statement, simple statement, or compound statement. exp Any expression. DESCRIPTION This is one of the three looping constructions available in C. Unlike the for and while loops, however, the do/while performs statement first and then tests expo If exp evaluates to nonzero (true), statement is executed again, but when exp evaluates to zero (false), execution of the loop stops. This type of loop is always executed at least once. Two ways to jump out of a do/while loop prematurely (that is, before exp becomes false) are the following: 4-72 Code • Use break to transfer control to the first statement following the do/while loop. • Use goto to transfer control to some labeled statement outside the loop. do/while EXAMPLE /* * * * * Program name is "do.while_example". This program finds the summation of an integer that a user supplies, and the summation of the squares of that integer. The use of the do/while means that the code inside the loop is always executed at least once. */ #include int main( void ) { int num, sum, square_sum; char answer; printf( "\n" ); do { printf( "Enter an integer: " ); scanf ( "%d", &num ); sum = (num*(num+1»/2; square_sum = (num*(num+1)*(2*num+1»/6; printf("The summation of %d is: %d\n", num, sum); printf( "The summation of its squares is: %d\n", square_sum ); printf ( "\nAgain? " ); fflush( stdin ); scanf ( "%c", &answer ); } while «answer != ~n~) && (answer != ~N~»; } USING THIS EXAMPLE If we execute this program, we get the following output: Enter an integer: 10 The summation of 10 is: 55 The summation of its squares is: 385 Again? y Enter an integer: 25 The summation of 25 is: 325 The summation of its squares is: 5525 Again? n Code 4-73 #eject #eject (preprocessor directive) Inserts a page break into the listing file. (Domain Extension) FORMAT #eject DESCRIPTION Domain C supports the #eject directive. which inserts a page break (formfeed) into the listing file .. The statement that follows the #eject command is output at the top of a new page. The #eject directive does not affect the object file in any way. 4-74 Code else else Refer to if later in this encyclopedia. Code 4-75 #else #else 4-76 Code Refer to #if later in this encyclopedia. #endif #endif Refer to #if later in this encyclopedia. Code 4-77 enum operations enum operations Operations that can be performed on enums. DESCRIPTION Chapter 3 explains how to define enumerated variables. Here, we explain how to use enumerated variables in the action part of your program. In conformance with the ANSI standard. Domain C allows you to use enums where integers may be used. However, we recommend that you use enums only in the following situations: • Assign an enumerated value to an enumerated variable. • Compare an enumerated value or variable to another enumerated value or variable. • Use an enumerated variable as an array subscript. • Use an enumerated variable in switch control expressions, and use enumerated values in switch case labels. • Pass an enumerated variable to a function or return an enumerated value from a function. For example, here is a program fragment that shows some of the possible uses of enumerated variables: enum fruits {mango, apple, lemon, orange} tasty_fruits; tasty_fruits = mango; /* assign enum value to an enum var. */ if (tasty_fruits> apple) /* compare enum var to enum value */ printf( "A tart fruit.\n" ); switch (tasty_fruits) /* use enum var in a switch statement */ { case apple printf( "Grown in temperate climates. \n" ); break; case mango case lemon case orange printf("Grown in tropical or semi-tropical \regions.\n"); break; } 4-78 Code expressions expressions Combinations of operators and operands that evaluate to a single value. DESCRIPTION An expression consists of one or more operands and zero or more operators linked together to compute a value. For instance, a + 2 is a legal expression that results in the sum of a and 2. The variable a all by itself is also an expression, as is the constant 2, since they both represent a value. There are four important types of expressions: • Constant Expressions contain only constant values. For example, the following are all constant expressions: 5 5 + 6 , a' • * 13 / 3.0 Integral Expressions are expressions that, after all automatic and explicit type conversions, produce a result that has one of the integer types. If j and k are integers, the following are all integral expressions: j j j * k / k + 3 k - ' a' 3 + (int) 5.0 • Float expressions are expressions that, after all automatic and explicit type conversions, produce a result that has one of the floating-point types. If x is a float or double, the following are floating-point expressions: x x + 3 x / y * 5 3.0 3.0 - 2 3 + (float) 4 • Pointer expressions are expressions that evaluate to an address value. These include expressions containing pointer variables, the address-of operator (&), string literals, and array names. If p is a pointer and j is an int, the following are pointer expressions: P &j P + 1 "abc" (char *) OxOOOfffff Code 4-79 expressions All Expressions Have Values One of the interesting features of C is that all expressions produce a value, called a byproduct value, as they are evaluated at run time. For many expressions, you won't know or care what this byproduct is. In some expressions, though, you can exploit this feature to write more compact code. Let us now look at a few examples. Example 1 First, consider the following simple expression statement: x = 6; The byproduct value of all assignment expressions is the value that gets assigned, which in this case is 6. However, we do not use this byproduct value in any way. The following example does use this byproduct value: y =x = 6; The equals operator binds from right to left; therefore, C first evaluates the expression x = 6. The byproduct of this operation is 6, so C sees the second operation as y = 6 Example 2 Now, let us consider the following relational operator expression: (10 < j < 20) It is certainly tempting to use an expression like the preceding to find out whether j is between 10 and 20. However, it won't work. Since the relational operators bind from left to right, C first evaluates 10 < j Note that the byproduct of a relational operation is 0 if the comparison is false and 1 if the comparison is true. Pretend that j equals 5. Therefore, the expression 10 < j is false, and the byproduct is o. Thus, the next expression that C evaluates is o < 20 which evaluates to true (or 1), which is the wrong answer. Example 3 Finally, consider the following fragment: 4-80 Code expressions static char a_char, c[20] {"Valerie"}, *pc c; while (a_char = *pc++) { This while statement uses C's ability to both assign and test a value. Every iteration of while assigns a new value to variable a_char. The byproduct of an assignment is equal to the value that gets assigned. The byproduct value will remain nonzero until the end of the string is reached. When that happens, the byproduct value will become zero (false), and the while loop will end. Code 4-81 - FILE - 4-82 Code Refer to _LINE_ listing later in this encyclopedia. for for Executes the statement(s) within a loop as long as exp2 is true. FORMAT for ( [expJ]: [exP2]: [eXP3] ) statement; ARGUMENTS expJ An .optional element of the command. It can be any expression, although it usually is some sort of assignment statement. exp J is evaluated only once-at the beginning of the loop iteration. exp2 An optional element of the command. It can be any expression, but is usually a relational expression. If omitted, exp2 is taken as being permanently true. exp3 An optional element of the command. It can be any expression, but it usually serves as the iteration instructions for the loop. It is evaluated each time after statement has been executed. statement Can be a null statement, simple statement, or compound statement. DESCRIPTION This is one of the three looping constructions available in C. The other two are while and do/while. The for statement operates as follows: 1. First, expJ is evaluated. This is usually an assignment expression that initializes one or more variables. 2. Then exp2 is evaluated. This is the conditional part of the statement. 3. If exp2 is false, program control exits the for statement and flows to the next statement in the program. If exp2 is true, statement is executed. 4. After statement is executed, exp3 is evaluated. Then the statement loops back to test exp2 again. Note that expJ is evaluated only once, whereas exp2 and exp3 are evaluated on each iteration. The operation of a for loop is shown pictorially in Figure 4-7. Code 4-83 for ENTER FOR LOOP NO EXIT FOR LOOP Figure 4-7. How a for Loop Is Executed Note that for loops can be written as while loops, and vice versa. For example, the for loop for (j = 0; j < 10; j++) { do_something(); } is the same as the following while loop: j = 0; while (j < 10) { do_something 0 ; j++; } The for loop is used most commonly in situations when a variable has to be initialized and reinitialized. Most loops have this kind of construction: for ( initialize_loop_variable; finished?; change_loop_variable) instructions; where change_loop_variable can increment or decrement the loop variable, depending on what you want. And unlike some programming languages, which restrict you to changing the loop variable by +1 or -1 only, C lets you change the loop variable by any amount. If, for example, you want to make some change to just the even-numbered members of an arra y, you can write: 4-84 Code for 0; i < ARRAY_SIZE; i += 2) /* instructions */; for (i Any of the three expressions, or even the statement, can be omitted from a for loop, but the semicolons must appear. It is permissible, for example, to do all the work in the exp part of the loop and just have a semicolon appear in the statement section. This is convenient if you are scanning a fixed-length array to determine the length of the string stored in it. The following for loop scans backward from the array's maximum size, reading over any blanks, end-of-line characters, or nulls, until it finds an alphanumeric character: for (i = ARRAY_SIZE-1; a[i] ==' , II a[i] a [ i ] == '\ 0'; i -- ) ; /* null statement */ == '\n' II C also provides a way to combine several for loops into one. You can use the comma operator (,) to string together expressions. If you want to process two indexes in parallel operations, separate them with commas. For example: for (i = 0, j /* = 10; i < j; i++, j--) statement */; The above loop initializes i to zero and j to 10 and loops through, incrementing i and decrementing j, until i equals j. The following describes two ways to jump out of a for loop prematurely (that is, before exp2 becomes false): • Use break to transfer control to the first statement following the for loop. • Use go to to transfer control to some labeled statement outside the loop. EXAMPLE /* * * Program name is "for_example". The following computes a permutation -- that is, P(n,m) = n!/(n-m)! -- using for loops to compute n! and (n-m)!) */ #include #define SIZE 10 int main( void ) { int n, m, n_total, m_total, perm, i, j, mid, count; printf( "Enter the numbers for the permutation (n things" printf( "taken m at a time)\nseparated by a space: " ); scanf ( "%d %d", &n, &m ); n_total = m_total = 1; /* compute n! */ for (i = n; i > 0; i--) n total *= i; ); Code 4-85 for for (i = n - m; i > 0; i--) /* compute (n-m)! m_total *= i; perm = n_total/m_total; printf( "P(%d,%d) = %d\n\n", n, m, perm); /* * * * * This series of for loops prints a pattern of "Z's" and shows how loops can be nested and how you can either increment or decrement your loop variable. The loops also show the proper placement of curly braces to indicate that the outer loops have multiple statements. */ printf( "Now, print the pattern three times:\n\n" ); mid = SIZE/2; /* controls how many times pattern is printed */ for (count = 0; count < 3; count++) { for (j = 0; j < mid; j++) { /* loop for printing for (i = 0; i < i f (i < mid printf( " " else printf ( "Z" printf( "\n" ); an individual line SIZE; i++) j I Ii> mid + j) ); ); } for (j = mid; j >= 0; j--) { for (i = 0;· i <= SIZE; i++) if (i < mid j I Ii> mid + j) printf ( " " ); else printf ( "Z" ); printf ( "\n" ); } } } 4-86 Code */ */ for USING THIS EXAMPLE If we execute this program, we get the following output: Enter the numbers for the permutation (n things taken m at a time) separated by a space: 4 3 P(4,3) = 24 Now, print the pattern three times: z zzz zzzzz zzzzzzz zzzzzzzzz zzzzzzzzzzz zzzzzzzzz zzzzzzz zzzzz zzz z z zzz zzzzz zzzzzzz zzzzzzzzz zzzzzzzzzzz zzzzzzzzz zzzzzzz zzzzz zzz z z zzz zzzzz zzzzzzz zzzzzzzzz zzzzzzzzzzz zzzzzzzzz zzzzzzz zzzzz zzz z Code 4-87 goto go to Unconditionally jumps to a specified label. FORMAT go to label; ARGUMENTS label This is the label to which you want the goto to jump. DESCRIPTION Few programming statements have produced as much debate as the goto statement. The goto statement is necessary in more rudimentary languages, but its use in high-level languages is generally frowned upon. Nevertheless, most high-level programming languages, including C, contain a goto statement for those rare situations where it can't be avoided. The purpose of the go to statement is to enable program control to jump to some other spot. The destination spot is identified by a statement label, which is just a name followed by a colon. The label must be in the same function as the goto statement that references it. With deeply nested logic there are times when it is cleaner and simpler to bail out with one goto rather than backing out of the nested statements. The most common and accepted use for a goto is to handle an extraordinary error condition. The following sample program shows a goto that easily could be avoided through the use of a while loop, and also shows what an illegal goto looks like. 4-88 Code goto EXAMPLE Program name is "goto_example". This program finds the circumference and area of a circle when the user gives the circle's radius. /* * * */ #include #define PI 3.14159 int maine void) { float cir, radius, area; char answer; extern void something_different( void ); circles: printf( "Enter the circle's radius: II ) ; scanf ( "%f", &radius ); cir = 2 * PI * radius; area = PI * (radius * radius); printf( liThe circle's circumference is: %6.3f\n", cir ); printf( lilts area is: %6.3f\n", area); printf( "\nAgain? y or n: II ) ; fflush ( stdin ); scanf ( "%c ", &answer ); if (answer == 'y' I I answer 'Y') goto circles; else { printf( liDo you want to try something different? II ) ; fflush( stdin ); scanf ( "%C ", &answer ); if (answer == 'y' I I answer == 'Y') /* go to different; WRONG! This label is in */ /* another block. */ something_different(); } /* end else */ } void something_different( void) { different: printf( "Hello. This is something different.\n" ); Code 4-89 goto USING THIS EXAMPLE If we execute this program, we get the following output: Enter the circle's radius: 3.5 The circle's circumference is: 21.991 Its area is: 38.484 Again? y or n: y Enter the circle's radius: 6.1 The circle's circumference is: 38.327 Its area is: 116.899 Again? y or n: n Do you want to try something different? Hello. This is something different. 4-90 Code y if if Tests one or more conditions and executes one or more statements according to the outcome of the tests. FORMAT if (exp) statement if (exp) statementl else statement2 '* format 1 *' '* format 2 *' ARGUMENTS exp Any expression. statement Any null, simple, or compound statement. Note that a statement can itself be another if statement. Remember, a statement ends with a semicolon. DESCRIPTION The if and switch statements are the two conditional branching statements in C. The if statement can take either of the two forms shown in the Format section. In the first form, if exp evaluates to true (any nonzero value), C executes statement, while if exp is false (evaluates to zero), C simply falls through to the next line in the program. In the second form, if exp evaluates to true, C executes statementl, but if exp is false, statement2 is performed. Note that a statement can itself be an if or if/else statement. Therefore, you can test multiple conditions with a command that looks like this: if (expl) /* statementl else i f (exp2) statement2 else i f (exp3) statement3 multiple conditions */ else statementN Code 4-91 if The important thing to remember is that C executes at most only one statement in the if... else and if... else/if ... else constructions. Several expressions may indeed be true, but only the statement associated with the first true expression is executed. The system does not even look at subsequent expressions. For example: /* determine reason the South lost the Civil War */ if (leSs_money) printf( ·"It had less money than the North. \n" ); else if (fewer_supplies) printf( "It had fewer supplies than the North.\n" ); else if (fewer_soldiers) printf( "It had fewer soldiers.\n" ); else { printf( "Its agrarian society couldn't compete with the" ); printf( "North's industrial one. \n" ); } All the expressions in the above code fragment could be evaluated to true, but the runtime system would only get as far as the first line and never even test the remaining expressions. If you use a compound statement in one of the if constructions, remember to use the curly braces to indicate where the statement begins and ends. For example: i f (x > y) { temp x = x; y; y = temp; } else /* make next comparison */ Braces also are important when you nest if statements. Since the else portion of the statement is optional, you may not have one for an inner if. However, C associates an else with the closest previous if unless you use braces to show that isn't what you want. For example: i f (month 12) { if (day 25) /* month = November printf( "Today is Christmas.\n" ); */ } else printf( "It's not even December.\n" ); Without the braces, the else would be associated with the inner if statement, and so the no-December message would be printed for any day in December except December 24. Nothing would be printed if month did not equal 12. 4-92 Code if .. ' Bug Alert: The Dangling else Nested if statements create the problem of matching each else phrase to the right if statement. This is often called the dangling else problem. The general rule is: An else is always associated with the nearest previous if. Each if statement, however, can have only one else phrase. It is important to format nested ifs correctly to avoid confusion. An else phrase should always be at the same indentation level as its associated if. However, don't be misled by indentations that look right even though the syntax is incorrect. Code 4-93 if EXAMPLE /* Program name is "if.else_example". */ #include int main( void ) { int age, of_age; char answer; This if statement is an example of the second form (see "Description" section). /* * */ printf( "\nEnter an age: " ); scanf ( "%d", &age ); if (age> 17) printf( "You're an adult.\n" ); else { of_age = 18 - age; printf( "You have %d years before you're an adult.\n", of_age) ; } /* end else */ printf( "\n" ); printf( "This part will help you decide whether to jog \ today.\n" ); printf( "What is the weather like?\n" ); printf( raining = r\n" ); printf( cold = c\n" ); printf ( muggy = m\n" ); printf( hot = h\n" ); printf( nice = n\n" ); printf( Enter one of the choices: " ); fflush( stdin ); scanf ( "\n%c", &answer ); This illustrates the common "else if" idiom */ if (answer == 'r') printf( "It's too wet to jog today. Don't bother.\n" ); else if (answer == 'c') printf("You'll freeze i f you jog today. stay indoors.\n" ); else if (answer == 'm') printf (" It's no fun to run in high humidity. Skip it. \n" ); else if (answer == 'h') printf( "You'll sweat to death i f you try to jog today. So\ don't.\n" ); else if (answer == 'n') printf("You don't have any excuses. You'd better go run.\n"); else printf ( "You didn' t give a valid answer. \n" ); /* } 4-94 Code if USING THIS EXAMPLE If we execute this program, we get the following output: Enter an age: 15 You have 3 years before you're an adult. This part will help you decide whether to jog today. What is the weather like? raining = r cold = c muggy = m hot = h nice = n Enter one of the choices: r It's too wet to jog today. Don't bother. Code 4-95 #if, #ifdef, #ifndef, #elif, #else, #endif #if, #ifdef, #ifndef, #elif, #else, #endif (preprocessor directives) and defined (predefined macros) Control conditional compilation. FORMAT #if const_exp #else #elif #endif #ifdef identifier #ifndef identifier defined (identifier) defined identifer (Supported only by the UNIX preprocessor) Predefined macro Predefined macro ARGUMENTS Any constant expression. identifier Any identifier. DESCRIPTION These preprocessor directives and predefined macros work together, so we explain them together in this one listing. The #if, #else, and #endif Preprocessor Directives Use these preprocessor directives to conditionally compile sections of your source code. For example, suppose you are writing a program that is to run on either a color or monochromatic node. Further suppose that although most of the program is independent of the target, a fraction of the program does depend on the target. In other words, the code for the color target is different from the code for the monochromatic target. To solve this problem you could just write two different programs. However, this makes program debugging and maintenance much more expensive since a change in one program would have to be duplicated in the other. A better solution is to use the conditional compilation preprocessor directives as follows: 4-96 Code #if, #ifdef, #ifndef, #elif, #else, #endif /* code applying to both color and monochromatic nodes */ Hif color /* code for color nodes only */ HeIse /* code for monochromatic nodes only */ Hendif /* code applying to both color and monochromatic nodes */ The #if directive takes a constant expression as its sole argument. If this constant expression evaluates to nonzero, then all the code up until an #else or #eDdif is compiled. If the constant expression evalutes to zero, then no code is compiled until the next #else or #endif directive. There are a number of differences between the preprocessor conditional statements and the C language conditional statements: • The conditional expression in an #if directive need not be enclosed in parentheses. (Parentheses may optionally be included.) • Blocks of statements under the control of a conditional preprocessor directive are not enclosed in braces. Instead, they are bounded by an #else, or #eDdif statement. • Every #if block may contain only one #else block. • Every #if block must end with an #eDdif directive. • Any macros in the conditional expression are expanded before the expression is evaluated. • If a conditional expression contains a name that has not been defined, it is replaced by the constant zero. For example, the sequence, Hundef x Hif x expands to: Hif 0 Code 4-97 #if, #ifdef, #ifndef, #elif, #else, #endif Conditional compilation is particularly useful during the debugging stage of program development since you can turn sections of code on or off by changing the value of a macro. Consider the following snippet: #if DEBUG i f (exp_debug) { printf( "lhs = " ); print_value ( result ); printf( " rhs = " ); print_value ( &rvalue ); printf ( "\n" ); } #endif If the macro DEBUG is a nonzero value, the if statement and printfO calls will be compiled. If DEBUG is zero, these statements will be ignored as if they were a comment. If DEBUG is not defined, it is the same as if it were defined to expand to zero. Domain C has a command line option that lets you define macros before compilation begins. If you compile under the UNIX system, use the -D option to define macros. Under Aegis, use the -def option. To receive debug information, you would define the macro DEBUG to be some nonzero value: cc -DDEBUG=1 test (under the UNIX system) cc -def DEBUG=1 test (under the Aegis system) Note that the #if and #endif directives control whether the enclosed C statements are compiled, not necessarily whether they are executed. In the above example, the printfO calls are only executed if the exp_debug variable has a nonzero value. This double-layer approach enables you to include the diagnostic statements in the executable program, but still decide each time you run the program whether you want them executed. If, for the final version, you need to reduce the size of the executable program, you can compile it with DEBUG set to zero. Another common use of the conditional compilation mechanism is to choose between the old function declaration syntax and the new ANSI prototyping syntax: #if ( __ STDC__ == 1) extern int foo( char a, float b ); extern *char goo( char *string ); #else extern int foo(); extern *char goo(); #endif By default, the compiler sets __ STDC__ to 1 and uses the prototyping syntax to declare the types of each argument. If you compile with -ntype, the compiler uses the old function declaration syntax. 4-98 Code #if, #ifdef, #ifndef, #elif, #else, #endif The #elif Directive The #elif directive is supported by the UNIX preprocessor (cpp), but not by the Aegis preprocessor. Therefore, use #elif only if you are compiling in a UNIX environment or if you explicitly use the Ibin/cc command. The #elif directive is a shorthand for the combination of an #else directive followed by an #if directive. For example, the following sequence is written without #elifs. Hif (TEST printf( HeIse Hif (test printf( HeIse Hif (test printf( Hendif == 0) "No test\n" == ) ; 1) "Test H1\n" ) ; == 2) "Test #3\n" ) ; Using #eHfs, you could rewrite this: #if (TEST == 0) printf( "No test\n" ); Helif (test == 1) printf( "Test #l\n" ); #elif (test == 2) printf( "Test #3\n" ); Hendif The #ifdef and #ifndef Preprocessor Directives Use the #ifdef command to determine if an identifier is currently defined. In this context "defined" means that the identifier was used in a #define preprocessor directive or used in the -D (/bin/cc) or -def (Icom/cc) compiler option. #ifndef checks whether an identifier is not currently defined. For example: Hifdef TEST printf ( "This is a test. \n" ); HeIse printf( "This is not a test.\n" ); Hendif If the macro TEST is defined, the first printfO call will be compiled. If TEST is not a defined macro, the second printfO call is compiled. Note that it doesn't matter what TEST expands to, only whether it exists or not. As with #if, an #ifdef and #ifndef block must be terminated by an #endif statement. Code 4-99 #if, #ifdef, #ifndef, #elif, #else, #endif Another way to write the previous example is to use the preprocessor defined operator (an ANSI feature): #if defined TEST or #if defined( TEST) The parentheses around the macro name are optional. By definition, #if defined macro_name is equivalent to: #ifdef macro name and the directive, #if ! defined macro name is equivalent to: #ifndef macro_name The defined macro is particularly useful for performing logical operations. For example: #if defined(Domain) && !defined(Aegis) && DEBUG In most instances, you can use #if instead of #ifdef and #ifndef, since the macro name expands to zero if it is not defined. The one exception where you need to use #ifdef or #ifndef is when the macro is defined to zero. For example, you may want to define the macro FALSE to expand to zero. If you use an #if directive to test whether FALSE is defined, FALSE will be redefined even if it is already defined to expand to zero. More important, it won't be redefined if it is defined to something other than zero. #if !FALSE # define FALSE 0 #endif You can avoid both of these problems by using #ifndef. #ifndef FALSE # define FALSE 0 #elif FALSE # undef FALSE # define FALSE 0 #endif 4-100 Code #ifdef #ifdef Ref~r to the #if listing earlier in this chapter. Code 4-101 #ifndef #ifndef Refer to the #if listing earlier in this chapter. 4-102 Code #include #include (preprocessor directive) Inserts an include file into the source code. FORMAT #include #include "pathname" ARGUMENTS pathname The pathname of the file that is to be included into the source code. DESCRIPTION The #include preprocessor directive inserts the contents of the specified file into the source file prior to compilation. For example, if you put the following #include into your source code f (x) ; #include "//lucas/eleven/rings.ins.c" g(x) ; then the C precompiler inserts the entire contents of the file into your source code between rex) and g(x). After this insertion, the C compiler compiles the inserted lines just as it would compile any other lines of source code. The #include command enables you to create common definition files, called header files, to be shared by several source files. Header files traditionally have a .h suffix and contain data structure definitions, macro definitions, and any global data necessary for modules to communicate with each other. The Domain preprocessors support up to 12 levels of nested header files. The Domain/OS operating system supplies many header files (sometimes called "include files") that describe structures internal to the operating system. The C run-time library also includes a number of header files that must be included in order to invoke associated functions. See the SysV Programmer's Reference manual and the BSD Programmer's Reference manual for more information about run-time library header files. By default, the C compiler automatically tries to include the following file at the beginning of each source file you compile: /usr/include/apollo_$std.h This file sets up predefined, system-wide definitions. Because it is automatically included, you do not need to explicitly include this file in your code. If the compiler cannot locate lusr/include/apollo_$std.h, no action is taken and no error is reported. If the compiler does locate the file, it processes the file like any other include file. Code 4-103 #include In Domain/OS pathname strings, the backslash character (\) represents the parent directory. Consequently, when the compiler detects a backslash character in an include file string, it does not interpret it as a normal escape character. This special interpretation of backslash applies only to include files. How the C Preprocessor Searches for Include Files The #include command has two forms: #include or #include "filename" If the filename is surrounded by angle brackets, the preprocessor looks in a list of implementation-defined places for the file. On Domain/OS systems, the compiler looks in the directory /usr/include unless alternative directories are specified with the -idir option (leom/ee) or the -I option (lbin/ee). (See the description of -idir and -I in Chapter 6 for more information about specifying search directories.) If the filename is surrounded by double quotes, the preprocessor looks for the file according to the file specification rules of the operating system (described below). If the preprocessor can't find the file there, it searches for the file as if it had been enclosed in angle brackets. For header files enclosed in double quotes. the Domain/OS operating system distinguishes between two kinds of pathnames: relative pathnames and absolute pathnames. An absolute pathname begins with a slash (/), double slash (II), backslash (\), tilde (-), or period (.); for example, the following include files are all absolute pathnames: 'include 'include 'include 'include "//rastelli/six/plates.ins.c" "/ignatov/seven/clubs" "-/brunn/spinning.ins.c" "./noakes/passing/tricks.ins.c" When pathname is an absolute file, the C preprocessor searches this pathname only. If the preprocessor does not find this pathname. it issues an error. Relative pathnames begin with an identifier; for example, here are two relative pathnames: 'include "jensby/jensen.ins.h" 'include "my_include_file.h" The search method for relative pathnames depends on whether you use the UNIX Ibin/ce interface or the Aegis Icom/ee interface. This difference is due to the fact that Ibin/ee invokes the UNIX preprocessor (epp) whereas Icom/cc uses the Aegis preprocessor. With Ibin/ee, relative pathnames are relative to the directory of the including source file. With leom/ee, relative pathnames are always relative to the working directory. These differences are described in more detail below. 4-104 Code #include Compiling with Icom/cc For relative pathnames delimited by double quotation marks ("pathname"), the compiler first searches for pathname in the working directory. If it is not there, the compiler searches any directories you specified with -idir. If it is not in any of them, the compiler searches directory lusr/include. If it is not there, the compiler issues an error. Compiling with Ibin/cc For relative pathnames delimited by double quotation marks ("pathname"), the compiler searches for pathname in the following order: 1. The preprocessor searches in the directory of the including source file. 2. If it is not there, the preprocessor searches in the working directory. 3. If it is not there, the preprocessor searches in any directories you specified with -I. 4. If it is not in any of them, the preprocessor searches in directory lusr/include. 5. If it is not there, the preprocessor issues an error. Code 4-105 increment and decrement operators increment and decrement operators Operators that you can use to increment or decrement variables. FORMAT Increment, postfix form Increment, prefix form Decrement, postfix form Decrement, prefix form lvalue++ ++lvalue lvalue---lvalue ARGUMENTS Any previously declared integer or pointer lvalue. (See Section 4.2 for a definition of lvalue.) Note that although lvalue can be a pointer variable, it cannot be a pointer to a function. lvalue DESCRIPTION C's increment (++) and decrement (--) operators are good examples of the language's tendency toward compactness. The increment operator adds 1 to its operand, and the decrement operator subtracts 1 from its operand. So while in many languages statements must look something like these i j i j + 1; - 1; to increment the variable i and decrement j, in C you can just type i++; j--; The increment and decrement operators are unary. The operand must be a scalar lvalueit is illegal to increment or decrement a constant, structure, or union. It is legal to increment or decrement pointer variables, but the meaning of adding 1 to a pointer is different from adding 1 to an arithmetic value. This is described in the "pointer arithmetic" section of this chapter. There are two forms for each of the operators: postfix and prefix. Both forms increment or decrement the appropriate variable, but they do so at different times. The statement ++i (prefix form) increments i before using its value, while i++ (postfix form) increments it after its value has been used. This difference can be important to your program. 4-106 Code increment and decrement operators The postfix increment and decrement operators fetch the current value of the variable and store a copy of it in a temporary location. The compiler then increments or decrements the variable. The temporary copy, which has the variable's value before it was modified, is used in the expression. For example: /* Program name is "inc.dec_examplel" */ #include int maine void { int j = 5, k 5; printf( "j: %d\t k: %d\n", j++, k--); printf( "J: %d\t k: %d\n", j, k); } The result is: j: 5 j: 6 k: 5 k: 4 In the first printfO call, the initial values of j and k are used, but once they have been used they are incremented and decremented, respectively. In contrast, the prefix increment and decrement operators modify their operands before they fetch the values: /* Program name is "inc.dec_example2" */ #include int maine { int j = printf( printf( } void 5, k 5; "j: %d\t k: %d\n", ++j, --k ); "J: %d\t k: %d\n", j, k ); The result of this version is: j: 6 j: 6 k: 4 k: 4 Code 4-107 increment and decrement operators In many cases, you are interested only in the side effect, not in the result of the expression. In these instances, it doesn't matter which operator you use. For example, as a stand-alone assignment, or as the third expression in a for loop, the side effect is the same whether you use the prefix or postfix versions: x++; is equivalent to: ++x; and the statement for (j = 0; j <= 10; j++) is equivalent to: for (j = 0; j <= 10; ++j) You need to be careful, however, when you use the increment and decrement operators within an expression. Consider the following function that inserts newlines into a text string at regular intervals. #inelude void break_line( int interval) { int e, j=O; while «e = geteharO) != '\n') { if «j++ % interval) printf ( "\n" ); putehar( e ); 0) } } This works because we use the postfix increment operator. If we were to use the prefix increment operator, the function would break the first line one character early. Precedence of Increment and Decrement Operators Note in Table 4-1 that the increment and decrement operators have the same precedence, but bind from right to left. So the expression, --j++ is evaluated as: --(j++) This expression is illegal because j++ is not an lvalue as required by the -- operator. In general, you should avoid using multiple increment or decrement operators together. 4-108 Code increment and decrement operators Bug Alert: Side Effects The increment and decrement operators and the assignment operators cause side effects. That is, they not only result in a value, but they change the value of a variable as well. A problem with side effect operators is that it is not always possible to predict the order in which the side effects occur. Consider the following statement: x = j * j++; The C language does not specify which multiplication operand is to be evaluated first. One compiler may evaluate the left-hand operand first, while another evaluates the right-hand operand first. The results are different in the two cases. If j equals 5, and the left-hand operand is evaluated first, the expression will be interpreted as: x = 5 * 5; /* x is assigned 25 */ If the right-hand operand is evaluated first, the expression becomes: x = 6 * 5; /* x is assigned 30 */ Statements such as this one are not portable and should be avoided. The side effect problem also crops up in function calls because the Clahguage'does not guarantee the call; order in which argumehtsare evaluated. F()r examplt{,'tllefuhction ,.". .-' ..... \':.:. f( " "" a, a++ )" is not portable because compilers are free to evaluate th.e~rgi.lments in any order they ' choose. To prevent side effect bugs; follow this rule:l/you use,asidee//ect operator in an expression,do not use the a//ectedvariableanywhere else inihe expression. The am- , biguousexpression, above;.for,instance, canbemad~Upambiguousby breaking it into twoassigIlmeri~s: ' ' .,((:;'.> ",: ,,; .. " ., .• ;:~,;.:\;~):;.:;:., Code 4-109 increment and decrement operators EXAMPLE /* Program name is "inc.dec_example3". */ #include int maine void ) { int n, m, n_total, m_total, perm, i, num; /* The following computes a permutation -- that is, * P(n,m) = n!/(n-m)! -- using decrement operators * to compute n! and (n-m)!) */ printf( "Enter the numbers for the permutation (n" ); printf( " things taken m at a time)\nseparated by a" ); printf( " space: " ); scanf ( "%d %d", &n, &m) n_total = m_total = 1; for (i = n; i > 0; i--) /* compute n! */ n_total *= i; for (i = n - m; i > 0; i--) /* compute (n-m)! m_total *= i; perm = n_total/m_total; printf( "P(%d,%d) = %d\n\n" , n, m, perm); /* */ This part shows the increment operator */ printf ("\nAnd now, the squares of 1 to 5:\n"); for (n = 1; n <= 5; n++) { num = n*n; printf( "%d\n", num ); } } USING THIS EXAMPLE If we execute this program, we get the following output: Enter the numbers for the permutation (n things taken m at a time) separated by a space: 4 3 P(4,3) = 24 And now, the squares of 1 to 5: 1 4 9 16 25 4-110 Code _LINE_ and _FILE_ (predefined symbols) Predefined symbols that expand to the current line number and source filename. FORMAT - Note that there are two underscores before and two underscores after each of these preprocessor symbols LINEFILE DESCRIPTION The preprocessor recognizes these special predefined symbols and replaces their occurrences with the following: LINE Expands to the source file line number on which it is invoked. Expands to the name of the file in which it is invoked. The _LINE_ and _FILE_ macros are valuable diagnostic tools. Suppose, for example, that you want a check facility that compares two expressions for equality and, if they are unequal, calls an error reporting function with the source filename and the line number of the check failure. #include #define CHECK( a, b ) \ i f «a) != (b» \ fail ( a, b, - FILE- LINE void fail( int a, int b, char *p, int line { printf( "Check failed in file %s at line %d:\ received %d, expected %d\n", p, line, a, b ); At various points in a program, you can check to make sure that a variable x equals zero by including the following diagnostic: CHECK(x, 0); Note that blank lines are included in the line count. Comment lines also are included. The symbol substitutions are performed before any other preprocessor commands. Consequently, _FILE_ and _LINE_ get defined before any #include statements that might change their values. Note that _LINE_ and _FILE_ are affected by the #line directive. Code 4-111 #line #line (preprocessor directive) Lets you set the current source line number. FORMAT #line integer ["filename"] /* first form * / #integer ["filename"] / * second form * / ARGUMENTS integer An integer constant. "filename" A filename enclosed in double quotes. DESCRIPTION The #line preprocessor directive allows you to set the compiler's knowledge of the current source line number and (optionally) current source file. The compiler reports errors in terms of the line numbers set by this option. In addition, the debugger line number table is built with these line numbers. The debugger source file option is given the last "filename" in the source, as long as that file truly exists. (The compiler verifies the existence of the source file before it creates the debug entry.) The word line may be omitted, as shown in the second form, but this feature is not portable. The optional filename must be enclosed in double quotes. The filename may be any legal pathname. EXAMPLE The following example illustrates the behavior of #line. /* Program name is "line_example". Example of #line * preprocessor directive. */ #include int main( void ) { printf( #line 100 printf( #line 200 printf ( 4-112 Code "Current line %d\nFilename: %s\n\n", LINE - FILE ) ; "Current line %d\nFilename: %s\n\n" , "new_name" "Current line %d\nFilename: %s\n\n" , LINE FILE ) ; LINE- - FILE- ) ; #line USING THIS EXAMPLE Assuming that the source file for this program is called line_example.c, execution produces: Current line: 7 Filename: line_example.c Current line: 101 Filename: line_example.c Current line: 201 Filename: new_name The #line feature is particularly useful for programs that produce C source text. For instance, yacc (which stands for Yet Another Compiler Compiler) is a UNIX utility that facilitates building compilers. The yacc utility reads files written in the yacc language and produces a file written in the C language, which can then be compiled by a C compiler. A problem arises, however, if the C compiler encounters an error in the yacc-produced C file. You want to know which line in the original yacc file is causing the error, but the C compiler will report the error-producing line in the C text file. To solve this problem, yacc writes #line directives in the C source file so that the compiler is fooled into reporting errors based on the yacc line numbers rather than the C line numbers. Code 4-113 #list and #nolist #list and #nolist (preprocessor directives) Enables and disables the listing of source code in the listing file. (Domain Extension) FORMAT #list #nolist DESCRIPTION The #list preprocessor directive enables the listing of source code in the listing file. while #nolist inhibits the listing of source code. For example. this sequence of preprocessor directives #nolist #include "/my_insert_files/beth.ins.c" #list excludes the contents of /my_insert_files/beth.ins.c from the source listing. Note that #list and #nolist have no effect on the compilation; they only affect the source listing file. The default is #list. 4-114 Code logical operators logical operators Logical AND, OR, and NOT operators. FORMAT expl && exp2 expJ II exp2 !expl Logical AND Logical OR Logical NOT ARGUMENTS expl Any expression. exp2 Any expression. DESCRIPTION The logical AND operator (&&) and the logical OR operator (Ii) evaluate the truth or falseness of pairs of expressions. The AND operator evaluates to 1 if and only if both expressions are true. The OR operator evaluates to 1 if either expression is true. To test whether y is greater than x and less than z, you would write: (x < y) && (y < z) The logical negation operator (!) takes only one operand. If the operand is true, the result is false; if the operand is false, the result is true. Recall that in C, true is equivalent to any nonzero value, and false is equivalent to zero. Table 4-9 shows the logical tables for each operator, along with the numerical equivalent. Note that all of the operators return 1 for true and 0 for false. Table 4-9. Operand zero nonzero zero nonzero Truth Table for C's Logical Operators Operator && && && && nonzero II II II II not applicable I I zero nonzero zero Operand Result zero 0 zero nonzero 0 0 1 nonzero zero zero nonzero 0 nonzero 1 zero 1 nonzero 0 1 1 Code 4-115 logical operators The operands to the logical operators may be integers or floating-point objects. The expression 1 && -5 results in 1 because both operands are nonzero. The same is true of the expression 0.5 && -5 Logical operators (and the comma and conditional operators) are the only operators for which the order of evaluation of the operands is defined. The compiler must evaluate operands from left to right. Moreover, the compiler is guaranteed not to evaluate an operand if it's unnecessary. For example, in the expression if «a != 0) && (bja == 6.0» if a equals zero, the expression (b/a == 6) will not be evaluated. This rule can have unexpected consequences when one of the expressions contains side effects (See the Bug Alert in this section.) Table 4-10 shows a number of examples that use relational and logical operators. Note that the logical NOT operator has a higher precedence than the others. The AND operator has higher precedence than the OR operator. Both the logical AND and OR operators have lower precedence than the relational and arithmetic operators. Table 4-10. Examples of Expressions Using the Logical Operators Given the following declarations: int j = 0, m = 1, n = -1; float x = 2.5, y = 0.0; Expression j && m j = 1 && m Ix II In II m + n x*y y) + !j II n++ (j II m) + (x II ++n) 4-116 Code Equivalent Expression U) && (m) o< m) && (n < m) (m + n) II (Ij) «x· 5) && 5) II (m I n) (0 <= 10) && (x >= 1» && m «Ix) II (!n» II (m + n) «x • y) < 0 + m» II n «x>y) + (!j) II (n++) 011 m) + (x II (++n» Result 0 1 1 1 1 0 1 1 2 logical operators Bug Alert: Side Effects in Relational Expressions Relational operators (and the conditional and comma operators) are the only operators for which the order of evaluation of the operands is defined. For these operators, operands must be evaluated from left to right. However, the system evaluates only as much of a relational expression as itneeds to determine the result. In many cases, this means that the system does not need to evaluate the entire expression. For instance, consider the following expression: if «a < b) && (c == d» The system begins by evaluating (a < b). If a is not less than b, the system knows that the entire expression is false, so it will not evaluate (c == dJ. This can cause problems if some of the expressions contain side effects: if «a < b) && (c == d++» In this case, d is only incremented when ais less thanb, This mayor may not be what the programmer intended. In generaI,you should avoid using side effect operators in relational· expressions. EXAMPLE Program name is "logical_ops_example". This program shows how logical operators are used. Notice that several logical expressions can be strung together to create multiple conditions. Also notice how the NOT operator (!) is used. In the program itself, the integer variables are initialized to zero, which C evaluates as being false. Then, if a question is answered "yes", the appropriate variable is reset to 1. C considers a nonzero value to be true. /* * * * * * * * * * */ #include int maine void ) { int won_lottery, enough_vacation, money_saved; char answer; won_lottery = enough_vacation = money_saved = 0; printf("\nThis program determines whether you can" ); printf( "take your next vacation in Europe.\n" ); printf( "Have you won the lottery? y or n: " ); fflush ( stdin ); scanf ( "%c", &answer ); if (answer == 'y') Code 4-117 logical operators won_lottery = 1; y y printf( "00 you have enough vacation days saved? \ or n: II ) ; fflush( stdin ); scanf ("%C ", &answer); if (answer == 'y') enough_vacation = 1; printf( "Have you saved enough money for the trip? \ or n: II ) ; fflush ( stdin ); scanf ( "%C ", &answer ); if (answer == 'y') money_saved = 1; printf ( "\n" ); if (won_lottery) { printf("Why do you need a program to decide if you"); printf( II can afford a trip to Europe?\n" ); } /* end if */ if (won_lottery I I (enough_vacation && money_saved» printf( "Look out Paris!\n" ); else if (enough_vacation && (!money_saved» printf( "You've got the time, but you haven't got \ the dollars.\n" ); else i f (!enough_vacation II (!money_saved» { printf( "Tough luck. Try saving your money and printf( "vacation days next year.\n" )~ } /* end else/if */ II ); } USING THIS EXAMPLE If we execute this program, we get the following output: This program determines whether you can take your next vacation in Europe. Have you won the lottery? y or n: y Do you have enough vacation days saved? y or n: n Have you saved enough money for the trip? y or n: n Why do you need a program to decide if you can afford a trip to Europe? Look out Paris! 4-118 Code Umodule #module (preprocessor directive) Changes the name of the object module for debugging purposes, and, optionally, lets you define procedure and data section names. (Domain Extension) FORMAT #module module_name [, psect_name [, dsect_name]] ARGUMENTS An identifier that serves as the new name of the module. An optional identifier. This is the name of the procedure section that the code will go into. An optional identifier. This is the name of the data section that the data will go into. DESCRIPTION The Umodule directive serves two purposes. First, it enables you to change the name of the object module for debugging purposes. Second, it allows you to define a procedure and data section for the code in the file. By defining sections, you can have some control over how the linker groups data and instructions in memory. This is described in more detail in the description of #section later in this chapter. There may be at most one #module statement per source file and it must appear before any other tokens, with the exception of the #systype directive. The module_name is required, but both psect_name and dsect_name are optional. If you include a psect_name or dsect_name, the specified section names are active until the end of the file or until a Usection statement redefines one or both of the names. The following examples illustrate the legal syntaxes of #module. #module example #module example, proc_a /* defines "example" as the module name */ /* defines "proc_a" as the procedure * section */ #module example, proc_a, data a #module example, ,data_a /* * * /* defines "proc_a" as the * procedure section and "data a" * as the data section */ defines "data a" as the data section, but uses the default name for the procedure section */ Code 4-119 #module The dde and dbx utilities-the Domain/OS language-level debuggers-use the module name as the starting point when they search for functions and static variable names. If you do not use #module, the compiler uses the source filename in uppercase, with underscores replacing any periods. For example, the default module name for test.1.e is test_I_c. See the Domain Distributed Debugging Environment (DomainIDDE) Reference manual for more information on accessing identifiers while debugging. 4-120 Code #nolist #nolist Refer to the #Iist entry earlier in this chapter. (Domain Extension) Code 4-121 pointer operations pointer operations Operations performed with pointers. DESCRIPTION A pointer variable is a variable that can hold the address of an object. Chapter 3 describes how to declare pointer variables. Here, we describe how to use pointer variables in the code portion of your program. We discuss pointers to functions in Chapter 5. We start with a discussion of the two principal pointer operations-finding an address and de referencing a pointer. Assigning an Address Value to a Pointer To declare a pointer variable, you precede the variable name with an asterisk. The following declaration, for example, makes ptr a variable that can hold addresses of long int variables. long *ptr; The data type, long in this case, refers to the type of variable that ptr can point to. To assign a pointer variable with the virtual address of a variable, you can use the address-of operator &. For instance, the following is legal: long *ptr; long long_var; ptr = &long_var; /* Assign the address of long_var to ptr. */ But this is illegal: long *ptr; float float_var; ptr = &float_var; /* ILLEGAL - because ptr can only store the * address of a long into */ The following program illustrates the difference between a pointer variable and an integer variable: /* Program name is "ptr_example1". */ #include int main( void { int j = 1; int *pj; pj = &j; /* Assign the address of j to pj */ printf ( "The value of j is: %d\n" , j ); printf ( "The address of j is: %d\n" , pj ); } 4-122 Code pointer operations The result is: The value of j is: 1 The address of j is: 5219405 Dereferencing a Pointer To dereference a pointer (get the value stored at the pointer address), use the The following program shows how dereferencing works. * operator. /* Program name is "ptr_example2". */ #include int main( void { char *p_ch; char ch1 = 'A', ch2; printf ( "The address of p_ch is %d\n", &p_ch ); p_ch = &ch1; printf( "The value of p_ch is %d\n". p_ch ); printf( "The dereferenced value of p_ch is %c\n" , *p_ch ); } The output from running this program is: The address of p_ch is 52194052 The value of p_ch is 52194050 The dereferenced value of p_ch is A This is a roundabout and somewhat contrived example that assigns the character 'A' to both chi and ch2. It does, however, illustrate the effect of the dereference (*) operator. The variable chi is initialized to 'A'. The first printfO call displays the address of the pointer variable p_ch. In the next step, p_ch is assigned the address of chi, which is also displayed. Finally, we display the dereferenced value of p_ch and assign it to ch2. The expression *p_ch is interpreted as: "take the address value stored in p_ch and get the value stored at that address." This gives us a new way to look at the declaration. The data type in the pointer declaration indicates what type of value results when the pointer is dereferenced. For instance, the declaration float *fp; means that when *fp appears as an expression, the result will be a float value. The expression *fp can also appear on the left side of an expression: *fp = 3.15; Code 4-123 pointer operations In this case, we are storing a value (3.15) at the location designated by the pointer fp. Note that this is different from fp = 3.15; which attempts to store the address 3.15 in Cp. This, by the way, is illegal since addresses are not the same as integers or floating-point values. When assigning a value through a dereferenced pointer, it is important to make sure that the data types agree. Consider the following case: /* Program name is "ptr_example3". */ #include int main( void ) { 1.17e3, g; float f int *ip; ip = &f; g = *ip; printf( "The value of f is: %f\n", f ); printf( "The value of g is %f\n", g ); } The result is: The value of f is: 1170.000000 The value of g is: 1150435328.000000 In the preceding example, instead of getting the value of C, g gets an erroneous value because ip is a pointer to an int, not a float. The Domain C compiler issues a warning message when a pointer type is unmatched. If you compile the preceding program, for instance, you receive the message: (0005) ip = &f; ******** Line 5: Warning: Illegal pointer combination: incompatible types. No errors, 1 warning, C Compiler, Rev X.yy Pointer Arithmetic The following arithmetic operations with pointers are legal: • You may add an integer to a pointer or subtract an integer from a pointer. • You may use a pointer as an operand to the ++ and -- operators. • You may subtract one pointer from another pointer. All other arithmetic operations with pointers are illegal. 4-124 Code pointer operations When you add or subtract an integer to a pointer, the compiler automatically scales the integer to the pointer's type. In this way, the integer always represents the number of objects to jump, not the number of bytes. For example, consider the following program fragment: int x[10], *p1x x, *p2x; p2x = p1x + 3; Since pointer pIx points to a variable (x) that is 4 bytes long, then the expression pIx + 3 actually increments p Ix by 12 (4 * 3), rather than by 3. It is legal to subtract one pointer value from another, provided that the pointers point to the same type of object. This operation yields an integer value that represents the number of objects between the two pointers. If the first pointer represents a lower address than the second pointer, the result is negative. For example, &a[3] - &a[O] evaluates to 3, but, &a[O] - &a[3] evaluates to -3. It is also legal to subtract an integral value from a pointer value. This type of expression yields a pointer value. The following examples illustrate some legal and illegal pointer expressions: long *p1, *p2; int a[5], j; char *p3; p1 = a; p2 = p1 + 4; j = p2 - p1; j = p1 - p2; p1 = p2 - 2; p3 = p1 - 1; j p1 - p3; j = p1 + p2; /* Same as p1 = &a[O] */ /* legal */ j is assigned 4 */ /* legal j is assigned -4 */ /* legal p2 points to a[2] */ /* legal different pointer types*/ /* ILLEGAL different pointer types*/ /* ILLEGAL can't add pointers */ /* ILLEGAL Arrays and Pointers Arrays and pointers have a close relationship in the C language. You can exploit this relationship in order to write more efficient code. See the discussion of "array operations" in this chapter for more information. Casting a Pointer's Type A pointer to one type may be cast to a pointer to any other type. For example, in the following statements, a pointer to an iot is cast to a pointer to a char. Presumably, the function funcO expects a pointer to a char, not a pointer to an into Code 4-125 pointer operations int i, *p = &i; func ( (char *) p); As a second example, a pointer to a char is cast to a pointer to struct H: struct H {int qi} char *genp = &x; Xi (struct H*)genp->q See the "casting operations" listing of this encyclopedia for more information about the cast operator. It is always legal to assign any pointer type to a generic pointer, and vice versa, without a cast. For example: float x, *fp = &x; int j, *pj &j; void *pv; pv pj fp; pv; /* legal */ /* legal */ In both these cases, the pointers are implicitly cast to the target type before being assigned. See Section 3.7.3 for more information about generic pointers. Assigning an Integer Value to a Pointer You may assign an integer value to a pointer, but programs that use this feature are not portable. The following statements assign absolute address OXFFF13000 to a pointer called abs_address. char *abs_address; abs_address = (char *) OXFFF130000; This feature is generally used to map variables to hardware registers whose addresses are fixed. Null Pointers The C language supports the notion of a null pointer-that is, a pointer that is guaranteed not to point to a valid object. A null pointer is any pointer assigned the integral value zero. For example: char *p; p = 0; /* make p a null pointer */ In this one case-assignment of zero-you do not need the pointer type. 4-126 Code to cast the integral expression to pointer operations Null pointers are particularly useful in control-flow statements since the zero-valued pointer evaluates to false, whereas all other pointer values evaluate to true. For example, the following while loop continues iterating until p is a null pointer: char *p; while (p) { /* iter~te until p is a null pointer */ } This use of null pointers is particularly prevalent in applications that use arrays of pointers, as described later in this chapter. The compiler does not prevent you from attempting to dereference a null pointer; however, doing so may trigger a run-time access violation. Therefore, if it is possible that a pointer variable is a null pointer, you should make some sort of test like the following when dereferencing it: i f (px && *px) /* if px = 0, expression will short-circuit before dereferencing occurs*/ Null pointers are a portable feature. Code 4-127 pointer operations EXAMPLE 1 /* * * * * * Program name is "pointer_examplel". This program shows how to access a one-dim. array through pointers. Function count_chars returns the number of characters in the string passed to it. Note that *arg is equivalent to a_word [0] ; *arg + 1 is equivalent to a_word[l] ... */ #include int count_chars( char *arg ) { int count = 0; while (*arg++) count++; return count; } int main( void ) { char a_word [30] ; int number_of_characters; printf ( "Enter a word -- " ); scanf ( "%s", a_word ); number_of_characters = count_chars( a_word ); printf( "%s contains %d characters.\n", a_word, number of characters ); } 4-128 Code pointer operations EXAMPLE 2 /* Program name is "pointer_example2". This program * demonstrates two ways to access a two-dim. array. */ #include int maine void ) { int count = 0, which_name; char c1, c2; static char str[5] [10] {"Phil", "Sandi", "Barry", "David", "Amy"}; static char *pstr[5] {str[O], str[l], str[2], str[3], str[4]}; /* pstr is an array of pointers. Each element in the array * points to the beginning of one of the arrays in str. */ /* Prompt for information. */ printf( "Which name do you want to retrieve?\n" ); printf( "Enter 0 for the first name,\n" ); printf( " 1 for the second name, etc. -- " ); scanf( "%d" , &which_name ); /* Print name directly through array. */ while (cl = str[which_name] [count++]) printf( "%c", cl ); printf("\n") ; /* Print same name indirectly through an array of pointers. */ while (c2 = *(pstr[which_name]++» printf("%c", c2); /* We could have also used the following statement instead of * the two previous ones: printf ( "%s", pstr [which_name] ); */ printf ( "\n" ); } USING THESE EXAMPLES If we execute the first program, we get the following output: Enter a word -- Marilyn Marilyn contains 7 characters. If we execute the second program, we get the following output: Code 4-129 pointer operations Which name do you want to retrieve? Enter 0 for the first name, 1 for the second name, etc. -- 1 Sandi Sandi 4-130 Code predefined macros predefined macros Provide information about the compiler or compilation environment. DESCRIPTION The Domain compilers support a number of predefined macros that provide information about the compiler or about the compilation environment. In addition to the macros described in this section, Domain C also supports the following predefined macros: _FILE_ Expands to the source file name. Expands to the current line number in the source file. - DATE Expands to the current date (of compilation). - Expands to the current time (of compilation). Expands to 1 if ANSI -style function prototyping is in effect. - BFMT- COFF Expands to 1 if the compiler is producing COFF object code. For more information about the _FILE_ and _LINE_ macros, see the entry under _FILE_ in this chapter. For more information about the _DATE_ and _TIME_ macros, see the entry under _DATE_ in this chapter. For more information about _STDC_ and _BFMT_COFF, see the entry under _STDC_ later in this chapter. Code 4-131 relational operators relational operators Compare the values of two expressions. FORMAT expJ expJ expJ expJ expJ expJ > exp2 >= exp2 < exp2 <= exp2 == exp2 != exp2 Greater than Greater than or equal to Less than Less than or equal to Equal to Not equal to ARGUMENTS expJ Any expression. exp2 Any expression. DESCRIPTION The relational operators perform the same way in C as they do in everything from fourthgrade arithmetic to Advanced Programming II. The two that are slightly unusual are == and !=, but even in these cases the differences are a matter of form, not substance. The equality operator (==) performs the same function as Pascal's = or FORTRAN's .EQ.; It just looks different. Note that although the equality operator looks similar to the assignment operator (=;,), the two operators serve completely different purposes. Use the assignment operator when you want to assign a value to a variable, but use the equality operator when you want to test the value of an expression. Bug Alert: Confusing = with == One of the most common mistakes made by beginners and experts alike is to use the assignment operator (=) instead of the equality operator (==). For .instance: while ( j = 5) do_something(); What is intended, clearly, is that the do_somethingO function should. only be invoked if j equals five. It should been written: while 0== 5) do_something 0 ; ... Note that the first version issyntactically legal since all e~preSSidn& have a valUe .. The v'alue.of theexptession j= 5 is 5; . Since this is a nonzero value, the while expression . will always evaluate to. apd dO:::.,somethil}gOwiII always be invoked. true 4-132 Code #section Note, however, that #section directives do not affect variables with fixed duration. Static data that has file scope resides in the module's section regardless of any #section directives. All global variables reside in special sections that cannot be affected by #section directives. If you are compiling with Ibin/cc, initialized global variables are placed in .data and uninitialized global variables are put in .bss. If you are compiling with Icarn/cc, the compiler creates a named section for each global variable. You can override these defaults by using the #attribute[sectian] modifer, which is described in Chapter 3. The following example illustrates the #section directive. #module section_example, psectionl , dsectionl mainO { } #section(psection2) /* dsectionl is still the active data section */ void funcl() { } #section(psectionl, dsection2) void func2() { } #section(, ,dsectionl) void func3() { } The preceding example creates four named sections that contain the program segments shown in the following chart. Code 4-141 #section Table 4-13. Example of #section Directive Named Section 4-142 Code What It Contains psectionl program instructions from mainO, func2(), and func3() psection2 program instructions from func1 () dsectionl data from mainO, func10, and func3() dsection2 data from func20 relational operators Note that all of these operators have lower precedence than the arithmetic operators. The expression, a + b * e < d / f is evaluated as if it had been written: (a + (b * e» < (d / f) Among the relational operators, >, >=, <, and <= have the same precedence. The == and != operators have lower precedence. All of the relational operators have left-to-right associativity. Table 4-11 illustrates how the compiler parses complex relational expressions. Table 4-11. Examples of Expressions Using the Relational Operators Given the following declarations: int j = 0, m = 1, n = -1; float x = 2.5, y = 0.0; Expression Equivalent Expressions j > m m / n = n j<=x==m -x+j==y>n>m x += (y >= n) ++j == m 1= y • 2 Result j > m (m / n) < x «j <= m) >=n) (0 <= x) == m) «-x) + j) == «y > n) >= m) x = (x + (y >= n)) «++j) == m) != (y * 2) 0 1 1 1 0 3.5 1 Relational expressions are often called Boolean expressions, in recognition of the nineteenth century mathematician and logician, George Boole. Many programming languages, such as Pascal, have Boolean data types for representing true and false. The C language, however, represents these values with integers. Zero is equivalent to FALSE, and any nonzero value is considered true. The value of a relational expression is an integer, either 1 (indicating the expression is true) or 0 (indicating the expression is false). The examples in Table 4-12 illustrate how relational expressions are evaluated. Table 4-12. Relational Expressions Expression Value -1 < 0 o> 1 0 1 5 = 5 7 != -3 1 >= -1 1 >10 1 1 1 0 Code 4-133 relational operators Because Boolean values are represented as integers, it is perfectly legal to write: i f (j) statement; If j is any nonzero value, statement is executed; if j equals zero, statement is skipped. Likewise, the statement, if' (isalpha( ch » is exactly the same as: if (isalpha( ch ) != 0) The practice of using a function call as a Boolean expression is a common idiom in C. It is especially effective for functions that return zero if an error occurs, since you can use a construct such as: i f (func (» proceed; else error handler; 4-134 Code relational operators Bug Alert: Comparing Floating-Point Values It is very dangerous to compare floating-point values for equality because floatingpoint representations are inexact for some numbers. For example, the following expression, though algebraically true, will evaluate to false on most computers: (1.0/3.0 + 1.0/3.0 + 1.0/3.0) == 1.0 This evaluates to 0 (false) because the fraction 1.0/3.0 contains an infinite number of decimal places (3.33333 ... ). The computer is only capable of holding a limited number of decimal places, so it rounds each occurrence of 1/3. As a result, the left-hand side of the expression does not equal 1.0 exactly. This problem can occur in even more subtle ways. Consider the following code: double divide( double num, double denom ) { return num/denom; } int maine void ) { double c, a = 1.0, b = 3.0; c = alb; if (c != divide(a,b» printf("Fuzzy doubles\n"); } Surprisingly, the value stored in c will not equal the value returned by divideO. This anomaly occurs due to the fact that the computer can represent more decimal places for values stored in registers than for values stored in memory. Because the value returned by divideO is never stored in memory, it is not equal to the value c, which has been rounded for memory storage. To avoid bugs caused by inexact /loating-point representations, you should refrain from using strict equality comparisons with floating-point types. Code 4-135 relational operators EXAMPLE Program name is "relational_example". This program simply does some mathematical calculations and along the way shows C's relational operators in action. /* * * */ #include int main( void { int num, i; printf ( "\n" ); num = 5; printf ( "The number is: %d\n", num ); for (i = 0; i <= 2; i++) { i f (num < 25) { num *= num; printf( "The number squared is: %d\n", num ); } else if (num 25) { num *= 2; printf( "Then, when you double that, you get: %d\n", num ); } else if (num > 25) { num -= 45; printf( "And when you subtract 45, you're back where" ); printf( "you started at: %d\n", num ); /* end for */ i f (num ! = 5) printf( "The programmer made an error in setting up this \ example\n") ; } USING THIS EXAMPLE If we execute this program, we get the following output: The number is: 5 The number squared is: 25 Then, when you double that, you get: 50 And when you subtract 45, you're back where you started at: 5 4-136 Code return return The mechanism for exiting from a called function. FORMAT return; return exp; /* /* first form second form */ */ ARGUMENTS Any valid C expression. exp DESCRIPTION The return statement causes a C program to exit from the function containing the return and go back to the calling block. It mayor may not have an accompanying exp to evaluate. If there is no exp, the function returns an unpredictable value. Functions can return only a single value directly via the return statement. The return value can be any type except an array or function. This means that it is possible to indirectly return more than a single value by passing a pointer to an aggregate type. It is also possible to return a structure or union directly, though Domain C implements this by returning a pointer to the structure or union. A function may contain any number of return statements. The first one encountered in the normal flow of control is executed, and causes program control to be returned to the calling routine. If there is no return statement, program control returns to the calling routine when the right brace of the function is reached. In this case, the value returned is undefined. The return value must be assignment-compatible with the type of the function. This means that the compiler uses the same rules for allowable types on either side of an assignment operator to determine allowable return types. For example, if fO is declared as a function returning an int, it is legal to return any arithmetic type, since they can all be converted to an int. It would be illegal, however, to return an aggregate type or a pointer, since these are incompatible types. The following example shows a function that returns a float, and some legal return values. float f( void { float f2; int a; char c; f2 = a; return a; f2 = c; return c; /* OK, quietly /* OK, quietly /* OK, quietly /* OK, quietly converts converts converts converts a a c c to to to to float float float float */ */ */ */ } Code 4-137 return The C language is pickier about matching pointers. In the following example. fO is declared as a function returning a pointer to a char. Some legal and illegal return statements are shown. char *f () { char **cpp, *cpl, *cp2, ca[lO]; int *ipl, *ip2; cpl = cp2; return cp2; cpl = *cpp; return *cpp; /* * /* /* /* /* OK, OK, OK, OK, types types types types match match match match */ */ */ */ An array name without a subscript gets converted to a pointer to the first element. */ cpl = ca; return ca; /* OK, types match */ /* OK, types match */ cpl /* /* /* /* /* /* /* /* = *cp2; return *cp2; cpl = ipl; return ipl; return; Error, mismatched types (pointer to char vs. char Error, mismatched types (pointer to char vs. char Error, mismatched pointer types Error, mismatched pointer types Produces undefined behavior should return (char *) */ */ */ */ */ */ */ */ Note in the last statement that the behavior is undefined if you return nothing. The only time you can safely use return without an expression is when the function type is void. Conversely, if you return an expression for a function that is declared as returning void. you will receive a compile-time error. 4-138 Code return EXAMPLE /* * Program name is "return_example". This program finds the length of a word that is entered. */ #include int find_length ( char *string ) { int i; for (i = 0; string[i] != '\0'; i++) return i; } int main( void ) { char string[132]; int result; printf( "This program finds the length of any word you "); printf( "enter.\n" ); printf( "Enter the word: " ); gets( string ); result = find_length ( string ); printf( "This word contains %d characters.\n", result ); } USING THIS EXAMPLE If we execute this program, we get the following output: This program finds the length of any string you enter. Enter the string: Copenhagen The string is 10 characters. Again? y Enter the string: galaxy The string is 6 characters. Again? n Code 4-139 #section #section (preprocessor directive) Directs the binder to place code and data into the specified sections. (Domain Extension) FORMAT #section( [psect_name,] dsect name) #section( psect_name ['dsect_name] ) ARGUMENTS psect_name Optional, but you must include a psect_name or a dsect_name, or both. If you do include a psect_name, it must be an identifier. This identifier is the name of the procedure section that the code will go into. Optional, but you must include a psect_name or a dsect_name, or both. If you do include a dsect_name, it must be an identifier. This identifier is the name of the data section that the data will go into. DESCRIPTION The #section directive instructs the linker to place instructions and data into named sections rather than the default sections. Every object module is composed of at least three sections: a procedure section, a data section, and a debug section. By default, the name of the procedure section is .text, and the names of the data sections are .data and .bss. The #section directive, as well as the #module directive, allow you to create additional sections. You can use this capability to group together code or data that is used frequently. This way the system need not swap extra pages in and out of memory to execute a program. For more information about sections and the object file format, see the Domain/OS Programming Environment Reference manual. Note that the following preprocessor directive is illegal: #sectionO #section directives may appear anywhere in a file except within a function. Section names defined in a #section directive are in effect until the end of the file or until another #section directive redefines the current section names. By specifying the same section names in different source files, you can ensure that the resulting object code is grouped 'together in virtual memory. 4-140 Code sizeof sizeof Unary operator that finds the size of an object. FORMAT sizeof exp; sizeof (type_name) ARGUMENTS This is any expression. exp This is the name of a predefined or user-defined data type, or the name of some variable. An example of a predefined data type is int. A user-defined data type could be the tag name of a structure. DESCRIPTION The sizeof operator accepts two types of operands: an expression or a data type. However, the expression may not have type function or void, or be a bit field. Moreover, the expression itself is not evaluated-the compiler only determines what type the result would be. Any side effects in the expression, therefore, will not have an effect. The result type of the sizeof operator is unsigned int. If the operand is an expression, sizeof returns the number of bytes that the result occupies in memory: /* Returns the size of an int (4 if ints are four * bytes long) */ sizeof(3 + 5) /* Returns the size of a double (8 if doubles are * eight bytes long) */ sizeof(3.0 + 5) For expressions, the parentheses are optional, so the following is legal: sizeof x By convention, however, the parentheses are usually included. The operand can also be a data type, in which case the result is the length in bytes of objects of that type: sizeof(char) sizeof(short) sizeof(float) sizeof(int *) /* /* /* /* 1 on all machines */ 2 on Domain machines */ 4 on Domain machines */ 4 on Domain machines */ Code 4-143 sizeof The parentheses are required if the operand is a data type. Note that the results of most sizeof expressions are implementation dependent. The only result that is guaranteed is the size of a char, which is always 1. In general, the sizeof operator is used to find the size of aggregate data objects such as arrays and structures. EXAMPLE You can also use the sizeof operator to obtain information about the sizes of objects in your C environment. The following, for example, prints the sizes of the basic data types: /* * Program name is "sizeof_example". This program demonstrates a few uses of the sizeof operator. */ #include int main( { printf ( printf( printf( printf( printf( printf( void ) "TYPE\ t \ tSIZE\n \n" ); "char\t\t%d\n", sizeof(char) ); "short\t\t%d\n", sizeof(short) ); "int\t\t\t%d\n", sizeof(int) ); "float\t\t%d\n", sizeof(float) ); "double\t\t%d\n", sizeof(double) ); } USING THIS EXAMPLE If we execute this program, we get the following output: 4-144 Code TYPE SIZE char short int float double 1 2 4 4 8 - STDC- and BFMT COFF _STDC_ and BlFMT COFF (predefined names) FORMAT If equal to 1. indicates that this compiler conforms to the ANSI standard. BFMT COFF (Note that there are two underscores before and two underscores after this preprocessor symbol.) If defined. indicates that this compiler generates COFF DESCRIPTION The _STDC_ macro. if it expands to 1. signifies that the compiler conforms to the ANSI Standard. If it expands to any other value. or if it is not defined. you should assume that the compiler does not conform to the ANSI standard. A common use of _STDC_ is to choose between the old function declaration syntax and the new ANSI prototyping syntax: Hif (_STDC_ == 1) extern int foo( char a. float b ); extern *char goo( char *string ); HeIse extern int foo(); extern *char goo(); Hendif If the compiler conforms to the ANSI standard LSTDC_ equals 1). we use the prototyping syntax to declare the types of each argument. Otherwise. we use the old function declaration syntax. By default. _STDC_ is defined unless you compile with the -ntype option (available with Icom/cc only). The _BFMT_COFF macro will be defined as 1 for compilers that generate COFF (as opposed to obj) code. For compilers that do not produce COFF. the macro will be undefined. Therefore. you can test the compiler with either an #if or an #ifdef directive: Hifdef BFMT__COFF /* Use Hat tribute to create overlay * section */ struct { int a; float b; } overlay Hattribute[section(overlay)] ; HeIse struct { int a; float b; } overlay; Hendif Code 4-145 structure and union operations structure and union operations Operations that can be performed on structures and unions. and structure and union members DESCRIPTION In Chapter 3, we explained how to define structure and union variables. In this section. we show how to use structure and union variables in the body of a function. Domain C allows the following uses of structures and unions: • You can reference a member of a structure or union. • You can find the address of a structure or union with the address-of operator &. • You can find the size of a structure or union with the sizeof operator. • You can assign a structure or union to another structure or union of the same type. • You can define a function that returns a structure or union. • You can pass a structure or union as an argument to a function. The following sections detail these uses. Referencing Structure and Union Members There are two methods for referencing a member of a structure or union. depending on whether you have the structure itself or a pointer to the structure. Each method uses a special operator. If you have the structure itself. you can enter the structure name and field name separated by the dot (.) operator. For instance. suppose you make the following declaration: struct vitalstat { char vs_name[19]. vs_ssnum[II]; short vs_month, vs_day, vs_year; } vs. *pvs = &vs; To assign the date, March 15, 1987 to vs, you would write: vs.vs_month = 3; vs.vs_day = 15; vs.vs_year = 1987; The referenced field expression is just like any other variable, so you can use vs. vs_month anywhere you would normally use a short variable. 4-146 Code structure and union operations The following statement, for instance, is perfectly legal: if (vs.vs_month > 12 I I vs.vs_day > 31) printf( "Illegal Date.\n" ); The other way to reference a structure member is indirectly through a pointer to the structure. To reference a member through a pointer, use the right-arrow operator (-», which is formed by entering a dash followed by a right angle bracket. For example: if (pvs->vs_month > 12 I I pvs->vs_day > 31) printf( "Illegal Date.\n" ); The right-arrow operator is actually a shorthand for dereferencing the pointer and using the dot operator. That is, pvs->vs_day is the same as: (*pvs).vs_day The pointer to a struct or union is usually a pointer variable, but Domain C also allows it to be an integer that contains the absolute address of a structure or union. (Using an integer in this context is not a portable feature; trying it triggers a warning.) Operations on Structure and Union Members In general, you can perform any operation on a structure member that you can on a normal variable of the same type. The only restriction is that you may not take the address of a bit field. Structure and Union Assignment Although it is not supported in the original K&R standard, Domain C and the ANSI Standard allow you to assign a structure or union to a structure or union variable, provided they share the same type. The following code extract shows some examples of structure assignments. struct { int a; float b; } s1, 52, sf(), *P5; 51 52 52; 5f () ; p5 &51; s2 *P5; Referencing Nested Members Domain C allows you to access nested members of structures and unions without specifying the inner structure name. You need only enter the outer structure name and the member name you want. Code 4-147 structure and union operations Consider the following nested structure: struct { int a; struct { float b,c; } in; } out; Domain C provides two ways to access component b. First, you can use the traditional C method as shown below: out.in.b Second, you can leave out the inner structure name, as in out.b If the same name appears more than once in a structure with inner structures and you give only the component name, the compiler warns you that the reference is ambiguous. For example, consider the following definition: struct { union { int a,b; } first_union; union { char a,b; } second_union; } outer_struct; The reference outer_struct.a is ambiguous since it is not clear whether it refers to outer_struct.first_union.a or outer_struct.second_union.a. If you use outer_struct.a as a reference, the compiler issues the following warning message: Warning: Ambiguous reference; more than one member named "a". NOTE: This feature is not supported by the ANSI standard. Use the -std compiler option to identify these nonportable usages in source code. Passing Structures as Function Arguments There are two ways to pass structures as arguments: pass the structure itself (called pass by value) or pass a pointer to the structure (called pass by reference). The two methods are shown in the following example. 4-148 Code structure and union operations VITALSTAT vs; func( vs ); func( &vs); /* * */ /* * */ Pass by value -- Passes an entire copy of the structure. Pass by reference -- Passes the address of a structure. Passing the address of a structure is usually faster because only a single pointer is copied to the argument area. Passing by value, on the other hand, requires that the entire structure be copied. There are only two circumstances when you should pass a structure by value: o The structure is very small (approximately the same size as a pointer). o You want to guarantee that the called function does ing passed. (When an argument is passed by value, of the argument for the called function. The called value of the copy, not the value of the argument on not change the structure bethe compiler generates a copy function can only change the the calling side.) In all other instances, you should pass structures by reference. NOTE: Passing structures by value, though supported in almost all C compilers, is not part of the original K&R standard. It is required by the ANSI Standard. Depending on which method you choose, you need to declare the argument on the receiving side as either a structure or a pointer to a structure: func( vs ) VITALSTAT vs; /* Pass by value -- the argument is a * structure. */ or func( pvs VITALSTAT *pvs; /* Pass by reference -- the argument * is a pointer to a structure. */ Note that the argument-passing method you choose determines which operator you should use in the function body-the dot operator if a structure is passed by value, and the rightarrow operator if the structure is passed by reference. Code 4-149 structure and union operations BllgAIert:"P~~·~in~~il'l1~ture~·'t~!Passirt~;'~ay~ . Passing structtires. isnot~~,~ameaspassing arrays}f.:.Thisjnc:orisist~ncyjn the C ··latl,:< . ,' guage-~,can--_:_~Cluse_ conJus~qJ~L" ~ :"--\ '; ··c, •. • pass •• ,.:~ array··in ~ri& sirnplY 'spe2dify.··the·• .·arr;~y·nanle •.·Witnouta··subscript..... Tne··· compilerinterpretsthenarne as a point~r to the initial element ofthe array soitreally passes the arraybyreference~ . ••. There ·is no way to pass. an .arrayby. value . (except·. to embed it in a structure and pas~the structure by value ).. . With stnictures, however, the structure name is inteIpreted as the entire structure, not asapointer to the beginning of the structure. If you use the same syntax that you use with ~rrays,therefore,you will get different semantics. For example: intar[lOO]; struct tag st; func( ar ) ; func( st ); 1* Passes a pointer to the first element ofar [] *1 1* Passes an entire structure *1 The inconsistency follows through to the receiving side. For example, the following two· array versions are the same: funer ar) int ar [] ; 1* ar is converted to a pointer to an int *1 func( ar ) int *ar; 1* ar is a pointer to an int *1 But the following two structure versions are very different: func( st ) struct tag st; func( st ) struct tag *st; 1* st is an entire structure *1 1* st is a pointer to a struct *1 Returning Structures Just as it is possible to pass a structure or a pointer to a structure, it is also possible to return a structure or a pointer to a structure. (Returning a structure is not supported in the original K&R standard, but is a common extension supported by most C compilers and by the ANSI standard.) The declaration of the function's return type must agree with the actual returned value. For example: 4-150 Code structure and union operations struct tag f() { /* Define a function that returns */ /* a struct */ struct tag st; return st; /* Return an entire struct */ } struct tag *fl() /* Define a function that returns */ /* a pointer to a struct */ { static struct tag pst; return &pst; /* Return the address of a struct */ } As with passing structures, you generally want to return pointers to structures because it is more efficient. Note, however, that if you return a pointer to a structure, the structure must have fixed duration. Otherwise, it will cease to be valid once the function returns. One situation where returning structures is particularly useful is when you want to return more than one value. The return statement can only send back one expression to the calling routine, but if that expression is a structure or a pointer to a structure, you can indirectly return any number of values. The following function, for instance, returns the sine, cosine, and tangent of its argument. The functions sinO, cosO and tanO are part of the run-time library. Each accepts an argument measured in radians and returns the corresponding trigonometric value. If the argument is too large, however, the results will not be meaningful. Code 4-151 structure and union operations #include #include /* include file for trig */ /* functions */ #define too_large 100 /* Differs from one machine */ /* to another. */ typedef struct { double sine, cosine, tangent; } TRIG; TRIG *get_trigvals( radian val double radian_val; { static TRIG result; If radian val is too large, the sine, cosine and /* * tangent values will be meaningless. */ if (radian_val> TOO_LARGE) { printf( "Input value too large -- cannot return \ meaningful results\n" ); return NULL; /* return null pointer -- defined in * stdio.h. */ } result.sine = sine radian val ); result.cosine = cos( radian_val); result. tangent = tan( radian val); return &result; } Referencing a Member Through a Pointer to Another Structure To be compatible with older of versions of C that did not create separate name spaces for every structure and union, Domain C allows you to access members through pointers to other structures and unions. That is, the member name can be a member of any structure, not just the structure of which it is a member. The following program fragment demonstrates this unusual feature of C: 4-152 Code structure and union operations struct a { int a; char b; } x = {'ABeD' ,'E'}; struct b { char aa; int bb; } y; main( ) { int *i; i = &y; printf ("%c\n" , i->a) ; } Note that pointer variable i holds the address of structure variable y. Further note that a is a member of structure x, not structure y. Yet, we are able to refer to i->a in the printfO call. When C encounters i->a, it looks for any structure member whose name is a. If we execute the preceding program, we get the following output: A NOTE: This functionality is not a portable feature since most modem C compilers do not support it. Code 4-153 switch switch A conditional branching statement that selects among several statements based on constant values. FORMAT switch ( exp ) { case const_exp : [statement ... ] [case const_exp : [statement ... ]] [defaUlt : [statement ... ]] } ARGUMENTS exp The integer expression that the switch statement evaluates and then compares to the values in all the cases. An integer expression to which exp is compared. If const_exp matches exp, the accompanying statement is executed. statement This is zero or more simple statements. (Note that if there is more than one simple statements, you do not need to enclose the statements in braces.) DESCRIPTION The expression immediately after the switch keyword must be enclosed in parentheses and must be an integral expression. That is, it can be char, short, int or long, but not float, double, or long double. NOTE: the K&R standard requires the switch expression to be of type into The expressions following the case keywords must be integral constant expressions, meaning they may not contain variables. The semantics of the switch statement are straightforward. The switch expression is evaluated; if it matches one of the case labels, program flow continues with the statement that 4-154 Code switch follows the matching case label. If none of the case labels match the switch expression, program flow continues at the default label, if it exists. (Strictly speaking, the default label need not be the last label, though it is good style to put it last.) No two case labels may have the same value. An important feature of the switch statement is that program flow continues from the selected case label until another control-flow statement is encountered or the end of the switch statement is reached. That is, the compiler executes any statements following the selected case label until a break, goto, or return statement appears. The break statement explicitly exits the switch construct, passing control to the statement following the switch statement. Since this is usually what you want, you should almost always include a break statement at the end of the statement list following each case label. The following print_errorO function, for example, prints an error message based on an error code passed to it. /* Prints error message based on error_code. Function is declared with void because it doesn't return anything. * * */ #include #define ERR_INPUT_VAL 1 #define ERR_OPERAND 2 #define ERR_OPERATOR 3 #define ERR_TYPE 4 void print_errore error_code int error_code; { switch (error_code) { ERR_INPUT_VAL: printf("Error: Illegal input value.\n"); break; case ERR_OPERAND: printf("Error: Illegaloperand.\n"); break; case ERR_OPERATOR: printf("Error: Unknown operator.\n"); break; case ERR_TYPE: printf("Error: Incompatible data.\n"); break; default: printf("Error: Unknown error code %d\n", error_code) ; break; case } } Code 4-155 switch The break statements are necessary to prevent the function from printing more than one error message. The last break after the default case isn't really necessary, but it is a good idea to include it anyway for consistency's sake. Sometimes you want to associate a group of statements with more than one case value. To obtain this behavior, you can enter consecutive case labels. The following function, for instance, returns 1 if the argument is a punctuation character, or zero if it is anything else. /* This function returns 1 if the argument is a punctuation character. Otherwise, it returns zero. * * */ iSJ)unc( arg char arg; { switch (arg) { . . , , case , case , , case ' , case ' , , case 'I': default : . . . . return 1; return 0; } } Domain C allows the use of enum values as the control expressions and case labels of a switch statement. However, if you use an enum one place, you must use use enums elsewhere. That is, if expr is of type enum, then all the case labels must also be of type enum, while if expr is not an enum, none of the case labels may be enums. For example: 4-156 Code switch /* Program name is "enums_in_a_switch". */ #include int main( void { enum AUTHORS { Hemingway, Steinbeck, Twain }; enum AUTHORS favorite = { Twain }; switch (favorite) { /* * case Hemingway: printf( "A Farewell To Arms\n" ) ; break; case Steinbeck: printf( "The Grapes of Wrath\n" ) ; break; case Twain: printf( "The Adventures of Tom Sawyer\n" ); break; case 5 : printf("no author") THIS WOULD BE ILLEGAL SINCE 5 IS NOT AN ENUM VALUE. */ } } Code 4-157 switch EXAMPLE /* program name is "switch_example". Read a student~s grade * from the keyboard. Then the switch statement uses the grade * to decide which comment should be printed. Notice that the * cases allow for uppercase and lowercase letters to be * entered. */ #include int main( void { char answer, grade; answer = ~y~; printf( "\n\n" ); while (answer == ~y~ I I answer ~y~) { printf( "Enter student~s grade: " ); fflush ( stdin ); scanf ( "%c", &grade ); printf( "\nComments: " ); switch (grade) { case 'A' : case 'a': printf( "Excellent\n" ); break; case 'B': case 'b': printf( "Good\n" ); break; case ' C' : case ' c' : printf( "Average\n" ); break; case 'D~: case 'd': printf( "Poor\n" ); break; case 'F': case 'f': printf( "Failure\n" ); break; default: printf( "Invalid grade\n" ); break; } /* end switch */ printf( "\nAgain? " ); fflush( stdin ); scanf ( "%s", &answer ); 4-158 Code switch } /* end while */ } USING THIS EXAMPLE If we execute this program, we get the following output: Enter student's grade: B Comments: Good Again? y Enter student's grade: c Comments: Average Again? n Code 4-159 #systype and stsypeO macro #systype (preprocessor directive) and the systypeO macro Selects the target operating system. (Domain Extension) FORMAT #systype systype_name Preprocessor directive systype ( systype _name ) Predefined macro ARGUMENTS systype _name A string containing the name of an operating system. The string must be one of the following: @ e C!I G I') e Gl bsd4.1 bsd4.2 bsd4.3 sys3 sysS sysS.3 any Berkeley 4.1BSD (obsolete) Berkeley 4.2BSD Berkeley 4.3BSD AT&T System III (obsolete) AT&T System Y Release 2 AT&T System Y Release 3 program is independent of a particular UNIX system DESCRIPTION We divide this listing into an explanation of the preprocessor directive and the macro. First, we describe the preprocessor directive. The #systype Preprocessor Directive Because C programs are often written to run in UNIX environments, and because not all UNIX environments are the same, Domain C supports the #systype preprocessor directive, which allows you to define the version of the UNIX system for which your program is targeted. The Domain C library contains two sets of routines. One is compatible with the Bell Labs versions of the UNIX system (System Y, Release 2 and 3) and the other set is compatible with Berkeley's versions of the UNIX system (4.2BSD, and 4.3BSD). All of the routines in both sets work properly in any Domain/OS environment. However, you may encounter problems if you attempt to mix functions from two sets that interact with each other. In general, it is best to choose one set and stick with it whenever possible. 4-160 Code #systype and stsype 0 macro The two sets of functions overlap to a large extent. It is sometimes the case, however, that while function x exists in both sets, the semantics of the function (and in some cases its arguments) may be subtly different. As an illustration, consider the function setgrp O. In the System V version, the function definition is: int setpgrp ( ) It is defined to set the process group ID of the calling process to the process ID of the calling process and return the new process group ID. In the 4.2BSD version of the UNIX system, there is an identically named function with similar semantics but a different calling sequence. The Berkeley function, setpgrp( pid, pgrp int pid, pgrp; sets the process group of the specified pgrp. and errno is set on failure. Zero is returned if successful; -1 is returned To avoid unexpected behavior, always know which set of functions you are accessing. The system chooses one set of functions over another based on a version selector called the systype. The systype affects both the compilation and the execution of a program. At compilation time, it determines which include files the compiler uses. At run-time, it determines which set of functions are called and makes sure that the proper calling conventions are employed. The compiler stamps the object module with the systype that was in effect when the module was compiled. When the program is executed, the loader checks this stamp and uses the semantics and calling sequences of the designated systype when invoking library functions. There are several ways to define the systype, one of which is to place a #systype directive in the source file. You may define the systype only once per source file. Any subsequent definitions produce an error. Moreover, the #systype directive must be the first non-comment token in the source file. For instance, to set the systype to 4.2BSD, enter the following at the top of your source file: ffsystype bsd4.2 It is also legal to enclose the systype in double quotes: ffsystype "bsd4.2" You also can define the target operating system with the -systype compile option (/com/cc only), which is described in Chapter 6. If you specify one systype on the command line and a different one in the file, the compiler reports an error. If you do not explicitly specify a systype, the compiler inherits the systype from an environment variable called COMPILESYSTYPE. By default, this variable is set to sysS. If, for some reason, the COMPILESYSTYPE variable does not exist, the systype is inherited from another environment variable called SYSTYPE. This variable is always set. These environment variables are described in more detail in the Using the SysV Environment and Using the BSD Environment manuals. Code 4-161 #systype and stsype 0 macro NOTE: Be especially careful about using systype any. Most programs are not independent of a particular version. For example, programs running under the Aegis environment are systype sysS. The systype Macro Domain C supports a macro called systype that enables you to find out what the current UNIX systype is. By ciefault, the systype is "sysS", but you can change it with the #systype preprocessor directive or with the -systype compiler option. The macro may be used only in an #if preprocessor directive. It evaluates to 1 if the argument is the same as the systype, and evaluates to zero (0) if the argument differs from the systype. The quotes around the argument are optional. For example: Uif systype("bsd4.2") Uinclude "comments.4.2" Uelse Uif systype(bsd4.3) Uinclude "comments.4.3" Uelse Uinclude "comments. bell" Uendif Uendif 4-162 Code - TIME - _TIME_ (predefined symbol)See the _DATE_ and _TIME_ listing earlier in this chapter. Code 4-163 while while Executes the statements within a loop as long as the specified condition is true. FORMAT while ( exp ) statement ARGUMENTS exp Any expression. statement Any simple or compound statement. DESCRIPTION This is one of the three looping constructions available in C. Like the for loop, the while tests exp and if it is true (nonzero), statement is executed. Once exp becomes false (zero), execution of the loop stops. Since exp could be false the first time it is tested, statement may not be performed even once. The following describes two ways to jump out of a while loop prematurely (that is, before exp becomes false): 4-164 Code • Use break to transfer control to the first statement following the while loop. • Use go to to transfer control to some labeled statement outside the loop. while EXAMPLE /* program name is "while_example". */ #include int maine void ) { int count = 0, count2 = 0; char a_string [80] , *ptr_to_a_string printf( "Enter a string -- " ); gets( a_string ); while (*ptr_to_a_string++) count++; /* A simple statement loop */ printf( "The string contains %d characters.\n", count ); printf( "The first word of the string is " ); while (a_string [count2] != ' ') { /* A compound statement loop */ printf ( "%c", a_string[count2] ); count2++; } printf( "\n" ); } USING THIS EXAMPLE If we execute this program, we get the following output: $ while_example. bin Enter a string -- Four score and seven years ago The string contains 30 characters. The first word of the string is Four $ -------88------- Code 4-165 Chapter 5 Functions The main organizational unit of C is the function. Functions can appear in a program in three forms: Function Definition A declaration that actually defines what the function does. as well as the number and type of arguments. Function Allusion Declares a function that is defined elsewhere. A function allusion specifies what kind of value the function returns. (With the new prototyping feature, discussed in Section 5.4. it is also possible to specify the number and types of arguments in a function allusion.) Function Call Invokes a function. causing program execution to jump to the invoked function. When the called function returns, execution resumes at the point just after the call. This chapter discusses function definitions. allusions. and calls. and other topics associated with functions. such as recursion and pointers to functions. 5.1 Function Definitions The syntax of a function definitions is shown below: [ statiC] [return_type] function_name ( [arg_name [, arg_name ... ]] ) [arg_declaration] [arg_declaration ... ] { function _body } Functions 5-1 You can specify any numbe~ of arguments, including zero. The return type defaults to int if you leave it blank. However, even if the return type is int, you should specify it explicitly to avoid confusion. We break the discussion of function definitions into two parts-the function's preamble (everything before the left brace) and the function's body (from the left brace to the right brace). 5.1.1 Function Preamble Domain C supports two forms for a function preamble-the old form specified by the K&R standard and the new form (called prototyping) specfied by the ANSI standard and used in the C++ programming language. This section describes the K&R method; later sections describe the new prototyping feature. The function's preamble must at the very least consist of the name of the function followed by a pair of parentheses. All other parts of a function are optional. The other parts are: • The static storage class specifier to give the function file scope. • The data type of the value that the function intends to return. If you do not specify a data type, the compiler assumes that the function returns an int. G The function's argument list, which is a list of identifiers separated by commas. • One optional parameter declaration for every argument in the argument list. A parameter declaration takes the same format as a variable declaration. If you omit a type in the declaration, the type defaults to into If there are no arguments in the argument list, do not specify any parameter declarations. NOTE: You must put a semicolon after each parameter declaration, but never put a semicolon after the argument list. If the function does not return an int, you must specify the true return type. If the func- tion does not return any value, you should specify a return type of void. Before void became a common feature of C compilers, it was a convention to leave off the return type when there was no return value. The return type would default to int, but the context in which the function was used would usually make it clear that no meaningful value was returned. With modern C compilers such as Domain C, however, there is no excuse for omitting the return type. 5-2 Functions 5.1.1.1 Argument Declarations Formal argument declarations obey the same rules as other variable declarations, with the following exceptions: • The only legal storage class specifier is register. (The default duration is automatic, but the auto specifier is not legal in this context.) • chars and shorts are passed as ints; floats are passed as doubles. (With the new ANSI prototyping feature, you can disable these automatic conversions.) . III A formal argument declared as an array is converted to a pointer to an object of the array type. • A formal argument declared as a function is converted to a pointer to a function. 19 You may not include an initializer in an argument declaration. It is legal to omit an argument declar~tion, in which case the argument type defaults to int. This is considered very poor style, however. Let us now examine several sample function preambles: Example 1 Our first example shows the preamble of a function named ghost that accepts no arguments and returns no values; therefore, it simply looks like this: void ghost() The data type void ensures that no value will be passed back to the calling function. Notice that we have to put an empty set of parentheses after the name of the function to remind the compiler that this is indeed a function. The fact that the parentheses are empty means that the function has no parameters. Example 2 Our second example is a function named analyze that accepts a single floating-point number as an argument. Here's how to declare it: void analyze( x ) float x; Notice that we declared x's data type immediately after the function definition. Also notice that we put a semicolon after the parameter declaration but not after the argument list. Functions 5-3 Example 3 Our third example shows a function that accepts two integer arguments and returns a floating-point result. It looks as follows: float int int pythagorean ( leg1, leg2 ) leg1; leg2; The keyword float identifies the data type of the returned answer. We declared leg! and leg2 separately for clarity, though we could have written the function preamble like this instead: float pythagorean (leg1, leg2) int leg1, leg2; Example 4 Our fourth example accepts three arguments and returns a pointer to a character: char *razzmatazz( high, low, precision) long int high; short int low; double precision; 5.1.2 The Body of the Function After the function preamble comes the body of the function. The body of the function takes the following format: local declarationl local declarationN statementl statementN } For example, here is a sample function body: { int y, x; /* local declarations */ scanf("%d", &x); /* statement */ y = 10 * x; /* statement */ printf("lO times %d is %d\n", x, y); /* statement */ } 5-4 Functions You must enclose the function body in braces. Note that a function body can consist of braces and nothing else; for example, the following function body is perfectly legal: /* a good place holder for code not yet written */ { } Statements within the function body can use the following kinds of variables: • The function's parameters (that is, the parameters defined in this function's preamble). • The variables declared within this function (variables having block scope within this function). • Variables with global scope or file scope. (See Chapter 2 for a complete discussion of variable scope.) 5.2 Function Allusions A function allusion is a declaration of a function that is defined elsewhere, usually in a different source file. The main purpose of the function allusion is to tell the compiler what type of value the function returns. With the new prototyping feature, it is also possible to declare the number and types of arguments that the function takes. This feature is discussed in Section 5.4. The remainder of this section describes the old function allusion format. Note that Domain C supports both the old and the new formats. By default, all functions are assumed to return an into You are only strictly required, therefore, to include function allusions for functions that do not return an into However, it is good style to include function allusions for all functions that you call. The syntax for a function allusion is shown in Figure 5-1. If you omit the storage class, it defaults to extern, signifying that the function definition may appear in the same source file or in another source module. The only other legal storage class is static, which indicates that the function is defined in the same source file. The data type in the function allusion should agree with the return type specified in the definition. If you omit the type, it defaults to into Note that if you omit both the storage class and the data type, the expression is a function call with no arguments if it appears within a block; if it appears outside of a block, it is an allusion: fl(); /* Function allusion -- default type is int */ mainO { f2(); /* Function call */ Functions 5-5 storage class ~i function name ~ Figure 5-1. Syntax of a Function Allusion Typically, a function allusion appears at the head of a block with other declarations. The scoping rules for function allusions are the same as for other variables declared with extern. Note, however, that the default storage class rules are different for functions than for other variables. For example, in the following declaration, the storage class of pflt and arr_flt[] defaults to auto, whereas the storage class of func_fltO defaults to extern. { float func_flt(); float *pflt, arr_flt[lO]; If this declaration appeared outside of a block, pflt and arr_flt[] would be global defini- tions, whereas func_fltO would still be a function allusion. 5.2.1 Forward References and Backward References When we make a forward reference to a function, we mean that the function call appears in the source code prior to the function's definition or allusion. A backward reference to a function means that the function call appears in the source code after the function's definition or allusion. C unconditionally permits backward references, but restricts forward references. You can make a forward reference when either of the following conditions is true: • The called function returns an int value • The caller does not use the value returned by the called function Stylistically, however, it is best to declare prototypes for all functions before they are invoked. 5-6 Functions 5.3 Function Cans A function call, also called a function invocation, passes control to the specified function. The syntax for a function call is shown in Figure 5-2. A function call is an expression, and can appear anywhere an expression can appear. Unless they are declared as returning void, functions always return a value that is substituted for the function call. For example, if fO returns 1, the statement a = f()/3; is equivalent to: a = 113; It is also possible to call a function without using the return value. The statement f(); calls the function equivalent to: ro, but does not use the return value. If fO returns 1, the statement is 1; which is a legal C statement, although it is a no-op (no operation is performed, assuming f 0 has no side effects). -1 function name '-------' argument Figure 5-2. Syntax of a Function Call 5.3.1 Call by Value Arguments to a function are a means of passing data to the function. Many programming languages (notably FORTRAN) pass arguments by reference, which means they pass a pointer to the argument. As a result, the called function can actually change the value of the argument. In C, arguments are passed by value, which means that a copy of the argument is passed to the function. The function can change the value of this copy, but cannot change the value of the argument in the calling routine. (Domain C supports a C++ extension that enables you to pass arguments by reference. This feature is described in Section 5.3.2.) Functions 5-7 Figure 5-3 shows the difference. Note that the arrows in the pass-by-reference picture point in both directions indicating that the calling and called function can send information to each other through arguments. In the pass-by-value diagram, the arrows go in only one direction because only the calling function can send information through arguments. The argument that is passed is often called an actual argument, while the received copy is called a formal argument or formal parameter. Called Function Calling Function Pass by Reference ~ Actual Argument ~I address of argument I~ ~ Formal Argument I Formal Argument I Pass by Value Actual Argument ~I I value of argument I ~I I Figure 5-3. Pass by Reference vs. Pass by Value Because C passes arguments by value, a function can assign values to the formal arguments without affecting the actual arguments. For example: /* Program name is "pass_by_val_example". */ #include int maine void ) { extern void f( in ); int a = 2; f (a); /* pass c copy of "a" to "f()" */ printf( "Value of \"a\" after return is %d\n" , a ); } void f( int received_arg ) { received_arg = 3; /* Assign 3 to argument copy */ } In the example above, the printfO function prints 2, not 3, because the formal argument, received_arg in fO, is just a copy of the actual argument a. The order of the actual arguments matches the order of the formal arguments, regardless of the names used. That is, the first actual argument is matched to the first formal argument, the second actual argu- 5-8 Functions ment to the second formal argument, and so on. For correct results, the types of the corresponding actual and formal arguments should be the same. If you do want a function to change the value of an object, you must pass a pointer to the object, and then make an assignment through the dereferenced pointer. The following, for example, is a function that swaps the values of two integer variables. /* Program name is "pass_by_ref_example". */ #include void swap ( int *x, int *y ) { register int temp; temp = *x; *x *y; *y = temp; } To call this function, you need to pass two addresses: int main( void { int a 2, b 3; swap ( &a, &b ); printf ( "a = %d\ t b %d \n", a , b ); } Executing this program yields: a = 3 b = 2 5.3.1.1 Automatic Argument Conversions In the absence of prototyping, all scalar arguments smaller than an int are converted to int, and all float arguments are converted to double. If the formal argument is declared as a char or short, the receiving function assumes that it is getting an int, so the receiving side converts the int to the declared type. If the formal argument is declared as a float, the receiving function assumes that it is getting a double, so it converts the received argument to float. This means that every time a char, short, or float is passed, at least one conversion takes place on the sending side where the argument is converted to int or double. In addition, the argument may also be converted again on the receiving side if the formal argument is declared as a char, short, or float. Functions 5-9 Consider the following: { char a; short b; float c; foo ( a, b, c ); /* * a and b are promoted to ints, and c is promoted to double. */ foo( x, y, z ) char x; /* short y; /* float z·, /* Received arg is to char. */ Received arg is to short. */ Received arg is double to float converted from int converted from int converted from */ { Note that these conversions are invisible. So long as the types of the actual arguments match the types of the formal arguments, the arguments will be passed correctly. However, as discussed in Section 5.4, these conversions can affect the efficiency of your program. Prototyping enables you to turn off automatic argument conversions. 5.3.1. 2 Passing an Array as an Argument C does not pass arrays by value because this would involve too much value copying at run time (particularly for a large array). Instead, C passes the address of the first element of the array. For more information about passing arrays as arguments, see the "array operations" section of Chapter 4. 5.3.1.3 Passing Structures and Unions as Arguments The K&R standard permits you to pass a member of a structure or union as a function argument. In conformance with the new ANSI standard, Domain C also further permits you to pass an entire structure or union as a function argument. For more information about passing structures and unions as arguments, see the "structure and union operations" section of Chapter 4. 5-10 Functions 5.3.2 Passing Arguments JBy Reference Domain C supports a feature from the C++ language that allows you to declare reference variables. One way in which reference variables can be used is to pass arguments by reference. To pass arguments by reference, all you need to do is declare the formal arguments as reference variables. For example: void incr( int &x ) { x++; } The reference variable x becomes an alias for whatever value is passed as an actual argument. For instance, if you call incrO from mainO, as shown below, the actual argument j will be incremented. int main( void { extern void incr( int & ); int j 5; incr ( j ); printf ( "Now, the value of j is: %d\n", j ); } Note that this same behavior can be obtained using pointers: void incr( x ) int *x; { (*x)++ } int main( void ) { extern void incr( int * ); int j; incr( &j); /* pass the address of j explicitly */ printf ( "Now, the value of j is: %d\n", j ); } The principal difference between these two methods is that in the pointer version, you must explicitly pass the address of j. In the reference variable version, the address of j is obtained implicitly. Functions 5-11 The actual argument to a reference variable can be an constant, however, the called function may not modify will not report this error, the program will abort with a access read-only memory. For example, if you pass a lvalue or an rvalue. If you pass a the value. Although the compiler run-time error when it attempts to constant to incrO iner ( 5 ); the program will issue the following run-time error when it attempts to increment the constant: ?(sh) II./iner.bin ll In routine lIiner ll • - access violation (OS/fault handler) These semantics differ somewhat from the semantics described in The C++ Programming Language, which states that the compiler treats reference arguments as if they are normal reference variables initialized with the values passed as actual arguments. This implies that if an rvalue is passed, the compiler should produce a temporary variable. This is, in fact, how Domain C works if you pass an expression rather than a constant. For example, the following invocation of incrO works because the compiler generates a temporary variable for the expression 2+3. iner( 2 + 3 ); 5.4 Function Prototypes Function prototyping is a feature introduced to the C language by Bjarne Stroustrup, the designer of the C++ programming language, and adopted by the AN~I X3J11 Technical Committee. Function prototypes in Domain C behave exactly as documented in the ANSI standard. Function prototypes allow function declarations to include data type information about arguments. This has two main benefits: 5-12 • Function prototyping enables the compiler to check that the types of the actual arguments in the function call match the types of the formal arguments specified in the function declaration. • Function prototyping turns off automatic argument conversions. Floating types are not converted to double and small integers are not converted to into This can significantly speed up algorithms that make intensive use of small integer or floating-point data. Functions The format for declaring function prototypes is the same as the old function allusion syntax except that you enter types for each argument. For example, the function allusion extern void func( int, float, char * ); declares a function that accepts three arguments-an int, a float, and a pointer to a char. The argument types may optionally be followed by variable names. For example, the previous declaration could be written: extern void func( int a, float b, char *pc ); The variable names have no meaning other than to make the type declarations easier to read and write. No storage is allocated for them, and the names do not conflict with real variables that have the same name. You may include the storage class register in a prototype but it is has no meaning. Prototyping ensures that the right number of arguments are passed, and it prohibits you from passing arguments that cannot be quietly converted to the correct type. On the other hand, it does quietly convert arguments when it can. As a result, you may actually pass the wrong type of argument without receiving a compile-time error. However, you will receive a warning if the types do not match. If you attempt to call this function with func ( j, x ); the compiler will report an error because the call contains only two arguments whereas the prototype specifies three arguments. Also, if the argument types cannot be converted to the types specified in the prototype, a compilation error occurs. The rules for converting arguments are the same as for assignments. The following, for example, should produce an error because the compiler cannot automatically convert a float to a pointer. { extern void f( int * ); float x; f( x); /* ILLEGAL -- cannot convert a float * to a pointer */ If the compiler can quietly convert an argument to the type of its prototype, it does so. In the following example, for instance, j is converted to a float and x is converted to a short before they are passed. Functions 5-13 { extern void f( float, short ); double x; long j; f(j,x); /* OK -- long is converted to float, * and double is converted to * short. */ Without prototyping, this example would produce erroneous results because fO would treat j as a float and x as a short, even though it is receiving a long and a double. To declare a function that takes no arguments, use the void type specifier: extern int f( void) /* * This function takes no arguments. */ 5.4.1 Function Definitions The new protyping feature also includes an alternative syntax for declaring arguments in a function definition. The old style, which is still supported, requires you to declare arguments after the function header. For example: int foo( x, y, z ) int x; float y; char *z; } The new syntax allows you to declare the arguments within the function header: int foo( int x, float y, char *z ) { Note that the new syntax makes it easy to create prototype declarations from new-style function definitions-all you need to do is copy the function definition, optionally precede it with extern, and end it with a semicolon. A prototype declaration of foo 0, for example, would be: extern int foo( int x, float y, char *z ); Moreover, when you use the new syntax for declaring arguments in a function definition, the definition also serves as a prototype of the function for the remainder of the source 5-14 Functions file. That is, the compiler uses the type information specified in the definition to check the types of the arguments in all invocations of the function throughout the remainder of the source file. This type-checking does not occur if you use the old syntax. For instance: int foo( x, y, z ) int x; float y; char *z; { } main () { char *a; float b; foo( a, b); /* Will NOT produce a compile-time error */ } On the other hand: int foo( int x, float y, char *z ) { mainO { char *a; float b; foo( a, b); /* Will produce a compile-time error */ } 5.4.2 Prototyping a Variable Number of Arguments If a function accepts a variable number of arguments (printfO for example), you can use the ellipsis token" ... " in the prototype declaration. For example, the prototype for printfO is: int printf( char *format, ... ); This indicates that the first argument is a character string, and that there are an unspecified number of additional arguments. The ellipsis token may appear only as the last argument type in a prototype declaration. See the description of varargs in the Domain Pro- Functions 5-15 grammer's Reference for SysV or BSD for more information about writing functions that accept a variable number of arguments. 5.4.3 Backwards Compatibility The Domain C compiler continues to support the old syntax and semantics for function declarations and definitions. However, unless the -ntype switch is used (supported with Icom/cc only), the compiler will issue an informational message whenever it encounters a function that is not prototyped. (Note, however, that the compiler reports informational messages only if you compile with -info 1 or a larger value.) The message informs you that the compiler is using the default prototype: func_name ( .... ); The ellipsis notation represents an indeterminate number of arguments with indeterminate types. When the compiler encounters a prototype allusion and an old-style definition for the same function, it expands the formal argument types using the old rules before checking the types against the prototype. The following example, for instance, produces a compiletime error because the expanded argument types in the definition are int and double, whereas the prototype specifies char and float. mainO { extern void foo( char, float ); void foo( x, char x; float y; y ) Note the distinction between this example and the following example which uses the newstyle definition. main () { extern void foo( char, float ); void foo( char x, float 5-16 Functions y ) In this case, no argument expansions take place so the prototype matches the definition. 5.4.4 Using Prototypes to Write More Efficient Functions The following example shows how prototypes can be used to write more efficient functions by turning off the automatic conversion of floats to doubles. The sum_of_squaresO function, shown below, is allowed to pass floats and perform float arithmetic, which can lead to significant savings in calculation time for large floating-point programs. #include int main( void { extern float sum_of_squares( float x, float y, float z); printf("Enter three floating-point numbers: "); scanf ( "%f%f%f ", &x, &y, &z ); printf("The sum of the squares of x, y, and z\ is: %f", sum_of_squares(x, y, z); } float sum_of_squares( float a, float b, float c ) { return (a*a)+(b*b)+(c*c); } Without prototyping, all three arguments would be converted to double before they were passed and then converted back to floats on the receiving side, making the function slower. 5.5 Returning a Value Back to the Caller Here are C's rules for returning a value from the calling function back to the caller: 1. Use the return statement to pass a value back to the caller. 2. A function can directly return at most one value to the calling function. However, the value can be a structure, union, or pointer, so it is possible to indirectly return more than one value. 3. If you specify the function's data type (in the function definition), then return passes back a value of this data type. However, if you specified the function's data type as void, then return passes back no value. 4. If you did not specify the function's data type, then return passes back an int value. Functions 5-17 For more information about returning from a function, see the "return" section in Chapter 4. 5.5.1 R.eturning Values By Reference Just as you can pass function arguments by reference, you can also return function values by reference. To do this, you need to declare the return type as a reference variable, as shown below: int &foo( int x, float y) { static int j; return j; /* returns the address of j */ } When you return from a function by reference, what gets returned is actually the address of the returned value. This can be a dangerous practice if the returned value is a constant, an expression, or an automatic variable. If the returned value is a constant, automatic variable, or temporary variable, there is no guarantee that its memory location will will remain unchanged before the calling function accesses it. Constants are stored in read-only memory, which the compiler can use for other purposes as soon as the constant has been referenced. Automatic variables live on the stack and can be overwritten as soon as their defining blocks are exited. There is no guarantee, therefore, that the address returned by a function that returns by reference will point to meaningful data at a later point in the program. For expressions returned by reference, the compiler creates a temporary variable to store the expression value. For this reason, you should return only fixed duration variables by reference. 5-18 Functions 5.5.2 The #options Specifier - Domain Extension The #options specifier gives you some control over the use of registers within function calls. The syntax is: function_declaration #options( option [,option ... ]) where function_declaration is an old-style declaration or a function prototype, and option is one of the following: Forces the function to place the return value in AO, in addition to DO. You must specify this option for Pascal routines that return pointers. See Chapter 6 for more information about cross-language communication. abnormal Warns the compiler that the function can produce an abnormal transfer of control. The compiler takes this warning into account when optimizing any routines that invoke this function. The abnormal option, however, does not affect the function to which it is applies (unless it calls itself recursively). The abnormal option is particularly useful for writing cleanup handlers. noreturn Indicates that the program terminates after invocation. The optimizer may remove any code following a call to a noreturn function since it is unreachable. nos ave Indicates that the function will not save the contents of any registers. The nosave option should only be specified when declaring an assembly language program that does not follow the normal conventions for preserving registers. Routines written in C or other Domain high-level programming languages always preserve these registers. Note that assembly-language routines must preserve registers AS and A6, which contain pointers to the current stack area and stack frame, respectively. Functions 5-19 5.6 Recursive Functions The C language supports recursive functions, which are functions that call themselves. The following example demonstrates a recursive method for calculating factorials: /* Program name is "recursive_example". */ #include int factorial ( int n ) { int result; i f (n 0) result else result l - ', n * factorial(n - 1); return result; } int maine void ) { int a-positive_integer, answer; printf( "This program finds a factorial.\n" ); printf( "Enter an integer from 0 to 16 "); scanf( "%d", &a_positive_integer ); answer = factorial( a_positive_integer ); printf( "The factorial of %d is %d.\n", a_positive_integer, answer) ; } 5.7 Pointers to Functions Pointers to functions are a powerful tool because they provide an elegant way to call different functions based on the input data. Before discussing pointers to functions, however, we need to describe more explicitly how the compiler interprets function declarations and invocations. The syntax for declaring and invoking functions is very similar to the syntax for declaring and referencing arrays. In the declaration, int ar[5] ; the symbol ar is a pointer to the initial element of the array. 5-20 Functions When the symbol is followed by a subscript enclosed in brackets, the pointer is indexed and then dereferenced. An analogous process occurs with functions. In the declaration, extern int f () ; the symbol f by itself is a pointer to a function. When a function is followed by a list of arguments enclosed in parentheses, the pointer is dereferenced (which is another way of saying the function is called). Note, however, that just as ar in, int ar[5]; is a constant pointer, so too, fin, extern int f(); is a constant pointer. Hence, it is illegal to assign a value to f. To declare a variable pointer to a function, you must precede the pointer name with an asterisk. For example, int (*pf)(); /* pf is a pointer to a function * returning an into */ declares a pointer variable that is capable of holding a pointer to a function that returns an into The parentheses around *pf are necessary for correct grouping. Without them, the declaration, int *pf() would make pf a function returning a pointer to an into 5.7.1 Assigning a Value to a Function Pointer To obtain a pointer to a function, you merely enter a function name, without the argument list enclosed in parentheses. For example: { extern int f1(); int (*pf) () ; pf = fl; /* assign pointer to fl to variable pf */ If you include the parentheses, then it is a function call. For example, if you write pf = fl(); /* ILLEGAL -- fl returns a~ int, * but pf is a pointer */ Functions 5-21 you will get a compiler error because you are attempting to assign the returned value of flO (an int) to a pointer variable, which is illegal. If you write pf = &f1(); /* ILLEGAL -- cannot take the address * of a function result. */ the compiler will attempt to assign the address of the returned value. This, too, is illegal. Lastly, you could write: pf = &f1; /* ILLEGAL &f1 is a pointer to * a pointer, but pI is a pointer to * an into */ On older C compilers, this would also cause a compile error (or warning) because the compiler would interpret f1 as an address of a function, and the address-of (&) operator attempts to take the address of an address. C does not permit this. Even if it did, the result would be a pointer to a pointer to a function which is incompatible with a simple pointer to a function. Domain C allows this syntax by ignoring the & operator, but the compiler does issue a warning message. 5.7.2 Return Type Agreement The other important point to remember about assigning values to function pointers is that the return types must agree. If you declare a pointer to a function that returns an int, you must assign the address of a function that returns an int, not the address of a function that returns a char, a float, or some other type. If the types don't agree, you will receive a compile-time error. The following example shows some legal and illegal function pointer assignments. extern int ifl 0, if20, (*pif) 0 ; extern float fflO, (*pff)(); extern char cf1(), (*pcf)(); main () { pif pif pff pcf ifl } 5-22 Functions ifl; cfl; if2; cft; if2; /* Legal -- types match */ /* ILLEGAL -- type mismatch */ /* ILLEGAL -- type mismatch */ /* Legal -- types match */ /* ILLEGAL -- Assignment to a constant */ 5.7.3 Calling a Function Using Pointers To dereference a function pointer, thereby calling a function, you use the same syntax you use to declare the function pointer, except this time you include parentheses, and possibly arguments. For example: { extern int £1(); int (*pf) () ; int answer; pf = £1; answer = (*pf)(a); /* Calls the function fI() with * argument a */ As with the declaration, the parentheses around *pf in the function call are essential to override default precedence rules. Without them, pf would be a function returning a pointer to an int, rather than a pointer to a function. Note that the value of a dereferenced function pointer is whatever it was declared to be. In our case, we declared pf with the statement, int (*pf) () ; signifying that when it is dereferenced, it will evaluate to an into One peculiarity about dereferencing pointers to functions is that it does not matter how many asterisks you include: For example, (*pf) (a) is the same as: (****pf) (a) This odd behavior stems from two rules: first, that a function name by itself is converted to a pointer to the function; and second, that parentheses change the order of evaluation. The parentheses cause the expression, ****pf to be evaluated before the argument list. Each time pf is dereferenced, it is converted back to a pointer because the argument list is still not present. Only after the compiler has exhausted all of the indirection operators does it move on to the argument list. The presence of the argument list makes the expression a function call. Functions 5-23 It follows from this logic that you can dereference a pointer to a function without the indi- rection operator. That is, pf(a) should be the same as: (*pf) (a) This is, in fact, the case according to the ANSI standard. Older compilers, however, may not support this syntax. We recommend the second version because it is more portable, and reminds us that pf is a pointer variable. 5.7.4 Passing a Pointer to a Function as an Argument You will sometimes want to pass a function pointer as an argument to another function. In this manner, you can call a function that can in turn call another function. We demonstrate this technique in the program that follows. Consider the main function of this program. Notice that we assign the address of either function max or function minO to variable pointer_to_a_function. Therefore, when we call function initial_checkingO, we pass the address of one of these functions. Function initial_checkingO copies the address of either maxO or minO. Then it does some checking regardless of whether max or min was passed. Finally, initial_checkingO calls either maxO or minO by dereferencing variable pf. /* Program name is "pointers_to_functions". This program * shows how to pass a function pointer as an argument to * another function. */ #include void initial_checking( int (*pf)(), int intl, int int2) { int answer; if «intl <= 0) I I (int2 <= 0» { printf ( "You entered an illegal value. \n" ); exit (); } else { answer = (*pf) (intl, int2); printf( "\nThe result is %d\n", answer ); } 5-24 Functions } /* find the maximum of two integers */ int max ( int argl, int arg2 ) { i f (argl > arg2) return argl; else return arg2; } /* find the minimum of two integers */ int min( int argl, int arg2 ) { i f (argl < arg2) return argl; else return arg2; int main( void ) { int (*ptr_to_a_function)(), valuel, value2, reply; printf( "Enter two positive integers -- " ); scanf( "%d%d", &valuel, &value2 ); printf( "\nEnter 0 to find the max of the two integers,\n" ); printf( "Enter 1 to find the min of the two integers. -- " ); scanf ( "%d", &reply ); i f (reply) &min; ptr_to_a_function else ptr_to_a_function &max; initial_checking( ptr_to a function, valuel, value2 ); } 5.S The mainO Function All C programs must contain a function called main 0, which is always the first function executed in a C program. When mainO returns, the program is done. The compiler treats the main 0 function like any other function, except that at run time, the host environment is responsible for providing two arguments. The first, usually called argc by convention, is an int that represents the number of arguments that are present on the command line when the program is invoked; the second, called argv by convention, is an array of pointers to the command line arguments. Functions 5-25 The following program uses argc and argv [] to print out the list of arguments supplied to it when it is invoked: /* Program name is "echo". * arguments on stdin. It prints the command line */ #include int main( int argc, char *argv[] ) { while (--argc > 0) printf ( "%s ", *++argv ); printf ( "\n" ); } In UNIX systems, there is a program like this called echo. So, if you write at the command line, echo Alan Turing was a father of computing. the system prints: Alan Turing was a father of computing. Note that a pointer to the command itself is stored in argv[O]. This is why we use the prefix increment operator rather than the postfix operator to increment argv. Otherwise, the name of the command, echo, would be printed first. When you invoke a program, each command line argument must be separated by one or more spaces. Note that the command line arguments are always passed to mainO as character strings. If the arguments are intended to represent numeric data, you must explicitly convert them. Fortunately, there are several functions in the run-time library that convert a string into its numeric value. The function atoiO, for example, converts a string into an int, and atofO converts a string into a float. The following program takes two arguments, and returns the first to the power of the second: #include #include int main( int argc, char *argv[] ) { float x, y; i f (argc < 3) { printf ( "Usage: power \n" ); printf ( "Yields argl to arg2 power\n" ); return; } x = atof( *++argv ); y = atof( *++argv ); printf( "%f\n", pow( x, y ) ); } 5-26 Functions The powO function is part of the run-time library. -------88------- Functions 5-27 Chapter 6 C Program Development This chapter describes how to produce an executable object file (that, is, a finished program) from Domain C source code. There are three Domain/OS environments in which you can develop programs: Aegis, SysV, and BSD. Where the development process differs depending on the environment, we describe each environment separately. 6.1 Program Development in a Domain/OS Environment Briefly, you create an executable object file in the following steps: 1. Compile each file of source code that makes up the program. The compiler creates one object file for each file of source code. 2. Debug program if it contains errors. 3. Link (bind) the object files if necessary. Linking is necessary if your program consists of more than one object file. The linker resolves external references; that is, it connects the different object files so that they can communicate with one another. Before linking, you may wish to package related object files into a library file with the UNIX archiver utility. Figure 6-1 illustrates the general program development process. As described in later sections, the details differ somewhat depending on whether you are developing programs in an Aegis or UNIX environment. Program Development 6-1 - Edit Source File(s) , Compile Source File(s) -- , >1 Object File No - Yes ... - Bind Object Files I ~ Execute Object File Find Errors ~ - + Errors 1...._ _ _7_ _--' Figure 6-1. Program Development in a Domain/OS System This chapter details the compiler and provides brief overviews of the binder (linker), archiver, and debugger utilities. In addition to the traditional programming development scheme shown in Figure 6-1, you can also use the Domain Software Engineering Environment (DSEE@!) system to develop C programs. This chapter also contains a brief description of Domain/Dialogue@!, which is a product that simplifies the writing of user interfaces. 6-2 Program Development 6.2 Compiling There are two cc commands: one resides in Icorn/cc and the other resides in Ibin/ec. Ultimately, both commands invoke the same compiler, which we refer to as Domain C. The syntaxes of the two ee commands, however, are somewhat different. The Ibin/cc command is the traditional UNIX command for compiling C source code. When you type cc in a UNIX shell, the system invokes Ibin/cc. The Icom/cc command is the traditional Aegis command for compiling C source code. The behavior of each of these commands is described in the following sections. 6.2.1 Compiling with Ibin/cc To invoke the Domain C compiler with the Ibin/cc command, type cc in a UNIX shell, or Ibin/cc in an Aegis shell. The Ibin/cc command has the following format: $ Ibin/cc [optionl ... optionN] pthnml [ ... pthnmN] where pthnm is a pathname and option is a command line option for ce, cpp, or ld. The Ibin/ce command is actually a driver for other commands. If the command line contains files with .c suffixes (source files), Ibin/cc begins by invoking the UNIX preprocessor (cpp). It passes along any options that are supported by the epp utility. After cpp has finished processing the source files, Ibin/cc sends them to the Domain C compiler. This is the same compiler that is invoked by Icom/cc. However, Ibin/cc implicitly passes along the -bss option so that the compiler will produce . bss sections rather than overlay sections (see the description of -bss in section 6.3.5). The other principal difference between Ibin/cc and Icom/cc is that Ibin/cc compiles without optimizations by default, whereas Icom/cc compiles with optimization level 3 by default. Finally, the Ibin/cc command is capable of invoking the link editor to bind object modules. It will do so automatically unless you specify the -e option. (The -c option suppresses the linking stage and saves all object modules in files with a .0 suffix.) The link editor links together all the object modules, including those just produced by Ibin/ce, and creates an executable program named a.out. Note, however, that a.out is created only if all global symbols are resolved. After creating a.out, the Ibin/cc command deletes all of the .0 object files produced by the compilation. By specifying options to the Ibin/cc command, you can prevent deletion of these files. You can also have the executable binary written to a file other than a.out. Program Development 6-3 6.2.1.1 Some Compilation Examples Let us consider a few compilation examples. Example 1 Consider a C program consisting of only one file called completeyrogram.c. compile as follows from a UNIX shell If we $ cc complete_program.c the compiler produces an executable object file named a.out. We can use the to produce an executable with a different name: -0 option Alternatively, we could use the -c option to suppress the linking stage: $ cc -c complete_program.c In this case, compilation would produce an object file named complete_program.o. Example 2 Now consider a program broken into two files of source code-m.c and r.c. Assume that m.c contains the mainO function. Further assume that somewhere in m.c, a call is made to a function stored in file r.c. Probably the easiest way to create an executable object file is to compile like this: $ cc m.c f.c Assuming no errors, the preceding command creates three files-two object files (m.o and r.o) and an executable file named a.out. Example 3 Suppose that we discovered a mistake in file m.c from Example 2. After changing the source code in m.c, we can recompile with the following command: $ cc m.c r.o The preceding command creates a new m.o and a new a.out, but it does not affect r.o. 6-4 Program Development Example 4 Let us now use the same source files as in Example 2, except that this time, we will compile with the -c option as follows: $ cc -c m.c r.c As before, this compiler compiles both m.c and r.c to produce m.o and r.o. This time, though, the -c option suppresses the linking of m.o and r.o. Later, you can optionally link m.o and r.o with the cc command: $ cc m.o r.o 6.2.1.2 Ibin/cc Compiler Errors The compiler does not produce a .0 file if there is an error in the source code or if compilation ends prematurely for some other reason (you type a CTRLlQ, for example). In addition, if you compile and one or more of the files contain errors, then the linking phase will be suppressed, and consequently, no a.out file will be produced. For instance, consider the following compilation: $ cc t1.c t2.c t3.c If all three source files are error-free, then the compiler creates the following four files: t1.o t2.0 t3.0 a.out However, if t2.c had an error, then the compiler would have created the following two files only: t1.o t3.0 Unlike many UNIX systems, the Domain Ibin/cc command renames any existing .0 files to .0. bak before compiling a .c file. For example, suppose that you compile file test.c to produce file test.o. Before beginning the compilation, the system will rename any existing test.o file to test.o.bak. If compilation succeeds, you will have two files-test.o and test.o. bak. If compilation fails, you will still retain the previous object module, but it will be renamed to test.o.bak. If errors occur during compilation, the compiler writes diagnostic messages in stderr and flags the incorrect statements in the listing file. See Chapter 9 for a complete list of compiler error and warning messages. Program Development 6-5 6.2.1.3 Overview of Ibin/cc Options The Ibin/cc command is really an interface to the preprocessor (cpp), the Domain C compiler, and the link editor (Id). Not all standard UNIX options are available. Furthermore, some unique options are provided by the Domain C compiler. The compiler will interpret any command line argument beginning with a dash (-) as a compiler option. If the compiler doesn't recognize an option, it will assume that it is an option for the link editor (I d) and will pass it along. The options it recognizes as preprocessor options are: -C, -D, -H, -I, and -U. The link editor options are: -a, -I, -m, -0, -r, -s, -t, -u, -x, -Z, -L, -M, and -v. See the DomainlOS Programming Environment Reference manual for more information about Id. The supported options are described in Table 6-1. Some of the Domain-specific options are described in more detail in section 6.3. Note that unlike options to the Icom/cc command, these options must be entered in the correct case. Also, you can enter multiple options with a single dash (-). For example: $ Ibin/cc -almc test.c Table 6-1. Ibinlcc Command Options Option Default -a -a -B name Description (Id option) Produces an object file for execution. This is the default. Use -r to retain relocation information in the object module. If you specify both -a and -r, the link editor will retain relocation information for all data except common symbols, which will be allocated. Assigns a prefix pathname to cc and ld for substitute compiler and linker passes. If name is not specified, the default is lusr/lib/o. -c Suppresss the linking phase of the compilation and force an object file to be produced, even if only one program is compiled. -C (cpp option) Prevents the preprocessor from stripping comments. (Continued) 6-6 Program Development Table 6-1. /bin/cc Command Options (Cont.) Option -D name[=def] Default Description (cpp option) Defines name to the preprocessor, as if by #define. If no definition is given, defines name as 1. This is the same as the Aegis -def option described in Section 6.3.10. The - D option has lower precedence than -U; if both are used for the same name, the name will be undefined, regardless of the order in which the options appear. -E Runs only the macro preprocessor on the named C programs, and sends the result to the standard output. This is similar to the Aegis -es option described in Section 6.3.11. Note, however, that -E passes the source file through the UNIX preprocessor (cpp) , whereas -es processes the source file with the Domain preprocessor that is part of the Domain compiler. -f Not supported. -F Not supported. -g Generates full run-time debugger information. See the Aegis -dbs option described in Section 6.3.9. -H (cpp option) Prints out to stderr the pathname of every file included during this compilation. The map lists the name of each section, its starting address, and its size. (Continued) Program Development 6-7 Table 6-1. /bin/cc Command Options (Cont.) Option Default Description -I dir (cpp option) Changes the search path for #include files with names not beginning with a slash (I) and enclosed in double quotes rather than angle brackets. Look first in the directory of the source file in which the #include directive occurs; then in directories named in this option; and finally. in directories on a standard list. Note that this option does not affect filenames enclosed in angle brackets. It is also similar to the Aegis -idir option (Section 6.3. 14). though the search rules are somewhat different. See the description of #include in Chapter 4 for more information. -Ix Searches the library named Iibx.a. Libraries are searched in the order that they appear on the command line. The link editor searches for libraries in the directories specified by the environment variables LIBDIR and LLIBDIR (these generally resolve to /lib and /usr/lib). You can specify additional library directories with the -L option. -Ldir (Id option) Changes the search path for libraries. By default. the compiler looks for Iibx.a libraries in the directories specified by LIBDIR and LLIBDIR. This option allows you specify a different directory before searching these standard directories. This is useful if you have different versions of a library and you want to specify which one the link editor should use. Note that this option is only effective if it precedes a -I option. -m (Id option) Produces a map or listing of the input/output sections on standard input. (Continued) 6-8 Program Development Table 6-1. /bin/cc Command Options (Cont.) Option Default -M id -Many Description Generates code for a particular class of processor. Legal values for id are: any 160 460 660 90 330 560 570 580 3000 FPX PEB standard M68000 code DSP160 code DN460 code 660 code DSP90 code DN330 code DN560 code DN570 code DN580 code DN3000 and DN400 code Floating-Point Accelerator Board Performance Enhancement Board This is the same as the -cpu option described in Section 6.3.8. -0 output -0 a.out Names the final output file output. By default, output is a.out. If you specify a different name, the system leaves any existing a.out file undisturbed. This is similar to the Aegis -b option described in Section 6.3.4. -0 Turns on compiler optimizations. This is the same as the Aegis -opt option described in Section 6.3.21. The default when you compile with /bin/cc is -opt O. -p Produces code that, when executed, creates a mon.out file that can be used by the prof utility to evaluate the program's performance. This is the same as the -prof option described in Section 6.3.23. This is the same as the -qp option available in SysV environments. (Continued) Program Development 6-9 Table 6-1. /bin/cc Command Options (Cont.) Option Default Description -p Runs only the macro preprocessor on the named C programs, and leaves the result on corresponding files suffixed with.i. This is similar to the Aegis -esf option described in Section 6.3.11. -pg (BSD only) Produces code that, when executed, creates a gmon.out file for use by the gprof utility. This is the same as the -qg option available in SysV environments. -qg (SysV only) Produces code that, when executed, creates a gmon.out file for use by the gprof utility. This is the same as the -pg option available in BSD environments. -qp (SysV only) Produces code that, when executed, creates a mon.out file that can be used by the prof utility to evauluate the program's performance. This is the same as the -prof option described in Section 6.3.23. This is the same as the -p option available in BSD environments. -r -s -a (ld option) Retains relocation entries in the output object module. Relocation entries must be preserved if the object file will be specified in a future ld or bind command. -a is the default. (ld option) Strips line number entries and symbol table information from the output object file. The option is equivalent to using the strip utility and is useful if you want to reduce the size of the object module. Note, however, that removing this information from a program makes it impossible to debug the program with a source level debugger (dbx or dde). (Continued) 6-10 Program Development Table 6-1. /bin/cc Command Options (Cont.) Option -t Default Description Not supported. Use the -Y option. -t[POI] Finds only the preprocessor (p), compiler passes(O). or binder (I) in the files whose names are constructed by a -B option. In the absence of a -B option. the name is taken to be /usr/ lib/no The value -t .... is equivalent to -tOl. The - Y option performs the same function and is easier to use. -T systype Defines the target system type (systype) for the compiled object. systype may be one of any bsd4.1 bsd4.2 bsd4.3 sys3 sys5 sys5.3 version independent Berkeley version 4.1BSD (obsolete) Berkeley version 4.2BSD Berkeley versions 4.3BSD System III (obsolete) System V System V, Release 3 This is the same as the Aegis -systype option described in Section 6.3.26. -u symname Enters symname as an undefined symbol in the symbol table. This option is useful if you are using the cc command to load a library. The symbol table is initially empty and needs an unresolved reference to force ld to load the first routine. -U name (cpp option) Removes any initial definition of name. -v (Id option) Outputs a message giving information about the version of ld being used. (Continued) Program Development 6-11 Table 6-1. Ibinlcc Command Options (Cont.) Option -w Default Description (BSD only) Suppresses warning diagnostics. This is the same as the Aegis -nwarn option described in section 6.3.30. -Wc,arg1, [arg2 ... ] Hands off the arguments argi to pass c where c is one of p, 0, or I, indicating the preprocessor, compiler or the binder. Using -WO enables you to use Icom/cc options that are not available with Ibin/cc. For instance, to specify the -exp option, you could write: $ /bin/cc -WO,-exp -x foo.c (ld option) Does not preserve local symbols in the output symbol table; enter external and static symbols only. This saves space in the object module, but still enables the link editor to resolve global references. Specifies a new pathname and directory for the locations of the tools and directories designated by the first argument. You can include only one letter or number per - Y option, but there is no limit to the number to - Y options per compilation. The valid letters and numbers, and their meanings, are as follows: (Continued) 6-12 Program Development Table 6-1. Ibinlcc Command Options (Cont.) Option Default Description p 0 I S I L U Preprocessor Compiler Link editor Directory containing the start-up routine Default include directory searched by the preprocessor First default library directory searched by the link editor (Id) Second default library directory searched by the link editor If the location of a tool is being speci- fied, the new pathname for the tool will be Idir/tool. If more than one - Y option is applied to anyone tool or directory, the last occurrence holds. -z (Id option) Does not bind anything to address zero. This option enables the run-time system to detect null pointers. 6.2.2 Compiling with Icomlcc To compile a file of C source code using the Icom/cc command, type cc from an Aegis shell or Icom/cc from a UNIX shell. The Icom/cc command has the following format: $ cc sourceJile [optiOn1...optionN ] For sourceJile, specify the pathname of the source file to be compiled. By convention, C source files usually end with a . c suffix, though the suffix is not required. Filenames may contain up to 256 characters, including the .c suffix. If the filename includes the .c suffix, you may omit the suffix on the command line. For example, to compile C source code stored in file test.c, you can enter either of the following commands: $ cc ~est $ cc Itest.c Following the source filename, you can optionally enter one or more of the C compiler options listed in Table 6-2, and detailed in Sections 6.2.3 to 6.2.21. Be sure to separate each option with at least one space. If there are no errors in the source code and the compilation proceeds normally, the C compiler creates an object file and, optionally, a listing file. By default, the compiler Program Development 6-13 gives the object file the . bin suffix and the listing file the .1st suffix. For example, in response to the command $ cc plot_data -1 the C compiler reads the file plot_data.c, and produces an object file named plot_data. bin and a listing file named plot_data. 1st. The Icom/cc command preprocesses and compiles a single source file (plus any included header files), and produces a single object file. If your program contains more than one module, you must link the object files together with the Icom/bind or Ibin/ld command. These two commands perform similar operations-/com/bind invokes the Aegis binder; Ibinlld invokes the UNIX link editor. You can use either one to link object modules together. You also need to use the Icom/bind or /bin/ld commands if your program accesses routines in a user-supplied library. If your program consists of a single module that does not access user-supplied library routines, you do not need to explicitly invoke a linker. For more information about the bind and Id commands, see the DomainlOS Programming Environment Reference manual. 6.2.3 Icomlcc Compiler Errors The compiler does not produce an object module if there is an error in the source code or if compilation ends prematurely for some other reason (you type a CTRLlQ, for example). Rather, it looks in the appropriate directory for a binary object module with the same name as the one it would have created, had it been successful. If such a file exists, the compiler changes its name by appending the additional suffix .bak (filename.bin.bak.) For example, suppose your working directory contains the following files: abc. c abc. bin (the source file) (the object file) Now suppose you recompile abc.c: $ cc abc If the source file contains an error, the compiler does not create a new version of abc.bin. Instead, the compiler changes abc. bin's name to abc.bin.bak. If the compi- lation completes successfully, the compiler creates the new filename. bin file and deletes any previous filename. bin. bak file. If errors occur during compilation, the compiler writes diagnostic messages in errout and flags the incorrect statements in the listing file. See Chapter 9 for a complete list of compiler error and warning messages. 6-14 Program Development 6.2.3.1 Overview of leo mice Compiler Options Domain C supports the compiler options summarized in Table 6-2. You cannot abbreviate option names. The optional "n" prefix negates the effect of some options. For example, the -b compiler option causes the compiler to produce an object file; conversely, the -nb option prevents the compiler from producing an object file. Table 6-2. C Compiler Options Option Default -ac -ac Produces absolute code. This is the default. Another option is -pic, which forces the compiler to produce position-independent code. -alnchk -alnchk Display messages about alignment of structures Suppresses alignment messages. -nalnchk -b [pathname] -b -nb -bss -nbss -nbss Description Produces a binary output file. The operational pathname specifies a name for the output file. If you omit the pathname, the compiler appends the . bin suffix to the source file's name. -nb inhibits production of a binary output file. Put uninitialized global variables in the . bss section of the object file. By default, all global variables are put in separate, named sections. (Continued) Program Development 6-15 Table 6-2. C Compiler Options (Cont.) Option Default -comchk -ncomchk -ncomchk Checks to see if comment delimiters are balanced and generates a warning if they are not. -cond -ncond -ncond Compiles lines begining with the #debug preprocessor directive. -cpu id -cpu any Specifies the cpu type on which the program will run. The id argument can be any of the following: 90, 160, 330, 460, 560, 570, 580, 660, 3000, fpx, peb, and any. Using any causes the compiler to produce universal machine code that can run on any of the CPUs. (Note: This option replaces the -peb option supported in earlier releases.) -db -ndb -db Generates minimal debugging information. When you debug a program compiled with this option, you can set breakpoints but you cannot examine variables. -dbs -db Generates full run-time debugging information and optimizes the generated object file (implies the -opt option). -dba -db Generates full run-time debugging information, but prevents optimization of the generated object file (implies -opt 0). -def name Description [= value] Defines a name (works like the #define preprocessor directive). Each compilation command supports up to 128 -define options. -es Causes the compiler to run only as a preprocessor. Writes the expanded source code to stdout. (Continued) 6-16 Program Development Table 6-2. C Compiler Options (Cont.) Option Default Description -esf [pathname] Causes the compiler to run only as a preprocessor. Writes the expanded output to pathname or to stdout if pathname is omitted. -exp -nexp -nexp Expands the code listing in the listing file to include the generated assemblylanguage code. This option implies the -1 option. -frnd Forces the compiler to write all floating-point operands to memory so that floating-point comparisons produce correct results. -idir pathname Specifies a list of directories for the compiler to search to find #included filenames. Each compilation command supports up to 63 -idir options. -indexl -nindexl -nindexl Produces a 32-bit index for all array references. -info level -info 0 Controls the output of informational messages. There are four possible levels: 0, 1, 2, and 3. Each higher level causes the compiler to output additional informational messages to indicate potential errors in the source file. -info 0 suppresses informational messages. -inlib pathname Specifies one or more libraries that are not currently installed but should be installed when the program is executed. These libraries are searched at compile-time to determine whether indirect or absolute references should be generated. (continued) Program Development 6-17 Table 6-2. C Compiler Options (Cont.) Option Default -I pathname -nl -nl Writes a listing file to filename.lst or to pathname.lst if pathname is specified. By default, this option is off, but is automatically turned on by the -map and -exp options. -map -nmap -nmap Inserts a symbol map in the listing file. This option implies the -I option. Description -natural Makes natural alignment the default for this compilation. -mgbl -nmgbl Obsolete. As of SRI0, this option is a no-op. -msgs -nmsgs -msgs Controls output of the warning and error summary line. If -nmsgs is specified, the final message from the compiler is suppressed. -opt -nopt -opt Causes the compiler to perform global program optimizations. The -nopt option suppresses optimizations. -pic -ac Produces position-independent object code. The default is to produce absolute code. -prof Produces a . mon file that can be used by the prof utility to evaluate performance of the program. -run type systype sysS Causes the compiler to use the runtime semantics of the specified systype regardless of the current environment setting. The possible systypes are: bsd4.2, bsd4.3, sysS, sysS.3, and any. -std -nstd -nstd Causes the compiler to issue warning messages when nonstandard language elements are encountered. (Continued) 6-18 Program Development Table 6-2. C Compiler Options (Cont.) Option Default Description -systype systype -systype sys5 Causes the compiler to stamp the obj ect module for execution under a specific version of the UNIX system. The possible systypes are: bsd4.2, bsd4.3, sys5, sys5.3, and any. -type -ntype -type Causes the compiler to recognize function prototypes and reference variables. Also defines - STDC- to be 1. -uline -uline Causes the compiler to recognize #line preprocessor directives. -nuline forces the compiler to ignore #line directives. -nuline Causes the compiler to print its version number. -version -warn -nwarn -warn Causes the compiler to display warning messages. The -nwarn option suppresses warning messages. 6.3 Domain Compiler Options The following sections describe the Ibin/cc and Icom/cc options in more detail. Program Development 6-19 6.3.1 Absolute Code in User Space: -ac (lcom/cc) The -ac option is the default. It forces the compiler to produce absolute code, which generally executes faster than position-independent code. Unlike code produced with the -abs option, however, code produced with -ac uses indirect referencing for all global variables that are defined in global libraries. This includes global libraries currently installed as well as libraries specified with the -inlib option. Refer to the Domain/OS Programming Environment Reference, and Domain Assembler Reference manuals for more information about absolute and position-independent code. 6.3.2 Longword Alignment: -align and -nalign (lcom/cc) The -align and -nalign options are obsolete. 6.3.3 Displaying Messages about Alignment: -alnchk and -nalnchk (lcom/cc) When you use the -alnchk option, the compiler displays messages telling you whether your data is naturally aligned. Naturally aligned data increases efficiency at least slightly on any workstation, but the increase in efficiency is very significant on Series 10000 workstations. Use the -nalnchk option to suppress messages about alignment. The -alnchk option is the default. 6.3.4 Binary Output: -bl-nb (lcom/cc) -0 (lbinlcc) The -b option (lcom/cc) produces a binary object module file as output. This option takes the format: -b ~athname] If you specify a pathname following -b, and your program compiles without errors, a binary file is created with the specified pathname and the suffix .bin. If you omit the path- name, the binary file is given the same name as the source file, except that .bin replaces . c as the suffix. Specify -nb to suppress creation of an object module. This option is useful if you are compiling only to check for errors in your program. -b is the default. 6-20 Program Development The -0 option (/bin/cc) allows you to direct the resulting object file to a file other than a.out, which is the default. This option takes the following format: -0 pathname Note that you must specify a pathname. 6.3.5 Global Variables in .bss Section: -bssl-nbss (lcom/cc) By default (-nbss), the cornice compiler creates a separate, named section for each global variable. The name of the section is the same as the name of the variable. In contrast, the Ibin/cc compiler puts all initialized global variables in .data and all uninitialized global variables in .bss. The -bss option causes the leo mice compiler to mimic the behavior of the Ibin/cc compiler. This is useful if your program uses many global variables and you don't want a named section for each one. Even if you use the -bss option, you can still create named sections for global variables by using the #attribute[section] modifier. See Chapter 2 for more information about this modifier. 6.3.6 Comment Checking: -comchkl-ncomchk (lcom/cc) The -comchk option causes the compiler to check that comment pairs are balanced-that there are no extra left comment delimiters (1*) before a right comment delimiter (*/). When -comchk is specified, the compiler returns a warning for every additional left comment delimiter. Using -comchk can help you identify a place in the program where some code was not compiled because the compiler assumed that it was part of a comment. The -ncomchk option inhibits this extra check. -ncomchk is the default. For example, consider the following program fragment: /*This comment should be closed, but I forgot to do it! crash_flag = 10; /* MUST occur or else disaster */ If we compile with -comchk, then the preprocessor will report the following warning: ******** Line 8: Warning: Unbalanced comment; another comment start found before end. If we compile with -ncomchk (or simply without -comchk), then the preprocessor will not report the warning. NOTE: Using -comchk only identifies a problem area in the source code; the option has absolutely no affect on the machine code generated. Program Development 6-21 6.3.7 Conditional Compilation: -condl-ncond (lcom/cc) The -cond option invokes conditional compilation. When this option is on, lines marked with the #debug preprocessor directive are treated as source code lines and are compiled. If you compile with the -ncond option, the compiler treats the marked lines as comments. -ncond is the default. 6.3.8 Target Node Selection: -cpu cpu (lcom/cc) -M cpu (lbin/cc) Use the -cpu or -M option to select the target workstations that the compiled program can run on. If you choose an appropriate target workstation, your program might run faster; however, if you choose an inappropriate target workstation, the run-time system will issue an error message telling you that the program cannot execute on this workstation. The Domain C compiler can generate code in five possible modes: • Code that will run on a DSP160, DN460, or DN660 workstation • Code that will run on a workstation with the M68020 microprocessor and the M68881 floating-point coprocessor • Code that will run on a workstation with a Performance Enhancement Board (PEB) • Code that will run on any Apollo workstation • Code that will run on a DN5xx-T with a floating-point accelerator (FPX) unit You select the code generation mode through the argument that you specify immediately after -cpu or -M. Table 6-3 shows the possible arguments and the code generation mode that they select. Note that there are many possible arguments to -cpu and -M; however, many of them are synonyms. For example, -cpu 330 produces exactly the same code as -cpu 560. The advantage of compiling with -cpu any is that the resulting program can run on any Apollo workstation. This is how Apollo compiles the programs that appear in your Icom or Ibin directories. -cpu any and -Many are the defaults. 6-22 Program Development Table 6-3. Arguments to the -cpu and -M Options Argument -cpu 160 -cpu 460 -cpu 660 -cpu -cpu -cpu -cpu -cpu -cpu 90 330 560 570 580 3000 -cpu fpx What It Does Generates code for the DSP160, DN460, and DN660 workstations. Generates code for workstations with a M68020 processor and a M68881 floatingpoint unit (includes the DSP90, DN330, DN560, DN570, DN580, DN3000, and DN4000). Generates code for workstations with a floating-point accelerator (FPX) unit (includes DN5xx-T's) -cpu peb Generates code for workstations with a PEB (includes the DN100, DN320, DN400, and the DN600, when equipped with an optional PEB). -cpu any Generates code for any workstation. The advantage of the processor-specific code generation modes is that the compiler generates code optimized for that particular processor, which makes the programs so compiled run faster. The advantage is seen mostly in programs that make heavy use of floatingpoint. Programs that make heavy use of 32-bit integer multiply and divide might also show significant improvement. NOTE: There is one caveat concerning programs compiled with the -cpu fpx option. The address of an instruction for a floatingpoint fault is not stored in the Instruction Address register (lAD DR) as it is for programs compiled with the -cpu 330 and -cpu 3000 options. Consequently, fault handlers should not rely on this address when code is compiled with -cpu fpx. This warning applies only to assembly language fault handlers. 6.3.9 Debugger Output: -dbl-ndbl-dbsl-dba (lcom/cc) -g (lbin/cc) The -db, -dba, -dbs, and -g options generate output for later use by dde and dbx, the language-level debuggers. These debuggers allow you to search for program errors using the program's variables, parameters, statement labels, and other program-defined symbols. The output generated by the four compiler options allows the debuggers a particular level of access to the program. The -ndb option specifies no debugger access. Table 6-4 summa- Program Development 6-23 rizes the access granted to the debuggers by each option. For an overview of the debuggers. see Section 6.8. Table 6-4. DEBUG Compilation Options Debugger Access Compiler Option -ndb None. -db Source line numbers (except lines optimized out) and functions. -dbs (or -g) Same access as -db with the addition of local and global variables. -dba Same access -dbs but without any code optimization. If you use the -db option. the compiler puts minimal debugger preparation information into the .bin file. This preparation is enough to enter the debugger and set breakpoints. but not enough to access symbols. such as variables and constants. If you use the -dbs option. the compiler puts full debugger preparation information into the .bin file. This preparation allows you to set breakpoints and access symbols. When you use the -dbs option. the compiler sets the optimization level to 3. (You can override this by specifying a different optimization level with the -opt option.) The -g option to Ibin/cc is the same as -dbs. The -dba option is identical to the -dbs option except that when you use the -dba option. the compiler sets the -nopt option (even if you specify -opt). For more complete details on these four options. see the Domain Distributed Debugging Environment Reference manual. 6.3.10 Name Definition: -def name [= value] (lcom/cc) -Dname[=value] (lbin/cc) The -def option lets you define a name and. optionally. its value at compilation time. It takes the format: -def name [= value] This option has the same effect as the #define preprocessor directive. You may use as many as 128 -def options in a compile command line. If you do not use the optional =value component. the default value of the name is 1. For example. consider the following simple program stored in file test.c: 6-24 Program Development Hinclude int x = 0; int main( void { Hif envl x 500; HeIse Hif env2 x = 1000; Hendif Hendif printf ("x %d\n", x); } Tables 6-5 and 6-6 shows the varying effects of three different compilation command lines: Table 6-5. The Effect of -def Compilation Command Result $ cc test x=O $ cc test -def envl x = 500 $ cc test -def env2 x = 1000 Table 6-6. The Effect of-D Compilation Command Result $ cc test.c x=O $ cc -Denvl test.c x = 500 $ cc - Denv2 test. c x = 1000 If there are spaces in the value, be sure to surround the entire definition with quotes. For example, $ Icomlcc -deC "rev_string=@"Revision 1.23 I-JAN-85@"" or Program Development 6-25 $ Ibin/cc '-Drev_string="Revision 1.23 I-Jan-85'" is the same as #define rev_string "Revision 1.23 I-Jan-85" Note that in an Aegis shell, any embedded quotation marks must be preceded with u@". The -D option behaves just like the -def option. Note, however, that unless you enclose the entire option in single quotes, you cannot put a space between the defined name and the equal sign or between the equal sign and the value. 6.3.11 Preprocessor Options: -esl-esf (lcom/cc) -EI-P (lbin/cc) Compilation actually consists of two phases-preprocessing and processing. During preprocessing, the preprocessor obeys all the preprocessor directives (such as #define, #if, #include) in your source code. It is not until processing that the compiler actually generates executable code. By default, when you issue the cc command, you invoke both the preprocessor and the processor. However, by using the -es or -esf options (-E or - P option with Ibin/cc) , you invoke only the preprocessor. The output (known as the expanded source) from the preprocessor can be studied and run through the processor if desired. The two Icom/cc options produce the exact same expanded source file; the only difference is in the pathname of the expanded source file. The options take the following format: -es -esf [pathname] The -es option directs the expanded source to standard output. The -esf option takes an optional pathname as an argument. If you omit a pathname, the expanded source file gets the same name as the source file, but the .c suffix is replaced by the suffix . i. If you specify a pathname, the compiler uses that name and automatically appends the .i suffix, unless it is already present. NOTE: You cannot use -es or -esf in a command line that also contains -I, -b, or -expo The Ibin/cc -E option behaves exactly like the -es option; the -P option functions exactly like the -esf option without a pathname argument. 6-26 Program Development 6.3.12 Expanded Code Listing: -exp!-nexp (lcom/cc) -8 (lbin/cc) If you use the -exp or -S option. the compiler produces an expanded listing file that contains a representation of the generated assembly-language code. interleaved with the source code. The listing also shows all macro expansions. Note that using -exp causes the compiler to produce a listing file even if you did not use the -1 option. However. if -01 appears on the command line after -expo then the expanded code listing will be suppressed. -oexp is the default. 6.3.13 Floating-Point Accuracy: -frnd (lcom/cc only) The -frod option forces the compiler to write all floating-point operands to memory and then fetch the memory contents before evaluating the expression. This ensures that each operand will have the same amount of precision so that floating-point comparisons will produce correct results. If you do not compile with -frod. floating-point operands may be kept in registers. which support more accuracy than memory. Consequently. when a register operand is compared with a memory operand. the result may not be what is expected. This is particularly true of equality comparisons. Consider the following C program: double fetch( void ) { return 1.1; } int main( void { double x; x = fetchO; if (x - 0.1 == 1.0) printf(" Pass\n" ); else printf( " Fail\n" ); } If you compile with -cpu 3000. and without -frod. this program fails because the values 0.1 and 1.1 cannot be represented exactly in base 2 floating-point. Thus. the quantity (x - 0.1) can only be approximated. This value is calculated in an 80-bit register. and then a compare is generated to see if this value is exactly equal to 1.0. which is stored in memory. Since the register has more accuracy than memory. the comparison fails. If you compile with -frod. the 80-bit register is stored (and rounded) in a single-precision 32-bit temporary memory location. Now when it is compared with 1.0. which is also stored in memory. the comparison passes. Program Development 6-27 6.3.14 Include Directories: -idir (/com/cc) The -idir option tells the compiler to look for include files in the directories specified by the pathname. (Include files are detailed in the "#include" listing of Chapter 4.) This option allows you to postpone until compilation naming the directories for include files. Suppose, for example, that different versions of an include file have the same name but reside in different directories. You might enter the filename in an #include command in your code, and then select the appropriate directory with -idir. You may use as many as 63 -idir options in each compilation command. When you enclose the include filename in quotes in the #include control line, the compiler first searches the working directory, then the directories specified by -idir (if any), and finally, the directory lusr/include. When you enclose the include filename in angle brackets « », the compiler searches the -idir directories first, and then lusr/include. Suppose that your source file contains these #include statements: #include "local_include_file" #include You then compile the program with the following options while in the same directory as the local include file: cc test -idir Ipersonal -idir \impersonal The C compiler resolves the include files by searching for filenames in the following order: 2. Ipersonai/local_include_file 3. \impersonal/locaUnclude_ file 4. lusr/include/locaUnclude_file 1. Ipersonal/global_include_file 2. \impersonal/ globaUnclude_ file 3. lusr/includel global_include_ file Note that the -idir directories are searched in the order in which they appear in the compilation command. 6-28 Program Development 6.3.15 Array Reference Index: -indexll-nindexl (leorn/ee) The -indexl option disables some optimizations and forces the compiler to use 32-bit indexing in subscript calculations for all array references. -nindexl, the default, causes tqe compiler to use the source code's array dimension information to determine whether to u.se 16-bit or 32-bit indexing. 6.3.16 Informational Messages: -info I -ninfo (leorn/ee) The Domain C compiler produces three types of messages: informational Identifies aspects of the source file that will compile correctly, but could be rewritten to be more efficient or more portable. warning Identifies aspects of the source file that may be correct, but are suspect. The compiler makes a "best guess" as to what the source means and produces an object file. error Identifies syntactical or semantic errors that prevent the compiler from producing an object file. By default, the compiler outputs warning and error messages but not informational messages. (You can suppress warning messages with the -nwarn option.) The -info option causes the compiler to output informational messages. However informational messages are divided into four levels, where each higher level represents additional messages. The four levels are as follows: o No messages (this is the default). 1 Messages about old-style function definitions and allusions and messages indicating that members of a structure are not naturally aligned. 2 Messages indicating that the program could be written more efficiently (for example, a variable is declared but never used). 3 Reserved for future use. 4 Reserved for future use. Note that each level includes the messages in all lower levels. For instance, if you specify -info 3, you will receive level 1, 2, and 3 messages, but not level 4 messages. Program Development 6-29 6.3.17 Installed Libraries: -inlib (lcom/cc) The -inlib option allows you to specify additional libraries that are not currently installed, but will be installed when you execute the program. The compiler needs this information to determine whether to use indirect or long absolute addressing modes. If you are producing absolute code (the default), you must use this option to specify any library that is not currently installed, but should be installed when the program is executed. If you use the -pic option to produce position-independent code, you do not need to specify libraries that are not yet installed. See the Domain/OS Programming Environment Reference and the Domain Assembler Reference manuals for more information about absolute and position-independent code. The -inlib option has a format similar to the -idir option. Instead of specifying the pathnames of directories, however, you specify the pathnames of files that you want to inlib. The following command line, for example, tells the compiler that the object files -libs/ my_lib and //nodellibs/master_lib will be installed when examp.bin is loaded: $ cc examp -ac -inlib -libs/my_lib //node/libs/master_lib If you specify a library with the -inlib option, the compiler writes a library record into the object file so that the loader automatically inlibs the library when it loads the object file. If the library is not available at load time, an error occurs. 6.3.18 Listing File: -ll-nl (lcom/cc) The -I option causes the compiler to produce a listing file. The listing file contains the following information: • The source code complete with line numbers. Note that line numbers start at 1 and increment by 1 (even if there is no code at a particular line in the source code). In addition, note that the listing file separately numbers lines from include files. • Compilation statistics. • An object module section summary. • A list of the compiler options affecting code generation. • Errors and warnings generated during the compilation. • A count of the error and warning messages produced by compilation. The format for the -I option is -I [pathname] 6-30 Program Development If you specify a pathname following -I, the listing file is created with the specified pathname and the suffix .Ist. If you omit pathname, the listing file is given the same name as the source file, except that .Ist replaces .c as the suffix. The -nl option is the default, but note that -map and -exp contain an implicit -I. 6.3.19 Symbol Map: -mapl-nmap (lcom/cc) If you use the -map option, Domain C creates a map file. A map file contains everything in the listing file (produced by using -1) plus a special symbolic map. The special symbolic map consists of two sections. The first section describes all the routines in the compiled file; for example, here is a sample first page: 001 TEST_C module(Psect = procedure$,Dsect = data$) 002 q function(Proc = OOOOOO,Ecb = 000040,Stack Size Psect = my-proc,Dsect = data$) 16, 003 main function(Proc = OOOOOO,Ecb = 00002C,Stack Size Psect = procedure$,Dsect = data$) 004 program(Proc Stack Size = O,Psect 4, 00004C,Ecb = 000018, procedure$,Dsect = data$) Let us consider this information on a line-by-line basis. The first line 001 TEST_C module(Psect = procedure$,Dsect = data$) tells us the name of the module (TEST_C) and the names of the head-of-file procedure and data sections. The second and third lines 002 q function(Proc = OOOOOO,Ecb = 000040,Stack Size Psect = my-proc,Dsect = data$) 003 main function(Proc = OOOOOO,Ecb = 00002C,Stack Size Psect = procedure$,Dsect = data$) 16, 4, tell us the names (q and main) of the two user-supplied functions in the source code. The map supplies five pieces of information for each function. The first piece of information is the starting address of the function measured in bytes from the beginning of the procedure section. The section piece of information is the offset in bytes of the ECB (Entry Control Block). The third piece of information is the stack size measured in bytes. The fourth and fifth pieces are the names of the procedure and data sections that the function is stored in. Program Development 6-31 The fourth line 004 program(Proc = 00004C,Ecb = 000018, Stack Size = O,Psect = procedure$,Dsect = data$) tells us the same information as the second and third lines, but for a special startup function provided by the compiler. The second section of the special symbolic map contains an alphabetic listing of all the variables used in the source code. For example, here is a sample second page: 002 002 001 003 002 001 001 001 arg1 c g j4 m student x y var(+000014/S): long int var(-000006/S): char var(+OOOOOO/g): float var(-000008/S): long int var(-000004/S): long int var(+OOOOOO/student): array[0 .. 9) of char var(extern) : long int var(+OOOOOO/y): long int The preceding data tells us that the program referred to eight variables. Let us consider the second variable 002 c var(-000006/S): char in greater detail: 002 The number to the far left tells us where within the program that the variable was declared. Top-level declarations get the number 001. A number higher than 001 indicates a variable declared in a function. For example, 002 means that this variable was declared in the first function of the file, 003 identifies a variable declared in the second function of the file, and so on. c The name of the variable. (-000006/S) This number and identifer tells us where the variable is actually stored at run time. If the identifier is "S", it means that the variable is stored on the stack. Otherwise, the identifier tells you the name of the section in which the variable is stored. The numerical part of the data is the offset (in bytes) from the beginning of the stack or the section. char This tells us the data type of the stored variable. Note that variable x does not have an offset or section name since it is a declaration, but not a definition. The -nmap option suppresses creation of the special symbol map. -nmap is the default. 6-32 Program Development 6.3.20 Error and Warning Summary: -msgsl-nmsgs (lcom/cc) The -msgs and -nmsgs options control the output of a summary message from the compiler. If -nmsgs is given, the final message from the compiler (shown below) is suppressed: XX errors, YY warnings, C Compiler, Rev n.nn The default is -msgs. 6.3.21 Optimization Levels: -opt [n] -0 [n] (/com/cc) (lbin/cc) For Icom/cc, the -opt 3 option is the default. For Ibinlcc, the default is no optimization. The -opt option allows you to specify the kinds of optimization performed on your source program, by means of an "optimization level." The syntax for the -opt option is: where n is an integer between 0 and 4 that represents the optimization level. At -opt 0, very few optimizations are performed. For each higher optimization level, more optimizations are performed. If you specify -opt and omit the optimization level, the level defaults to -opt 3. If you omit the -opt option completely, the default option, -opt 3, is assumed. The obsolete option -nopt is equivalent to -opt O. Each higher level of optimization includes all optimizations performed at the lower levels of optimization. Because the compiler does more work at each higher level of optimization, it may take longer to compile your program at higher optimization levels. It is important to note that the· -dba option overrides anything you specify for the -opt option. If you want your code to be optimized, and want to use the debugger on your program, you should use the -dbs option rather than -dba. When you specify -dba, the -opt option is set to -opt 0, regardless of what you specified for -opt on the command line for the compilation. In addition, -dba represents an even lower level of optimization than -opt O. Program Development 6-33 NOTE: If you wish to use the Debugger (described in the Section 6.8) to debug a program compiled at -opt 3 or -opt 4, you may find that you get inaccurate values for some local variables at points in the source code where those variables are not actively in use. This happens because the value of the variable is assigned to a machine register, rather than being kept in the computer's main memory. The optimizer may decide that the main memory location for this variable does not need to be updated, because all uses of the variable in the source program can legally use the value of the variable that is retained in the machine register. In addition, the optimizer may inerge some source statements together, or eliminate source statements entirely, when legal to do so. When you are debugging with these optimizations, you may see what appear to be strange jumps in the control flow of the program. In addition, you may be unable to set a breakpoint at a particular source line because the generated code for that source line has been optimized away or merg~d with the code from another source line. It can be slightly more difficult to use the debugger with optimized code, but there is no reason to avoid using dde or dbx with the optimization levels discussed here. See the Domain Distributed Debugging Environment Reference manual for more details concerning the use of the debugger. The following is a brief description of the optimizations performed at each optimization level. For a detailed discussion of compiler optimization techniques, consult a general compiler textbook. -dba Represents the lowest possible optimization level. It forces the -opt option to be -opt 0, and additionally suppresses some optimizations that are normally performed at -opt O. In particular, the -dba option forces the compiler to store machine registers in main memory after every statement. Even with the -dba option, however, some optimizations are still performed. For example, the compiler may: • Rearrange expressions to minimize the number of registers needed to compute the expression. • Generate faster short range branch instructions in place of long branches where possible. • Compute constant expressions that appear in the source code, such as 2 • 3, rather than generating code to compute them. • Compute multiple occurrences of the same expression only once. 6-34 Program Development Another example of simple constant folding performed at this level is shown by the following example: unsigned char small_range; if (small_range < 0) In this example, small_range can never be less than zero because of its type. The compiler will therefore substitute the value FALSE for the expression "small_range < 0". The expression i f FALSE means that the statements following if cannot be executed, so the compiler will not generate code for them. -opt 0 Performs the optimizations described above. If -dba is not also set, the compiler will permit values to remain in registers across statements where it is legal to do so. Additionally, a sequence of generated code that is identical to another sequence may have all its instructions replaced by a branch to the other identical sequence of instructions. -opt 1 performs the following optimizations: • Eliminates limited global "common subexpression." • Eliminates "dead code." • Transforms integer multiplication by a constant into shift and add instructions rather than using direct multiply, where appropriate. • Performs simple transformations for speed. • Merges assignment statements where possible. A common subexpression is an expression that appears two or more times in the program, with no intervening assignments to any component of the expression. In such cases, the expression need only be computed once, and the other occurrences of the expression can be replaced with the resulting value. Dead code is code that cannot be executed because there is no execution path of the program that leads to the code. -opt 2 Performs the following optimizations: • Substitutes constants for "reaching" definitions. • Folds global constants. Assigning to a variable, or using the variable as parameter in a function call, produces a definition of the variable. A particular definition of a variable is Program Development \ 6-35 said to "reach" later uses of the variable if there are no other definitions between the original definition and the use of the variable. If the definition is an assignment of a constant to the variable, uses of the variable that are "reached" by the definition can be replaced with the constant value. As constants are substituted for variable uses, the expressions in which the variable uses occurred are sometimes transformed into constant expressions that can be evaluated during compilation. This eliminates the need to generate code to compute the value of the expression. For example, in the statements a = 3; c = 2 * a; there are no other definitions of the variable a between the original assignment and the use of a in the expression 2*a. So the compiler can substitute the value 3 in the expression 2*a. The expression then becomes 2*3, which is computed during compilation. As a result, the program does not perform a multiply when it executes. Instead, it merely assigns the already computed value 6 to c. -opt 3 This is the default optimization level. At this level, the compiler performs the following optimizations: • Live analysis on local variables. • Redundant assignment statement elimination. • Global register allocation. • Instruction reordering. • Removal of invariant expressions from loops. • Exhaustive searches through each routine for global common subexpressions to eliminate. The -opt 1 and -opt 2 levels make only limited searches through the code for global common subexpressions. Performing live analysis of local variables involves determining the areas of a routine where a variable is actively used. For example, j k; i f (i { 0) i = 2; j=3*j; printf( "%d\n" , j ); } else { } 6-36 Program Development k = i * 4; printf("%d", k); In the else clause of the example, there are no uses of j. If there are no further uses of the variable j on any execution path from the else clause to the end of the program, j is not actively used in the else clause and on execution paths from the else to other parts of the routine. j is therefore considered "dead" from the statement following the else to the end of the routine. Within the if clause, there is a use of j. Therefore, j is actively used within the if clause, and is considered "live" within the if clause. If there are other uses of j that can be reached from the if clause, j is considered "live" along the paths that lead to those uses. Live analysis is important because it allows the compiler to allocate local variables to machine registers for exactly as long as the variable's value is needed. When the variable becomes "dead," the register can be used for other variables or expression values. In general, the CPU can reference a value in a register faster than a value in the computer's main memory. Efficient use of registers increases the execution speed of your program. Redundant assignment elimination performed at this optimization level may result in warning messages such as the following: ******** Line 14: [Warning 279] Value assigned to SMALL RANGE is never used; assignment is eliminated by optimizer. Consider the following example: main( ) { int i, j; fscanf ("%d%d", &i, &j); i f (i == 0) j = 3; printf("%d", i); } There are no uses of the variable j after the assignment j=3. Since the value assigned to j is not used. the compiler can eliminate the assignment completely without changing the result computed in the program. In fact, once the assignment is eliminated, the if portion of the statement isn't needed either, and can be eliminated. If we change the example so that j is used after the assignment, main () { int i. j; fscanf ("%d%d". &i. &j); i f (i==O) j = 3; printf ("%d\ t%d". i. j); } the assignment is no longer eliminated. Global register allocation allows variables that are local to a routine to have their values placed in machine registers for faster access. In many cases, all Program Development 6-37 definitions and uses of a local variable may occur in a register, and the copy of the variable in the computer's main memory is never used or updated. Keeping variables in registers makes your program execute faster. The global register allocator treats the register variable declaration as advice, not as a directive. Variables declared as register receive special consideration for allocation to registers. However, if a variable is declared as register, but is not used, it will not be allocated to a register. Instruction reordering changes the order in which some instructions are executed to take advantage of overlaps that are possible in some instruction sequences. Some integer instructions can execute at the same time as some floating-point instructions, as long as the integer instructions do not depend upon the result computed by the floating-point instruction. A loop invariant expression is an expression whose value does not change during the execution of a loop. When invariant expressions are computed outside a loop, they are only computed once, instead of needlessly being computed on each pass through the loop. This makes the loop execute faster, and generally increases program execution speed. For example: for (i=l; i <= 10; i++) { j j k * m; i + j; } The expression k * m is invariant in the above example. The compiler can safely transform this loop as follows: temp = k * m; for (i=l; i <= 10; i++) { j j temp; i + j; } After the invariant expression is removed from the loop, the example does only one multiply instead of 10 to make the assignment to j. -opt 4 This is identical to -opt 3 in the present compiler. Future releases may use this level to perform additional optimizations. 6-38 Program Development 6.3.22 Position-Independent Code: -pic (lcom/cc) By default, Domain compilers produce absolute or fixed position code. Absolute code programs are loaded at a fixed (determined prior to load time) address. By default, absolute code program,s are loaded at the low end of user virtual memory (hexadecimal address 8000). If the loader cannot load your program at the pre-determined address (because there is already a resident program), it reports an error. The -pic option enables you to produce position-independent code (pic). Pic code can be loaded and run anywhere in virtual address space without relocating (modifying at loadtime) the procedure text. The procedure text is mapped at load time, which is a much faster operation than copying and relocating. In general, absolute code runs faster than position-independent code so you will not use the -pic option often. However, there are a few instances where you must use the -pic option. In particular, you should produce pic code for all routines that are to be entered into an installed library. In addition, you should produce pic code for the following: .. Programs that invoke other absolute code programs in-process with the pgm_$inyoke system call in pgm_$wait mode. fD Programs that are dynamically loaded, such as lOS type managers, GPIO drivers, and shared libraries. Refer to the DomainlOS Programming Environment Reference and Domain Assembler Reference manuals for more information about absolute and position-independent code. 6.3.23 Profiling: -prof (lcom/cc) -p (lbin/cc) The -prof and -p options force the compiler to produce code that, when executed, produces a .mon file that can later be used by the prof utility to identify bottlenecks in the program. For example, if you compile a program called test.c with the command, $ Icom/cc test -prof the compiler will produce a file called test. mon in the working directory. To get performance statistics, execute the command: $ prof test. bin will display the number of calls to each function and the amount of time spent in each function. If you don't specify the program name on the prof command line, prof assumes a.out. For example: $ $ Ibin/cc -p test.c prof Program Development 6-39 For more information about the prof utility, see the SysV Command Reference manual. Note that you can also use the dpak utility to obtain more detailed statistics about program performance. For details about dpak, refer to Analyzing Program Performance with DPAK. 6.3.24 Nonportable References: -stdl-nstd (/com/cc) The -std option causes the compiler to issue warning messages for nonportable language elements (that is, extensions to the K&R standard). If portability is an issue, pay attention to the warnings; otherwise, ignore them. The -nstd option suppresses reporting of nonstandard elements. -nstd is the default. 6.3.25 Run-Time UNIX Version Selection: -runtype systype (lcom/cc) When you execute a C program, the run-time environment uses the semantics of the systype stamped on the object module. By default, this is sys5, but you can change it with the #systype preprocessor directive or the -systype compile option. Use the -runtype option to override the systype that is stamped on the object module. That is, you can use -run type when you compile with one systype setting but want to execute the program with a different systype setting. Suppose, for example, that you want to use the C shell in a SysV environment. Because the C shell is a BSD program, you need to compile it in a BSD environment. When you actually run the program, however, you want all filenames to resolve to the SysV tree. To accomplish this, you need to compile with -systype BSD4.3 and -runtype SysV.3. Note that the -runtype setting only affects the run-time semantics for library calls-it does not affect the resolution of #include pathnames. See the discussion of -systype for more information. 6.3.26 UNIX Version Selection: -systype systype (lcom/cc) -T systype (lbin/cc) Because C programs are often written to run in UNIX environments, and because not all UNIX environments are the same, Domain C supports the #systype preprocessor directive and the -systype compilation option, which allow you to define the version of the UNIX system for which your program is targeted. The Domain C library contains two sets of routines. One is compatible with Bell Labs versions of the UNIX system (System V, Releases 2 and 3) and the other set is compatible with Berkeley's versions of the UNIX system (4.2BSD, and 4.3BSD). All of the routines in both sets work properly in any Domain environment. However, you may encounter problems if you attempt to mix functions from the two sets that interact with each other. In general, it is best to choose one set and stick with it whenever possible. I The two sets of functions overlap to a large extent. It is sometimes the case, however, that while function x exists in both sets, the semantics of the function (and in some cases its arguments) may be subtly different. As an illustration, consider the function setgrpO. In the Bell Labs version, the function definition is: 6-40 Program Development int setpgrp ( ) It is defined to "set the process group ID of the calling process to the process ID of the calling process and return the new process group ID." In the Berkeley versions of UNIX systems, there is an identically named function with similar semantics but a different calling sequence. The Berkeley function, setpgrp( pid, pgrp ) int pid, pgrp; "sets the process group of the specified pgrp. Zero is returned if successful; -1 is returned and errno is set on failure." To avoid unexpected behavior, always know which set of functions you are accessing. The system chooses one set of functions over another based on a version selector called the systype. The systype can affect both the compilation and the execution of a program. At compilation time, it determines which include files the compiler uses. At run time, it determines which set of functions are called and makes sure that the proper calling conventions are employed. However, it is possible to compile with one systype and execute the program with a different systype by using the -runtype option. To affect the execution of a program, the compiler stamps the object code with the systype that was in effect when the module was compiled. This is either the systype specified by the -systype option, the #systype directive, or the -runtype option. Note that the -runtype option overrides all other systype specifications for determining how the object module is stamped. When the program is executed, the loader checks this stamp and uses the semantics and calling sequences of the designated systype when invoking library functions. There are several ways to define the systype, one of which is to place a #systype directive in the source file. You may define the systype only once per source file. Any subsequent definitions produce an error. Moreover, the #systype directive must be the first non-comment token in the source file. You also can define the target operating system with the -systype compile option. The format of -systype is as follows: -systype systype where systype can be any of the following: • • • • • • • bsd4.1 Berkeley 4.1BSD (obsolete) bsd4.2 Berkeley 4.2BSD bsd4.3 Berkeley 4.3BSD sys3 Bell System III (obsolete) sysS System V Release 2 sysS.3 System V Release 3 any program is independent of a particular UNIX system If you specify one systype on the command line and a different one in the file, the compiler reports an error. If you do not explicitly specify a systype, the compiler inherits the Program Development 6-41 systype from an environment variable called COMPILESYSTYPE. By default, this variable is set to sysS. If, for some reason, the COMPILESYSTYPE variable does not exist, the systype is inherited from another environment variable called SYSTYPE. This variable is always set. These environment variables are described in more detail in the Using Your BSD Environment and Using Your SysV Environment manuals. 6.3.27 Function Prototypes: -typel-ntype (lcom/cc) By default (-type), Domain C expects function prototypes for all functions. If the compiler encounters an old-style function declaration or a function invocation prior to a prototype, it will issue an informational message (assuming -info is set to level 1 or above). If you 'ar.e compiling older source files that do not contain prototypes, you should use the -ntype1 option, which suppresses these messages. The -type option also turns on the reference variable feature. If you compile a file with -ntype, the compiler will issue errors when it encounters declarations of reference variables. Finally, -type sets the predefined macro _STDC_ to 1. If -ntype is specified, this macro expands to zero. 6.3.28 Line Numbers: -ulinel-nuline (lcom/cc) Use -uline and -nuline to enable or disable any #line preprocessor directives in your program. The #line and #line_number preprocessor directives establish nondefault line numbers. If you specify -uline, the compiler honors these preprocessor directives. However, if you specify -nuline, the compiler ignores these preprocessor directives, and therefore, numbers statements according to its normal scheme. For details on #line, see the "#line" listing of Chapter 4. -uline is the default. 6.3.29 Version Number: -version (lcom/cc) The -version option causes the compiler to print the current version number of the compiler. Use this number when reporting APRs (Apollo Product Reports) to Customer Service. If you specify -version, you should not specify any other options, nor should you specify a source file. For example: $ cc -version cc (C compiler), revision 4.89 6-42 Program Development 6.3.30 Warning Messages: -warnl-nwarn (/corn/cc) -w (lbin/cc) If you specify -warn, the compiler issues any warning messages generated by compilation. If you specify -nwarn, the compiler suppresses these warning messages. Note that -warn and -nwarn do not affect the warning summary; the warning summary is controlled by the -msgs and -nmsgs options described earlier in this section. The default is -warn. 6.4 Linking in a Domain Environment There are two commands that enable you to link object modules to form an executable image. The Ibin/ld utility is the standard UNIX link editor with some Domain enhancements. The bind command is the traditional Aegis binder. You can use either command regardless of whether the modules were compiled with Ibin/cc or Icom/cc. 6.4.1 The Ibinlld Utility Use the UNIX link editor, Ibin/ld, to combine several object modules into one executable program. You can invoke the link editor with the Id command or with the Ibin/cc command. In fact, the link editor is automatically invoked by a Ibin/cc command if the command line contains .0 files or a source file containing the main 0 function. The input object modules can come from the following sources: • Libraries created by ar (the UNIX archiver) • Object modules created by the Domain C, Domain Pascal, or Domain FORTRAN compilers, or the Domain assembler. • Object modules previously created by Id. • Object modules created by bind (the Aegis binder). One of the primary purposes of ld is to resolve external references. If there are any unresolved external references, Id will report them. The UNIX utility nm can also be used to perform a check of resolved and unresolved global symbols. When the link editor is called by Ibin/cc, a startup routine named Ilib/crtO.o is linked with the program. This routine invokes mainO. Assuming mainO returns normally, crtO.o finishes by invoking exit(2). Note that Id's output can either be executed (assuming that there is a start address) or used as input for a further Id run. For syntax details on Id and its options, see the BSD Command Reference manual and the SysV Command Reference manual. Program Development 6-43 6.4.2 The bind Command The format for the bind command is as follows: $ bind pthnml [ ...pthnmN] [optionl [ ... optionN]] A pthnm must be the pathname of an object file (created by a compiler) or a library file (created by the librarian). Your bind command line must contain at least one pthnm. The available options are described in the Domain/OS Programming Environment Reference manual. For example, suppose you write a program consisting of the source files named test_main.c, mod1.c, and mod2.c. To compile the source files using /com/cc, you issue the following three commands: cc test_main cc modl cc mod2 The /com/cc command creates three object files named test_main. bin, mod1.bin, and mod2. bin. To create an executable file named complete_program with the com/bind utility, enter the following command: bind test._main.bin mod1.bin mod2.bin -b complete_program 6.5 Archiving in a Domain Environment Use the UNIX archiver, ar, to create and update library files. Once created, a library file can be used as input to the link editor, /bin/ld. As with most linkers, /bin/ld will optionally bind only those modules in a library file that resolve an outstanding external reference. For syntax details on ar and its options, see the BSD Command Reference manual and the SysV Command Reference manual. 6.6 System Libraries There are a number of libraries that come automatically with your operating system. One of these libraries-known as the standard C library-is available regardless of whether you run in an Aegis or UNIX environment. The standard library enables you to perform buffered 110, memory management, double-precision math, and other functions. Though it is known as the "standard" library, there is no real standard for it. The ANSI C Subcommittee has proposed a standard for the C library, which is expected to be approved by the full ANSI Committee in 1988. In the meantime, the de facto standard is the UNIX library, which also agrees with the subset of library functions described by K&R. Domain/OS systems support the UNIX version of the standard library. 6-44 Program Development In addition to the standard library, Domain/OS also supports lower-level libraries that enable you to perform systems-type operations, such as creating and deleting directories, changing protection codes, and creating new processes. For more information about these library routines, see the BSD Programmer's Reference manual, the SysV Programmer's Reference manual, and the Aegis Programmer's Reference manual. In addition, there are several libraries for performing graphics operations, 110 through streams, and for manipulating windows. For a complete list of manuals that describe these libraries, see the Technical Publications Master Index. 6.6.1 The Standard C Library Although the standard C library exists in a single object file (/lib/clib), it is really a conglomeration of many special-purpose libraries. Each sub-library contains routines that covver a particular area of functionality, such as 110 or memory management. Each sublibrary has an associated header file. The header file contains the declarations for any related functions, macros, or data types needed to execute a set of library functions. Table 6-7 lists the standard header files. All header files for the standard library reside in /usrl include and can be included in your source file by surrounding the filename with angle brackets. For example, #include "jusrjincludejstdio.h" and #include are equivalent in a Domain/OS environment. The second method is preferred because it is more portable. In some cases, the header file is not required but we recommend that you include them anyway. Because they contain prototypes, they enable the compiler to perform type checking of arguments, and they also inhibit unnecessary argument type conversions. Both the loader and the linkers (ld and bind) automatically search through clib for unresolved references. It is not necessary, therefore, to explicitly link routines from the standard library. Program Development 6-45 Table 6-7. Header Files Header File Functions assert.h Diagnostic functions. ctype.h Character testing and mapping functions. curses.h The curses screen control utility. errno.h The errno global variable. malloc.h Memory management functions. math.h Double-precision math functions. setjmp.h The setjmpO and longjmpO functions, which enable you to bypass the normal function call and return discipline. signal.h Functions that handle signals. stdio.h Buffered I/O functions. string.h String manipulation functions. strings.h (BSD only) String manipulation functions. time.h Time functions. varargs.h Variable argument list macros. 6-46 Program Development 6.6.2 Built-in Routines Domain C supports built-in code (also called in-line code) for many of the routines declared in string.h, strings.h, and math.h. To obtain the built-in versions of these functions, you must include the header file. The functions for which built-in versions exist are as follows: atanO atan20 cosO expO fabsO logO sinO sqrtO strcatO strncatO strcpyO strcmpO strlenO strncpyO tanO Normally, when you invoke a library function, the compiler produces code to pass control to the specified function at run time. This requires some overhead since local variables must be preserved and arguments must be passed. When you include , the compiler simply inserts the function's object code wherever it is invoked. While this results in somewhat longer object files, it can produce much faster executable code, particularly when double-precision math functions are used heavily. NOTE: The built-in functions do not support the error-checking and recovery that normally accompanies library routines. This is particularly important for the math.h functions, which check for overflow and assign meaningful values to errno. If your programs rely on this error handling, do not use the built-in routines. Program Development 6-47 6.7 Executing Programs in a Domain/OS Environment The following sections describe how to execute a program in a UNIX or Aegis environment. 6.7.1 Executing in a UNIX Environment To execute a program, simply enter its full pathname (including any suffixes). ple, to execute an object file named a.out, just enter For exam- $ a.out By default, standard input and standard output for the program are directed to the keyboard and display, respectively. You can redirect standard input and output by using the shell's redirection notation (described in Using Your SysV Environment and Using Your BSD Environment). For example, to redirect standard input when you invoke a. out, type $ a.out results This command uses the character U>" to redirect standard output for a.out to the file results. 6.7.2 Executing in an Aegis Environment To execute a program, simply enter its full pathname (including any suffixes). ple, to execute an object file named complete_program, just enter For exam- The operating system searches for a file named complete_program according to its usual search rules, then calls the loader utility. The loader utility is user transparent. It binds unresolved external symbols in your executable object file with global symbols in the language and system libraries. Then, it executes the program. By default, standard input and standard output for the program are directed to the keyboard and display, respectively. You can redirect standard input and output by using the shell's redirection notation (described in the Using Your Aegis Environment). For example, to redirect standard input when you invoke complete_program, type $ complete_program results This command uses the character ">" to redirect standard output for completeyrogram to the file named results. NOTE: If the executable object has a suffix (such as .bin), you must not forget to type this suffix. 6.8 Debugging Programs in a Domain Environment The Domain systems support two source level debuggers-dde and dbx. The following sections describe these sections briefly. For more information about dde, refer to the Domain Distributed Debugging Environment Reference manual. For information about dbx, refer to the Domain/OS Programming Environment Reference manual. 6.S.1 The dde Utility The Domain Distributed Debugger (dde) utility is a powerful screen-oriented debugger. To prepare a file for debugging with dde, you do not have to do anything special at bind time but you do have to compile with the -db, -dba, or -dbs compiler options. -db provides minimal debugger preparation, -dba and -dbs provide full debugger preparation. Use the following syntax to invoke dde: $ dde [-dde_options] targetyrogram_name [targetyrogram_options] where targetyrogram_name is the pathname of the program you want to debug. For example, issue the following command to debug the executable object stored in file complete_program: $ dde complete_program For complete details on dde and its commands. refer to the Domain Distributed Debugging Environment Reference manual. Note that dde works somewhat differently for C programs than for Pascal programs. Program Development 6-49 6.8.2 The dbx Utility dbx is the traditional Berkeley UNIX source language debugger. Although it is usually available only on BSD systems, the Domain/OS version is available regardless of what environment you are running. Note also that, like dde, dbx can be used on programs compiled with /com/cc as well as with programs compiled with /bin/cc. The command syntax for invoking dbx is: % dbx [options] [objectJile [coredump]] where objectJile is the name of the program you want to debug. If you omit the objectJile name, dbx attempts to debug the file a.out. If you specify a coredump filename, or if a file named core exists in the working directory, you can use dbx to examine the state of a program that has aborted prematurely. For complete details about the dbx utility, refer to the Domain/OS Programming Environment Reference manual. 6.9 Program Development Tools Domain/OS supports several programming tools that aid in program development, debugging, and source management. Some of these tools are listed below. Most of these utilities are described in detail in the Domain/OS Programming Environment Reference manual, although the DSEE facility has its own documentation set. Refer to the appropriate manual for more information about these tools. cb Formats a C source file according to user-supplied rules so that it is consistent and readable. lint Examines C source files and attempts to detect obscure bugs and nonportable usages. The lint utility is described in detail in Appendices C and D of this manual. make Creates a program from input object modules according to a list of dependencies that the programmer supplies in a makefile. The make utility is described in the Domain/OS Programming Environment Reference manual. sccs sccs stands for Source Code Control System, which is a collection of programs that help you maintain a record of versions of a program. The sees utility is described in the Domain/OS Programming Environment Reference manual. DPAK Package DPAK is a collection of three programs-DSPST, DPAT, and HPCthat allows you to analyze the performance of a program. It is particularly useful for isolating bottlenecks. The DPAK package is described in Analyzing Program Performance with DPAK. 6-50 Program Development DSEE Facility The DSEE (Domain Software Engineering Environment) package is a support environment for software development. DSEE helps engineers develop, manage, and maintain software projects; it is especially useful for large-scale projects involving a number of modules and developers. Domain/Dialogue Package The Domain/Dialogue package is a tool for designing the interface to an application program and specifying how the interface should be presented to users of the application. The primary advantage of the Domain/Dialogue package is that it lets you create interfaces separately from the application code. 6.9.1 tb (Traceback) If you execute a program and the system reports an error, you can use the tb utility to find out what routine triggered the error. You invoke tb by entering the command $ tb immediately after a faulty execution of the program. (To execute tb in a Bourne shell, you must set the in process environment variable before executing the program.) For example, suppose you execute object file complete_program, encounter an error, and then invoke tb. The whole sequence might look like the following: $ complete_program Enter a value -- 2 ?(sh) "./test.bin" - access violation (as/fault handler) In routine" doscan" line 320. $ tb access violation (from as / fault handler) In routine " doscan" line 320 Called from "scanf" line 53 Called from "my_rout" line 12 Called from "main" line 6 tb first reports the error, which in this case is access violation (from as / fault handler) Then, tb shows the chain of calls leading from the routine in which the error occurred all the way back to the main program block. For example, the error was picked up at line 320 of routine _doscan, which was called by routine scanf which was called by routine my_rout which was called by routine main. Given this information, it is probable, though not certain, that there is a problem at line 12 of routine my_rout. We make this presumption because my_rout is the deepest user-defined routine shown in the traceback. The Aegis Command Reference manual details the tb utility. Program Development 6-51 NOTE: If you compile a file with the -ndb option (/corn/cc) , then the functions stored in this file will not be included in the traceback. -------88------- 6-52 Program Development Chapter 7 Cross-Language Communication This chapter describes how to call Pascal and FORTRAN routines from a C program and how to share data between a C program and a FORTRAN or Pascal program. Because many Domain system routines are written in Pascal, the information in this chapter also applies to invoking system routines from C. Briefly, this chapter covers the following topics: • U sing function prototypes to declare parameters • Understanding data type agreement of Domain C, Pascal, and FORTRAN • Using reference variables to declare Pascal IN parameters • Calling Pascal routines from a C program • Calling FORTRAN routines from a C program • Sharing data between routines written in different languages • Using global names • Calling system service routines For detailed information about system calls, see the Domain/OS Calls Reference manual. Cross-Language Communication 7-1 7.1 Suppressing Automatic Type Promotions of Arguments When you call a C function without a prototype for that function being in scope, the compiler automatically converts the data types of the parameters according to the rules shown in Table 7-1. For communication among C functions, these conversions are usually invisible because the arguments are converted back to the type declared in the formal argument declaration. When calling routines written in other languages, however, it is important to suppress these conversions. The simplest way to suppress these conversions is to declare the external routine with a function prototype. For instance, consider the following program: mainO { short j float x = 3; = 3.141; ex_func( j, x ); } Because there is no prototype for ex_funcO, the C compiler implicitly casts j to an int and x to a double before passing them to ex_funcO. There is no problem if ex_funcO is a C routine that expects two arguments of type short and float because the necessary conversions will occur on the receiving side. However, if ex_runcO is a Pascal routine that expects arguments of type INTEGER16 and REAL passed by value, the function call will fail. This problem can be avoided by prototyping ex_funcO: int main( void ) { extern void ex func( short, float ); short j = 3; float x = 3.141; ex_ func ( j, x ); } The prototype causes the C compiler to suppress the automatic argument type promotions. Note that the prototype should be used even if the function is a C routine because it turns on type checking which can identify bugs that would otherwise go unnoticed. For more information about function prototypes, see Chapter 5. NOTE: Prior to SR10, the Domain C compiler did not support function prototyping. Instead, Domain C supported the reserved word std_Scall, which turned off automatic type promotions of arguments. Domain C continues to support std_Scall but it is viewed as an obsolete and inferior means for cross-language communication. We strongly urge you not to use std_Scall for new programs and to convert your older programs that use std_Scall to the new prototyping syntax. std_Scall is described in Appendix E. 7-2 Cross-Language Communication Table 7-1. C Function Argument Conversions without Prototypes Data Type of Argument Data Type Actually Passed char short unsigned char unsigned short float int int unsigned int unsigned int double 7.2 Data Type Agreement in C, Pascal and FORTRAN Table 7-2 shows equivalences among the three languages' data types. To call a Pascal or FORTRAN routine, make sure that the types declared in the C prototypes are compatible with the types in the definition. 7.2.1 Non-C Data Types As Table 7-2 shows, the C language has no equivalent types for Pascal's BOOLEAN and SET types or for FORTRAN's LOGICAL and COMPLEX types. Section 7.5.4 shows how to simulate the BOOLEAN type in C, and Sections 7.6.6 and 7.6.7 show how to simulate FORTRAN's LOGICAL and COMPLEX data types. It is also possible to simulate the SET type in C, but a description of this technique is beyond the scope of this manual. However, the Programming with Domain/OS Calls manual describes how to simulate sets in C. (For an interesting discussion of implementing SET functions in C, see C: A Reference Manual, by Samuel P. Harbison and Guy L. Steele Jr. ) 7.2.2 Non-FORTRAN Data Types There are a few C types that have no FORTRAN equivalents. Most of these, however, can be simulated in FORTRAN. Programming with Domain/OS Calls describes how to simulate C's structure, union, and enumerated data types. Section 7.6.5 describes how to pass . pointers from Domain C to Domain FORTRAN. There is no easy way to simulate C's unsigned types in FORTRAN. Therefore, if you pass an unsigned value to a FORTRAN routine, it will be interpreted as a signed value. This will only make a difference when the high-order bit is set. Cross-Language Communication 7-3 Table 7-2. Domain C, Pascal, and FORTRAN Data Types C Pascal FORTRAN char, char enum short int, long float double short enum long enum, enum struct union pointer (*) CHAR INTEGER,INTEGER16 INTEGER32 REAL, SINGLE DOUBLE enumerated types INTEGER32 record variant record pointer CHARACTER *1 INTEGER*2 INTEGER, INTEGER *4 REAL, REAL *4 DOUBLE PRECISION, REAL*S INTEGER *2 INTEGER *4 none none none unsigned char unsigned short unsigned long none 0 .. 65335 o.. 4295967295 none none none none none BOOLEAN SET none none none none none none none none none none none none LOGICAL LOGICAL *2 LOGICAL *1 COMPLEX COMPLEX * 16 n 7.3 Data Type Agreement of Return Value Just as the parameters must agree in type, so must the function return value. For example, if a Pascal function returns an INTEGER16 value, you must declare it in your C program as a function that returns a short. That is, if the Pascal declaration is FUNCTION funel : INTEGERl6; then the C declaration should be: extern short funel( void ); 7-4 Cross-Language Communication All C declarations of Pascal procedures and FORTRAN subroutines should use the void type since these routines do not return a value. For instance. the Pascal procedure defined by PROCEDURE procl; should be declared as: extern void procl( void ); 7.3.1 Functions Returning Pointers When Pascal returns the value of a function. it places it in one of two registers: a data register (DO) if the value being returned is not a pointer or an address register (AO) if the value is a pointer. C normally expects values to be returned in a data register (DO). Therefore. when you prototype a Pascal function that returns a pointer. you need to tell the C compiler to fetch the returned value from the address register rather than the data register. You do this by appending #options[aOJeturn] to the prototype. For instance. if pass_pointO is a Pascal function that returns a pointer to an into the prototype would be: extern int *pass_point() #options[aO_return]; FORTRAN has no syntax for declaring a function that returns a pointer. All FORTRAN functions return their values in a data register as do C programs so no special syntax is required. 7.4 Argument Passing Conventions In addition to ensuring that arguments agree in type. you also need to compensate for different passing conventions. Domain FORTRAN passes all arguments by reference. and Domain Pascal passes most arguments by reference. This means that they pass the address of the argument rather than the value of the argument. In contrast, C passes all arguments (except arrays and functions) by value. Although Domain C passes arguments by value, it provides two mechanisms to simulate passing by reference. The first is to explicitly pass the address of the argument. For example, if pas_funcO is a Pascal procedure that expects an integer16 argument, you could invoke it from C with the following statements: int maine void) { extern void pas_func( short * ); short x; pas_func( &x ); Cross-Language Communication 7-5 Note that in the prototype of pasjuncO, we declare the argument as a pointer. There are two drawbacks with this method. First, it does not provide an easy means for passing constants or expressions since it is illegal to take the address of these. For example, if you want to pass the constant 5 to pasjuncO you need to store the value in a variable first: int main( void ) { extern void pas_func( short * ); short x; /* pas func( &5); x =5;- pas_func( &x); ILLEGAL */ /* Legal */ Likewise, if you want to pass the product of two numbers, you must again store the product in a variable before passing it: int main( void ) { extern void pas_func( short * ); short x, y; /* pas func( &(x*y»; x *= y; pas_func( &x); ILLEGAL */ /* Legal */ The other problem with passing addresses explicitly is that the prototype gives no indication of whether the argument is an IN or OUT parameter. That is, the declaration of pass_CuneO does not reveal whether pas_CuneO will modify the value of the argument or not. You cannot, therefore, assume that the value of x will be the same after the call as it was before the call. Both of these limitations can be avoided by using reference variables, a Domain extension borrowed from C++. Reference variables are described in Sections 3.15 and 5.3.2. Declaring a parameter as a reference variable in a prototype causes the compiler to pass the argument by reference when the function is invoked. For example: int main( void) { extern void pas_func( short & ); short x; 7-6 Cross-Language Communication Note that reference variables make it legal to pass constants and expressions by reference: int main( void ) { extern void pas_func( short & ); pas_func( 5 ); /* Legal */ Although this will work, the receiving routine, pas_funcO, may not modify the constant value passed. If it attempts to modify this value, a run-time access error will occur. Because there are two ways to pass arguments by reference-explicitly passing addresses or declaring arguments as reference variables-you can set up conventions to use one method in certain situations and the other method in different situations. Domain/OS system calls, for example, use the two methods to distinguish between IN variables and all other type of parameters. In the insert files, all IN variables are declared as reference variables and all other parameters are declared as pointers. A single function might include a combination of pointers and reference variables. 7.5 Pascal Examples Pascal can pass by reference or by value depending on how a parameter is declared. In Domain Pascal, there are five ways to declare a formal parameter: IN, OUT, IN OUT, V AR, or without a keyword. In the first four cases, the parameters are passed by reference. The Pascal keywords control what operations are legal within the Pascal routine. Consult the Domain Pascal Language Reference for information about these declaration specifiers. When you declare a variable in Pascal without one of the declaration specifiers, it directs the compiler to use call-by-value semantics. This means that the Pascal routine will use a local copy of the parameter so that the formal and actual parameters are different objects. The actual parameter in the calling routine will remain unchanged despite any modifications that the called. routine makes to the formal parameter. Domain Pascal uses two methods to achieve call-by-value semantics: 1. For nested routines (routines that are visible only within a single source file), Domain Pascal passes arguments by value just like C. 2. If the routine is globally visible, Domain Pascal assumes that it may be called by routines written in other languages, such as FORTRAN, that only support pass by reference. Therefore, the Pascal routine expects an address of the actual argument and then generates a local copy on the receiving side. From a Pascal programmer's perspective, the two methods are equivalent since both achieve the call-by-value semantics (that is, the routine operates on a local copy of the Cross-Language Communication 7-7 argument). From a C programmer's perspective, however, it is important to know which method is being used. If the first method is being used, you should declare and pass arguments as though you were invoking a function written in C. If the second method is used, you need to pass arguments by reference by either explicitly passing a pointer or by declaring the arguments as reference parameters. (You can force the Pascal compiler to use method 1 by declaring the globally visible routine with the val_param option.) The following examples show how to pass various objects of different types and sizes to Pascal routines. 7.5.1 Passing Integers and Floating-Point Numbers Passing characters, integers and floating-point values to Pascal programs is fairly straightforward. The prototype for the Pascal function should be type compatible with Pascal function definition. The actual arguments passed must be assignment compatible. To conform with Domain conventions, you should declare IN parameters as reference arguments and all other parameters as pointers. Consider the following Pascal function that raises its first argument to the power specified by the second argument: MODULE power_p; FUNCTION power (IN argI IN pow SINGLE; INTEGER16) VAR temp: INTEGER16; count INTEGER16; BEGIN temp := argI; count := pow; WHILE count > 1 DO BEGIN temp := argI*temp; count := count-I; END; power := temp; END; The C program below 7-8 show~ Cross-Language Communication various ways to call powerO: DOUBLE; /* Program name is "callyowery". To execute it, you * need to compile this program and the pascal routine * in file "powery.pas", and then bind the two binaries. */ #include int main( void ) { extern double power( float &, short & ); float x = 2.5; short j = 5; double z; z = power( x, j ) ; printf(" %f to the power of %d is %f\n" , x, j , z ) ; z = power( 3.0, 2 ) ; printf(" %f to the power of %d is %f\n" , 3.0, 2, z ) ; z = power( 2, 3.0 ) ; printf(" %f to the power of %d is %f\n" , 2, 3.0, z ) ; } Note that both arguments are declared as reference variables in the prototype because they are declared as Pascal IN parameters. Because they are reference variables, it is legal to pass constants, as illustrated in the second and third invocations. In the third invocation, note that the types of the actual arguments do not match the types declared in the prototype. This is acceptable so long as the actual arguments are assignment-compatible with the prototype parameters. The compiler implicitly casts the first argument to float and the second argument to short before passing them. 7.5.2 Passing Character Arrays Pascal supports both fixed-length and variable-length character arrays. In C, strings are fixed-length, but C's convention of ending string with a null character makes them behave like variable-length strings. Having allocated an array of characters, you can store strings of any length in that array as long as they do not exceed the total length of the array. To facilitate passing strings between the two languages, Domain Pascal supports two runtime functions-ptocO and ctopO. The ptocO function appends a null character to a Pascal variable-length string to make it a C-style string. The ctopO function helps convert a C-style null-terminated string into a Pascal-style variable-length string. These functions are primarily designed to simplify calling C functions from Pascal. As shown in the example in this section, though, they can also be used when a C function passes a string to a Pascal routine. For more information about these functions, and about Pascal variablelength strings, see the Domain Pascal Language Reference manual. Unlike other type of variables, C arrays are automatically passed by reference. Therefore, if a Pascal routine expects an array argument, you should prototype and invoke the routine as though it were written in C. Do not use reference parameters for array arguments. If you do, you will need to dereference the array before passing it, which will produce very unusual-looking code. The Pascal program in our example takes one argument: a string with a maximum size of 256 characters. It copies the string to a variable-length string in order to find the length Cross-Language Communication 7-9 and then reverses the string. Because s is an IN OUT parameter, the reversed string is available to the calling C routine. MODULE reverse_string; TYPE str = ARRAY[1 .. 256] of CHAR; VAR var string temp j len VARYING [256] of CHAR; CHAR; INTEGER; INTEGER; PROCEDURE reverse_string( IN OUT s UNIV str ); BEGIN j := 1; var_string.body := s; {COpy s to variable-length string} CTOP( var_string); {set length of var-Iength string} len := var string. length; WHILE j <=-len/2 DO BEGIN temp : = s [j] ; s[j] := s[len+1-j]; s[len+1-j] .- temp; j .- j+1; END; END; The following mainO function calls reverse_stringO. C automatically passes arrays by reference so there is no need to precede the array name with an ampersand. /* Program name is "call reverse string". To execute this program, you need to compile this source file with cc, and the "reverse string.pas" source file with pas, and then link the two object modules. * * * */ #include int main( void ) { extern void reverse string( char * ); static char s[] = "reverse this string"; reverse string( s ); printf("%s\n", s ); } The output is: gnirts siht esrever 7-19 Cross-Language Communication 7.5.3 Passing Pointers In both C and Pascal, pointers are 4-byte entities. The example below shows a simple linked-list application. The C program creates the first element of the list and then calls the Pascal routine appendO to add new elements to the list. The function printlistO is a C routine that prints the entire list. In addition to illustrating how to pass pointers, this example also shows the correspondence of Pascal records to C structures. The Pascal program is: MODULE pointer_example; TYPE link = ~list; list = RECORD nex : link; data: char; END; PROCEDURE append (firstrec IN val link; CHAR) ; VAR newdata link; BEGIN new(newdata); {allocate memory for new element.} WHILE firstrec~.nex <> NIL DO firstrec := firstrec~.nex; := newdata; := val; NIL; firstrec~.nex newdata~.data newdata~.nex := END; The C program is shown below. Note that C's NULL pointer (defined in static struct list { struct list *nextj char dataj } j int main( void ) { extern void append( struct list **, char & )j extern void printlist( struct list * )j struct list first, *basej char ch='z'j first.data = 'a'j /* assign 'a' to first element of list */ first.next = NULLj base = &firstj append ( &base,'b' )j append ( &base, ch )j printlist( base )j } /* printlist() prints the data in each member of the list. */ void printlist( struct list *base { while (base != NULL ) { printf( "%c\n", base->data ) j base = base->nextj } } After compiling and binding these routines, the output is: a b Z 7.5.4 Simulating the BOOLEAN Type The Pascal BOOLEAN type is an 8-bit entity that evaluates to TRUE when its numeric value is -1 and to FALSE when its numeric value is O. (In a packed record, a BOOLEAN uses only one bit.) The BOOLEAN type can be simulated in C with the char data type. Suppose that you want to call the Pascal routine shown below. This routine takes a BOOLEAN argument and returns a BOOLEAN result (the opposite of the argument). 7-12 Cross-Language Communication MODULE pass_boolean-p; FUNCTION log not( IN bool arg : BOOLEAN) : BOOLEAN; BEGIN writeln('pascal value of argument:',bool arg); bool arg := NOT bool arg; writeln('Pascal value returned: ',bool_arg); log_not := bool_arg; END; The C program below shows several ways to invoke boolO. Program name is "pass boolean c". To execute this program, you must also obtain-the Pascal program named "pass_boolean-p". After compiling pass_boolean-p and pass_boolean_c, you must bind them together. /* * * * */ #include #define TRUE -1 #define FALSE 0 int main( void ) { extern char bool( char & ); char x; printf ( "Numeric value of argument: %d\n", TRUE ); x = boo 1 ( TRUE ); printf( "Numeric value returned: %d\n\n", x ); printf ( "Numeric value of argument: %d\n", FALSE ); x = bool (FALSE) ; printf( "Numeric value returned: %d\n\n", x ); } The output after compiling, binding and executing is: Numeric value of argument: -1 Pascal value of argument: Pascal value returned: Numeric value returned: 0 TRUE FALSE Numeric value of argument: 0 Pascal value of argument: Pascal value returned: Numeric value returned: -1 FALSE TRUE Cross-Language Communication 7-13 7.6 FORTRAN Examples The following examples show how to pass various objects of different types and sizes to FORTRAN routines. Remember that FORTRAN does not make local copies of parameters-all arguments are passed by reference. Unlike Pascal, FORTRAN does not include syntax to control whether a parameter can or can't be modified within the called function. To be safe, you should assume that the called function may modify any arguments you pass from C. Therefore, you should be careful about passing constants and expressions. If the FORTRAN routine attempts to modify constants or expressions, a run-time error will occur. There is one restriction concerning the types of data that you can pass to, or return from, a FORTRAN routine: • You cannot return a character array of any size, including 1, from a FORTRAN function. For instance, a FORTRAN function declared as CHARACTER FUNCTION char_funcO cannot be called from a C program. As with Pascal, there are two methods for passing arguments from C to FORTRAN: explicitly pass addresses or declare the arguments as reference variables so that the compiler will implicitly pass the address. Either method will work, although only the reference variable method enables you to pass constants and expressions. The choice of which to use is largely a question of style. Using reference variables provides a cleaner interface since the implicit addressing is hidden. On the other hand, this cleanliness can be misleading. Someone reading the code must look at the prototype to realize that the arguments are being passed by reference rather than by value. The examples in Section 7.6.2 through 7.6.7 illustrate both methods. 7.6.1 Names of FORTRAN Routines The Domain system supports two FORTRAN compile command: Ibin/f77 and Icom/ftn. Both commands compile FORTRAN source files, but the resulting object files differ slightly. One of the differences is that Ibin/f77 appends an underscore to all global names. This includes names of functions and subroutines as well as names of common blocks. When you invoke a FORTRAN routine from C, you need to know whether the routine has an appended underscore or not. For example, consider the following FORTRAN function definition: REAL*8 FUNCTION hypot( sidel, side2 ) REAL*4 sidel, side2 7-14 Cross-Language Communication If the function is compiled with Icom/fto, the C prototype would be: extern double hypot( float &, float & ); On the other hand, if the function is compiled with Ibio/f77, the C prototype would be: extern double hypot_( float &, float & ); If you don't know how the function was compiled, you need to look at the object file to see whether the function name has an appended underscore. One way to look at the object file is with the om command, described in the BSD Command Reference manual. 7.6.2 Passing Integers and Floating-Point Data Passing integers and floating-point values to FORTRAN programs is fairly straightforward. The prototype for the FORTRAN function should be type compatible with the FORTRAN function definition. The actual arguments passed must be assignment compatible. The example below shows a FORTRAN subroutine that accepts the values of the two sides of a right-angle triangle and returns the length of the hypotenuse. The parameters are REAL*4 and the result is REAL *8. REAL*8 FUNCTION hypot(sidel,side2) REAL*4 sidel, side2 hypot = SQRT«sidel*sidel) + (side2*side2» END The first C program below shows how to declare and invoke hypotO using pointers as parameters. The second example illustrates the function call using reference variables. /* Passing floats using pointers */ int maine void ) { extern double hypot( float *, float * ); float x = 3.0, y = 4.0; double z; z = hypot( &x, &y); /* Note that you cannot pass constants -- the following is * illegal z */ = hypot( &3.0, &4.0 } /* Passing floats using reference variables */ int maine void ) { extern double hypot( float &, float & ); float x = 3.0, y = 4.0; double z; z = hypot( x, y ); /* Note that it is legal to pass constants */ z = hypot( 3.0, 4.0 ) } Cross-Language Communication 7-15 7.6.3 Passing Character Data Passing character data is the same as passing integers, with two exceptions: • FORTRAN routines expect an additional argument for every character parameter specifying the size of the character array. (For a single character, the size is only one.) • A FORTRAN routine cannot return character data. To return a character value, create a subroutine and return the character value in a parameter. Consider the following FORTRAN case-inversion routine that takes two character arguments. The routine inverts the case of the first argument and returns the result through the second argument. The FORTRAN routine is: SUBROUTINE UPPER LOWER( in char, inverted) CHARACTER in_char,invertedIF (ICHAR(in char) .LE. 97) THEN inverted- CHAR(ICHAR(in char) + 32) ELSE inverted = CHAR(ICHAR(in_char) - 32) END IF END The following C program shows how to call upper_lowerO. Note that the first character is declared as a reference variable to allow us to pass character constants; the second parameter is declared as a pointer to prevent us from passing a constant. (Passing a constant would produce a run-time error when the FORTRAN routine attempts to modify its value.) Note also that the size parameters come at the end of the argument list. Both size paramters are declared as reference variables so that we can pass them as constants. 7-16 Cross-Language Communication /* Program name is "pass char cf". To execute this * program, you must also obtain the FORTRAN program named * "pass char f". After compiling pass char cf and * pass_char_f, you must bind them together~ */ #include int main( void ) { extern void upper_lower( char &, char *,short &, short & ); char out char,result; /* 8-bit variables */ short long_char; /* 16-bit variable */ out char = 'A" long_char = 'b!; printf( "Original Char\t\tCase-Inverted\n\n" ); upper lower ( out char, &result, 1, 1 ); printf( "\t%c\t\I\t\t\t%c\n", out_char, result ); upper lower ( 'b', &result, 1, 1 ); printf( "\t%c\t\t\t\t\t%c\n",'b', result); upper lower ( 81, &result, 1, 1 ); printf( "\t%c\t\t\t\t\t%c\n", 81, result ); } The result of program execution is: Original Char Case-Inverted A a b B q Q Because the hidden size parameters come at the end of the argument list, you can omit them without affecting your program. 7.6.4 Passing Arrays There are three points to remember when passing arrays from C to FORTRAN: • FORTRAN expects the size of each character array to be passed implicitly. In the C prototype for the FORTRAN routine, you should declare this extra argument as a short. Size arguments always come at the end of the argument list. • FORTRAN and C access multidimensional arrays in a different order. In C, the rightmost subscript varies fastest while in FORTRAN the leftmost subscript varies fastest. • Unlike other variables, C arrays are passed by reference. Therefore, if a FORTRAN routine expects an array argument, you should prototype and invoke the routine as though it were written in C. Do not use reference parameters for array arguments. If you do, you will need to dereference the array before passing it, which will produce very unusual-looking code. The following example illustrates how to pass a character array from C to FORTRAN. Note that you can declare the array in FORTRAN as a character string or as an array of type Cross-Language Communication 7-17 CHARACTER. The two FORTRAN routines shown here return the last character of a string and the next-to-Iast character, respectively. C Pass a string and get the last char. SUBROUTINE pass_char_array(ca, clen, outchar) CHARACTER ca(256) INTEGER*2 clen CHARACTER outchar C Test for null string. IF (clen .LT. 1) THEN outchar RETURN ENDIF out char RETURN END ca(clen) C Pass a string and get the next-to-last char. SUBROUTINE pass_char_string(ca, clen, outchar) CHARACTER*256 ca INTEGER*2 clen CHARACTER outchar C Test for null string. IF (clen .LT. 1) THEN outchar return ENDIF outchar RETURN END ca(clen-1:clen-1) The following C program calls these FORTRAN routines. 7-18 Cross-Language Communication /* * Program name is "pass char array c". To execute this program, you must also obtain the FORTRAN program named "pass char array f". After compiling pass char array c and pass_char_array_f, you must bind them-together. - * * */ #include int main( void ) { extern void pass char string( char &, short &, char &, short &, short & ); extern void pass char array( char *, short &, char &, short &, short & ); char result; static char s1[] "This is the first string"; static char s2[] "This is the second string"; /* * * To pass an array declared as a reference variable, you need to dereference the array. This is the WRONG way to pass arrays. */ pass_char_string( *s1, strlen(s1), result, sizeof(s1), sizeof(result) ); printf( "The second to last character is %c\n", result ); /* To pass an array declared as a pointer, you just pass the * array name, as you would in a C-to-C function invocation. * This is the RIGHT way to pass arrays. */ pass char array( s2, strlen(s2), result, sizeof(s2), sizeof(result) ); printf( "The last character is %c\n", result ); } The result is: The second to last character is n The last character is g Note that we need to pass the length of the string twice. The first string length is for the c1en argument explicitly declared in the FORTRAN routines. The second length is the implict array size that FORTRAN expects for every character argument. The last argument, " 1", is the length of the outchar parameter. 7.6.4.1 Passing Adjustable Arrays The following example illustrates how to pass an adjustable array from C to FORTRAN. The C program passes two arguments: an array of integers and the size of the array. The FORTRAN routine uses the second argument to declare the size of the array. The routine then returns the average value of the array elements. Cross-Language Communication 7-19 C Pass an array of long int and return the average. INTEGER*4 INTEGER*4 INTEGER*4 INTEGER*4 FUNCTION pass int array (larray, array_len) array len larray(array len) i, tot - tot = 0 DO i = 1,array len tot = tot + Iarray(i) print *,'larray(',i,') = ',larray(i) END DO pass_int_array = tot / array_len RETURN END The C program is: /* * * Program name is IIpass int array". To execute this program, you must also obtain the FORTRAN program named "pass int array fll. After compiling pass int array c and pass_Int_array_f, you must bind them together. - * */ #include int main( void ) { extern int pass int array( int *, int & ); static int average,-pass_array[]={ 325, 478, 982,331, 21, 56, 79 }; average = pass int array ( pass array, -sizeof(pass array)/sizeof(pass array[O]) ); printf( liThe average is: %(1\n", average); } Note that the array is declared as a pointer rather than a reference parameter so that we can pass the array name without dereferencing it; the length is declared as a reference variable so that we can pass the expression that computes the array's length. The result of executing the program is: larray( 1) 325 larray( 2) 478 larray( 3) 982 larray( 4) 331 larray ( 5) 21 larray( 6) 56 larray( 7) 79 The average is: 324 7-20 Cross-Language Communication 7.6.4.2 Passing Multidimensional Arrays When you pass a multidimensional array, it is important to remember that in C the rightmost subscript varies fastest while in FORTRAN the leftmost subscript varies fastest. The example below shows the consequences of this difference. The FORTRAN routine is: SUBROUTINE dyn dim(arr, x, y) INTEGER*4 x, yINTEGER*4 arr(x, y) INTEGER*2 i, j WRITE(*,*) WRITE(*,*) 'This is the FORTRAN array:' DO i = 1, x DO j = 1, y , ,arr (i, j) WRITE(*,*) , arr (' , i , ' , ' ,j , ' ) END DO END DO END The C program is shown below. Note that the array is declared as a pointer to an int, just as it would be declared if dyn_dimO was a C function. The x and y arguments are declared as reference parameters so that we can pass constants. /* Program name is "multi dim array C". To execute this * program, you must also-obtain the FORTRAN program named * "multi dim array fll. After compiling multi dim array c * and multi_dim_arraY_f, you must bind them together. */ #include #define DIM1 2 #define DIM2 3 int main( void { extern void dyn dim( int *, int&, int& ); static int arr[DIM1] [DIM2] = { { 1, 2, 3 }, { 4, 5, 6 } }; short i,j; printf(IIThis is the C array:\n"); for (i = 0; i<=1; i++) for (j=O; j<=2; j++) printf( "arr(%d,%d) = %d\n", i, j, arr[i] [j] ); dyn_dim( arr, DIM1, DIM2 ); } Cross-Language Communication 7-21 The result is: This is the C array: arr(O,O) 1 arr(O,I) 2 arr(O,2) 3 arr(I,O) 4 arr(I,I) 5 arr(I,2) 6 This is the FORTRAN array: arr( 1, 1) 1 arr( arr( arr( arr( arr( 3 5 2 4 6 1, 1, 2, 2, 2, 2) 3) 1) 2) 3) 7.6.5 Passing Pointers As an extension to the ANSI standard, Domain FORTRAN enables a FORTRAN routine to dereference pointers passed from C or Pascal programs. For complete details, consult the Domain FORTRAN User's Guide. In the following example, the C program passes the FORTRAN subroutine a pointer to a structure that contains four short integers. By using the the POINTER statement, the FORTRAN subroutine is able to modify the structure elements. Pay special attention to the C prototype for the FORTRAN routine. We declare the parameter as a pointer to a structure of type S, passed by reference. What actually gets passed, therefore, is a pointer to a pointer. We declare the parameter as a reference parameter so that we can pass a constant (the result of the address-of operator). The FORTRAN subroutine is: SUBROUTINE pass point(pl) INTEGER*4 pI INTEGER*2 a,b,c,d POINTER/pl/a,b,c,d a=a+l b=2**a c=3**a d=4**a END 7-22 Cross-Language Communication The C program is: Program name is "pass_point_c". To execute this program, you must also obtain the FORTRAN program named "pass_point_f". After compiling passyoint_c and pass_point_f, you must bind them together. /* * * * */ #include typedef struct { short s1,s2,s3,s4; } S; int maine void { extern void pass_pointe S *&); /* Parameter is a pointer to * S, passed by reference. */ static S struct_pass = { 1, 1, 1, 1 }; passyoint( &structyass ); printf( "%d\n%d\n%d\n%d\n" , struct pass.s1, struct_pass.s2, struct_pass.s3, struct_pass.s4); } The result is: 2 4 9 16 7.6.6 Simulating the LOGICAL Types Domain FORTRAN supports three LOGICAL types: • LOGICAL and LOGICAL*4 • LOGICAL*2 • LOGICAL*l The numbers refer to the length, in bytes, of the type. Note that the default is four bytes long. Each of these types describes an object that evaluates to TRUE when its numeric value is In C you can simulate the logical types with integer types of the same size. The following FORTRAN function accepts two arguments: a LOGICAL*l and a LOGICAL*2, and returns a LOGICAL*4. Note that out_arg is modified, so we need to be careful not to pass an address of a constant. -1 and to FALSE when its numeric value is O. Cross-Language Communication 7-23 LOGICAL*4 FUNCTION pass_logical (in_arg, out_arg) LOGICAL*l in arg LOGICAL*2 out_arg PRINT *,'FORTRAN value of in_arg:',in_arg PRINT *,'FORTRAN value of out arg:' ,out arg out arg = .NOT. out arg pass logical = in arg .EQV. out arg PRINT * ,'FORTRAN value returned:', pass_logical END The C program below shows how to invoke pass_logical 0 . /* * * * Program name is "pass logical cIt. To execute this program, you must also obtain the FORTRAN program named "pass logical fIt. After compiling pass logical c and pass_logical_f, you must bind them together. - */ #include #define TRUE -1 #define FALSE 0 int maine void ) { extern int pass logical( char &, short * ); char arg1 = TRuE; char arg2 = TRUE; int result; printf( "C numeric value of arg1: %d\n", arg1 ); printf( "C numeric value of arg2: %d\n", arg2 ); result = pass logical( arg1, &arg2 ); printf( "C numeric value of arg2 after function call: %d\n", arg2 ); printf( "C numeric value returned: %d\n\n", result ); printf ( "c numeric value of arg1: %d\n", arg1 ); printf( "c numeric value of arg2: %d\n", arg2 ); result = pass logical( arg1, &arg2 ); printf( "C numeric value of arg2 after function call: %d\n", arg2 ); printf( "c numeric value returned: %d\n\n", result ); } 7-24 Cross-Language Communication The output after compiling, binding, and executing is: C numeric value of arg1: -1 C numeric value of arg2: -1 FORTRAN value of in arg: T FORTRAN value of out arg: T FORTRAN value returned: F C numeric value of arg2 after function call: 0 C numeric value returned: 0 C numeric value of arg1: -1 C numeric value of arg2: 0 FORTRAN value of in arg: T FORTRAN value of out arg: F FORTRAN value returned: T C numeric value of arg2 after function call: -1 C numeric value returned: -1 7.6.7 Simulating the COMPLEX Types Domain FORTRAN supports two sizes of complex data types. The FORTRAN COMPLEX data type is stored as two 4-byte floating-point numbers, the first representing the real part and the second representing the imaginary part of a complex value. The COMPLEX*16 type is stored as two 8-byte floating-point numbers. It is easy to simulate both types in C via structures containing two floats or two doubles. In the following example, the FORTRAN function accepts a COMPLEX argument, and returns the square of the argument. COMPLEX FUNCTION sqr_comp( com-param COMPLEX com_param END The C program is: /* * * * Program name is "pass complex c". To execute this program, you must also obtain the FORTRAN program named "pass complex f". After compiling pass complex c and pas s_complex_f , you must bind them together. - */ #include typedef struct { float real; float imag; } COMPLEX; int main( void { extern COMPLEX pass complex ( COMPLEX * ); static COMPLEX result, arg = { 2.5, 3.5 }; printf( "Complex Number\t\t\tSquare of Number\n\n" ); result = pass complex( &arg ); printf( "(%f,if)\t\t(%f,%f)\n", arg.real, arg.imag, result.real, result.imag ); } Cross-Language Communication 7-25 The result is: Complex Number Square of Number (2.500000,3.500000) (-6.000000,17.500000) 7.7 Data Sharing As the previous sections illustrated, one way to share data between routines is by passing arguments. The following sections describe two other methods: • Explicitly define and allude to global variables in the C and Pascal routines. • Create overlay sections. Before describing these two techniques, it will be helpful to explain how declarations of global variables get entered into the object file. This is especially important in C because the Ibin/cc command and the Icom/cc command handle global declarations differently. 7.7.1 Global Variable Declarations Using Icornlcc NOTE: The description in this section assumes that you do not use the -bss switch. If you specify this switch, the compiler will handle global variables as described in Section 7.7.2. When the Icom/cc compiler encounters a global definition, it creates a new section in the object file to hold the variable. The name of the new section is the same as the name of the variable. These sections are called overlay sections because the linker is allowed overlay sections with the same name. If you include the file scope declaration: int x; in three different source files, the compiler will produce an overlay section named x in each of the three resulting object files. When you link these object files together, the compiler overlays the three sections with the same name so that there is only one section for the variable in the resulting executable file. Because of this overlay technique, it is possible to initialize a global variable in more than one source file (although this is not recommended). The variable gets whatever initial value was overlaid last. (Sections are overlaid in the order in which the files are listed in the link command.) If none of the source files contain an initialization value, the linker initializes the variable to zero. Note that this discussion refers to global definitions, not global allusions. If you allude to a global variable (precede the declaration with extern), the compiler enters the variable into the symbol table as an undefined name. It is up to the linker to resolve this reference by 7-26 Cross-Language Communication finding the definition in another object module. If the linker can't resolve an allusion, it reports an error. 7.7.2 Global Variable Declarations Using Ibin/cc Unlike the Icom/cc compiler, the Ibin/cc compiler makes a distinction between global definitions that contain an initializer and those that don't. If a compiler encounters a global definitions with an initializer, it allocates space for the variable in the .data section of the object file, which is where local static data is also kept. If the definition does not contain an initializer, the compiler treats the variable as "weakly defined" -it enters the variable into the symbol table, but does not allocate any storage for it. When the linker attempts to resolved undefined references, it recognizes these "weakly defined" variables as a special case. If the linker cannot find memory allocated for a weakly defined variable in any of the other object modules, it allocates memory for it in a section named. bss. Eventually, therefore, all uninitialized global variables are placed in . bss. At run time, the entire section is initialized to zero. To put a global variable in a named section, as is done with Icom/cc, you must declare the variable with the #attribute [section] specifier, described in Section 3.16.6. NOTE: When the . bss section is used, you must pass object modules through the linker before you can execute them. If you compile with the Ibin/cc command, the linker is automatically invoked. However, if you compile with Icom/cc and the -bss switch, you must explicitly invoke the linker yourself. 7.7.3 Case Sensitivity and Global Names Unlike C, Pascal and FORTRAN are case-insensitive, which means that names written in lowercase are the same as names written in uppercase. By convention, both Pascal and FORTRAN export global variables to the linker as lowercase names. Therefore, all C global names that are accessed by FORTRAN or Pascal routines must also be lowercase. C global names that are shared between C modules may use only uppercase and lowercase letters. 7.7.4 Data Sharing Between C and Pascal There are two ways to declare global variables in Pascal and C such that the linker can resolve references: Cl) Declare the variables so that they are placed in the . data or . bss sections. o Declare the variables so that they are placed in named overlay sections. 7.7.4.1 Declaring .data and .bss Global Variables In Pascal, an external variable is defined with the DEFINE keyword and alluded to with the EXTERN keyword. All variables defined with DEFINE are placed in the .data section Cross-Language Communication 7-27 of the the object file. Variables declared as EXTERN are listed as unresolved references in the symbol table. For compatible behavior in C, you must compile with Ibinlcc or use the -bss switch with Icom/cc. There are several scenarios for declaring and defining variables in Pascal and C. The three most common are described below: • Define a variable in Pascal and allude to it in C. For example, the Pascal source file might contain the following: VAR x: DEFINE INTEGER32 .- 0; and the C file would contain: extern int x; In this case, the definition in the Pascal module causes the compiler to allocate space for x in the . data section. The C declaration produces an undefined reference to x in the symbol table, which is resolved by the linker. • Define a variable in C (initialized) and allude to in Pascal. For example, the C file would contain: int x = 10; and the Pascal source file would declare x as: VAR x: EXTERN INTEGER32; In this case, the definition of x in the C module forces the C compiler to allocate space for x in the .data section. The declaration of x in the Pascal file causes the compiler to produce an undefined reference to x in the symbol table, which is resolved by the linker. • Define a variable in C (uninitialized) and allude to in Pascal. the C file would contain: For example, int x; and the Pascal source file would declare x as: VAR x: EXTERN INTEGER32; In this case, the uninitialized definition of x in the C module causes the C compiler to make a "weakly defined" entry in the symbol table. The declaration of x in the Pascal file causes the compiler to produce an undefined reference to x in the symbol table. The linker then places x in the . bss section, initialized to zero, and resolves the Pascal reference. It is also possible to define the same variable in C and in Pascal, as long as only one or neither of the definitions contain initializers. If both definitions contain initializers, the linker will report an error. 7-28 Cross-Language Communication In the following example, we define the global variable xx at the top of the C source file; the function mainO prints the initial value of xx and then calls the C routine add_threeO which adds 3 to xx; finally, add_threeO calls the Pascal procedure sub_twoO which subtracts 2 from xx. The Pascal routine is: MODULE global_var_p; PROCEDURE sub_two; VAR xx : EXTERN INTEGER32; BEGIN xx := xx - 2; WRITELN('Value of xx after sub_two():' ,xx); END; The C routines are: /* Program name is "global var c". To execute this program, must also obtain the Pascal program named * you "global_var_p". After compiling global_vary and * global_var_c, you must bind them together. * */ #include int xx = 1; /* Definition of xx */ int maine void { extern void add_three( void); printf( "Initial value of xx: %d\n" , xx ); add_three() ; void add three( void { - extern void sub_two ( void ); xx += 3; printf( "Value of xx after add_three(): %d\n", xx ); sub_two() ; } The result of executing the program is: Initial value of xx: 1 Value of xx after add three(): 4 Value of XX after sub=two(): 2 7.7.4.2 Creating Overlay Data Sections Both C and Pascal have syntaxes that enable you to produce named overlay sections for global data. Since the binder ensures that overlay sections with the same name refer to the same memory locations, this mechanism enables you to share data across procedures. Cross-Language Communication 7-29 In Pascal, you create an overlay section with the syntax: section name')' declaration declaration VAR ' (' For instance, the following statements define an overlay section called example with two variables. VAR (example) x y : INTEGER16; DOUBLE; In C, there are two ways to create overlay sections. If you use the Icom/cc compiler, you can create an overlay section simply by defining an external variable. All external variables are automatically stored in their own named sections. For instance, if compiled with Icoml cc, the declarations shown below create three overlay sections called first_sec, second_sec, and example. int first sec=O; float second sec=1.0; struct { short x; double y; } example; main () { Note that example contains two variables: x and y. If you compile your program with /bin/cc, you need to use a special #attribute[section]) syntax (described in Section 3.16.6) to create a named overlay section: int first sec #attribute[section(first sec)] = 0; float second sec #attribute[section(second sec)] = 1.0; struct { short x; double y; } example #attribute[section(example)]; int main( void ) { Consider the example below. The Pascal program calculates the power of a number. The number, the exponent, and the reSUlting value are all located in an overlay section accessible to the calling C program. 7-30 Cross-Language Communication The Pascal routine is: VAR (sec1) { All Pascal names are sent to } { the binder in lowercase. } exponent: INTEGER32; value : INTEGER16; result: DOUBLE .- 1.0; PROCEDURE power; VAR temp : SINGLE; BEGIN temp := exponent; WHILE (temp >=1) DO BEGIN result := result*value; temp .- temp-1; END END; The Icom/cc version of the program is: /* * * * Program name is "section example c". To execute this program, you must also obtain the Pascal program named "section_example_p". After compiling section_exampley and section_example_c, you must bind them together. */ #include struct { int exp; float val; double res; } sec1; int main( void ) { extern void power( void ); secl. val = 5.1; secl. exp = 3; power () ; printf ( "%f to the power of %d is: %f\n", sec1. val, sec1.exp, sec1.res ); } Cross-Language Communication 7-31 The Ibin/ee version of the program is: /* * * * Program name is "section example COl. To execute this program, you must also obtain the Pascal program named "section example pOI. After compiling section example p and section_example_c, you must bind them together. - */ #include struct { int exp; float val; double res; } sec1 #attribute[section(sec1)]; int main( void ) { extern void power( void ); secl. val = 5. 1 ; secl. exp = 3; power(); printf( "%f to the power of %d is: %f\n", secl.val, sec1.exp, sec1.res ); The result is: 5.100000 to the ,power of 3 is: 132.650993 Note that the names of the variables in the overlay section can be different in the two routines. Their sizes and types, however, should be the same. 7.7.5 Data Sharing Between FORTRAN and C In FORTRAN, variables are declared external by placing them in a common block. A common block declaration creates an overlay data section. To communicate with a C program, the C program must create an overlay section with the same name. If you compile with leom/ee, you can create an overlay section simply by defining an external variable. If you compile with Ibin/ee, you must use the special #attribute[section] syntax, as described in Section 3.16.6. For example: The FORTRAN program is COMMON /XVAR/ X INTEGER*4 X 7-32 Cross-Language Communication The Icorn/cc declaration is: int xvar; The Ibin/cc declaration is: int xvar Hattribute[section(xvar)]; Note that the C declaration corresponds to the name of the common block, not to the name of the variable in the common block. If the FORTRAN common block contains more than one external variable, the C source file should define an external structure with the same name as the common block. The fields of the structure should correspond to the variables in the common block. For example, consider the following FORTRAN and C declarations. Here are the declarations in the FORTRAN source file: COMMON /CNAME/IFIELD,RFIELD INTEGER*4 IFIELD REAL RFIELD Here is Icorn/cc version of the declaration: struct { int ifield; float rfield; } cname; Here is Ibin/cc version of the declaration: struct { int ifield; float rfield; } cname Hattribute[section(cname)]; Note that the variable is declared as cname and not CNAME in the C programs. This is because all FORTRAN global names are exported to the linker in lowercase. The example below illustrates this data-sharing mechanism. The C routine calls a FORTRAN subroutine that evaluates the natural log of a number. The number and the log of the number are global variables that can be accessed by both routines. The FORTRAN routine is: SUBROUTINE GET LOG REAL*4 NUM, LOG OF NUM COMMON /GLOBAL_VARS/ NUM, LOG_OF_NUM LOG_OF_NUM = LOG(NUM) END Cross-Language Communication 7-33 The Icorn/cc version of the program is: Program name is get log c". /* To execute this program, * you must also obtain the FORTRAN program named "get log f". * After compiling get log c and get log f, you must bInd - * them together. - - -- */ Hinclude struct S { float cnum; float clog of num; } global_vars;;;; { 1. 0 '0 0.0 }; int main( void ) { extern void get_log( void ); printf( "Number\t\t\tNatural Log of Number\n\n" ); while (global vars.cnum++ < 10) - { get logO; printf( "%f\t\t\t%f\n", global vars.cnum, global_vars.clog_of_num ); } } The Ibin/cc version of the program is: Program name is get log c". /* To execute this program, * you must also obtain the FORTRAN program named "get log f". * After compiling get log c and get log f, you must bInd * them together. -- */ Hinclude struct S { float cnum; float clog of num; } global_vars-Hattribute[section(global_vars)] { 1.0, 0.0 }; int main( void ) { extern void get_log( void ); printf( "Number\t\t\tNatural Log of Number\n\n" ); while (global vars.cnum++ < 10) - { get log 0 ; printf( "%f\t\t\t%f\n", global vars.cnum, global_vars.clog_of_num ); } 7-34 Cross-Language Communication If we compile, bind, and execute, the program produces the following results: Number Log of Number 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 0.693147 1.098612 1. 386294 1.609438 1. 791759 1.945910 2.079442 2.197225 2.302585 7.8 System Service Routines System routines provide a variety of services, including direct manipulation of the display, error handling, interprocess communication, and general input and output. These routines are described in the Domain/OS Call Reference manual and the Programming With Domain/OS Calls manual. The system routines follow the standard calling conventions described earlier in this chapter. You should treat them like Pascal routines when passing arguments and variables. 7.8.1 Insert Files There are a number of header files distributed with the operating system and language software. A header file defines constants and type definitions used by the system service routines, as well as declarations of the system service routines themselves. Each system component has an associated header file. For example, there is a header file for serial 110, for touchpad manipulation, and for error reporting. All header files are distributed in the directory /usr/include. You can include a header file by specifying the full pathname enclosed in double quotes are by enclosing the filename in angle brackets: #include "/usr/include/ component_name. h" or #include where component_name is one of the header files listed in the Domain/OS Call Reference manual. Note that all header filenames end with a .h suffix. The example below shows the files needed for a C program that uses the the system I/O routines and system error-handling routines: #include #include #include Cross-Language Communication 7-35 Always include the header file first, since some of the other system header files rely on the definitions in this file. 7.8.2 Returned Status Code - Most system routines return a status code as a value of the system's STATUS_$T type. The status code indicates whether the routine completed successfully. The value of the status code is STATUS_$OK (defined as zero in base. h) for successful completion, positive for error-level failure, and negative to indicate a warning-level error. You should check the value of the status code after each system call to find out if errors occurred. Every nonzero status code is associated with descriptive error text. To analyze the status code and retrieve the text, use the error handling routines described in the Programming with Domain/OS Calls manual. 7.8.3 Linking and Execution The system service routines are included in preinstalled, shared libraries. References to identifiers in these libraries are resolved at execution time. Therefore, you do not need to specify any additional files when compiling or binding a program that calls system service routines. For more information about the linker, refer to Domain/OS Programming Environment Reference. -------88------- 7-36 Cross-Language Communication Chapter 8 Input and Output Input and output are not built-in features of the C language. Instead, C comes with a standard run-time library that covers I/O functions and other operations. In addition to the standard run-time library, there is the UNIX run-time library, which enables you to perform I/O at a lower level, and the Domain/OS system library, which enables you to perform I/O at the lowest level. Altogether, there are three types of I/O, organized hierarchically. Each higher-level function maps onto one or more lower-level functions, as shown in Figure 8-1. Standard 1/0 UNIX I/O DomainlOS System Calls 1/0 Devices Figure 8-1. Hierarchy of I/O Libraries Input and Output 8-1 Ultimately. all I/O is performed through Domain/OS system calls. The lower levels give you more flexibility, but they are more difficult to use and are not portable. All of the functions described in this chapter are available in all Domain/OS environments. Briefly. the three types of I/O are: Standard I/O Functions The standard C I/O library (clib) enables you to open and close files. and to read and write data in a variety of formats. These functions provide automatic buffering by default. but you can override this mechanism. In addition to file I/O functions. the standard library also includes several functions for performing I/O to default input and output devices. The standard I/O functions are the most portable. They are implemented in most C libraries regardless of the operating system. UNIX I/O Functions For users writing UNIX applications, these functions enable you to access files and devices via UNIX-compatible system calls. These functions offer many of the same capabilities as the standard I/O functions, but without buffering. In addition. the UNIX calls give you more control in assigning protection attributes to files. Domain/OS System Calls At the lowest level, you can access the Domain/OS operating system directly. These calls are more complex than the other two groups and they do not provide any portability. On the other hand, they offer some features that are not available with the other functions. You should use these calls only if portability is not an issue. In particular, you should use the Domain/OS system calls to access mailboxes. perform GPIO operations on peripheral devices, and access files that have a system-defined structure. This chapter primarily describes performing I/O operations using the standard I/O library. For specific information about the standard I/O functions and the UNIX I/O function see the SysV Programmer's Reference and the BSD Programmer's Reference manuals. For information about Domain system calls, refer to the Programming with Domain/OS Calls manual. 8.1 General Remarks The next few sections provide an overview of many of the 110 concepts that are common to both the standard buffered 110 library and the UNIX unbuffered library. 8-2 Input and Output 8.1.1 File Types The Domain operating system supports many types of files, including the following: • Headerless ASCII files • Fixed-length record files • Variable-length record files • User-written type-manager files (extensible streams) • No defined-record structure files The Domain/OS system calls enable you to create and access any of these types. With the standard I/O library and UNIX functions, however, you can access only ASCII files. These are files that consist a string of ASCII characters. You can create your own records within such a file by entering a delimiting character, but there is no predefined record structure. Also, you can read and write bytes in numeric rather than string formats, but it is your responsibility to keep track of how data is represented. 8.1.2 Streams and File Descriptors C makes no distinction between devices such as a terminal or tape drive and logical files located on a disk. In all cases, I/O is performed through streams that are associated with the files or devices. A stream consists of an ordered series of bytes. You can think of it as a 1-dimensional array of characters, as shown in Figure 8-2. Reading or writing to a file or device involves reading data from the stream or writing data onto the stream. C PROGRAM FILE Figure 8-2. C Programs Access Data on Files Through Streams Input and Output 8-3 To perform I/O operations, you must associate a stream with a file or device. For the buffered I/O operations (the ones in the standard I/O library), you do this by declaring a pointer to a structure type called FILE. The FILE structure, which is defined in the stdio.h header file, contains several fields to hold such information as the file's name, its access mode, and a pointer to the next character in the stream. The FILE structures proVide the operating system with bookkeeping information, but your only means of access to the stream is the pointer to the FILE structure (called a file pointer). The file pointer, which you must declare in your program, holds the stream identifier returned by the fopenO function. You use the file pointer to read from, write to, or close the stream. A program may have more than one stream open simultaneously, although each implementation imposes a limit on the number of concurrent streams. The limit for Domain/OS systems is 31. For Unix unbuffered functions, you must also associate a stream with a file, but instead of identifying the file by a pointer to the stream, you identify it with a file descriptor. A file descriptor is a unique integer that identifies a particular stream. It is a component of the FILE structure. You can obtain a file descriptor with the openO function. Even if you open a file with a standard I/O function, it is possible to extract the file descriptor and access the file through UNIX functions. Conversely, you can open a file with UNIX functions and then access it with standard I/O functions. You should not however, mix UNIX read and write operations with standard I/O read and write operations. 8.2 The Standard I/O Library The standard, buffered I/O library contains nearly 30 functions for accessing files and devices. We have divided the functions into two groups: 1) Those that access standard streams. 2) Those that access user-defined files and devices. Before describing the specific functions, however, we discuss the buffering mechanism. 8.2.1 Buffering Compared to memory, secondary storage devices such as disk drives and tape drives are extremely slow. For most programs that involve I/O, the time taken to access these devices overshadows the time the CPU takes to perform operations. It is extremely important, therefore, to reduce the number of physical read and write operations as much as possible. Buffering is the simplest way to do this. A buffer is an area where data is temporarily stored before being sent to its ultimate destination. Buffering provides more efficient data transfer because it enables the operating system to minimize accesses to I/O devices. 8-4 Input and Output All operating systems use buffers to read from and write to 110 devices. That is, the operating system only accesses 110 devices in fixed-size chunks, called blocks. Typically, a block is 512 or 1024 bytes. In Domain/OS systems, blocks are 1024 bytes long by default. This means that even if you want to read only one character from a file, the operating system reads the entire block in which the character is located. For a single read operation, this isn't very efficient, but suppose you want to read 1000 characters from a file. If 110 were unbuffered, the system would perform 1000 disk seek and read operations. With buffered I/O, on the other hand, the system reads an entire block into memory and then fetches each character from memory when necessary. This saves 999 I/O operations. The C run-time library contains an additional layer of buffering, which comes in two forms: line buffering and block buffering. In line buffering, the system stores characters until a newline character is encountered, or until the buffer is filled, and then sends the entire line to the operating system to be processed. This is what happens, for example, when you read data from the terminal. The data is saved in a buffer until you enter a newline character. At that point, the entire line is sent to the program. In block buffering, the system stores characters until a block is filled, and then passes the entire block to the operating system. Note that these are not the same blocks used by the operating system. To distinguish between the two levels of buffering, we use the term user-level blocks to refer to blocks used by the standard I/O library, and kernel-level blocks for blocks used by the operating system. By default, all I/O streams that point to a file are block buffered. Streams that point to your terminal (stdin and stdout) are line-buffered. The buffered I/O library package includes a buffer manager that keeps buffers in memory as long as possible. So if you access the same portion of a stream more than once, there is a good chance that the system can avoid accessing the I/O device multiple times. Note, however, that this can create problems if the file is being shared by more than one process. For inter-process synchronization, you need to use UNIX unbuffered functions or Domain/OS system calls. In both line buffering and block buffering, you can explicitly direct the system to flush the buffer at any time (with the ff1ushO function), sending whatever data is in the buffer to its destination. Although line buffering and block buffering are more efficient than processing each character individually, they are unsatisfactory if you want each character to be processed as soon as it is input or output. For example, you may want to process characters as they are typed rather than waiting for a newline to be entered. C allows you to tune the buffering mechanism by changing the default size of the buffer. You can set the size to zero to turn buffering off entirely. Alternatively, you can use the UNIX unbuffered functions or Domain/OS system calls. Input and Output 8-5 There are several functions in the standard library that allow you to change the buffering parameters of a stream: void setbuf( FILE *stream, char *buf) Assigns a specific buffer to a stream rather than using the default buffer. If you pass a null pointer as the buffer, then the stream is unbuffered. void setbuffer( FILE *stream, char *buf, int size) (BSD library only) Same as setbufO, but allows you to set the size of the buffer. void setIinebuffer( FILE *stream ) (BSD library only) Changes stdin or stdout from block-buffered to line-buffered or unbuffered. void setvbuf( FILE *stream, char *buf, int type, int size) (SysV library only) Assigns a specific buffer to a stream. You may specify block-buffering, line-buffering, or no buffering. If you specify block-buffering, you may also specify the size of the block. In most instances, buffering is invisible. The standard I/O functions make sure that all data is processed as if it were being handled immediately even though it is not. So long as you do not mix buffered calls with unbuffered calls, you should have no problem. 8.2.2 The Header File To use any of the standard I/O functions. you should include the stdio.h header file. This file contains: • Prototype declarations for all the I/O functions. • Declaration of the FILE structure. • Several useful macro constants. including stdin. stdout. stderr, EOF, and NULL. EOF is the value returned by many functions when the system reaches the end-of-file marker. NULL is the name for a null pointer. 8-6 Input and Output 8.2.3 Macros and Functions A number of the standard I/O functions are implemented as macros rather than functions. Specifically, the macros are: • • • • • • • • geteO geteharO puteO puteharO ferrorO clearerrO feofO filenoO Because they are macros, you should not include side effect operators in the arguments when you invoke them. For example, putc(c, *fp++) causes erroneous results. For geteO and puteO, you can get around this problem by using fgeteO and fputeO, which perform the same operation, but are implemented as true functions. 8.2.4 Error Handling All standard I/O functions return either NULL or EOF for errors. Both names are defined in , NULL as zero and EOF as -1. Some functions also return EOF when an end-of-file condition is encountered. There are also two flags in the FILE structure that indicate whether an error or end-of-file has occurred for the stream. Because EOF is returned for both errors and end-of-files, it is often difficult to tell which of these conditions has occurred. Moreover, some functions, such as getwO, may return -1 as a valid return value. To find out for sure whether an end-of-file has occurred, you can call feofO, which checks the end-of-file flag and returns 1 if an end-of-file has occurred. Similarly, the ferrorO function checks the error flag. Neither of these functions, however, resets the flags. To reset the flags, use the clearerrO function. If either flag is set, the system will prevent you from performing further operations on the stream. Input and Output 8-7 To summarize, the error-handling routines for standard I/O functions are: void c1earerr( FILE *stream) Resets the error and end-of-file indicators for the specified stream. int feof(FILE *stream) Checks whether an end-of-file was encountered during a previous read operation. int ferror( FILE *stream ) Returns an integer error code (the value of errno) if an error occurred while reading from or writing to a stream. The following function checks the error and end-of-file flags for a specified stream and returns one of four values based on the results. The c1earerrO function sets both flags equal to zero. /* * * * */ If If If If neither flag is set, stat will equal zero. error is set, but not eof, stat equals 1. eof is set, but not error, stat equals 2. both flags are set, stat equals 3. #include #define EOF FLAG 1 #define ERR=FLAG 2 char stream stat( FILE *fp ) - { char stat = 0; if (ferror( fp » stat 1= ERR_FLAG; i f (feof( fp » stat 1= EOF_FLAG; clearerr( fp ); return stat; } 8.2.5 File Position Indicators One of the fields in each FILE structure is a file position indicator that points to the byte where the next character will be read from or written to. As you read from and write to the file, the operating system adjusts the file position indicator to point to the next byte. Although you can't directly access the file position indicator (at least not in a portable fashion), you can fetch and change its value through library functions (fseekO and ftellO, thus enabling you to access a stream in non-serial order. Do not confuse the file pointer with the file position indicator. The file pointer identifies an open stream connected to a file or device. The file position indicator refers to a specific byte position within a stream. 8.2.6 1/0 to Standard Devices There are three streams that are automatically open: stdin, stdout, and stderr. All three point to your pad by default. The streams stdin and stdout are both line-buffered. The 8-8 Input and Output stderr stream, which is where error messages are output, is not buffered. At the command level, you can redirect the input and output by using the redirection commands or the pipe facility. To redirect the standard streams within programs, use the freopenO function. The following is a list of all routines that perform input and output to stdin, stdout, and stderr. int getchar( void) Reads the next character from the standard input stream. getcharO is identical to getc(stdin). char *gets( char *string ) Reads characters from stdin until a newline or end-of-file is encountered. int printf( char *format, ... ) Outputs one or more values according to user-defined formatting rules. int putchar( char c) Outputs a single character to the standard output stream. putcharO is identical to putc(stdout). int puts( char *string ) Outputs a string of characters to stdout. It appends a newline character to the string. int scanf( char *format, ... ) Reads one or more values from stdin, interpreting each according to user-defined formatting rules. The BSD Programmer's Reference and the SysV Programmer's Reference manuals describe each of these functions in detail. The following example, which reads user input, and then writes output, uses several of these routines. /* Program name is "standard_io_example". */ #include #define RETURN 10 /* ASCII value of linefeed character */ int maine void { int age, i = 0; static char name [30] , profession[30] , ageyrompt[]= "Age: "; static char prof_prompt[] = "Profession: "; printf( "Name: " ); gets( name ); puts( age prompt ); scanf ( "%d", &age ); getchar(); /* Flush linefeed character from buffer. */ printf ( "%s", profyrompt ); while«(profession[i++]=getchar(» != RETURN) && (i < 30» , profession[i] = '\0'; } Input and Output 8-9 A typical execution of the program, with user input, is: Name: John Doe Age: 37 Profession: Tech Writer The getsO function reads characters from stdin until a linefeed character is encountered. Although it reads the linefeed character, it replaces it with a null character when it stores the string in memory. The putsO function automatically outputs a linefeed following the string. The scanfO function takes an address of a variable as its argument. If you use the %s format, scanfO automatically appends a null character to the input string. scanfO does not read the linefeed character at the end of the input. As a result, the first character in the input buffer following a scanfO is often a linefeed character. You can discard this character by invoking getcharO once, as we did. Unlike putsO, printfO does not output a linefeed after each string. The getcharO function reads successive characters from stdin. If an error or end-of-file occurs, it returns EOF. In our program, we call getcharO until it reads a linefeed character (ASCII value 10). We then append a null character to make it a true string. 8.2.7 110 to Files For each of the functions in the previous section, there is a corresponding function that is exactly the same except that it takes one additional argument, a pointer to a file. There are also additional functions for opening and closing files, listed below (they are listed alphabetically by function name). int fclose( FILE *stream ) Closes a stream. FILE *fdopen( int filedes, char *type ) Associates a stream with a file descriptor. This enables you to open a file with UNIX functions and then access it with standard I/O functions. int ff1ush( FILE *stream ) Flushes a buffer by writing out everything that has been buffered for the specified stream. The stream remains open. int fgetc( FILE *stream ) Same as getcO, but it is implemented as a function rather than a macro. 8-10 Input and Output char *fgets( char *s, int n, FILE *stream ) Reads a string from a specified input stream. Unlike getsO, fgetsO enables you to specify a maximum number of characters to read and includes the terminating newline in the string. int fileno( FILE *stream ) Returns the file descriptor associated with a specified stream. This enables you to open a file with standard I/O functions, and then access it with UNIX functions. FILE *fopen( char *fiIename, char *type ) Opens and possibly creates a file, and associates a stream with it. fopenO takes two arguments: a pathname identifying the file, and a mode specification that determines what types of operations may be performed on the file. See Section 8.2.8 for more information about this function. int fprintf( FILE *stream, char *format, ... ) Exactly like printfO, except that output is to a specified file. int fputc( int c, FILE *stream ) Writes a character to a stream. This is the same as putcO, but it is implemented as a function rather than a macro. int fputs( char *s, FILE *stream ) Writes a string to a stream. This is like puts 0, except that it does not append a newline to the stream. int fread( void *ptr, unsigned size, unsigned nitems, FILE *stream ) Reads a block of binary data from a stream. The arguments specify the size of the block and where it should be stored. FILE *freopen( FILE *stream ) Closes a specified stream, and then reopens it for a new file. This is useful for recycling a stream, particularly stdin, stdout, and stderr. int fscanf( FILE *stream, char *format, ... ) Same as scanfO, except that data is read from a specified file. int fseek( FILE *stream, long offset, int ptrname ) Positions a stream marker. This function enables you to perform random access on a file. long ftell( FILE *stream ) Returns the position of a stream marker. Input and Output 8-11 int fwrite( void ·ptr, unsigned size, unsigned nitems, FILE ·stream ) Writes a block of binary data from a specified buffer to a specified stream. int getc( FILE ·stream ) Reads a character from a specified stream. int getw( FILE ·stream ) Reads the next word (four bytes) from a specified stream. int putc( char c, FILE ·stream ) Writes a character to a specified stream. int putw( int w, FILE ·stream ) Writes a word (four bytes) to a specified stream. void rewind( FILE ·stream ) Sets the file position indicator to the beginning of the file for a specified stream. int ungetc( int c, FILE ·stream ) Pushes a character onto a stream. The next call to getcO returns this character. 8.2.8 Opening and Closing a File Before you can read from or write to a file, you must open it with the fopenO function. fopenO takes two arguments-the first is the file name and the second is the access mode. The text stream modes are shown in Table 8-1. Table 8-2 summarizes the properties of the fopenO modes. When you open a file with one of the + modes, you may read and write to the file. However you cannot write and then read without an intervening fseekO or rewindO call. Likewise, you may not read and then write without an intervening fseekO or rewindO call, unless the write operation encounters an end-of-file. If you use the append mode (a), it is impossible to overwrite existing data in the file. Whenever you write to the file, the data is appended at the end regardless of the stream marker's current position. 8-12 Input and Output Table 8-1. jopenO Text Modes Description Mode "r" Open an existing text file for reading. The system initializes the file position indicator to point to the beginning of the file. "w" Create a new text file for writing. If the file already exists, the system will truncate it to zero length, thereby destroying the file's previous contents. The file position indicator is initially set to the beginning of the file. "a" Open an existing text file in append mode. You can write only at the end-of-file position. Even if you explicitly move the file position indicator, the system will reassign the inidicator to point to the end of the file prior to any write operation. "r+" Open an existing text file for reading and writing. The file position indicator is initially set to the beginning of the file. "w+" Create a new text file for reading and writing. If the file already exists, the system will truncate it to zero length, thereby destroy the file's previous contents. "a+" Open an existing file or create a new one in append mode. You can read data anywhere in the file, but you can only write data at the end-of-file marker. The fopenO function returns a file pointer that you can use to access the file later in the program. The following function opens a text file called test with read access. #include FILE *open test( void ); { - /* Returns a pointer to a FILE */ /* struct */ FILE *fp; fp = fopen( "test", "r" ); i f (fp == NULL) fprintf( stderr, "Error opening file test\n" ); return fp; } Note how the file pointer fp is declared as a pointer to FILE. The fopen 0 function returns a null pointer (NULL) if an error occurs. If successful, fopen 0 returns a Input and Output 8-13 non-zero file pointer. The fprintfO function is exactly like printfO, except that it takes an extra argument indicating which stream the output should be sent to. In this case, we send the message to the standard I/O stream stderr. By default, this stream usually points to your terminal. Table 8-2. File and Stream Properties of fopenO Modes Mode Property r File must exist before open * Truncates file to zero length Can read from stream Can write to stream Can write to stream only at end w a r+ w+ a+ * * * * * * * * * * * * * * We have written the opeo_testO function more verbosely than is usual. Typically, the error test is combined with the file pointer assignment: if «fp = fopen( "test", "r" » == NULL) fprintf( stderr, "Error opening file test\n" ); The opeo_testO function is a little too specific to be useful since it can only open one file, called test, and only with read-only access. A more useful function, shown below, can open any file with any mode. #include FILE *open file( char *file_name, char *access_mode ) { - FILE *fp; if «fp = fopen( file name, access mode » == NULL) fprintf( stderr, "Error opening file %s with access mode\ %s\n" , file name, access mode); return fp; Our opeo_fileO function is essentially the same as fopeoO, except that it prints an error message if the file cannot be opened. 8-14 Input and Output To open test from mainO, you could write: #include mainO { extern FILE *open_file(); if «open file ("test", "r"» exit (1);- NULL) } Note that the stdio.h header file is included in both routines. You can include it in any number of different source files without causing conflicts. 8.2.8.1 Closing a File To close a file, you need to use the fcloseO function: fclose ( fp ); Closing a file frees up the FILE structure that fp points to so that the operating system can use the structure for a different file. It also flushes any buffers associated with the stream. Domain/OS has a limit on the number of streams that can be open at once (128), so it's a good idea to close files when you're done with them. In any event, the system automatically closes all open streams when the program terminates normally. Domain/OS will close open files even when a program aborts abnormally, but it is more efficient for you to close the files yourself. Bug Alert: Opening a File In the statement, if «fp = fopen( "test","r" » == NULL) fprintf ( stderr, "Error opening fi Ie test \n" ); the parentheses around, fp = fopen( "test", "r" ) == are necessary because has higher precedence thaIlj =. Without the parentheses, fp gets assigned zero or one, depending on whether the result of fopenO is a null pointer or a valid pointer. This is a common programming mistake. Input and Output 8-15 8.2.9 Reading and Writing Data Once you have opened a file, you use the file pointer to perform read and write operations. The standard 110 library supports three degrees of 110 granularity. That is, you can perform 1/0 operations on three different sizes of objects. The three degrees of granularity are as follows: • One character at a time • One line at a time • One block at a time Each of these methods has some pros and cons. In the following sections, we show three ways to write a simple function that copies the contents of one file to another. Each uses a different degree of granularity. One rule that applies to all levels of 110 is that you cannot read from a stream and then write to it without an intervening call to fseek 0, rewind 0, or fflush O. The same rule holds for switching from write mode to read mode. These three functions are the only 110 functions that flush the buffers without disconnecting the stream. 8.2.9.1 One Character at a Time There are four functions that read and write one character to a stream: getcO A macro that reads one character from a stream. fgetcO Same as getcO, but implemented as a function. putcO A macro that writes one character to a stream. fputcO Same as putcO, but implemented as a function. Note that getcO and putcO are usually implemented as macros whereas fgetcO and fputcO are guaranteed to be functions. Because they are implemented as macros, putcO and getcO usually run much faster. In fact, on Apollo computers, they are almost twice as fast as fgetcO and fputcO. Because they are macros, however, they are susceptible to side effect problems (see Section 8.2.3). For example, the following is a dangerous call that may not work as expected: putc( 'x', fp[j++] ); If an argument contains side effect operators, you should use fgetcO or fputcO, which are guaranteed to be implemented as functions. The following example uses getcO and putcO to copy one file to another. 8-16 Input and Output #include #define FAIL 0 #define SUCCESS 1 int copyfile( char *infile, char *outfile ) { FILE *fpl, *fp2; "r" » == NULL) return FAIL; if « fp2=fopen ( outfile, "w" » == NULL) { fclose( fpl ); return FAIL; i f «fpl = fopen( infile, } while (!feof( fpl » putc( getc( fpl ), fp2 ); fclose ( fpl ); fclose ( fp2 ); return SUCCESS; } The getcO function gets the next character from the specified stream and then moves the file position indicator one position. Successive calls to getcO read each character in a stream. The feofO function returns a nonzero value if the stream's end-of-file flag is set. 8.2.9.2 One Line at a Time Another way to write this function is to read and write lines instead of characters. There are two line-oriented I/O functions-fgetsO and fputsO. The prototype for fgetsO is: char *fgets( char *s, int n, FILE stream ); The three arguments have the following meanings: s A pointer to the first element of an array to which characters are written. n An integer representing the maximum number of characters to read. stream The stream from which to read. fgetsO reads characters until it reaches a newline, an end-of-file, or the maximum number of characters specified. fgetsO automatically inserts a null character after the last character written to the array. This is why, in the following copyfileO function, we specify the maximum to be one less than the array size. fgetsO returns NULL when it reaches the end-of-file. Otherwise, it returns the first argument. The fputsO function writes the array identified by the first argument to the stream identified by the second argument. The prototype for fputsO is: char *fputs( char *s, int n, FILE stream ); Input and Output 8-17 The three arguments have the following meanings: s A pointer to the first element of an array from which characters are read. n An integer representing the maximum number of characters to write. stream The stream to which characters are written. One point worth mentioning is the difference between fgetsO and getsO (the function that reads lines from stdin). Both functions append a null character after the last character written. However, getsO does not write the terminating newline character to the input array. fgetsO does include the terminating newline character. Also, fgetsO allows you to specify a maximum number of characters to read, whereas getsO reads characters indefinitely until it encounters a newline or end-of-file. There is a similar difference between fputsO and ,putsO. putsO appends a newline to the end of each string it writes, but fputsO does not. The following function illustrates how you might implement copyfiJeO using the line-oriented functions. #include #define FAIL a #define SUCCESS 1 #define LINESIZE 100 int copyfile( char *infile, char *outfile ) { FILE *fp1, *fp2; char line[LINESIZE]; i f «fp1 = fopen( infile, "r" » return FAIL; i f «fp2 = fopen( outfile, "w" » == == NULL) NULL) { fclose ( fp1 ); return FAIL; } while (fgets( line, LINESIZE-1, fp1 ) != NULL) fputs( line, fp2 ); fclose ( fp1 ); fclose( fp2 ); return SUCCESS; } You might think that the copyfiJe 0 version that reads and writes lines would be faster than the version that reads and writes characters because it requires fewer function calls. Actually, though, the version using getcO and putcO is significantly faster. This is because Domain/OS systems implement fgetsO and fputsO using fputcO and fgetcO. Since these are functions rather than macros, they tend to run more slowly. 8-18 Input and Output 8.2.9.3 ODe Block at a Time In addition to character and line granularity, you can also access data in lumps called blocks. Note that these are user-level blocks, not kernel-level blocks. You can think of a block as an array. When you read or write a block, you need to specify the number of elements in the block and the size of each element. The two block I/O functions are freadO and fwriteO. The prototype for fread() is int fread( void *ptr, int size, int nmemb, FILE *stream ); The arguments represent the following data: ptr A pointer to an array in which to store the data. size The size of each element in the array. nmemb The number of elements to read. stream The file pointer. fread() returns the number of elements actually read. This should be the same as the third argument unless an error occurs or an end-of-file condition is encountered. The fwrite() function is the mirror-image of freadO. It takes the same arguments, but instead of reading elements from the stream to the array, it writes elements from the array to the stream. The following function shows how you might implement copyfileO using the block I/O functions. Note that we test for an end-of-file condition by comparing the actual number of elements read (the value returned from freadO) with the number specified in the argument list. If they are different, it means that either an end-of-file or an error condition occurred. We use the ferror() function to find out which of the two possible events happened. If an error occurred, we print an error message and return an error code. Otherwise we return a success code. For the final fwrite() function we use the value of Dum_read as the number of elements to write, since it is less than BLOCKSIZE. Input and Output 8-19 #include #define FAIL 0 #define SUCCESS 1 #define BLOCKSIZE 512 typedef char DATA; int copyfile( char *infile, char *outfile ) { FILE *fp1,*fp2; DATA block[BLOCKSIZE]; int num_read; if «fp1 = fopen( infile, "r" » == NULL) { printf ( "Error opening file %s for input. \n", infile ); return FAIL; } i f «fp2 { = fopen( outfile, "w" » == NULL) printf ( "Error opening file %s for output. \n", outfile ); fclose ( fp1 ); return FAIL; } while «num read = fread( block, sizeof(DATA) , BLOCKSIZE, fpl » == BLOCKSIZE) fwrite( block, sizeof(DATA) , num_read, fp2 ); fwrite( block, sizeof(DATA) , num_read, fp2 ); fclose ( fp1 ); fclose( fp2 ); if (ferror( fp1 » { printf( "Error reading file %s\n", infile ); return FAIL; } return SUCCESS; } Like fputsO and fgetsO, the block I/O functions are usually implemented using fputeO and fgeteO functions, so they are not as efficient as the macros puteO and geteO. Note also that these block sizes are independent of the blocks used for buffering. The buffer size, for instance, might be 1024 bytes. If the block size specified in a read operation is only 512 bytes, the operating system will still fetch 1024 bytes from the disk and store them in memory. Only the first 512 bytes, however, will be made available to the freadO function. On the next freadO call, the operating system will fetch the remaining 512 bytes from memory rather than performing another disk access. The block sizes in freadO and fwriteO functions, therefore, do not affect the number of device I/O operations performed. 8-20 Input and Output 8.2. 10 Random Access The previous examples accessed files sequentially, beginning with the first byte and accessing each successive byte in order. For a function such as copyfileO, this is reasonable since you need to read and write each byte anyway. In this case, it's just as fast to access them sequentially as any other way. For many applications, however, you need to access particular bytes in the middle of the file. In these cases, it is more efficient to use C's two random access functions-fseekO and ftellO. The fseekO function moves the file position indicator to a specified position in a stream. The prototype for fseekO is: int fseek( FILE *stream, long int offset, int whence ); The three arguments are: stream A file pointer. offset An offset measured in characters (can be positive or negative). whence The starting position from which to count the offset. There are three choices for the whence argument, all of which are designated by names defined in stdio.h: SEEK SET The beginning of the file. SEEK CUR The current position of the file position indicator. The end-of-file position. For example, the statement, stat = fseek(fp, 10, SEEK_SET) moves the file position indicator to character 10 of the stream. This will be the next character read or written. Note that streams, like arrays, start at the zero position, so character 10 is actually the 11th character in the stream. The value returned by fseekO is zero if the request is legal. If the request is illegal, fseekO returns a nonzero value. This can happen for a variety of reasons. For example, the following is illegal if fp is opened for read-only access because it attempts to move the file position indicator beyond the end-of-file position: stat = fseek(fp, 1, SEEK_END) Obviously, if SEEK_END is used with read-only files, the offset value must be less than or equal to zero. Likewise, if SEEK_SET is used, the offset value must be greater than or equal to zero. Input and Output 8-21 The ftell 0 function takes just one argument, which is a file pointer, and returns the current position of the file position indicator. ftell 0 is used primarily to return to a specified file position after performing one or more I/O operations. For example, in most text editor programs, there is a command that allows the user to search for a specified character string. If the search fails, the cursor (and file position indicator) should return to its position prior to the search. This might be implemented as follows: cur pos = ftell( fp ); if (search( string ) == FAIL) fseek(fp, cur-pos, SEEK_SET); Note that the position returned by ftellO is measured from the beginning of the file. The example in the next section illustrates random access, as well as some of the other I/O topics discussed in this chapter. 8.2.10.1 Printing a File in Sorted Order Suppose you have a large data file composed of records. Let's assume that the file contains one thousand records, where each record is a VITALSTAT structure, as declared below in a file called vitalstat.h: #define NAME LEN 19 typedef char-NAME[NAME LEN] ; typedef struct date { unsigned day: 5, month: 5, year: 11; DATE; typedef struct vitalstat { NAME vs name; char vs-ssnum[II]; DATE vs=:date; char vs jersey; } VITALSTAT; Suppose further that the records are arranged randomly, but you want to print them alphabetically by the vs_name field. First, you need to sort the records. We can do this by creating an index for each record. The following function reads the key field (vs_name) of every record, and stores them in an array of structures that contain just two fields-the record id (index) and the key. We assume that the data file has already been opened, so that the function is passed a file pointer. The include file recs.h contains the following: #include "vitalstat.h" #include #define MAX REC NUM 1000 typedef struct int index; NAME key; } INDEX; T 8-22 Input and Output /* * Reads up to max rec num records from a file and stores the key field of each record in an index array. Returns the * number of key fields stored. */ 'include "recs.h" int get records( FILE *data file, INDEX names_index, int max_rec_num) { int offset = 0, counter = 0; for (k = 0; !feof( data_file) && counter < max_rec_num; k++) { fgets( names index[k] .key, NAME LEN, data file ); offset += sizeof(VITALSTAT); if (fseek( data file, offset, SEEK SET) && (!feof( data=file ») { fprintf(stderr, "Problem accessing file\n"); exit ( 1 ); } counter++; } return counter; } The function reads the first NAME_LEN characters of each record using fgetsO and stores them in the array names_index, then moves the file position indicator to the beginning of the next record with fseekO. In this way, we avoid reading extraneous parts of the record. In reality, of course, the I/O buffering mechanism fetches blocks of 1024 characters, so the entire records are read anyway. Within each buffer, however, we need only access the first field in each record. This saves us memory-to-memory data copying time, even though we don't save any device-to-memory processing time. For large records, which span blocks, this approach could also save you device-to-memory processing time. The next task is to sort the array of NAMES_INDEX structures. This function, which makes use of the library function qsortO, is shown below. The return value is a pointer to an ordered array of NAMES_INDEX structures. Input and Output 8-23 /* * * Sort an array of NAMES INDEX structures by the name field. There are-index count elements to be sorted. Returns a pointer to the sorted array. */ #include "recs.h" void sort index ( INDEX names_index, int index_count) - { int j; static int compare func(); /* Defined in this file. */ /* Assign values to-the index field of each structure. */ for (j = 0; j < index count; j++) names_index[j].index = j; qsort( names_index, index_count, sizeof(INDEX), compare_func ); return names_index; static int compare_func( NAMES_INDEX *p, NAMES INDEX *q ) { return strcmp( p->name, q->name ); } The next step is to print out the records in their sorted order. We definitely need to use fseekO for this function because we need to jump around the file. We can compute the starting point of each record by multiplying the index value with the size of the VITALSTAT structure. If each VITALSTAT structure is 40 characters long, for example, record 50 will start at character 2000. After positioning the file position indicator with fseekO, we use freadO to read each record. Finally, we print each record with a printfO call. /* Print the records in a file in the order * indicated by the index array. */ #include recs.h void print_indexed_records( FILE *data file, INDEX index[], int index=count) { VITALSTAT vs; int j; for (j = 0; j <= index_count; j++) { } } 8-24 Input and Output if (fseek( data file, sizeof(VITALSTAT) * index[j] . index, SEEK SET» exit ( 1 );fread( &vs, 1, sizeof(VITALSTAT), data file ); printf( "%20s, %hd, %hd, %hd, %12s", vS. name , vs.bdate.day, vs.bdate.month, vs. bdate. year , vS.ssnum ); To make this program complete, we need a mainO function that calls these other functions. We have written mainO so the filename can be passed as an argument. #include "recs.h" int maine int argc, *argv[] { extern int get records(); extern void sort index(); extern int print=indexed_records(); FILE *data file; static INDEX index [MAX_REC_NUM] ; i f (argc != 2) { printf( "Error: must enter filename\n" ); printf( "Filename: " ); scanf( "%s", filename ); } else filename = argv[l]; i f «data file { - = fopen( filename, "r" » == NULL) printf( "Error opening file %s.\n", filename ); exit ( 1 ); num recs read = get index( data file, index, MAX REC NUM ); sort index( index, num recs read ); print indexed records (-data-file, index, num_recs_read ); exit(-O); } 8.3 UNIX Unbuffered 110 Functions Although these functions are called "unbuffered," they do not bypass the disk buffering that occurs at the lowest levels of the operating system. These functions are called unbuffered because they do not use the additional layer of buffering employed by the standard I/O library. Whereas the standard I/O functions access a stream through a stream pointer, UNIX unbuffered I/O functions operate through a file descriptor. A file descriptor is an integer that identifies a channel between a stream and a file or device. A unique file descriptor is returned whenever you open or create a file. Each process can support up to 20 file descriptors, numbered 0 through 19. By default the standard devices have the following file descriptors: standard device file descriptor stdin stdout stderr o 1 2 Input and Output 8-25 The basic UNIX I/O functions are shown in table 8-3. Table 8-3. UNIX I/O Functions Function What It Does c1oseO Closes a file. This function breaks the connection between a file descriptor and a file. allowing you to reuse the descriptor. creatO Creates a new file or re-creates (overwrites) an existing file. This function enables you to assign specific protection attributes to a file. IseekO Moves a stream marker. This function is similar to fseekO. but it uses a file descriptor instead of a stream pointer. openO Opens a file. This function is similar to fopenO. but it returns a file descriptor instead of a stream pointer. readO Reads a block. This function is similar to fread O. but blocks are unbuffered. write 0 Writes a block. This function is similar to fwriteO. but blocks are unbuffered. unlinkO Deletes a file. In addition to these functions. there are a number of functions that enable you to access directory files and change the protection attributes of data files. but these are beyond the range of this manual. For information about these functions. see the manuals BSD Programmer's Reference and the SysV Programmer's Reference manuals. The following example is a file copy function using the UNIX I/O library. 8-26 Input and Output /* Program name is "unix copy". */ #include #define BUFSIZE 100 unix copy( char *infile, char *outfile ) { - int fdin, fdout, nbuf; char buf[BUFSIZE] ; if «fdin=open( infile, 0 RDONLY » -1) { perror ("Error") ; exit () ; } if «fdout=open( outfile, 0 WRONLY O_CREAT, 066 » -1) { perror ( "Error" ); exi t () ; } while «nbuf = read( fdin, buf, sizeof(buf») > 0) write( fdout, buf, nbuf ); i f (nbuf == -1) perror ("Error") ; if (close( infile ) == -1 I I close( outfile) -1) perror ( "Error" ); } This routine performs the same operation as the file copy functions listed previously using standard I/O calls. But in this function, we define our own buffering. Data is read in and written in 100-byte chunks. 8.3.1 UNIX I/O Error-Handling Like the standard I/O functions, UNIX I/O functions return -lor 0 when an error occurs, but they do not use the names EOF and NULL. Also, instead of setting flags in the FILE structure, they use a global variable called errno. This variable is assigned a positive integer value that represents a specific error message. Chapter 9 lists all the error codes and messages. After an error has occurred, you check to see which error it is by looking at the value of errno (errno is defined in , which you must include in the source file). There is also a function called perror 0 that prints out the message corresponding to errno's current value. As with the standard I/O flags, you must explicitly reset errno. -------88------- Input and Output 8-27 Chapter 9 Diagnostic Messages This chapter details the error, warning, and informational messages that the C compiler produces. An error indicates a problem severe enough to prevent the compiler from creating an executable object file. A warning is less severe than an error; a warning does not prevent the compiler from creating an executable object file. The warning message tells you about a potential ambiguity in your program for which the compiler believes it can generate the correct code. Informational messages are intended to inform you of potential problems in your program. The C compiler always outputs error messages; warning messages can be suppressed by compiling with the -nwarn option. There are four levels of informational messages. You can select the level you want with the -info option. To suppress all informational messages, specify -info 0 or -ninfo (this is the default). When the compiler outputs a diagnostic message, it lists the following information: • The error, warning, or informational message number. This is an integer symbolizing a message. In Section 9.2, we list all messages by number. • The line number in the source code where the problem was detected. (Occasionally, the given line number is one or more lines after the line containing the error.) • The line of source code where the problem was detected. • The actual message. The compiler includes in the message invalid symbols defined by the program, but it can identify only those symbols defined after preprocessor execution. Diagnostic Messages 9-1 The Domain C compiler is designed to compile code as quickly as possible. This means that there are minimal error recovery mechanisms. Although the compiler does attempt to recover from errors, a single mistake can produce cascading errors. Therefore, the cardinal rule of error-fixing in Cis: Worry about the first reported error only! For instance, if the compiler reports twenty errors, stare at the first one, for it may have indirectly triggerered the other nineteen. Now, it is entirely possible that some or all of the other nineteen errors may be real errors that you will have to take action on, but don't waste your time on them until you are sure that they are real errors. Fix the first one and then recompile. 9.1 Common C Programming Mistakes We draw your attention to the most commonly made C programming mistakes: • Forgetting a semicolon at the end of a statement. • Putting a semicolon where it is not needed, for instance, at the end of a preprocessor directive, or after a function's argument list. • Forgetting to balance braces; that is, you must have the same number of left braces { and right braces }. • Confusing = with time error.) • Forgetting to use the ampersand (&) in front of an argument to the scanf function. (This will probably cause a run-time error, and possibly a compile time warning.) ==. (This confusion will cause a run-time error, not a compile 9.2 Domain C Compiler Messages Here is a list of the C compiler error, warning, and informational messages: 1 ERROR unterminated comment. You forgot to close a comment. Remember that you begin a comment with / * and close it with • / . 2 ERROR Improper numeric constant. For example, you entered a numeric constant of the form Oxreal. This implies that you are trying to specify a hexadecimal floatingpoint number. The number following Ox must be a hexadecimal integer. 9-2 Diagnostic Messages 3 ERROR Unterminated character string. You started a string, but you did not finish it. Remember that you must enclose a string with double-quotes. A common trigger for this error is calling printfO and forgetting to end the string before you list the data arguments. 4 ERROR Bad syntax TOKEN. The compiler encountered TOKEN when it was expecting to find something else. 5 ERROR Illegal module name module_name. A module name must be a legal identifier. See Chapter 2 for identifier rules. (The most common mistake is to begin the module name with a digit instead of a letter.) 6 ERROR Quoted string is too long; maximum size is 4095 characters. You should break the long string into several shorter strings. 7 ERROR -DEF option has no name to define. You specified the compiler option -def with the format: -def = value rather than the correct form which is: -def name = value See Chapter 6 for details on -def. 8 WARNING Old-fashioned assignment operator; taken as assignment_operation. This is only a warning. Some older C compilers let you use assignment operators in a format opposite to that of modern C compilers. For instance, some older C compilers let you use the assignment operator =+ instead of +=. You should use the modern format (Le., +=). 9 ERROR "void" is illegal for identifier in this context. For a complete discussion of void, see Chapter 3. Diagnostic Messages 9-3 11 ERROR Missing right parenthesis on declaration. In a declaration, the number of right parentheses must match the number of left parentheses. 12 ERROR storage class specifier is illegal in this context; default assumed. You used a storage class specifier inappropriately. For example, you cannot specify auto or register when declaring a global variable. As a second example, you cannot specify static, auto, or extern on a parameter declaration. See Chapter 3 for a complete discussion of storage classes. 13 WARNING Old-fashioned initialization; missing "=" This is only a warning. Some older C compilers allow you to initialize variables with the format data_type variable initial_value; Domain C lets you use this format, but we suggest that you use the modern C format which is data_type variable = initial_value; 14 ERROR Unrecognizable item token; syntax error in declaration. There are many possible causes for this error. One common cause is that you put a semicolon after a function definition. (When you remove the semicolon many other errors will probably go away.) Sometimes this error occurs when the compiler is expecting to find an identifier but finds token instead. (See Chapter 5 for a discussion of function syntax.) 15 ERROR When allocated, size of array_name was zero. Possibly, you omitted the array size when defining array_name but you forgot to initialize the array; for example, compare the following two definitions: char str [] ; I * wrong • I charstr[] = "Hello"; I· right *1 Another possibility is that you declared an array improperly, and consequently, the compiler did not allocate any space for it. See Chapter 3 for details about array declaration. 9-4 Diagnostic Messages 18 ERROR Array dimension token is not an integer constant. When you declare the number of elements in an array, the number must be a positive integer value. For example, compare the following two declarations: int a[3]; /* right */ int a[3.2]; /* wrong */ See Chapter 3 for details about array declaration. 19 ERROR Array dimension token is either zero or negative. When you declare the number of elements in an array, the number must be a positive integer value. See Chapter 3 for details about array declaration. 20 ERROR Too many enumerators for "enum" type; max is 1024. See Chapter 3 for details on enumerated variables. 21 WARNING "long" or "short" in this context is meaningless and ignored. long and short can only be applied as prefixes to the data types int, unsigned int, or float. They cannot be applied to any other data. type. 22 WARNING "unsigned" in this context is meaningless and ignored. The unsigned keyword can only be applied to integer data types. Floating-point data types cannot be made unsigned. 23 ERROR Identifier has not been declared. If it seems like you did declare it, then just make sure that your spelling matches the spelling of the variable in the definition. Diagnostic Messages 9-5 24 ERROR Multiple declaration of identifier was on 1 ine number. I previous declaration Possibly, you used the same identifier as both a parameter and a local variable. For example, the following function will trigger this error: f(arg) int arg; { int arg; } Another possibility is that you declared the same variable twice in the same block. 25 WARNING Repeated item token is ignored. You probably repeated the same data type prefix (like long, short, or unsigned) twice in the same declaration. 26 ERROR Illegal type of constant token for "enum" type. The compiler encountered token when it expected to encounter an integer value. See Chapter 3 for details about enumerated variables. 27 ERROR Improper use of "void" type for function; assumed) . ("int" type A function can return the type "void", but it cannot return an aggregate type that uses void as its base type. 28 WARNING "enum" constant number exceeds 16 bits. Since enumerated constants are stored as signed short ints by default, any number over +32767 or under -32768 will cause an overflow problem. Use long enum to store larger constants. Chapter 3 explains enumerated constants. 30 ERROR Parameter identifier was not listed in the function declaration. You have defined a function parameter named identifier, but you did not put identifier in the function heading. See Chapter 5 for details on function syntax. 9-6 Diagnostic Messages 31 ERROR Dynamic aggregate variable identifier cannot be initialized. The only kind of local aggregate variable that can be initialized is a static one. For instance, compare the following dynamic aggregate variable declarations: fO { /* a function declaration */ auto register extern static int int int int int a[2] a[2] a[2] a[2] a[2] {SOO, {SOO, {SOO, {SOO, {SOO, 400}; 400}; 400}; 400}; 400}; /* /* /* /* /* Illegal */ Illegal */ Illegal */ Illegal */ Legal */ } 3S WARNING In function function_name, parameter identifier was listed but never declared; ("int" type assumed). This is only a warning. Assuming that you wanted identifier to be an int, you can ignore this warning. However, it is bad programming style to accept the default data types in parameter declarations. See Chapter 5 for details about proper function syntax. 36 ERROR Multiple declaration of identifier in parameter_list. You've declared the same parameter more than once in the parameter list. See Chapter 5 for details on proper function syntax. 37 ERROR Cannot assign "void" from function_name. You cannot assign a void function to a non-void lvalue. For example, the following function call triggers this error: int i; i = (void) printf("Bon Jour\n"); 38 ERROR Label name on line number is outside of the scope of the goto. You specified label name, but name is not defined within the current function. (See the "goto" listing for details.) 39 ERROR Improper parameter declaration token. All the arguments in the argument list must be identifiers, but token is not an identifier. Diagnostic Messages 9-7 40 ERROR Token et cetera is not an lvalue. An "Ivalue" is any C entity that can appear on the left side of an assignment statement. Here is a partial list of some things that are not Ivalues, but which programmers often mistake for Ivalues: • An entire array (though a single component of an array is an lvalue) • A constant specified by a #define statement. A possible trigger for this error is to define an n-dimensional array, but to access it with less than n components. 41 ERROR Unknown type token in a structure or union. The compiler was expecting a valid C data type and encountered token instead. Perhaps you misspelled the data type, or perhaps you put the data type in uppercase, or perhaps you just plain forgot the data type. See Chapter 3 for details about C data types. 42 ERROR Function declaration function_name is illegal in a structure or union. You cannot declare a function as a component of a structure or union. Note that the compiler sees a function declaration as any phrase of the following form: IDENTIFIER 0 Perhaps you were trying to declare an array and used parentheses instead of brackets. 43 ERROR "switch" expression type is not an integral type. The C integral types are int, char, and enum. For details on switch, see the "switch" listing in Chapter 4. 44 ERROR Value is not of the correct type for the "switch" on 1 ine number. The most common cause of this error is that you used a floatingpoint value in a case statement. (C will not convert the floatingpoint number to an integer value.) For details on switch, see the "switch" listing in Chapter 4. 9-8 Diagnostic Messages 45 WARNING "switch" expression type is unsigned, but constant name is negative. The C compiler has detected a probable mistake in your programming logic since this constant will never be equal to the switch expression. For details on switch, see the "switch" listing in Chapter 4. 46 ERROR Value has already occurred as a "case" constant on line number. You cannot specify the same value for a case statement more than once in the same switch statement. Note that the C compiler evaluates the case expression, so although you may have specified two different expressions, if they evaluate to the same value. then this error occurs. For details on switch. see the "switch" listing in Chapter 4. 47 ERROR Token is not a valid option specifier. You put a pound sign (#) in column 1 and then followed it with some token other than an identifier or a number. For example. the following expression triggers this error: # "Aloha" 48 ERROR Include file name is not a string; found token. C expected to find a string pathname immediately following #include. but it found token instead. Remember that a string consists of characters enclosed in double-quotes or angle brackets. For example. compare the following #include statements: #include I sys/insltest .ins. c #include "/sys/insltest.ins.c" #include 49 ERROR 1* wrong *1 1* right *I 1* right *1 Nested includes are too deep (> 16). A header file can itself contain header files. which themselves can contain header files which can contain .... but there can be no more than 16 levels of header files. Since exceeding this depth is rather unlikely. this error is more likely to be caused by an include file that includes the file that had included it. (It's rather like two facing mirrors producing infinite reflections.) For instance. if program main.c lists "inc.c" as an include file. and file inc.c lists main.c as an include file. this error will be triggered. Diagnostic Messages 9-9 50 ERROR Token is not a recognized option, or it does not begin in column 1. This error is triggered by one of the following two mistakes. First, perhaps you mistakenly started a preprocessor directive in a line other than the leftmost column. Second, you put the pound sign # in column 1, but you did not put a legal preprocessor directive immediately after it. See Section 4.3 for an overview of preprocessor directives. 51 ERROR Include file pathname is not available. You have specified an include file with the #include preprocessor directive, but the compiler cannot find it. Possibly, pathname does not exist or you have misspelled it, or perhaps network problems prevent the compiler from seeing the pathname. 53 ERROR Multiple declaration of identifier in a structure or union. You cannot use the same identifier more than once in the same structure or union declaration. For details on structure and union declarations, see Chapter 3. 55 ERROR Bad syntax in a struct/union/enum; Token found. See Chapter 3 for details on declaring structure, union, and enumerated variables. Possible triggers for this error include: • You mistakenly separated two enumerated constants with a semicolon instead of a comma. • You forgot to put a closing brace after the last enumerated constant. • You mistakenly used a right parenthesis) or bracket] instead of a brace }. 56 ERROR Multiple definition of label, previous definition was on 1 ine number. You cannot define the same label more than once in the same block. To correct the error, simply rename the second occurrence. 58 ERROR Bad syntax in a struct/union/enum; token found; assuming end of list. An unneeded token has slipped into your declaration. Removing token should clear up the error. See Chapter 3 for details about declaring struct, union, and enum variables. 9-10 Diagnostic Messages 60 ERROR Improper use of token, only a variable or constant is valid here. One possibility is that you mistakenly used a label name as the argument to a case statement. For example, the following program fragment causes this error: abc: switch (i) { case abc: break; } 61 ERROR Variable_name is not an array. Probably, you've used variable_name as if it were an array, but it is not. Note that C interprets any expression of the form IDENTIFIER[] as an attempt to access an array. A second possibility is that you defined a i-dimensional array, but you tried to access it as a 2-dimensional array. 62 ERROR Variable_name is not a pointer variable. You tried to use variable_name in a way that only a pointer variable can be used. For instance, maybe you tried to dereference variable_name, but you can only dereference a pointer variable. See the "pointer operations" listing in Chapter 4 for details. 63 ERROR Variable_name is not a structure or union. You used variable_name in a manner that is appropriate for a structure or union variable only. Note that C interprets expressions of the form identifier.token OR identifier->token as an attempt to access a structure or union. Diagnostic Messages 9-11 64 ERROR Identifier is not a member of struct_or_union_name. It appears to the compiler that you are trying to access a member of a structure or union, but identifier not a declared member of this structure or union. Perhaps you misspelled identifier or perhaps there is a mistake in your structure or union declaration. See the "structure and union operations" listing in Chapter 4 for information about using structures and unions in the body of a function, or see Chapter 3 for information on declaring structures and unions. 66 ERROR Bit field constant number is not an integer. Bit fields must be integers. See Section 3.8.4 for details on bit fields. 67 ERROR Improper use of identifier, only a function reference is valid here. You used an expression of the form identifier(token), but identifier was not a function name. A common mistake is to use parentheses ()instead of brackets (for an array) or braces (for comments). 68 ERROR The types of tokenl and token2 are not compatible with the operator_name operator. See Chapter 4 for descriptions of all the operators. Common mistakes include: • Using the modulo operator (%) for floating-point division . • Using floating-point expressions as arguments to the bit-shift operators. 69 ERROR The type of variable_name is not compatible with the operator_name operator. See Chapter 4 for descriptions of all the operators. 70 ERROR Incompatible operands [operandl, operand2] to the operator_name opera tor. This error can be triggered in many ways. Possibly, you've misused the = operator. Another possibility is that you've called a function using a format like this answer = function () ; but the data type the function will return cannot be converted to answer's data type. For example, if the function returns void, then answer must be void also. If you've misused an operator, see Chapter 4. But, if you've had a problem calling a function, see Chapter 5. 9-12 Diagnostic Messages 71 ERROR Subscript [subscript] to array array_name is not of the correct type. The implicit or explicit data type of subscript must be compatible with the int type. For instance, you'll get this error if subscript is a pointer variable, but you won't get this error if subscript is an integer or enumerated value. 72 WARNING No path to statement statement. This is only a warning, but it could very well mean that there is a mistake in your coding. The warning tells you that there is no way that the program will ever reach statement. This warning is usually caused by a go to statement or by a return statement (if it is unconditionally called and if it is not the last line of the function). 73 ERROR No declaration for type type. A superfluous comma is the culprit here; notice the right and wrong ways to use the comma operator inside a declaration: int i, ; int i,j; int ,i ; 74 ERROR '* ''** wrong, causes error 73 *' right wrong, causes error 73 *' *' Function function_name may not be defined inside another function; (identifier begins the definition). Unlike some other structured languages, C does not support nested functions. Since you probably already know that you cannot nest functions, you probably made some other mistake. Did you forgot to end the previous function with a closing brace? Perhaps you mistakenly placed a pair of parentheses right after an identifier name. (This means "function definition" to the C compiler regardless of what you wanted it to mean.) 75 WARNING s izeof identifier is zero. You have specified an expression whose storage allocation is zero bytes. Perhaps you mistakenly declared an array without an explicit size, and then forgot to supply an initial value that would allow the compiler to set its size. 76 ERROR Illegal cast type for variabk. You cannot cast variable to the stated data type. See the "casting operations" listing in Chapter 4. Diagnostic Messages 9-13 77 ERROR Cannot initialize external variable variable name. This error highlights one of the subtler distinctions in C-that between allusion and definition. The storage class specifier extern indicates that you are alluding to a variable. Note that you can initialize a variable when you define it, but you cannot initialize a variable when you allude to it. 78 WARNING Incompatible pointer and integer operands [operandi, operand2] to the operator_name operator. See the "pointer operations" listing of Chapter 4. For example, consider some right and wrong ways to mix pointers and integers: VARIABLE = POINTER + INTEGER; I" wrong *1 POINTER_VARIABLE = POINTER + INTEGER; 1* right "I VARIABLE = * (POINTER + INTEGER);· 1* right *1 79 ERROR Illegal type of constant token for variable variable name. The compiler was expecting a constant of a particular data type. You supplied a constant, but it was of the wrong data type. For instance, code like the following triggers this error: struct {int a;} x={3.14}; I" wrong *1 struct {int a;} x = {3}; 1* right *1 80 WARNING Illegal pointer combination: incompatible types. C is flexible about converting data types; however, C is not so flexible that it allows you to mix pointer variables that point to two different types. For example, a pointer to an int cannot be assigned to a pointer to a char. 84 ERROR Named bit field identifier cannot have a size of O. A named bit field must have an integer value greater than or equal to 1. See Chapter 3 for details on structures and unions. 86 ERROR Unrecogn i zed s ta temen t keyword. You've used keyword (probably break) in an illegal context. For instance, you cannot use break outside of a for, while, or do/while loop or outside of a switch statement. 9-14 Diagnostic Messages 87 ERROR IIgotoll label expected; token found. If you specify a token followed by a colon, C assumes that you are specifying a label. Although most computer languages accept numbers as labels, C only accepts identifiers as labels. (Remember, a number is not a legal identifier.) See Chapter 2 for a definition of identifiers. 89 WARNING Non-standard usage: partial member reference field_name resolved. This is only a warning. Ignore the warning if you do not plan to port the program to another system. If you are trying to write portable code, then you will have to specify field_name in the standard way. (See the "structure and union operations" listing of Chapter 4 for details.) 90 WARNING Ambiguous reference; more than one member named identifier. See the "structure and union operations" listing in Chapter 4 for details on this error message. 91 ERROR Illegal type for bit field identifier. The only kind of data type that can be packed down into a bit field is an int or unsigned into See Section 3.8.4 for details on bit fields. 92 ERROR Address operator is illegal for bit field identifier. It is okay to take the address of a member of a structure or union. However, you cannot take the address of a bit field (even if the bit field stalts on a byte boundary). For example, the following code triggers this error: struct x {unsigned a 2} y; z = &(y. a) ; 93 ERROR Input line too long; it has been truncated. You have exceeded the line limit of 1024 characters. 94 WARNING Negative shift constant value may give undefined results. The compiler is warning you that a negative shift might not give the expected results. Note that C supports both a left shift operator « and a right shift operator », so instead of trying to use a negative shift value, perhaps you should just use the other shift operator with a positive shift value. See the "bit operators" listing in Chapter 4 for details. Diagnostic Messages 9-15 95 ERROR Too many include files. There is no fixed limit on the number of include files. In fact, even a program with a small number of #includes can cause this error if the include files themselves contain other include files. 96 ERROR Constant value token cannot be evaluated at compile time. The compiler was expecting a constant that could be evaluated at compile time. Certain constants cannot be evaluated at compile time. For example, any constant that relies on an address cannot be evaluated until run time. 97 ERROR Label label is never defined. You made one of three mistakes. First, perhaps you just plain forgot to define the label (see the "goto" listing in Chapter 4 for details on labels). Second, perhaps the spelling of the label does not match the spelling in the goto statement. Third, perhaps the label is defined in one function and the goto statement is in another function. (They must be in the same function.) 98 ERROR Line exceeds maximum length of number by number characters. This line is too long; divide it into multiple lines. 99 ERROR Left brace ({) expected; token found. There are many possible causes for this error. In particular, you should check to see that you are not missing a { immediately after the parameter declarations. Another possibility is that at the line prior to the line where the error was reported, there was a faulty function declaration. 100 ERROR Right brace (}) expected; token found. You started a block with the left brace {, but the compiler did not encounter its matching right brace. Sometimes, this error occurs when you do a lot of nesting and forget to close a function or a loop. Another possibility is that you forgot to terminate a comment or a string. 101 ERROR statement terminator expected; token found. Probably, you forgot a semicolon at the end of a statement. Another possibility is that a letter somehow crept into a number; for example, maybe a line contained the numeric constant 1.2f3 instead of 1.2e3. A third possibility is that you forgot to enclose a compound statement within a pair of braces. 9-16 Diagnostic Messages 102 ERROR Improper argument list; token found. You tried to call a C library routine, but your list of arguments does not look right. If the mistake occurred in a printfO call, make sure that you put the comma in the right places, for example: printf("%d\n", count); printf("%d\n," count); 103 ERROR /* right */ /* wrong */ Keyword expr must begin with "("; token found. The keywords if, switch, while, and for must be followed by a parenthesized expression. If something besides a comment comes between one of these keywords and a left parenthesis, the compiler issues this error. 104 ERROR Keyword expr must end with ")"; token found. The keywords if, switch, while, and for must be followed by a parenthesized expression. Apparently, you started the parenthesized expression, but you forgot to finish it with a right parenthesis. 105 ERROR Colon (:) expected in "case" or "default"; case/default found. This error may confuse you because it will probably be reported at the line beneath where the error actually occurred. For instance, if the error was reported at line 20, look at line 19. The problem can be remedied by putting a colon after the case statement. 106 ERROR "case" or "default" expected in "switch"; token found. See the "switch" listing in Chapter 4. 107 ERROR More than one "default" case given for a "switch". See the "switch" listing in Chapter 4. Diagnostic Messages 9-17 108 ERROR Illegal return expression (expression et cetera) for "void" function. The function heading specifies that the function will not return any value to the caller. Therefore, you cannot specify any expression with return. For instance, compare a legal and an illegal return for a void function: void f () { return; return (expression); /* legal */ /* illegal */ } 109 ERROR Cannot initialize null array identifier. This error is a byproduct of some other error. The other error prevented the compiler from allocating any space for the array. An array with no space is a null array, and you cannot initialize a null array. Fix the other error and you'll cure this one too. 110 ERROR "while" expected in "do" statement; token found. In a do/while loop, you must place the keyword while immediately after the closing} of the loop. 111 ERROR ";" expected in "for" statement; token found. The parenthesized list that immediately follows the keyword for must contain exactly two semicolons. 112 ERROR string initializer too long for array_name; truncated to fit. You declared array_name to hold n components, but you are initializing array_name with more than n values. It may surprise you to find that the following char array declaration triggers this error: char alpha[5] = {"abcde"} Although it seems like a snug fit between the five-element array and the five-char initialization string, in actuality, the string "abcde" takes up six components. The sixth component is the terminating null character that Domain C automatically supplies for you. If you want to be sure that you've defined the perfect array size, just omit the array size as explained in Chapter 3. 9-18 Diagnostic Messages 113 WARNING Structure or union member member name has size of zero. This warning will be associated with an error that explains what went wrong when you declared member_name. If you fix the associated error, this warning should go away. For details about structure and union declarations, see Chapter 3. 115 WARNING Function function_name is declared as an argument. This is only a warning. The compiler was expecting parameter declarations here, but got another function declaration instead. Remember that the compiler interprets any expression of the form IDENTIFIERO as a function declaration. So, when the compiler issues this warning you should ask yourself, "Did I put the parentheses in the right place?" 116 ERROR Improper expression; token found. Many situations could have caused this error. In particular, check for a missing comma or semicolon in a variable declaration. Conversely, check for an extra semicolon or comma. 117 ERROR Identifier expected; token found. See Chapter 2 for a definition of identifier. The most common mistake is in trying to start an identifier with a digit rather than a letter. Another possible trigger for this error is that you used a keyword where an identifier was expected. See Chapter 2 for a list of Domain C keywords. Another possibility is that you were supposed to supply one or more identifiers inside a set of parentheses, but you supplied the parentheses without the identifiers. 118 ERROR Member name expected; token found. Your source code contained an expression of the format structure- or- union- variable. This format is illegal. The correct format is structure_or_union_variable.member 120 ERROR Illegal operation on pointer to function function_name. You cannot do mathematical operations on a pointer to a function. Diagnostic Messages 9-19 121 ERROR Function function_name returns more than 32K bytes. You are trying to pass a very large amount of data back to the calling function. Probably, you are trying to pass back a structure that contains a very large array. Can you reduce the size of this array? If not, can you make the structure into a global variable that does not have to be returned to the caller? 122 WARNING Function function_name needs number bytes of stack, which approaches the maximum stack size of number bytes. You are passing arguments to this function that take up a lot of space. The sort of argument that could trigger this error is a structure that contains a large array. This warning informs you that additional arguments may result in an overflow. 123 WARNING Function function_name needs number bytes of stack, which exceeds the maximum stack size of number bytes. You are passing arguments to this function that take up too much space. The sort of argument that could trigger this error is a structure or union that contains a large array. If the compiler issues this warning, then the resulting object file is non-executable. 124 ERROR Cannot add two pointers: pointer_expJ + pointer_exp2. C permits you to add an integer to a pointer, but you can never add a pointer to another pointer. Perhaps you were trying to add two dereferenced pointers and you did not use the proper syntax. In that case, please see the "pointer operations" listing in Chapter 4. Incidentally, don't forget that an array name is a pointer constant. 125 ERROR Illegal type of token for addition to a pointer. If you add a value to a pointer, the value must be an integer. Ap- parently, you have added a value that is not an integer. 126 ERROR Cannot subtract two pointers of different type: pointer_var 1 - pointer_var2 . For example, compare two possible pointer subtractions: int i, *pi int j, *pj char c, *pc pi - pc; i i 9-20 Diagnostic Messages pi - pj &i; &j; &c; /* wrong /* right pi points to int, but pc points to char */ both pi and pj point to ints */ 127 ERROR Either expression1 is not a pointer type. or expression2 is not an integer type. Pointers and subtraction do not often mix. C permits you to subtract an integer value from a pointer; however. you cannot • Subtract a non-integer value from a pointer. • Subtract a pointer from any value. For example. compare some proper and improper methods of subtraction: int *px; (px - 2) (px - 2.2) (2 - px) (2.2 - px) 128 ERROR /* right */ /* wrong */ /* wrong */ /* wrong */ "main" function cannot return a type whose size is greater than 4 bytes. By default. the main function returns an into You are trying to pass back something that cannot be converted to an into 129 WARNING Ignoring data initialization for "switch" variable declarations. A compound switch statement contains a block. Since C permits you to define variables on the block level. you can define a variable within the switch statement. However. if you try to initialize this variable. C ignores the value. In other words. the variable will spring into existence when the program enters the block. but the variable will have a garbage value. 131 ERROR Cannot take the address of register variable variable name. Since a program ideally stores a register variable in a register. and since a CPU register has no address. you cannot find the address of a register variable. For more information on the register storage class specifier. see Chapter 3. Diagnostic Messages 9-21 132 WARNING Multiple declaration of variable_name with data initialization, previous declaration was on line number. You'll trigger this warning if you declare two or more variables of the same name and at least one of them contains an initialization value. If two or more variables have the same name but different initialization values, then the compiler will set the variable's value to the last initialization value. For example, given the following two initializations float s = 2.2; float s 4.2; the compiler will initialize s to 4.2. 133 ERROR Compiler failure, unexpected data init construct: construct. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 134 ERROR Compiler failure, Pascal-only error code. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 135 ERROR Floating point constant number conversion problem. For some reason (probably overflow), the compiler could not convert number to the desired data type. 136 ERROR Compiler failure, register consistency. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 137 ERROR Compiler failure, no temp created. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 138 ERROR Compiler failure, improper forward label at token. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 9-22 Diagnostic Messages 139 ERROR Compiler failure, pseudo pc consistency. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 140 ERROR Compiler failure, unknown tree node. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 141 ERROR Compiler failure, unknown top node. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 142 ERROR Compiler failure, no temp space. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 143 ERROR Compiler failure, lost value of node. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 144 ERROR Compiler failure, registers locked. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 145 ERROR Compiler failure, no emit inst. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 146 ERROR Compiler failure, procedure too large. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 147 ERROR Compiler failure, inst disp too large. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. Diagnostic Messages 9-23 148 ERROR Compiler failure, obj module too large. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 149 ERROR Compiler failure, no free space. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 150 ERROR Compiler failure, short branch optimization. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 151 ERROR Compiler failure, data frame overflow. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 152 ERROR External variable definition identifier conflicts with procedure or data section name. You cannot declare a global variable having the same name as a procedure or data section name. 154 ERROR Compiler failure, too many nodes. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 159 WARNING Variable variable_name was not initialized before this use. You are using variable_name on the right side of an assignment operator, but you have not assigned variable_name a value yet, so using it may cause bizarre results. 160 ERROR Illegal bit field constant identifier; cannot be negative. Bit fields must be positive integers. See Section 3.8.4 for details about bit fields. 9-24 Diagnostic Messages 161 ERROR Unknown or incomplete structure/union type name. You mistakenly tried to declare a recursive structure or union. For example, the following declaration causes this error because name was not yet a declared data type when you attempted to use it: struct S {int x; struct S c;}; /* wrong */ Note that you can declare a pointer to this structure or union. For example, the following declaration is okay: struct S {int x; struct S *c;}; 162 ERROR /* right */ Illegal option identifier for typedef. You are using the #attribute address modifier in a typedef statement. In a typedef statement, #attribute address is illegal; however, #attribute volatile and #attribute device are legal. For details about the #attribute modifier, see Chapter 3. Incidentally, by fixing this error, you will probably fix a lot of other errors. 163 ERROR Left bracket ([) expected; token found. The C compiler expected a left bracket immediately after the #attribute modifier, but found token instead. For details about the #attribute modifier, see Chapter 3. 164 ERROR Right bracket (]) expected; token found. The C compiler expects a right bracket just after the #attribute argument. For details about the #attribute modifier and its arguments, see Chapter 3. 165 ERROR Left parenthesis "(" expected; token found. Possibly, you were using the #attribute address modifier, but you forgot to put an address (enclosed within parentheses) right after address. For details on #attribute address, see Chapter 3. Another possibility is that you forgot the left parenthesis in a #section preprocessor directive. Diagnostic Messages 9-25 166 ERROR Right parenthesis ")" expected; token found. Possibly, you were using the #attribute device modifier with the read or write options, but you forgot to close the list of read and write options with a right parenthesis. For example, compare the following declarations: int q int q #attribute[deviee(read)] = 2; #attribute [device (read] = 2; /* right */ /* wrong, missing ")" */ If you correct this error, a lot of other errors will probably vanish. For details on #attribute device, see Chapter 3. Another possibility is that you were using the #sectioo preprocessor directive, but forgot to put a ,::omma between the two section names. 167 ERROR Number expected; token found. You misused the #lioe preprocessor directive. Compare the right and wrong ways to use #lioe in the following examples: #23 #23 "new_file.c ll #line 23 #line 23 "new_file.c #line "new_file.c ll /* /* /* /* /* right */ right */ right */ right */ wrong, triggers error 167 */ For details about its correct use, see the "#line" listing in Chapter 4. 169 ERROR string expected; token found. You forgot to enclose the file name in double quotes while using the #lioe preprocessor directive. Compare the right and wrong ways to use #lioe in the following examples: #23 /* right */ #23 "new_file. e" /* right */ #line 23 /* right */ #line 23 "new file.e" /* right */ #23 new_file.e /* wrong, triggers error 169. */ /* wrong, triggers error 169. */ #line new_file.e 170 ERROR Dividing by zero in a eompiletime constant expression. Division by zero is illegal. 9-26 Diagnostic Messages 171 ERROR Compiler failure, store elimination failure. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 172 ERROR No static address for dynamic variable variable_name. C issues this error when you assign the address of a dynamic variable to a static pointer. For example, consider the following statements: auto int x static *px 3; &x; Since px is a static pointer variable, it cannot hold the address of the dynamic variable x. To correct the problem, make both variables dynamic or make both variables fixed. For an explanation of dynamic and static, see Chapter 3. 173 WARNING Comma expected but not found in data init list. You must separate the elements of a data initialization list with commas; for example: int x [] int x [] 174 ERROR {2,3,5,7}; {2 3 57}; /* okay */ /* wrong */ Empty structure or union. C prohibits you from declaring a structure or union without members. By fixing this error, you may indirectly also fix many other errors. For details on structures and unions, see Chapter 3. 175 ERROR Unknown type name in "sizeof". The sizeof operator is evaluated at compile time, not run time. Therefore, if sizeof's operand is a partially constructed type, then this error is triggered. For example, the following use of sizeof triggers this error because data type x is not fully constructed at the point when sizeof is called: struct x {unsigned int 176 ERROR q: sizeof(struct x);}; Too many nested pointer references for debug tables. You declared a structure or union having a member which is itself a structure or union, and one of the members of this structure or union is itself a structure or union, and so on, and so on, down 256 or more levels. Diagnostic Messages 9-27 177 WARNING 8 or 9 found in an octal number. This is only a warning, but if you get it, your program will probably produce bizarre run-time results. As in the rules of conventional math, the digits 8 and 9 are forbidden in a base 8 number. Note that in C, an octal number is any integer that begins with the digit O. Did you mistakenly put a leading 0 in your decimal number? 178 ERROR Null dimension in a sub-array declaration. A multidimensional array cannot have any null dimensions other than the first dimension. For example, consider the following array declarations: int int int int 179 ERROR x[3] [5]; x[3] [5] [] ; x [] ; x[] [3] [5]; /* /* /* /* right wrong right right */ */ */ */ Invalid systype string. See the "#systype" or "if" listings in Chapter 4 for a list of valid systypes. 180 ERROR Compiler failure, limit exceeded; identifier. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 182 ERROR Cannot give more than one "systype". You can put no more than one #systype preprocessor directive in a file. For details on #systype, see the "#systype" listing in Chapter 4. 183 ERROR Cannot take "systype" once other tokens are seen. The only place that a #systype preprocessor directive can occur is as the first or the second token in the file. If it is the second token, then the only token that can precede it is the #module preprocessor directive. For details on #systype, see the "#systype" listing in Chapter 4. 184 ERROR Comma expected, token found. Probably, you forgot a comma in a #module preprocessor directive. For example, compare the right and wrong ways to use #module: #module #module 9-28 Diagnostic Messages math, math, x$, y$ /* right */ x$ y$ /* wrong, missing a comma */ 185 ERROR Found "end-of-line" before end of definition. You made a mistake in a preprocessor directive. Possibly, you forgot to close parentheses or quotes. See Chapter 4 for descriptions of all the preprocessor directives. 186 ERROR Redundant #module control line found; ignored. A source file contains more than one #module preprocessor directive, but C allows one (at most) per file. 187 ERROR Procedure section name conflicts with a previously defined data section name or identifier. Suppose you created a procedure section named "x" with the #section preprocessor directive. If, later in the same file you use x as a data section name, you will trigger this error. Also, if you define x as a global variable, C will issue this error because a global variable named x is stored in a data section named x. Probable cause for this error-you accidentally reversed the section names. See the "#section" listing in Chapter 4 for details. 188 ERROR Data section name conflicts with a previously defined procedure section name or identifier. Suppose you created a data section named x with the #section preprocessor directive. If later in the same file you use x as a procedure section name, you will trigger this error. Also, if you define x as a global variable, C will also issue this error because a global variable named x is stored in a data section named x. Probable cause for this error-you accidentally reversed the section names. See the "#section" listing in Chapter 4 for details. 189 ERROR Extraneous data at end of control line; ignored. A #section preprocessor directive should end with a right parenthesis ). However, you have mistakenly put some more code after the right parenthesis. This extra code may cause many errors. See the "#section" listing in Chapter 4 for details. 190 ERROR Compiler failure, invalid use of multiple sections and non-local goto to label identifier. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. Diagnostic Messages 9-29 191 ERROR Compiler failure, bad address constant. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 192 ERROR Compiler failure, invalid use of multiple sections and up-level referencing in routine identifier. The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 193 ERROR Illegal option token for parameter. You cannot use #attribute address in a parameter declaration, though #attribute volatile and #attribute device are okay. For details on the #attribute modifier, see Chapter 3. 194 ERROR Data section name conflicts with a previously defined external variable. One of your global variables matches the name of a data section. (See the "#module" listing of Chapter 4 for details on section names.) 195 ERROR Cannot take "#module" directive once other tokens are seen. The #module preprocessor directive is optional, but when you use it, it must be the very first thing in the file. 196 ERROR "#section" directive may not appear within a function. A C function can appear outside a function, but can never appear within a function. See the "#section" listing in Chapter 4 for details. 197 WARNING Identifier exceeds 32 characters, only identifier is recognized. Your source code can contain names of up to 256 characters; however, for internal representation, Domain C truncates any name longer than 32 characters down to 32 characters. Thus, the compiler sees the following two identifiers as identical even though we see them as unique: int int 9-30 Diagnostic Messages accounts_receivable_kansas_city_kansas; accounts_receivable_kansas_city_missouri; 198 WARNING Type of variable_name is illegal for member token. Possibly, you are misusing the arrow operator ->. The arrow operator dereferences a pointer to a structure or union, so the compiler issues this warning if variable_name is not a pointer to a structure or union. Another possibility is that you misspelled the name of the structure or union. See the "structure and union operations" listing in Chapter 4 for details. 199 ERROR Non-unique member name requires structjunion or structjunion pointer. You can trigger this rather rare error by using the same member name more than once in different structure or union declarations. For example, the following attempt to reference member a is doomed because the compiler cannot figure out whether you mean j.a or k.a: struct struct {int a;} j; {float a;} k; int *i; i->a = 10; 200 ERROR Illegal return type for function function_name; functions must return either an lvalue or void. Functions cannot return arrays, but see Chapter 5 for a way around this restriction. 201 ERROR Internal error - error_message The error is in the compiler, not in your code. Please contact your customer support representative or mail us an APR. 202 WARNING Value assigned to variable_name is never used; assignment eliminated by optimizer. You made an assignment to a local, automatic variable, but never used that variable again. To make the program more efficient, the optimizer eliminated the assignment. 203 ERROR Illegal declaration of variable_name; cannot have an array of functions. You have attempted to declare an array of functions as in: Diagnostic Messages 9-31 int f [] () ; This is not allowed in C. To declare an array of pointers to functions, which is legal, you can write: int (*f[]) () 204 WARNING Wrong size for enum variable_name; original size type_name assumed. This warning occurs when you declare an enum type with the char, short, or long qualifiers, and then use the type without the qualifier. For example: short enum color { green, red, blue }; enum color hue; The declaration of hue will generate this warning because it does not include the short specifier. 205 WARNING Enumeration type clash [variable_name, variable_name] to the operator operator. Technically, it is legal to mix enums of one type with enums of another type, and to mix enums with integer types. However, Domain C reports a warning when it encounters one of these type clashes. 206 WARNING Address of array or function in this context is redundant and ignored. This warning occurs when you precede a naked array name (Le., one without a subscript) or a naked function name (Le., one without the parentheses indicating invocation) with an ampersand. Naked' array names and function names are implicitly converted to addresses, so the address-of operator is ignored. 207 ERROR Illegal type "void" for argument parameter_name. You have attempted to pass an argument of type void. Recall that the void type used in a prototype means that the function accepts no arguments, not that it accepts an argument of type void. 208 ERROR Illegal use of "void" in a function prototype; void must be the only type specified. The type void in a prototype means that the function takes no arguments, so it is invalid to specify void and additional parameter types. 9-32 Diagnostic Messages 209 ERROR Illegal use of "ellipsis" in a function prototype; no other elements may follow" " The ellispsis notation in a function prototype can only appear as the last parameter. It indicates that the function accepts an unspecified number of additional arguments. 210 ERROR Exceeded maximum number of allowable parameters ( > 64). Domain C does not support functions that take more than 64 arguments. 211 ERROR Type of formal parameter parameter_name conflicts with prototype declaration. This error occurs when the types declared in a prototype declaration do not match the parameter types in the function definition. For example: extern int foo( int ); int foo( char x ) {}; 212 INFO 1 Old-style function declaration encountered; default prototype function_name ( ... ) assumed. This informational message indicates an old-style function definition, such as: extern int f(), g(); 213 INFO 1 No prototype in scope, default prototype function_name ( ... ) assumed. When the compiler encounters a function invocation for a function that has not been prototyped, it assumes that the function returns an iot and takes an unspecified number of arguments. 214 WARNING Cannot dereference "pointer to void" . You may not dereference a pointer that is declared to point to the void type. 215 ERROR Missing parameter name for argument argument_name of function_name. You have forgotten to enter a parameter name in a prototype definition. Diagnostic Messages 9-33 216 INFO 1 Although argument argument_name to function_name is assignment compatible, it does not match the declared argument type. You have invoked a function with arguments that are assignmentcompatible with the parameters declared in the prototype, but are not exactly the same type. For example: extern void foo( short, double ); int a; float b; foo( a, b); a will be converted to short, and b to double, but you will receive info messages telling you that they types of a and b are not the same as the parameter types declared in the prototype. 217 ERROR Invalid Hoptions specifier, token. The only valid #options specifiers are aO_return and dO_return. 218 ERROR Illegal declaration of variable_name; array of references not allowed. You have attempted to create an array of reference variables, which is not allowed. 219 ERROR Unini tialized reference variable variable name. You have declared a reference variable, but failed to initialize it. You must initialize all reference variables. 220 ERROR Global or static reference variable variable_name; not implemented. Currently, Domain C does not support global or static reference variables. Reference variables must be local and automatic, or be function parameters. 221 WARNING Incompatible combination of integer and pointer types. This warning occurs when you attempt to assign an integer value to a pointer type, or vice versa. Assignments such as these are not portable. 9-34 Diagnostic Messages 222 ERROR Invalid runtype token. You have compiled with the -runtype switch, but have specified a runtype that the compiler does not recognize. See Chapter 6 for a list of valid runtypes. 223 INFO 3 Unnaturally aligned load/store variable_name diminishes code quality. You have referenced an object that is not naturally aligned. 224 ERROR Compiler failure, no case for object type. Internal failure. Submit APR. 225 ERROR Argument to attribute_specifier attribute conflicts with value already specified for this type. You have attempted to assign an attribute specifier to an object that has already been declared with a conflicting specifier. 226 ERROR Maximum specifiable alignment is alignment_value. You may not specify alignment greater than 3 (octword boundaries). 227 ERROR Size of "@1" bits is invalid for specified type. Pascal error. Not used by C compiler. 228 ERROR Structured types may not be UNALIGNED. You have specified byte alignment for a structure or union. The minimum alignment for structures and unions is word alignment. 229 WARNING Specified attribute_specifier attribute conflicts with attributes of base type. This error occurs when you specify an attribute for an object of a user-specified type, and the type definition specifies a conflicting attribute. For example: typedef struct { int a; short b; } S #attribute[natural); S sl #attribute[align(I»); Diagnostic Messages 9-35 230 ERROR Attribute_specifier attribute is inappropriate for target machine type. Reserved for future use. 231 ERROR Attribute_specifier and attribute_specifier attributes may not both be specified. You have specified two attributes that are mutually exclusive. 232 ERROR PHYSICAL attribute specified without an ADDRESS. You have specified the physical attribute for a variable, but have failed to specify an address. You must specify an address attribute when you specify a physical attribute. See Chapter 2 for more information about these attributes. 233 ERROR Attribute is inappropriate in this context. You have specified an inappropriate attribute. For example, specifying volatile for a function parameter will generate this error. 234 INFO 1 Actual alignment of variable_name (alignment) is less than natural alignment (natural_alignment). This info message tells you that you have declared an object or type that is not naturally aligned. 235 INFO 1 Large bit field bitJield_name not on longword boundary. The bit field named bitJield_name is not aligned on a longword boundary. 236 WARNING Invalid section attribute for static var - ignored. You have used the #section specifier to indicate a named overlay section, but the section name you specified is not valid. 237 ERROR This section name conflicts with a previously defined global variable. You have attempted to create a section with the same name as a previously defined global variable. This is not allowed. For example: 9-36 Diagnostic Messages int global_var = 1; int x #attribute[section(global_var)]; mainO { } 238 ERROR Bi t field constant bitJield_name too large. You may not declare a bit field larger than 32 bits. 239 INFO 1 Actual alignment of array elements array_name is less than natural alignment n@2n. This message occurs when you declare an array of structures where the size of the structure is not evenly divisible by the size of the largest member. For example: typedef struct { int a; short b; } S; 240 WARNING Alignment of array elements is dependent on the current default alignment environment. This warning signifies that the alignment of array elements depends on the current alignment setting. 241 WARNING Size of array element rounded up from num to num bits. Reserved for future use. -------88------- Diagnostic Messages 9-37 Appendix A ISO Latin-l Table Domain C uses the ISO DIS 8859/1 character set. commonly known as Latin-l, for character data representation. The Latin-1 set also includes all ASCII characters in their standard positions. Table B-1 shows the decimal. octal. and hexadecimal values for all ISO Latin-1 characters. You can use Latin-1 characters in comments or character strings. but are limited to using ASCII letters A-Z and a-z (decimal positions 65-90 and 97-122. respectively). digits. underscores U. and dollar signs ($) in identifiers. This adheres to existing C standards. ISO Latin-l Table A-I Table A-I. ISO Latin-I Codes oct dec hex 0 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 20 21 22 23 24 25 26 27 30 31 32 33 34 35 36 37 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F character NUL SOH STX ETX EOT ENQ ACK BEL BS TAB LF VT FF CR SO SI DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US A@ AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO Ap AQ AR AS AT AU AV AW AX Ay AZ A[ AI A] AA A oct dec hex character 40 41 42 43 44 45 46 47 50 51 52 53 54 55 56 57 60 61 62 63 64 65 66 67 70 71 72 73 74 75 76 77 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F space ! " # $ % & , ( ) * + , - · / 0 1 2 3 4 5 6 7 8 9 ·· ,· < = > ? (Continued) A-2 ISO Latin-I Table Table A-i. ISO Latin-l Codes (Cont.) oct dec hex 100 101 102 103 104 105 106 107 110 111 112 113 114 115 116 117 120 121 122 123 124 125 126 127 130 131 132 133 134 135 136 137 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 SA 5B 5C 5D 5E SF character @ A B C D E F G H I J K L M N 0 P Q R S T U V W X y Z [ \ ] A - oct dec 140 141 142 143 144 145 146 147 150 151 152 153 154 155 156 157 160 161 162 163 164 165 166 167 170 171 172 173 174 175 176 177 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 hex 60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F 70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F character , a b c d e f g h 1 J k I m n 0 P q r s t u v w x y z { I } del (Continued) ISO Latin-l Table A-3 Table A-I. ISO Latin-I Codes (Cont.) oct dec hex character 204 205 206 207 210 211 212 213 214 215 216 217 220 221 222 223 224 225 226 227 233 234 235 236 237 240 241 242 243 244 245 246 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 155 156 157 158 159 160 161 162 163 164 165 166 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 9B 9C 9D 9E 9F AO Al A2 A3 A4 A5 A6 IND NEL SSA ESA HTS HTJ VTS PLD PLU RI SS2 SS3 DCS PU1 PU2 STS CCH MW SPA EPA CSI ST OSC PM APC NBSP I ¢ £ xx ¥ I oct 247 250 251 252 253 254 255 256 257 260 261 262 263 264 265 266 267 270 271 272 273 274 275 276 277 300 301 302 303 304 305 306 dec hex 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 A7 A8 A9 © AA a AB AC AD AE AF BO B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF CO C1 C2 C3 C4 C5 C6 character § .. « ., SHY ® 0 ± 2 3 , J.L ~ . , 1 Q » 1,4 V2 3,4 6 A A A A A A lE (Continued) A-4 ISO Latin-I Table Table A-I. ISO Latin-I Codes (Cont.) oct dec hex 307 310 311 312 313 314 315 316 317 320 321 322 323 324 325 326 327 330 331 332 333 334 335 336 337 340 341 342 343 344 345 346 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 C7 C8 C9 CA CB CC CD CE CF DO Dl D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF EO E1 E2 E3 E4 E5 E6 character <; E E E E i i I I £> N 0 6 6 6 6 x 0 (] (] 0 0 Y l> 13 oct dec hex 347 350 351 352 353 354 355 356 357 360 361 362 363 364 365 366 367 370 371 372 373 374 375 376 377 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 E7 E8 E9 EA EB EC ED EE EF FO F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF a character (: e e e e i i i i 0 Ii. 0 6 0 5 0 0 U U 0. ii Y P Y a a a a a re --88-- ISO Latin-I Table A-S Appendix B Domain C Extensions This appendix lists all extensions to the de facto C standard defined in The C Programming Language by Kernighan and Ritchie. The extensions listed in Table B-1 are compatible with the proposed ANSI C standard or with the C++ programming language. The ones listed in Table B-2 are unique to Domain C. Table B-1. ANSI C and c++ Extensions Supported by Domain C Extension function prototypes ANSI C++ V' V' V' reference variables void and (void *) V' V' _FILE_ and _LINE_ predefined symbols V' V' - DATE_, _TIME_ and _STDC_ predefined symbol V' structure and union assignment V' V' union initialization V' V' defined preprocessor operator V' V' unsigned short, unsigned long, and unsigned char types V' V' long constants V' V' passing structures and unions as arguments V' V' Domain C Extensions B-1 Table B-2. Domain Extensions to the C Language Extension sized enums (char, short, and long) #attribute specifier #option specifier std_$call keyword #section and #module preprocessor directives #debug preprocessor directive #eject preprocessor directive #list and #nolist preprocessor directives #systype preprocessor directive systype predefined macro _BFMT_COFF predefined name long float data type dollar sign ($) in identifiers partial specification of struct and union members ----88------ B-2 Domain C Extensions Appendix C BSD lint: A C Program Checker C.l Introduction The lint utility examines C source code, detecting any bugs or obscurities. It enforces the type rules of C more strictly than the C compilers do. It may also be used to enforce many portability restrictions involved in moving programs between different machines and/or operating systems. Furthermore, it detects certain constructions which, although technically "legal," are nonetheless wasteful, error-prone, or otherwise best avoided. lint accepts multiple input files and library specifications, and checks them for consistency. The separation of function between lint and the C compilers has both historical and practical rationale. The compilers turn C programs into executable files rapidly and efficiently. This is possible, in part, because the compilers don't do sophisticated type checking, especially between separately-compiled programs. lint takes a more global, leisurely view of the program, looking much more carefully at the compatibilities. This chapter discusses the use of lint, gives an overview of the implementation, and gives some hints on the writing of machine independent C code. C.2 Summary of lint Options The command currently has the form % lint [options] files ... library-descriptors ... BSD lint Utility C-l The following options are available: -a Print messages about assignments of long objects to integers that are not long. -b Print messages about unreachable break statements. -c Complain about questionable casts. -h Perform heuristic checks. -n Don't do any library checking. -p Perform portability checks. -s Perform heuristic checks (same as h). -u Don't report unused or undefined externals. -v Don't report unused arguments. -x Report unused external declarations. C.2.1 Usage Suppose there are two C source files, file1.c and file2.c, that are ordinarily compiled and loaded together. Then the command, $ lint file1.c file2.c produces messages describing inconsistencies and inefficiencies in the programs. The following command $ lint -p file1.c file2.c also produces these messages, as well as other messages that relate to the "portability" of the programs to other operating systems and machines. Replacing the -p by -h produces messages about constructions that, although legal, demonstrate poor programming style (according to lint). You may use both options $ lint -hp file1.c file2.c to get both types of messages. Many of the facts that lint needs to establish may, in reality, be impossible to discover. For example, it may not be possible to know whether a given function in a program ever gets called without also knowing the input data. Deciding whether exit is ever called is equivalent to solving the famous "halting problem," known to be recursively undecidable. Thus, most of the lint algorithms are a compromise. If a function is never mentioned, it can never be called. If a function is mentioned, lint assumes it can be called. C-2 BSD Lint Utility lint tries to give only relevant information. Messages of the form "xxx might be a bug" are easy to generate, but are acceptable only in proportion to the fraction of real bugs they uncover. If this fraction of real bugs is too small, lint loses credibility, and its "error" messages merely clutter up the output, obscuring other, possibly more important messages. C.2.2 Unused Variables and Functions As sets of programs evolve, previously used variables and arguments to functions may become unused. It isn't uncommon for external variables, or even entire functions, to become unnecessary, and yet not be removed from the source. These "errors of commission" rarely cause working programs to fail, but they are a source of inefficiency, and make programs harder to understand and change. Moreover, information about such unused variables and functions can occasionally help you to discover bugs; if a function does a necessary job and is never called, something is probably wrong. lint complains about variables and functions that are defined but not otherwise mentioned. An exception is variables that are declared through explicit extern statements but are never referenced; thUS, the statement extern float sin(); evokes no comment if sin is never used. Note that this agrees with the semantics of the Domain C compiler. In some cases, these unused external declarations might be of some interest; you can discover them by adding the -x option when you invoke lint. Certain styles of programming require many functions to be written with similar interfaces; frequently, some of .the arguments may be unused in many of the calls. The -v option suppresses the printing of complaints about unused arguments. When -v is in effect, lint produces no messages about unused arguments except for those arguments that are unused and also declared as register arguments. This can be considered an active (and preventable) waste of the register resources of the machine. In one particular case, information about unused or undefined variables is more distracting than helpful. This is when lint is applied to some, but not all, files in a collection that is normally loaded together. Here, many of the functions and variables defined may not be used, and, conversely, many functions and variables defined elsewhere may be used. Use the -u option to suppress the spurious messages that might otherwise appear. C.2.3 SetlUsed Information lint attempts to detect cases where a variable is used before it is assigned a value. This isn't easy to detect. Many algorithms take a good deal of time and space, and still produce "error" messages about perfectly valid programs. lint detects local variables (automatic and register storage classes) whose first use appears physically earlier in the input file than the first assignment to the variable. It assumes that taking the address of a variable constitutes a "use," since the actual use may occur later, in a data-dependent fashion. BSD lint Utility C-3 The restriction to the physical appearance of variables in the file makes the algorithm very simple and quick to implement, since the true flow of control need not be discovered. This genre of complaint has its roots in stylistic, rather than actual, error. Because static and external variables are initialized to zero, no meaningful information can be discovered about their uses. The algorithm deals correctly, however, with initialized automatic variables, and variables used in the expression that first sets them. The set/used information also permits recognition of those local variables that are set and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs. C.2.4 Flow of Control lint attempts to detect unreachable portions of the programs that it processes. It complains about unlabeled statements immediately following goto, break, continue, or return statements. It attempts to detect loops that can never be left at the bottom, detecting the special cases while(1) and for(;;) as infinite loops. lint also complains about loops that can't be entered at the top. As is often true when lint makes false accusations, this condition may not be a bug, but a complaint about programming style. lint has an important area of blindness in the flow of control algorithm: it can't detect functions that are called and never return. Thus, a call to exit may cause unreachable code that lint doesn't detect; the most serious effects of this are in the determination of returned function values (see Section C.2.5). A break statement that can't be reached causes no message. Programs generated by yacc and lex may have hundreds of unreachable break statements. The -0 option in the C compiler often eliminates the resulting object code inefficiency. Thus, these unreached statements are of little importance, there is typically nothing you can do about them, and the resulting messages would clutter up lint's output. If you want to see these messages, invoke lint with the -b option. C.2.S Function Values Sometimes functions return values that are never used; sometimes programs incorrectly use function "values" that have never been returned. lint addresses this problem in a number of ways. Locally, within a function definition, the appearance of both return ( expr ); and return ; statements is cause for alarm; lint gives the. message function name contains return(e) and return C-4 BSD Lint Utility The most serious difficulty with this is detecting when a function return is implied by flow of control reaching the end of the function. For example: f ( a ) { if ( a ) return ( 3 ); g (); } Notice that, if a tests false, f calls g and then returns with no defined return value; this triggers a complaint from lint. If g, like exit, never returns, the message is produced even though nothing is actually wrong. In practice, some potentially serious bugs have been discovered by this feature. It also accounts for a substantial fraction of the "noise" messages produced by lint. On a global scale, lint detects cases where a function returns a value, but this value is sometimes or always unused. When the value is always unused, it may constitute an inefficiency in the function definition. When the value is sometimes unused, it may represent bad style (e. g., no testing for error conditions). The dual problem of using a function value when the function does not return one is also detected. This is a serious problem that has been observed in "working" programs where, by chance, the desired function value was computed in the function return register. C.2.6 Type Checking lint enforces the type checking rules of C more strictly than compilers do. The additional checking goes on in four major areas: across certain binary operators and implied assignments, at the structure selection operators, between the definition and uses of functions, and in the use of enumerations. Several operators have an implied balancing between types of the operands. The assignment, conditional ( ?: ), and relational operators have this property. The argument of a return statement, and expressions used in initialization also suffer similar conversions. In these operations, char, short, int, long, unsigned, float, and double types may be freely intermixed. The types of pointers must agree exactly, except that arrays of x's can be intermixed with pointers to x's. The type checking rules also require that, in structure references, the left operand of the "->" be a pointer to structure, the left operand of the "." be a structure, and the right operand of these operators be a member of the structure implied by the left operand. Similar checking is done for references to unions. Strict rules apply to function argument and return value matching. The types float and double may be freely matched, as may the types char, short, int, and unsigned. Also, pointers can be matched with the associated arrays. Aside from this, all actual arguments must agree in type with their declared counterparts. With enumerations, lint checks to see that enumeration variables or members are not mixed with other types or other enumerations. Another check ensures that the only operations applied are =, initialization, ==, !=, and function arguments and return. BSD lint Utility C-5 C.2.7 Type Casts The type cast feature in C was introduced largely as an aid to producing more portable programs. Consider this assignment, where p is a character pointer: p = 1 ; lint has reason to complain. Now, consider the assignment p = (char *)1 ; in which a cast has been used to convert the integer to a character pointer. This assignment clearly signals the desired action. It seems harsh for lint to continue to complain about this. On the other hand, if this code is to be truly portable, such constructs should be examined carefully. The -c option controls the printing of comments about casts. When -c is in effect, casts are treated as though they were assignments subject to complaint; otherwise, all legal casts are passed without comment, no matter how strange the type mixing seems to be. C.2.S Nonportable Character Use On most C implementations, characters take on only positive values. lint flags certain comparisons and assignments as illegal or nonportable. For example, the fragment char c; if( (c = getchar(» < 0 ) .... works where the version of C allows a character to have a negative value, but fails on machines where characters always assume positive values. The real solution is to declare c an integer, since getchar is actually returning integer values. In any case, lint responds with .. nonportable character comparison." A similar issue arises with bitfields; when assignments of constant values are made to bitfields, the field may be too small to hold the value. This is especially true because, on some machines, bitfields are considered signed quantities. While it may seem unintuitive to consider that a 2-bit field declared as type int cannot hold the value 3, the problem disappears if the bitfield is declared to have type unsigned. C.2.9 Assignments of "longs" to "ints" Bugs may arise from the assignment of long to an int, which loses accuracy in some implementations. This may happen in programs that have been incompletely converted to use typedefs. When a typedef variable is changed from int to long, the program can stop working because some intermediate results may be assigned to ints, losing accuracy. Since there are a number of legitimate reasons for assigning longs to ints, the detection of these assignments is enabled by the -a option. C-6 BSD Lint Utility C.2.tO Unorthodox Constructions lint flags several perfectly legal, but somewhat unorthodox, constructions in the hope of promoting better code quality and clearer style, and even of pointing out bugs. The -b option enables these checks. For example, in the statement *p++ ; the asterisk (*) does nothing. This provokes the message "null effect" from lint. In the following program fragment, unsigned x ; if( x < 0 ) ... the test never succeeds. Similarly, the test if( x > 0 ) is equivalent to if( x != 0 ) which may not be the intended action. lint accuses you of making a "degenerate unsigned comparison" in these cases. If the code says if( 1 != 0 ) .... lint reports "constant in conditional context," since the comparison of 1 with 0 gives a constant result. Another construction detected by lint involves operator precedence. Bugs arising from misunderstandings about the precedence of operators can be accentuated by spacing and formatting, making such bugs extremely hard to find. For example, the statements if ( x&077 == 0 ) ... or x«2 + 40 probably don't do what was intended. The best solution is to place such expressions in parentheses, and lint encourages this by an appropriate message. Finally, when the -b option is in force, lint complains about variables that are redeclared in inner blocks in a way that conflicts with their use in outer blocks. This is legal, but is considered by many to be bad style, often unnecessary, and frequently a bug. C.2.11 Antiquated Syntax lint attempts to discourage several forms of older syntax. These fall into two classes: assignment operators and initialization. BSD lint Utility C-7 The older forms of assignment operators (e.g., =+, =-, ... ) could cause ambiguous expressions, such as a =-1 ; This expression could be interpreted as either a =- 1 ; or a = -1 ; It is especially perplexing when such ambiguity arises as the result of a macro substitution. The newer and preferred operators (+=, -=, etc. ) don't cause such confusion. To spur the abandonment of the older forms, lint complains about these older operators. A similar issue arises with initialization. Older versions of C allowed int x l ; to initialize x to 1. This also caused syntactic difficulties. For example, int x ( -1 ) ; looks somewhat like the beginning of a function declaration: int x ( y ) { . . . and the compiler must read some distance past x to be sure what the declaration really is. Again, the problem is even more perplexing when the initializer involves a macro. The current syntax places an equal sign between the variable and the initializer: int x = -1 ; This is free of any possible syntactic ambiguity. C.2.12 Pointer Alignment Certain pointer assignments may be reasonable on some machines, and illegal on others, due entirely to alignment restrictions. On machines where double-precision values may begin on any integer boundary, it is reasonable to assign integer pointers to double pointers. On other machines, double-precision values must begin on even word boundaries; thus, not all such assignments make sense. lint tries to detect cases where pointers are assigned to other pointers, and such alignment problems might arise. The message "possible pointer alignment problem" results from this situation whenever either the -p or -h options are in effect. C.2.13 Multiple Uses and Side Effects In complicated expressions, the best order in which to evaluate subexpressions may be highly machine dependent. For example, on machines in which the stack runs backwards, C-8 BSD Lint Utility function arguments are probably be best evaluated from right-to-Ieft; on machines with a stack running forward, left-to-right seems most attractive. Function calls embedded as arguments of other functions mayor may not be treated similarly to ordinary arguments. Similar issues arise with other operators which have side effects, such as the assignment operators and the increment and decrement operators. So that the efficiency of C on a particular machine isn't unduly compromised, the C language leaves the order of evaluation of complicated expressions up to the local compiler. In fact, the various C compilers have considerable differences in the order in which they evaluate complicated expressions. In particular, if any variable is changed by a side effect, and also used elsewhere in the same expression, the result is explicitly undefined. lint checks for the important special case where a simple scalar variable is affected. For example, the statement a[i] = b[i++] draws the complaint: warning: i evaluation order undefined C.3 Implementation Details lint consists of two programs and a driver. The first program is a version of the Portable C Compiler (PCC). This compiler does lexical and syntax analysis on the input text, constructs and maintains symbol tables, and builds trees for expressions. Instead of writing an intermediate file which is passed to a code generator (as the other compilers do), lint produces an intermediate file which consists of lines of ASCII text. Each line contains an external variable name, an encoding of the context in which it was seen (use, definition, declaration, etc.), a type specifier, and a source file name and line number. The information about variables local to a function or file is collected by accessing the symbol table, and examining the expression trees. Comments about local problems are produced as detected. The information about external names is collected onto an intermediate file. After all the source files and library descriptions have been collected, the intermediate file is sorted to bring together all information collected about a given external name. The second, rather small, program then reads the lines from the intermediate file and compares all of the definitions, declarations, and uses for consistency. The driver controls this process, and is also responsible for making the options available to both passes of lint. C.3.1 Portability This section describes some of the differences between C implementations, and discusses the lint features that encourage portability. BSD lint Utility C-9 Uninitialized external variables are treated differently in different implementations of C. Suppose two files contain a declaration without initialization, such as int a ; outside of any function. The loader resolves these declarations and cause only a single word of storage to be set aside for G. Under some implementations, this isn't feasible, so each such declaration causes a word of storage to be set aside and called a. When loading or library editing takes place, this causes fatal conflicts that prevent the proper operation of the program. If lint is invoked with the -p option, it detects such mUltiple definitions. A related difficulty comes from the amount of information retained about external names during the loading process. Names known externally to UNIX software have seven significant characters, with the upper/lowercase distinction preserved. On other systems, the number of characters used and the preservation of case distinction may not be handled the same way. This leads to situations where programs that run fine under the UNIX system encounter loader problems on other systems. lint -p causes all external symbols to be mapped to one case and truncated to six characters, providing a worst-case analysis. A number of differences arise in the area of character handling. The UNIX system uses 8-bit ASCII. Other systems may use other character lengths or even other encoding schemes (e.g., EBCDIC). Moreover, character strings go from high to low bit positions ("left to right") on some systems, and low to high ("right to left") on the others. Thus, code attempting to construct strings out of character constants, or attempting to use characters as indices into arrays, are suspect. lint is of little help here, except to flag multi-character character constants. Other problems are likely to arise in shifting or masking words. C supports a bit-field facility that can be used to write much of this code in a reasonably portable way. Frequently, portability of such code can be enhanced by slight rearrangements in coding style. For example, consider the use of x &= 0177700 to clear the low order six bits of x. If the bit field feature cannot be used, the same effect can be obtained by writing the following, which works on many machines: x&= 9- 8 077 ; The right shift operator is arithmetic shift on the PDP-11, and logical shift on most other machines. To obtain a logical shift on all machines, the left operand can be typed unsigned. Characters are considered signed integers on the PDP-ll, and unsigned on the other machines. This persistence of the sign bit may be reasonably considered a bug in the PDP-ll hardware that has infiltrated itself into the C language. If there were a good way to discover the programs that would be affected, C could be changed; in any case, lint is no help here. The above discussion may have made the problem of portability seem bigger than it in fact is. The issues involved here are rarely subtle or mysterious, at least to the implementor of C-IO BSD Lint Utility the program, although they can involve some work to straighten out. The most serious bar to the portability of UNIX system utilities has been the inability to mimic essential UNIX system functions on the other systems. The inability to seek to a random character position in a text file, or to establish a pipe between processes, has involved far more rewriting and debugging than any of the differences in C compilers. On the other hand, lint has been very helpful in moving the UNIX operating system and associated utility programs to other machines. C.3.2 Suppressing Unwanted Output Sometimes you want lint to refrain from citing various constructs that, while technically "wrong," are nevertheless there for a good reason. There may be valid reasons for "illegal" type casts, functions with a variable number of arguments, etc. Moreover, the flow of control information produced by lint often has blind spots, causing occasional spurious messages about perfectly reasonable programs. Thus, some way of controlling lint's output is often desirable. The form that this mechanism should take is not at all clear. New keywords would require current and old compilers to recognize these keywords, if only to ignore them. This has both philosophical and practical problems. New preprocessor syntax suffers from similar problems. What was finally done was to cause several words to be recognized by lint when they were embedded in comments. This required minimal preprocessor changes; the preprocessor just had to agree to pass comments through to its output, instead of deleting them as had been previously done. Thus, lint directives are invisible to the compilers, and the effect on systems with the older preprocessors is merely that the lint directives don't work. The first directive is concerned with flow of control information; if a particular place in the program cannot be reached, but this is not apparent to lint, it can be asserted by the directive /* NOTREACHED */ at the appropriate spot in the program. Similarly, if you want to turn off strict type checking for the next expression, you can use the directive /* NOSTRICT */ This causes the program to revert to the previous default after the next expression. The -v option can be turned on for one function by the directive /* ARGSUSED */ Complaints about variable number of arguments in calls to a function can be turned off by using this directive /* VARARGS */ BSD lint Utility C-ll before the function definition. Sometimes, it is desirable to check the first several arguments, and leave the later arguments unchecked. This can be done by following the V ARARGS keyword immediately with a digit giving the number of arguments to be checked; thus, this causes the first two arguments to be checked, the others unchecked: /* VARARGS2 */ Finally, the directive /* LINTLIBRARY */ at the head of a file identifies this file as a library declaration file (see Section 6.3.3). C.3.3 Library Declaration Files lint accepts certain library directives, such as -ly and tests the source files for compatibility with these libraries by accessing library description files whose names are constructed from the library directives. These files all begin with the directive /* LINTLIBRARY */ followed by a series of dummy function definitions. The critical parts of these definitions are the declaration of the function return type, whether the dummy function returns a value, and the number and types of arguments to the function. You can use the V ARARGS and ARGSUSED directives to specify features of the library functions. lint library files are processed almost exactly like ordinary source files. The only difference is that functions defined on a library file, but not used on a source file, draw no complaints. lint doesn't simulate a full library search algorithm, and complains if the source files contain a redefinition of a library routine. By default, lint checks the programs it is given against a standard library file, which contains descriptions of the programs which are normally loaded when a C program is run. When the -p option is in effect, another file containing descriptions of the standard 110 library routines that are expected to be portable across various machines is checked. The -n option can be used to suppress all library checking. -------88------- C-12 BSD Lint Utility Appendix D SysV lint Utility The lint program examines C language source programs, detecting a number of bugs and obscurities. It enforces the type rules of C language more strictly than the C compiler. It may also be used to enforce a number of portability restrictions involved in moving programs between different machines and/or operating systems. Another option detects a number of wasteful or error-prone constructions, which nevertheless are legal. lint accepts multiple input files and library specifications and checks them for consistency. D.l Usage The lint command has the form: lint [options] files ... [librarY-deScriPtors ... ] where options are optional flags to control lint checking and messages; files are the files to be checked which end with .c or .In; and library-descriptors are the names of libraries to be used in checking the program. SysV lint Utility D-l The options that are currently supported by the lint command are: -a Suppress messages about assignments of long values to variables that are not long. -b Suppress messages about break statements that cannot be reached. -c Only check for intra-file bugs; leave external information in files suffixed with .In. -h Do not apply heuristics (which attempt to detect bugs, improve style, and reduce waste). -n Do not check for compatibility with either the standard or the portable lint library. -0 Create a lint library from input files named IIib-lname.ln. name -p Attempt to check portability. -u Suppress messages about function and external variables used and not defined or defined and not used. -v Suppress messages about unused arguments in functions. -x Do not report variables referred to by external declarations but never used. When more than one option is used, they should be combined into a single argument, such as -ab or -xha. The names of files that contain C language programs should end with the suffix .c, which is mandatory for lint and the C compiler. lint accepts certain arguments, such as: -1m These arguments specify libraries that contain functions used in the C language program. The source code is tested for compatibility with these libraries. This is done by accessing library description files whose names are constructed from the library arguments. These files all begin with the comment: /* LINTLIBRARY */ which is followed by a series of dummy function definitions. The critical parts of these definitions are the declaration of the function return type, whether the dummy function returns a value, and the number and types of arguments to the function. The V ARARGS and ARGSUSED comments can be used to specify features of the library functions. Section D.2 describes how it is done. D-2 SysV lint Utility lint library files are processed almost exactly like ordinary source files. The only difference is that functions that are defined in a library file but are not used in a source file do not result in messages. lint does not simulate a full library search algorithm and will print messages if the source files contain a redefinition of a library routine. By default, lint checks the programs it is given against a standard library file that contains descriptions of the programs that are normally loaded when a C language program is run. When the -p option is used, another file is checked containing descriptions of the standard library routines that are expected to be portable across various machines. The -n option can be used to suppress all library checking. D.2 lint Message Types The following paragraphs describe the major categories of messages printed by lint. D.2.1 Unused Variables and Functions As sets of programs evolve and develop, previously used variables and arguments to functions may become unused. It is not uncommon for external variables or even entire functions to become unnecessary and yet not be removed from the source. These types of errors rarely cause working programs to fail, but are a source of inefficiency and make programs harder to understand and change. Also, information about such unused variables and functions can occasionally serve to discover bugs. lint prints messages about variables and functions which are defined but not otherwise mentioned, unless the message is suppressed by means of the -u or -x option. Certain styles of programming may permit a function to be written with an interface where some of the function's arguments are optional. Such a function can be designed to accomplish a variety of tasks, depending on which arguments are used. Normally lint prints messages about unused arguments; however, the -y option is available to suppress the printing of these messages. When -y is in effect, no messages are produced about unused arguments except for those arguments which are unused and also declared as register arguments. This can be considered an active (and preventable) waste of the register resources of the machine. Messages about unused arguments can be suppressed for one function by adding the comment: /* ARGSUSED */ to the source code before the function. This has the effect of the function. Also, the comment: -y option for only one /* VARARGS */ SysV lint Utility D-3 can be used to suppress messages about variable number of arguments in calls to a function. The comment should be added before the function definition. In some cases, it is desirable to check the first several arguments and leave the later arguments unchecked. This can be done with a digit giving the number of arguments which should be checked. For example: /* VARARGS2 */ will cause only the first two arguments to be checked. When lint is applied to some but not all files out of a collection that are to be loaded together, it issues complaints about unused or undefined variables. This information is, of course, more distracting than helpful. Functions and variables that are defined may not be used; conversely, functions and variables defined elsewhere may be used. The -u option suppresses the spurious messages. D.2.2 SetlUsed Information lint attempts to detect cases where a variable is used before it is assigned a value. lint detects local variables (automatic and register storage classes) whose first use appears physically earlier in the input file than the first assignment to the variable. It assumes that taking the address of a variable constitutes a "use" since the actual use may occur at any later time, in a data-dependent fashion. The restriction to the physical appearance of variables in the file makes the algorithm very simple and quick to implement since the true flow of control need not be discovered. It does mean that lint can print error messages about program fragments that are legal, but these programs would probably be considered bad on stylistic grounds. Because static and external variables are initialized to zero, no meaningful information can be discovered about their uses. The lint program does deal with initialized automatic variables. The set/used information also permits recognition of those local variables that are set and never used. These form a frequent source of inefficiencies and may also be symptomatic of bugs. D.2.3 Flow of Control lint attempts to detect unreachable portions of a program. It will print messages about unlabeled statements immediately following goto, break, continue, or return statements. It attempts to detect loops that cannot be left at the bottom and to recognize the special cases whileCl) and forC;;) as infinite loops. lint also prints messages about loops that cannot be entered at the top. Valid programs may have such loops, but they are considered to be bad style. If you do not want messages about unreached portions of the program, use the -b option. lint has no way of detecting functions that are called and never return. Thus, a call to exit may cause unreachable code which lint does not detect. The most serious effects of D-4 SysV lint Utility this are in the determination of returned function values (see "Function Values"). If a particular place in the program is thought to be unreachable in a way that is not apparent to lint, the comment /* NOTREACHED */ can be added to the source code at the appropriate place. This comment will inform lint that a portion of the program cannot be reached, and lint will not print a message about the unreachable portion. Programs generated by yacc and especially lex may have hundreds of unreachable break statements, but messages about them are of little importance. There is typically nothing the user can do about them, and the resulting messages would clutter up the lint output. The recommendation is to invoke lint with the -b option when dealing with such input. D.2.4 Function Values Sometimes functions return values that are never used. Sometimes programs incorrectly use function values that have never been returned. lint addresses this problem in a number of ways. Locally, within a function definition, the appearance of both return ( expr ); and return ; statements is cause for alarm; lint will give the message function name has return(e) and return The most serious difficulty with this is detecting when a function return is implied by flow of control reaching the end of the function. This can be seen with a simple example: f ( a ) { if ( a ) return ( 3 ); g (); } Notice that, if a tests false, f will call g and then return with no defined return value; this will trigger a message from lint. If g, like exit, never returns, the message will still be produced when in fact nothing is wrong. A comment j*NOTREACHED*/ SysV lint Utility D-S in the source code will cause the message to be suppressed. In practice, some potentially serious bugs have been discovered by this feature. On a global scale, lint detects cases where a function returns a value that is sometimes or never used. When the value is never used, it may constitute an inefficiency in the function definition that can be overcome by specifying the function as being of type void. For example: void fprintf(stderr, "File busy. Try again later!\n"); When the value is sometimes unused, it may represent bad style (e.g., not testing for error conditions). The opposite problem, using a function value when the function does not return one, is also detected. This is a serious problem. D.2.S Type Checking lint enforces the type checking rules of C language more strictly than the compilers do. The additional checking is in four major areas: • across certain binary operators and implied assignments • at the structure selection operators • between the definition and uses of functions • in the use of enumerations There are several operators which have an implied balancing between types of the operands. The assignment, conditional ( ?: ), and relational operators have this property. The argument of a return statement and expressions used in initialization suffer similar conversions. In these operations, char, short, int, long, unsigned, float, and double types may be freely intermixed. The types of pointers must agree exactly, except that arrays of xs can, of course, be intermixed with pointers to x's. The type checking rules also require that, in structure references, the left operand of the -> be a pointer to structure, the left operand of the "." be a structure, and the right operand of these operators be a member of the structure implied by the left operand. Similar checking is done for references to unions. Strict rules apply to function argument and return value matching. The types float and double may be freely matched, as may the types char, short, int, and unsigned. Also, pointers can be matched with the associated arrays. Aside from this, all actual arguments must agree in type with their declared counterparts. With enumerations, checks are made that enumeration variables or members are not mixed with other types or other enumerations and that the only operations applied are =, initialization, ==, !=, and function arguments and return values. D-6 SysV lint Utility If it is desired to turn off strict type checking for an expression, the comment /* NO STRICT */ should be added to the source code immediately before the expression. This comment will prevent strict type checking for only the next line in the program. D.2.6 Type Casts The type cast feature in C language was introduced largely as an aid to producing more portable programs. Consider the assignment p = 1 ; where p is a character pointer. lint will print a message as a result of detecting this. Consider the assignment p = (char *)1 in which a cast has been used to convert the integer to a character pointer. The programmer obviously had a strong motivation for doing this and has clearly signaled these intentions. Nevertheless, lint will continue to print messages about this. D.2.7 Nonportable Character Use On some systems, characters are signed quantities with a range from -128 to 127. On other C language implementations, characters take on only positive values. Thus, lint will print messages about certain comparisons and assignments as being illegal or nonportable. For example, the fragment char c; if( (c = getchar(» < a ) ... will work on one machine but will fail on machines where characters always take on positive values. The real solution is to declare c as an integer since getchar is actually returning integer values. In any case, lint will print the message nonportable character comparison A similar issue arises with bit fields. When assignments of constant values are made to bit fields, the field may be too small to hold the value. This is especially true because on some machines bit fields are considered as signed quantities. While it may seem logical to consider that a 2-bit field declared of type int cannot hold the value 3, the problem disappears if the bit field is declared to have type unsigned. SysV lint Utility D-7 D.2.S Assignments of longs to ints Bugs may arise from the assignment of long to an int, which will truncate the contents. This may happen in programs that have been incompletely converted to use typedefs. When a typedef variable is changed from int to long, the program can stop working because some intermediate results may be assigned to ints, which are truncated. The-a option can be used to suppress messages about the assignment of longs to ints. D.2.9 Strange Constructions Several perfectly legal, but somewhat strange, constructions are detected by lint. The messages encourage better code quality, clearer style, and may even point out bugs. The -h option is used to suppress these checks. For example, in the statement *p++ ; the * does nothing. This provokes the message null effect from lint. The following program fragment: unsigned x ; if( x < 0 ) ... results in a test that will never succeed. Similarly, the test if( x > 0 ) ... is equivalent to if( x != 0 ) which may not be the intended action. lint will print the message degenerate unsigned comparison in these cases. If a program contains something similar to if( 1 != 0 ) lint will print the message constant in conditional context since the comparison of 1 with 0 gives a constant result. D-8 SysV lint Utility Another construction detected by lint involves operator precedence. Bugs which arise from misunderstandings about the precedence of operators can be accentuated by spacing and formatting. making such bugs extremely hard to find. For example. the statements if( x&077 == 0 ) ... and x«2 + 40 probably do not do what was intended. The best solution is to parenthesize such expressions. and lint encourages this by an appropriate message. D.2.10 Old Syntax Several forms of older syntax are now illegal. These fall into two classes: assignment operators and initialization. The older forms of assignment operators (e.g .• =+. =-.... ) could cause ambiguous expressions. such as: a =-1 ; which could be taken as either a =- 1 ; or a = -1 ; The situation is especially perplexing if this kind of ambiguity arises as the result of a macro substitution. The newer and preferred operators (e.g .• +=. -= .... ) have no such ambiguities. To encourage the abandonment of the older forms. lint prints messages about these old-fashioned operators. A similar issue arises with initialization. The older language allowed int x 1 ; to initialize x to 1. This also caused syntactic difficulties. For example. the initialization int x ( -1 ) ; looks somewhat like the beginning of a function definition: int x ( y ) { . . . and the compiler must read past x in order to determine the correct meaning. Again. the problem is even more perplexing when the initializer involves a macro. The current syntax places an equal sign between the variable and the initializer: SysV lint Utility D-9 int x = -1 ; This is free of any possible syntactic ambiguity. D.2.11 Pointer Alignment Certain pointer assignments may be reasonable on some machines and illegal on others due entirely to alignment restrictions. lint tries to detect cases where pointers are assigned to other pointers and such alignment problems might arise. The message possible pointer alignment problem results from this situation. D.2.12 Multiple Subexpressions and Side Effects In complicated expressions, the best order in which to evaluate subexpressions may be highly machine dependent. For example, on machines in which the stack runs backwards, function arguments will probably be best evaluated from right to left. On machines with a stack running forward, left to right seems most attractive. Function calls embedded as arguments of other functions mayor may not be treated similarly to ordinary arguments. Similar issues arise with other operators that have side effects, such as the assignment operators and the increment and decrement operators. In order that the efficiency of C language on a particular machine not be unduly compromised, the C language leaves the order of evaluation of complicated expressions up to the local compiler. In fact, the various C compilers have considerable differences in the order in which they will evaluate complicated expressions. In particular, if any variable is changed by a side effect and also used elsewhere in the same expression, the result is explicitly undefined. lint checks for the important special case where a simple scalar variable is affected. For example, the statement a[i] = b[i++]; will cause lint to print the message warning: i evaluation order undefined in order to call attention to this condition. ----88---- D-IO SysV lint Utility Appendix E Using std_$call This chapter describes how to use std_Scall to invoke. Pascal and FORTRAN routines from a C program. The std_Scall convention is obsolete and will not be supported in future releases of the operating system. This documentation, therefore, is designed to enable you to maintain old programs. Do not use std_Scall for new programs. Moreover, we strongly recommend that you remove std_Scall from your existing programs as soon as possible. See Chapter 7 for information about current cross-language communication techniques. E.l Data Type Agreement of Arguments When you call a C function in the absence of a prototype, the compiler automatically converts the data types of the parameters according to the rules shown in Table E-l. All of these conversions are suppressed, however, if you declare the function with a function prototype. Table E-1. C Function Argument Conversions Without Prototype of Argument Data Type Data Type Actually Passed char short unsigned char unsigned short float int int unsigned int unsigned int double Using std_$caII E-l E.2 Data Types of Constant Arguments If you pass a constant to a Pascal or FORTRAN routine, you must make sure that the con- stant is the same size as the parameter declared in the Pascal or FORTRAN routine. The sections below describe the default sizes for constant expressions. For the sake of clarity, we recommend that you explicitly cast all constant expressions used as arguments to a std_Scall routine even when the casting is not necessary. This does not produce any extraneous machine code. NOTE: When you pass a constant or constant expression, the value is stored in read-only memory. Therefore, you cannot attempt to change the parameter's value in the called routine. This applies to all FORTRAN parameters and any Pascal parameters declared with VAR, OUT, or IN OUT. E.2.1 Integer Constants Normally in C, all integral parameters are passed as 32-bit integers. For std_Scall invocations, however, the default is 16 bits. That's because most of the Domain system calling sequences require 16-bit integers rather than 32-bit integers. The only times a constant expression is passed as 32 bits are when the value is too large to fit in 16 bits (i.e., if it is less than -32768 or greater than +32767), when it is explicitly cast to int or long, or when it has an "L" or "I" suffix. For instance. the following examples illustrate how different integer constants are passed to a std_Scall routine. Constant Expression Type Passed 100 100·1000 -25 -25L (long) 25 short long short long long E.2.2 Floating-Point Constants All floating-point constants are represented as double. Therefore. they will agree in size with Pascal's DOUBLE and FORTRAN's REAL*8 data types. To pass a 4-byte floatingpoint constant, you must cast it to float. E.2.3 Character Constants C treats character constants like integer constants. Since they are always within the range 0 through 128, they are passed as 16-bit values. To pass a character constant as a character. you must cast it to char. E-2 Using std_$call E.2.4 String Constants C passes string constants as arrays of type char. E.3 Data Type Agreement of Function Declarations Just as the parameters must agree in type, so must the function itself. For example, if a Pascal function returns an INTEGER16 value, you must declare it in your C program as a short function. That is, if the Pascal declaration is FUNCTION funcl (invar: DOUBLE) : INTEGER16; then the C declaration should be: std_$call short funcl(); All C declarations of Pascal procedures and FORTRAN subroutines should use the void type since these routines do not return a value. For instance, the Pascal procedure defined by PROCEDURE procl(invar DOUBLE); should be declared as: std_$call void procl(); E.3.1 Functions Returning Pointers In most cases, C treats pointers and integers interchangeably. For std_Scan functions returning pointers, however, they are not the same. When Pascal returns the value of a function, it places it in one of two registers: a data register or an address register. Although C normally expects values to be returned in a data register, it conforms to the Pascal convention for std_Scan functions. For instance, in the following example, C expects the value of pass_pointO to be returned in an address register, and the value of pass_dataO to be returned in a data register. mainO { std $call int *pass-point(); std=$call int pass_data(); Make sure that when you declare a std_Scan function returning a pointer that you also declare it in Pascal as a function returning a pointer. Otherwise, the returned value will be put in one register while the C program is looking for it in a different register. FORTRAN has no syntax for declaring a function that returns a pointer. All FORTRAN functions return their values in a data register. In C, you should never declare a std_Scall FORTRAN function that returns a pointer. Using std_$call E-3 E.3.2 Using std_ScalI Pascal and FORTRAN usually pass arguments by reference; C generally passes arguments by value. To simplify cross-language communication, the Domain system uses a standard calling convention. In C, you signify that you are using the standard calling convention by declaring external Pascal and FORTRAN routines with the keyword std_Scall before invoking them. This keyword tells the compiler that the C program will pass arguments according to the Domain system's standard calling convention. The syntax for std_Scall is: std_Scall function-declaration For instance, all of the following are legal uses of std_Scall: std_$call void string_match(); std_$call int sum(); std_$call char *sp(); NOTE: Do not use the storage class extern in a std_Scall declaration. The std_Scall declaration has the following effects on function calls: E-4 • All arguments in the function call are passed by reference rather than by value. Essentially, the compiler adds an address-of operator (&) to every argument in the function call. • All normal C argument conversions are suppressed. • Integral constant expressions are passed as 16-bit values if they are in the range -32768 through +32767; otherwise, they are passed as 32-bit values. Using std_$call The standard calling convention has some important consequences that are discussed in detail in this chapter. There are three general caveats that deserve special attention: • Passing Arrays-When a FORTRAN or Pascal routine expects an array as an argument, you must pass it an array reference: either an array name or a dereferenced pointer. If you pass a pointer without dereferencing it, you will pass the address of the pointer. See Sections 7.5.3 and 7.6.4 for more information. • Passing Integer Constants-Normally all integer arguments in C are expanded to 32 bits before they are passed. For std_$call functions, however, constant integer expressions are passed as 16 bits if they can fit in 16 bits. If they cannot fit in 16 bits, they are passed as 32 bits. • Passing Constants and Expressions-Do not pass a constant or an expression to a Pascal or FORTRAN routine that attempts to change the value of the incoming argument. If you do, you will get an error or unpredictable results. The one exception to this rule occurs when you declare a Pascal parameter without a declaration keyword. In this case, Pascal generates a local copy of the argument that can be changed. The program below shows two ways to call a Pascal routine named pass_example 0 , first by declaring it with std_$call and then by declaring it as a normal external routine and explicitly compensating for the different calling conventions. static float x=l.O , y=l.O; void pass! () { std $call void pass example(); pass example(x,y); - /* x and yare passed by reference */ /* because it is a std_$call function. */ } void pass2 () { extern void pass example(); pass example(&x,&y); /* x and yare explicitly passed by reference. */ } In addition to the pass-by-reference and pass-by-value conventions, there are also complications created by the different data types supported by C, Pascal, and FORTRAN. The following sections describe the intricacies of data type agreement. E.4 Pascal Examples The following examples show how to pass various objects of different types and sizes to Pascal routines. In our examples, we always cast constant arguments even if the cast is unnecessary. In Pascal, there are five ways to declare a formal parameter: IN, OUT, IN OUT, VAR, or without keyword. In all five cases, the parameters are passed by reference, but the Pascal a Using std_$call E-S keywords control what operations are legal within the Pascal routine and whether or not a local copy of the parameter is generated. The only time a local copy is generated is when the parameter is declared with no keyword. In this case, the Pascal routine can change the value of the argument without affecting the argument in the calling C program. Whenever one of the parameter keywords is used, however, the Pascal argument and the corresponding C argument have the same address. E.4.l Passing Integers Suppose you need to send a 16-bit integer and a 32-bit integer to a Pascal routine that returns a 32-bit integer. In our example, the Pascal function squares the second argument, multiplies the result to the first argument, and returns the product. MODULE pass int p; FUNCTION pass int(invarl INTEGER16 invar2 INTEGER32 BEGIN invar2 .- sqr(invar2); pass_int := invarl*invar2; END; INTEGER32; The C program below shows a variety of ways to call this routine. Note especially that constants that can fit in 16 bits are passed as shorts unless you cast them. Non-constant expressions are passed as ints unless you cast them to short. E-6 Using std_$call main () { std $call int pass int(); short argl; /* argl is 16 bits */ int arg2,answer; /* arg2 and answer are 32 bits */ /* Initialize variables. */ argl=10; arg2=20; /* First we call pass_int with the correct-size arguments. * No casting is necessary. */ answer = pass int( argl, arg2); printf ("%d\ t%d\ t%d\n" ,argl, arg2, answer) ; /* If we want to send arg2 as the first argument and argl as * the second, both arguments must be cast. */ answer = pass int ( (short) arg2 , (long) argl); printf ("%d\ t%d\ t%d\n", arg2, argl, answer) ; /* Any integer expression containing a variable is converted to * long into */ answer = pass int«short) (argl+arg2) , (arg2-argl»; printf("%d\t%cf\t%d\n", (argl+arg2), (arg2-argl) ,answer); /* By default, integral constant expressions are passed as * shorts. Both constants are short because they are in the * range: -32767 - 32767. */ answer = pass int( (short) 10 , (long) (20*3»; printf ("%d\ t%d\ t%d\n" ,10,60, answer) ; /* Append L to constant to make it long. */ answer = pass int( (short) 5 , 3L); printf ("%d\ t%d\ t%d\n" ,5,3, answer) ; /* Chars may be sent as integer values, but they are 16 bits. */ answer = pass int ( (short) 'A' , (long) printf("%d\t%d\t%d\n" , 'A' ,'8' ,answer); '8'); } If we execute the preceding program, the output is: 10 20 30 10 20 10 10 60 5 3 65 66 4000 2000 3000 36000 45 283140 E.4.2 Passing Floating-Point Numbers The rules for passing floating-point numbers are similar to the rules for passing integers. All arguments must agree in type, either by declaration or by casting. By default, all floating-point expressions are passed as double values unless explicitly cast to float. Using std_$call E-7 The example below shows a Pascal procedure that accepts two floating-point numbers; the first is four bytes and the second is eight bytes. The program assigns the second argument to be equal to the square root of the first argument. Note that this is not a function-the value is returned through the second argument, which is declared with VAR. MODULE pass_float-p; PROCEDURE pass float (single var VAR double=var SINGLE; DOUBLE) ; BEGIN double_var := sqrt(single_var); END; The C program below shows various ways to call pass_floatO. Because double_var is declared with VAR and has its value changed in the Pascal routine, it is illegal to pass it a constant or an expression. #module pass_float_c main() { std $call void pass float(); float argl=O.O; double arg2=O.O; /* First we pass variables of the correct size. * needed. No casting is */ printf("number\t\t\tsquare root\n\n"); while (argl<=2.0) { pass float(argl,arg2); printf ("%f\ t \ t%f\n" , argl, arg2) ; argl += .25; } /* Expressions that contain at least one floating-point * variable are converted to double before being passed. */ pass float«float) (argl+3),arg2); printf("%f\t\t%f\n",argl+3,arg2); /* Floating-point constants must be cast or they will be passed * as doubles. */ pass float«float) 2e+3 , arg2); printf ("%f\ t\ t%f\n" , (float) 2e+3, arg2) ; E-8 Using std_$call The output is: number square root 0.000000 0.250000 0.500000 0.750000 1.000000 1.250000 1.500000 1. 750000 2.000000 5.250000 2000.000000 0.000000 0.500000 0.707107 0.866025 1.000000 1.118034 1. 224745 1.322876 1.414214 2.291288 44.721359 E.4.3 Passing Character Data When a Pascal routine expects a Pascal CHAR type. make sure that the argument you supply is passed as a reference to eight bits. not as a reference to 16 or 32 bits. You can do this either by passing a C char variable or by casting the argument to char. Be especially careful of char constants because for std_$call functions. all integral constant arguments. including character constants. are passed as 2-byte integers if possible. Consider the following Pascal case-inversion routine that accepts a character as an argument and returns the same character with the opposite case. MODULE pass_char-p; FUNCTION upper lower (in_char : CHAR) : CHAR; BEGIN If «ord(in char) < 65) OR (ord(in char) > 122) OR «ord(in char) >= 91) AND (ord(in char) <= 96») THEN-upper lower := in char ELSE IF (ord(in char) <= 97) THEN upper lower := chr(ord(in char) + 32) ELSE upper_lower := chr(ord(in_char) -32); END; The C program below calls upper_IowerO in a variety of ways. Using sld_$call E-9 #module pass char c maine) -{ std_$call char upper lower(); char out char, result; short long_char, long_result; /* 8-bit variables */ /* 16-bit variables */ out char = 'A" long_char = 'B;; printf("Original Char\t\tCase-Inverted\n\n ll ) ; /* We do not have to cast out_char because it is one byte. */ result = upper lower(out char); printf("\t%c\t\t\t\t\t%c\n", out_char,result); /* We must cast long_char because it is two bytes. */ long result = upper lower«char) long char) printf ("\ t%c\ t \t \ t\ 't\ t%c\n ll , (char) long_char, long_result) ; /* This is the right way to pass a character constant. */ result = upper lower«char) 'c'); printf("\t%c\t\t\t\t\t%c\n", 'c' ,result); /* We can send integers if they can be represented in one byte. */ result = upper lower«char) 81); printf("\t%c\t\t\t\t\t%c\n", (char)81,result); /* IF WE PASS A CONSTANT WITHOUT CASTING IT, IT WON'T WORK. */ result = upper lower('c'); printf("\t%c\t\t\t\t\t%c\n", 'c', result); } The result of this program execution is: Original Char Case-Inverted a A B b c C Q q c Note that when we try to pass the constant 'c' without casting it, the result is an unprintable character. E.4.4 Passing Character Arrays Suppose you are calling a Pascal procedure that expects an array of char and the length of the string in the array. The Pascal program in our example takes two arguments: a string and the length of the string. It reverses the string and returns a pointer to the reversed string. E-IO Using std.;...$call MODULE pass_string-p; TYPE GENERIC STRING = ARRAY [1 .. 256] OF CHAR; STRING~OINT = AGENERIC_STRING; FUNCTION reverse string (IN str UNIV GENERIC_STRING; IN len : INTEGERI6) : STRING_POINT; VAR length: INTEGERI6; temp : CHAR; .temp str : STATIC GENERIC STRING; BEGIN length := len; WHILE length > len/2 DO BEGIN temp := str[len-Iength+l] ; temp str[len-length+l] := str[length] ; temp-str[length] := temp; length := length-I; END; temp str[len+1] := CHR(O); reverse string := addr(temp str); END; The standard call declaration and some invocations appear below. Note that when Pascal expects an array argument, you must pass it either the name of the array or a dereferenced pointer to an array. For std_Scall invocations, an array name and a pointer are not the same. Using std_$call E-ll #module pass string c maine) { std $call char *reverse string(); char *sp="This is an example"; short len=O; /* C's short is equivalent to Pascal's *INTEGER16 */ /* A "real" array of char!! */ static char an_array[128]="This is the second example"; len = strlen(sp); /* * * * /* strlen() returns a 32-bit length which * is then converted to a short into */ Notice that we must DEREFERENCE the pointer "sp" , to make a true 'array-type' expression. Don't give "sp" by itself as an argument, as you would in normal C; you'll only send the ADDRESS of "sp"! */ printf("%s\n",sp); sp = reverse string(*sp,len); printf("%s\n\n",sp) ; /* reverse string(sp, len); WRONG! ! ! ! ! ! ! ! !T! ! ! ! ! ! ! ! ! ! ! !! */ /* You could return the value from "strlen" directly, but then * you must cast it to short since the value returned is an * into This next call returns the string back to the original. */ sp reverse string(*sp, (short)strlen(sp»; printf("%s\n\n",sp) ; /* A real array of char is passed as an array reference * since that is what the Pascal procedure actually expects. */ printf ("%s\n", an_array) ; sp = reverse string(an array, (short)strlen(an_array»; printf("%s\n",sp); } The output is: This is an example elpmaxe na si sihT This is an example This is the second example elpmaxe dnoces eht si sihT E.4.S Passing Pointers Passing pointers between C and Pascal programs is fairly straightforward. In both cases, pointers .are 4-byte entities. The example below shows a simple linked-list application. The E-12 Using std_$call C program creates the first element of the list and then calls the Pascal routine appendO to add new elements to the list. The function printlistO is a C routine that prints the entire list. In addition to illustrating how to pass pointers, this example also shows the correspondence of Pascal records to C structures. The Pascal program is: MODULE pointer_example; TYPE link = Alist; list = RECORD nex : link; data : char; END; PROCEDURE append (firstrec val link; CHAR) ; VAR newdata link; BEGIN new(newdata) ; {allocate memory for new element.} WHILE firstrecA.nex <> NIL DO firstrec := firstrecA.nex; firstrecA.nex := newdata; newdataA.data := val; newdataA.nex := NIL; END; The C program is shown below. Note that C's NULL pointer (defined in static struct list { struct list *next; char data; }; main() { std ScalI void append(); extern void printlist(); struct list first,*base; char ch='z'; first.data 'a'; first.next = NULL; base = &first; /* assign value to first element of * linked list */ /* The first element is also the last * so set pointer to NULL */ /* base points to the beginning of the * list */ append(base,(char)'b'); append(base,ch); /* Must cast a char constant. */ printlist (base) ; } /* printlist() prints the data in each member of the list. */ void printlist(base) struct list *base; { do { printf ("%c\n", base->data) ; base= base->next; } while (base != NULL); } After compiling and binding these routines, the output is: a b z E.4.6 Simulating the BOOLEAN Type The Pascal BOOLEAN type is an 8-bit entity that evaluates to TRUE when its numeric value is -1 and to FALSE when its numeric value is O. The BOOLEAN type can be simulated in C with the char data type. Suppose that you want to call the Pascal routine shown below. This routine takes a BOOLEAN argument and returns a BOOLEAN result (the opposite of the argument). E-14 Using std_$call MODULE pass_boolean-p; FUNCTION bool( bool arg : BOOLEAN) : BOOLEAN; BEGIN writeln('Pascal value of argument:',bool arg); bool arg := NOT bool arg; writeln('Pascal value returned: ' ,bool_arg); bool := bool arg; END; - The C program below shows several ways to invoke boo I O. #module pass_boolean_c #define TRUE «char)-l) #define FALSE «char)O) mainO /* Cast to char and set all bits to * 1. */ /* Cast to char and set all bits to * O. */ { std $call char bool(); int-x; printf("Numeric value of argument: %d\n",TRUE); x = (bool(TRUE»; printf("Numeric value returned: %d\n\n",x); printf ("Numeric value of argument: %d\n", FALSE) ; x = (bool(FALSE»; printf ("Numeric value returned: %d\n\n", x) ; } The output after compiling, binding and executing is: Numeric value of argument: -1 Pascal value of argument: Pascal va~ue returned: Numeric value returned: 0 TRUE FALSE Numeric value of argument: 0 Pascal value of argument: Pascal value returned: Numeric value returned: -1 FALSE TRUE E.S FORTRAN Examples The following examples show how to pass various objects of different types and sizes to FORTRAN routines. Remember that FORTRAN does not make local copies of parameters. Therefore, if you change the value of a parameter in a FORTRAN routine, the corresponding argument in the C program is also changed. Do not pass constants or expressions as arguments if the FORTRAN routine attempts to change the argument value. Using std_$call E-15 There are a two restrictions concerning the types of data that you can pass to, or return from, a FORTRAN routine: • You cannot pass an assumed-size array from C to FORTRAN. In other words, the called FORTRAN routine cannot declare an array parameter with an asterisk, as in: SUBROUTINE assumed_size Car) INTEGER·4 arC·) • You cannot return a character array of any size, including 1, from a FORTRAN function. For instance, a FORTRAN function declared as CHARACTER FUNCTION char_ func () cannot be called from a C program. E.S.l Passing Integers Suppose you need to send a 16-bit integer and a 32-bit integer to a FORTRAN routine that returns a 32-bit integer. In our example, the FORTRAN function returns the sum of the two arguments squared. INTEGER*4 FUNCTION PASS_INT(invarl,invar2) INTEGER*2 invarl INTEGER*4 invar2 PASS_INT = (invarl*invarl) + (invar2*invar2) END The C program below shows a variety of ways to call this routine. Note especially that constants that can fit in 16 bits are passed as shorts unless you cast them. Non-constant expressions are passed as ints unless you cast them to short. E-16 Using std_$call #module pass int cf main() -{ std $call int pass int(); short argl; int arg2,answer; /* argl is 16 bits */ /* arg2 and answer are 32 bits */ /* Initialize variables. */ argl=10; arg2=20; /* First we call pass int with the correct-size arguments. * No casting is necessary. */ answer = pass int( argl, arg2); printf ("%d\ t%d\ t%d\n", argl, arg2, answer) ; /* If we want to send arg2 as the first argument and argl as * the second, both arguments must be cast. */ answer = pass int «short) arg2 , (long) argl); printf ("%d\ t%d\ t%d\n" ,arg2, argl, answer) ; /* Any expression that contains a variable is converted to an * into */ answer = pass int «short) (argl+5), (arg2+arg1»; printf("%d\t%d\t%d\n", (argl+5), (argl+arg2) ,answer); /* By default, integral constant expressions are passed as * shorts. */ /* Both constants are short because they are in the range: * -32767 - 32767. */ answer = pass int( (short) 10 , (long) (20*3); printf ("%d\ t%d\ t%d\n" ,10,60, answer) ; /* Append L to constant to make it long. */ answer = pass int( (short) 5 , 3L); printf ("%d\ t%d\ t%d\n", 5,3, answer) ; /* Chars may be sent as integer values, but they are 16 bits. */ answer = pass int( (short) 'A' , (long) 'B'); printf ("%d\ t%d\ t%d\n", ' A' , 'B' ,answer) ; } The output is: 10 20 15 10 10 30 60 500 500 1125 3700 5 3 34 65 66 8581 20 E.S.2 Passing Floating-Point Numbers The rules for passing floating-point numbers are similar to the rules for passing integers. All arguments must agree in type, either by declaration or by casting. By default, all float- Using std_$call E-17 ing-point constants and expressions are passed as double values unless explicitly cast to float. The example below shows a FORTRAN subroutine that accepts the values of the two sides of a right-angle triangle and returns the length of the hypotenuse. The first parameter is four bytes and the second is eight bytes. The result is eight bytes. REAL*8 FUNCTION hypot(side1,side2) REAL*4 side1 REAL*8 side2 hypot = SQRT«sidel*sidel) + (side2*side2» END The C program below shows various ways to call hypotO. #module pass float cf maine) { std $call double hypot(); float argl=3.0; double arg2=4.0,result; printf("side1\t\t\side2\t\thypotenuse\n\n"); /* First we call it with the correct data types. */ result = hypot(argl, arg2); printf("%f\t%f\t%f\n",argl,arg2,result) ; /* If we reverse the order of the arguments, we must cast both. */ result = hypot( (float) arg2, (double) argl); printf("%f\t%f\t%f\n",arg2,arg1,result); /* Any expression that contains a floating-point variable is * converted to double. */ result = hypot «float) (arg1+arg2), (arg2+2»; printf("%f\t%f\t%f\n", (arg1+arg2), (arg2+2),result); /* When we pass constant expressions, the float argument must * be cast. */ result = hypot( (float)7.5 , (double) 3.2); printf("%f\t%f\t%f\n",7.5,3.2,result); } The output is: E-18 Using std_$call side1 side2 hypotenuse 3.000000 4.000000 7.000000 7.500000 4.000000 3.000000 6.000000 3.200000 5.000000 5.000000 9.219544 8.154140 E.S.3 Passing Character Data When a FORTRAN routine expects a FORTRAN CHARACTER type, make sure that the argument you supply is passed as a reference to eight bits, not as a reference to 16 or 32 bits. You can do this either by passing a char variable or by casting the argument to char. Be especially careful of character constants, because for std_$call functions all integral constant arguments, including character constants, are passed as 2-byte integers if possible. Note that you cannot return a character from a FORTRAN function. To return a character variable, create a subroutine and return the character value in a parameter. Consider the following FORTRAN case-inversion routine that takes two character arguments. The routine inverts the case of the first argument and returns the result through the second argument. The FORTRAN routine is: SUBROUTINE UPPER LOWER(in char,inverted) CHARACTER in_char,inverted IF (ICHAR(in char) .LE. 97) THEN inverted- CHAR(ICHAR(in char) + 32) ELSE inverted = CHAR(ICHAR(in_char) - 32) END IF END The following C program calls upper_lowerO in a variety of ways. Using std_$call E-19 #module pass charf c main() { std $call void upper lower(); char out char,result; short long_char; /* 8-bit variables */ /* I6-bit variable */ out char = 'A'; long char = 'b'; printf("Original Char\t\tCase-Inverted\n\n"); /* We do not have to cast out char because it is 8 bits. */ upper lower(out char,result); printf("\t%c\t\t\t\t\t%c\n", out_char,result); /* The short int argument must be cast to char. */ upper lower«char) long char,result); printf ("\ t%c\ t \t \ t\ t \ t%c\n", 'b', result); /* This is the right way to pass a character constant. */ upper lower«char) 'c',result); printf("\t%c\t\t\t\t\t%c\n", 'c' ,result); /* You can send integers if they can be represented in 8 bits. */ upper lower«char) 8I,result); printf("\t%c\t\t\t\t\t%c\n", (char)8I,result); /* THIS DOESN'T WORK BECAUSE THE CONSTANT IS NOT CAST. */ upper lower('c',result); printf("\t%c\t\t\t\t\t%c\n", 'c', result); The result of program execution is: Original Char Case-Inverted a A b B c C Q q c Note that when we try to pass the constant Ie' without casting it, the result is an unprintable character. E-20 Using std_$call E.S.4 Passing Arrays There are two points to remember when passing arrays from C to FORTRAN: • FORTRAN and C access multidimensional arrays in a different order. In C, the rightmost subscript varies fastest while in FORTRAN the leftmost subscript varies fastest. • When FORTRAN expects an array argument, you must pass it either the name of the array or a dereferenced pointer to an array. For std_$call invocations, an array name and a pointer are not the same. The following example illustrates how to pass a character array from C to FORTRAN. Note that you can declare the array in FORTRAN as a character string or as an array of type CHARACTER. The two FORTRAN routines shown here return the last character of a string and the next-to-Iast character, respectively. C Pass a string and get the last char. SUBROUTINE pass char array(ca, clen, outchar) CHARACTER ca(256) INTEGER*2 clen CHARACTER outchar C Test for null string. IF (clen .LT. 1) THEN out char RETURN ENDIF outchar RETURN END = ca(clen) C Pass a string and get the next-to-last char. SUBROUTINE pass char string(ca, clen, outchar) CHARACTER*256 ca INTEGER*2 clen CHARACTER out char C Test for null string. IF (clen .LT. 1) THEN outchar return ENDIF out char RETURN END = ca(clen~1:clen-1) The following C program calls these FORTRAN routines. Using std_$call E-21 #module pass_char_array mainO { std $call void pass char string(); std-$call void pass-char-array(); char result, *sl "This-is the first string"; static char s2[] = "This is the second string"; short length; /* First we pass a dereferenced pointer. */ length = strlen(sl); pass char string(*sl,length,result); printf("The second to last character is %c\n",result); /* Then we pass an array. */ pass char array(s2, «short)strlen(s2»,result); printf("The last character is %c\n",result); } The result is: The second to last character is n The last character is g E.S.4.1 Passing Adjustable Arrays The following example illustrates how to pass an adjustable array from C to FORTRAN. The C program passes two arguments: an array of integers and the size of the array. The FORTRAN routine uses the second argument to declare the size of the array. The routine then returns the average value of the array elements. C Pass an array of long int and return the average. INTEGER*4 INTEGER*4 INTEGER*4 INTEGER*4 FUNCTION pass int array(larray, array_len) array len larray(array len) i, tot - tot = 0 DO i = l,array len tot = tot + Iarray(i) print *,'larray(',i,') = ' ,larray(i) END DO pass_int_array = tot / array_len RETURN END E-22 Using std_$call The C program is: #module pass_int_array MainO { std $call int pass int array(); static int average~ array size,pass array[]={325,478,982,331,21,56,79}i array size=sizeof(pass array)/4; average = pass int array(pass array,array size); printf("The average is: %d\nll~average); } The result is: larray( 1) 325 larray( 2) 478 larray( 3) 982 larray( 4) 331 larray( 5) 21 larray( 6) 56 larray( 7) 79 The average is: 324 E.S.4.2 Passing Multidimensional Arrays When you pass a multidimensional array, it is important to remember that in C the rightmost subscript varies fastest while in FORTRAN the leftmost subscript varies fastest. The example below shows the consequences of this difference. The FORTRAN routine is: SUBROUTINE dyn dim(arr, x, y) INTEGER*4 x, yINTEGER*4 arr(x, y) INTEGER*2 i, j WRITE(*,*) WRITE(*,*) 'This is the FORTRAN array:' DO i = 1, x DO j = 1, Y , , arr (i j) WRITE(*,*) , arr (' , i , , , , , j , , ) END DO END DO END I Using std_$call E-23 The C program is: #module multi_dim_array maine) { std $call void dyn dime); static int arr[2] [3]={1,2,3,4,5,6}; short i,jj i=O; j=O; printf("This is the C array:\n"); while (i<=l) { while (j<=2) { printf("arr(%d,%d) = %d\n",i,j,arr[i] [j]); j++; } i++; j = 0; } dyn_dim(arr, (long)2, (long)3); } The result is: This is the C array: arr(O,O) 1 arr(O,l) 2 arr(0,2) 3 arr(l,O) 4 arr(l,l) 5 arr(1,2) 6 This arr( arr( arr( arr( arr( arr( is the FORTRAN array: 1, 1) 1 1, 2) 3 1, 3) 2, 1) 5 2 4 6 2, 2) 2, 3) E.S.S Passing Pointers As an extension to the ANSI standard, Domain FORTRAN enables a FORTRAN routine to dereference pointers passed from C or Pascal programs. For complete details, consult the Domain FORTRAN User's Guide. In the following example, the C program passes the FORTRAN subroutine a pointer to a structure that contains four short integers. By using the the POINTER statement, the FORTRAN subroutine is able to modify the structure elements. E-24 Using std_$call The FORTRAN subroutine is: SUBROUTINE pass point(p1) INTEGER*4 p1 INTEGER*2 a,b,c,d POINTER/p1/a,b,c,d a=a+1 b=2**a c=3**a d=4**a END The C program is: #module pass point c struct S { short sl,s2,s3,s4; } struct_pass = {1,1,1,1}; mainO { std_$call void pass-point()j struct S *p; p = &struct-pass; pass-point (p) ; printf("%d\n%d\n%d\n%d\n",struct_pass.s1,struct-pass.s2, struct_pass.s3,struct_pass.s4); } The result is: 2 4 9 16 E.S.6 Simulating the LOGICAL Types The FORTRAN LOGICAL type is a 4-byte entity that evaluates to TRUE when its numeric value is -1 and to FALSE when its numeric value is O. Although FORTRAN allocates four bytes, it uses only one of them (the high byte). Therefore, in C you can simulate the logical type with either a char type or an int type. For the best results, we recommend the following: • Declare in arguments (those that are not changed in the FORTRAN code) as chars. • Declare out arguments (those that are changed in the FORTRAN routine) as a union of a char and an into • Declare FORTRAN functions that return a LOGICAL value as type char. The following example shows all three cases. The FORTRAN function accepts an in argument and an out argument and returns a LOGICAL value. Using std_$call E-25 LOGICAL FUNCTION pass logical(in arg,out arg) LOGICAL in_arg, out_arg PRINT *,'FORTRAN value of in arg:',in arg PRINT *,'FORTRAN value of out arg:',out arg out arg = .NOT. out arg pass logical = in arg .EQV. out arg PRINT *,'FORTRAN value returned:', pass_logical END The C program below shows how to invoke pass_logical. #module pass logical c #define TRUE-«char)=l) /* Cast to char and set all bits to 1. #define FALSE «char)O) /* Cast to char and set all bits to mainO { */ */ std $call char pass logical(); char arg1,result; union { char log; int filler; } arg2; arg1 = TRUE; arg2.log = TRUE; printf("C numeric value of arg1: %d\n",arg1); printf("C numeric value of arg2: %d\n",arg2.log); result = pass logical(arg1,arg2); printf("C numeric value of arg2 after function call: %d\n II , arg2. log) ; printf("C numeric value returned: %d\n\n",result); printf ("C numeric value of arg1: %d\n ", arg1) ; printf("C numeric value of arg2: %d\n",arg2.log); result = pass logical(arg1,arg2); printf(IIC numeric value of arg2 after function call: %d\n", arg2 .log) ; printf("C numeric value returned: %d\n\n",result); } The output after compiling, binding, and executing is: C numeric value of arg1: -1 C numeric value of arg2: -1 FORTRAN value of in arg: T FORTRAN value of out arg: T FORTRAN value returned: F C numeric value of arg2 after function call: 0 C numeric value returned: 0 C numeric value of arg1: -1 C numeric value of arg2: 0 FORTRAN value of in arg: T FORTRAN value of out arg: F FORTRAN value returned: T C numeric value of arg2 after function call: -1 C numeric value returned: -1 E-26 Using std_$call o. E.S.7 Simulating the COMPLEX Type The FORTRAN COMPLEX data type first representing the real part and the value. It is easy to simulate in C via a the following example, the FORTRAN turns the square of the argument. is stored as two 4-byte floating-point numbers, the second representing the imaginary part of a complex structure containing two floating-point members. In function accepts a COMPLEX argument, and re- COMPLEX FUNCTION pass_complex(com-param) COMPLEX com-param pass_complex = com-param * com-param END The C program is: #module pass_complex_c mainO { struct complex { float real; float imag; }; std $call struct complex pass complex(); static struct complex result,arg = {2.5,3.5}; printf(IIComplex Number\t\t\tSquare of Number\n\n"); result = pass complex(arg); printf("(%f,%f)\t\t(%f,%f)\n",arg.real,arg.imag, result.real, result.imag); } The result is: Complex Number Square of Number (2.500000,3.500000) (-6.000000,17.500000) -------88------- Using std_$call E-27 Kndex Symbols #undef preprocessor directive, 4-64 to 4-71, 4-71 ., structure member operator, 4-146 $, dollar sign, used in identifiers, 2-4 ... , ellipsis token, used to specify a variable number of arguments, 5-15 %, modulo division operator, 4-19 .bak filename suffix, 6-14 ++, increment operator, 4-106 to 4-110 .c filename suffix, 1-6, 6-3, 6-13 .h filename suffix, 4-103, 7-35 +, addition operator, 4-19 and arrays, 2-16 and pointers, 4-124 postfix, use of, 5-26 .i filename suffix, 6-10, 6-26 .lst filename suffix, 6-14, 6-31 .0 filename suffix, 6-3 I, logical NOT operator, 4-115 1=, not equal to operator, 4-132 ?:, conditional expression operator, 4-55 to 4-56 " comma operator, 4-54 in for statements, 4-85 ; semicolon, mistakenly used to end macro definitions, 4-66 :, statement label, 4-88 sign reversal operator, 4-19 subtraction operator, 4-19 and pointers, 4-124 --, decrement operator, 4-106 to 4-110 and pointers, 4-124 ->, structure member operator, 4-147 -alnchk compiler option, 6-20 -es compiler option, 4-66 -nalnchk compiler option, 6-20 dereferencing operator, 4-123 multiplication operator, 4-19 ", double quotes, surrounding filenames, 4-104 "Empty' *, 6-20, 7-35 " single quotes, 3-8 , I, comment delimiter, 2-2 (), parenthesized expression, 4-9 to 4-10 I, division operator, 4-19 {, begin block symbol, 2-13 1*, comment delimiter, 2-2 }, end block symbol, 2-13 " bitwise exclusive OR operator, 4-42, 4-45 & I, bitwise inclusive OR operator, II, logical OR operator, 4-115 address-of operator, 4-122, 5-22 declaring reference variables, 3-63 illegal with register variables, 3-56 bitwise AND operator, 4-42, 4-44 &&, logical AND operator, 4-115 #, preprocessor directive symbol, 4-15 4-42, 4-45 =, assignment operator, 4-34 to 4-40 confused with equal to operator (==). 4-132 erroneous use in macro definitions, 4-69 ==, equality operator, 4-132 confused with assignment operator (=), 4-132 Index 1 <, less than operator, 4-132 <=, less than or equal to operator, 4-132 «, shift left operator, 4-42 <>, #include directive, 4-104 >, greater than operator, 4-132 >=, greater than or equal to operator, 4-132 », shift right operator, 4-42 \ continuation character, 2-3 in strings, 2-10 in pathnames, 4-104 absO function, 4-21 absolute code, 6-39 -ac compiler option, 6-20 absolute pathnames, 4-104 abstract declarators, 3-41 -ac compiler option, 6-20 access modes, fopenO, 8-13 accuracy double type, 3-12 float type, 3-11 actual arguments, 5-8 \", double quote escape code, 2-8 addition operator (+), 4-19 \', single quote escape code, 2-8 address attribute specifier, 3-63, 3-68 \0, null character, 2-9, 3-40 in strings, 2-10 address-of operator (&), 4-122, 5-22 declaring reference variables, 3-63 illegal with register variables, 3-56 \b, backspace escape code, 2-8 \f, formfeed escape code, 2-8 \n, newline escape code, 2-8 \r carriage return escape code, 2-8 addresses assigning to pointer variables, 4-122 binding variables to, address attribute specifier, 3-68 \t, horizontal tab escape code, 2-8 adjustable arrays, passing from C to FORTRAN 7-19 ' \v, vertical tab escape code, 2-8 Aegis, executing programs in, 6-48 aggregate types, 3-2, 3-4 to 3-5 bitwise complement operator, 4-45 bitwise negation operator. See complement operator _, underscore appeJ;lded to FORTRAN routine names, 7-14 used in identifiers, 2-4 Numbers OX, prefix for hexadecimal constants, 2-6 Ox, prefix for hexadecimal constants, 2-6 aliases, 3-62 -align compiler option, 6-20 alignment bit field, 3-31 char type, 3-25 of object file sections, 6-20 pointer, C-8, D-10 structure, 3-24 to 3-25 allusions, 3-57, 5-1, 5-5 and initialization, 9-14 function, 3-61, 5-5 to 5-6 -alnchk compiler option, 6-20 A -a compiler option, 6-6 alphabetic letters, used in identifiers, 2-4 AND bitwise operator (&), 4-42, 4-44 logical operator (&&) , 4-115 aOJeturn, #options specifier, 5-19 ANSI standard list of features supported by Domain C, B-1 _STDC_ predefined name, 4-145 abnormal, #options specifier, 5-19 any systype, 6-41 a.out file, 1-6, 6-3 2 Index apollo_$std.h header file, 4-103 ar utility, 6-44 archiving, 6-44 argc, argument to mainO, 5-25 /* ARGSUSED */, lint comment, C-ll, D-2 arguments actual, 5-8 automatic conversions of, 5-3, 5-9 suppressing, 5-12, 7-2 to 7-3 table of, 7-3 command line, 5-25 declaring, 3-46, 5-3 to 5-4 default type of, 5-3 formal, 5-8 multidimensional arrays, 4-29 to 4-33 pass by reference, 4-148, 5-7, 5-11 to 5-14, 7-5 pass by value, 4-148, 5-7, 5-7 to 5-10 passing arrays, vs. passing structures, 4-150 passing arrays as, 4-26 to 4-82, 5-3, 5-10 passing conventions in C, FORTRAN, and Pascal, 7-5 to 7-7 passing functions as, 5-3 passing pointers to functions as, 5-24 to 5-25 passing structures, vs. passing arrays, 4-150 passing structures as, 4-148 to 4-149, 5-10 passing unions as, 5 -1 0 to macros binding of, 4-67 no type checking for, 4-67 to 4-82 side effects in, 4-71 type checking of, 5-12 variable number of, 5-15 argv, argument to mainO, 5-25 arithmetic, pointer, 4-124 to 4-165 arithmetic operators, 4-6, 4-19 to 4-21 arithmetic type conversions, 4-12 arithmetic types, 3-1 table, 3-3 array elements accessing through pointers, 4-24 to 4-82 assigning values to, 4-22 indexing, 4-22 with enums, 4-23 array names, 3-35 interpretation of, 4-24 naked, 4-25 arrays, 3-4, 3-35 to 3-41 adjustable, passing from C to FORTRAN, 7-19 and typedefs, 2-16 as function parameters, 3-36 base address of, 4-25 bounds checking, 4-23 char, 3-40 to 3-41 See also strings declaring, 3-35 example, 4-32 finding number of elements in, 4-27 finding the size of, 4-26 functions returning (illegal), 3-44 indexing with enums, 4-23 initializing, 3-36 interpreted as pointers, 4-24 memory allocation of, 4-23 multidimensional, 4-28 to 4-33 See also multidimensional arrays passing as arguments, 4-29 to 4-33 of char, passing from C to Pascal, 7-9 to 7-10 of functions (illegal), 3-44 of pointers, initializing, 3-36 of structures, 3-28 initializing, 3-39 operations on, 4-22 to 4-33 See also array elements passing as arguments, 4-26 to 4-82, 5-3, 5-10 passing as function arguments, 5-3 vs. passing structures, 4-150 passing from C to FORTRAN, 7-17 to 7-22 returning from functions, 4-28 to 4-82 size, 3-4, 3-35, 3-39 omitting, 3-36 size of index value, 6-29 storage of, 3-39 zero-sized, 9-4 ASCII codes character constants, 2-8 table of, A-I to A-3 ASCII files, 8-3 assembly language code, 6-27 declaring, 5-19 assignment conversions, 4-12, 4-36 to 4-82 assignment operator (=), 4-34 to 4-40 and structures and unions, 4-147 confused with equal to operator (==) , 4-132 erroneous use in macro definitions, 4-69 assignment operators, 4-8, 4-34 to 4-40 old-style, 4-36 Index 3 associativity of operators, 4-9 table of, 4-11 atanO function, built-in version of, 6-47 atan20 function, built-in version of, 6-47 atofO function, 5-26 atoiO function, 5-26 #attribute modifier, 3-63 to 3-70 and pointers, 3-64 inheritance of, 3-64 attribute specifiers address, 3-63, 3-68 device, 3-63, 3-66 to 3-68 section, 3-63, 3-69 volatile, 3-63, 3-64 to 3-66 auto storage class specifier, 3-55 automatic duration, 3-52 and initialization, 3-4, 3-54 automatic type conversions, 4-12 of operators. See associativity bit fields, 3-31 to 3-32 declaring, 3-31 illegal operations, 3-31 length, 3-31 order of assignment, 3-31 sign, 3-31 syntax for declaring, 3-31 unnamed, 3-31 bit operators, 4-7, 4-41, 4-42 to 4-45 bitwise AND operator (&), 4-42 bitwise exclusive OR operator n, 4-42 bitwise inclusive OR operator (i), 4-42 bitwise logical operators, 4-43 bitwise negation operator (-). See complement operator bitwise shift operators, sign preservation, 4-43 block. See compound statement block buffering, 8-5 block 110, 8-19 to 8-27 B - B compiler option, 6-6 -b compiler option, 6-20 to 6-21 backspace escape code (\b) , 2-8 backward references, to functions, 5-6 block scope, 3-48, 3-50 to 3-51 blocks, 8-5 begin symbol ({), 2-13 end symbol 0), 2-13 kernel-level, 8-5 user-level, 8-5 backwards compatibility, for function declarations, 5-16 body function, 5-4 to 5-5 macro, 4-64 base address, of arrays, 4-25 Boole, George, 4-133 base.h header file, 7-36 BOOLEAN, Pascal data type, 7-3 simulating in C, 7-12 to 7-14 begin block symbol ({), 2-13 _BFMT_COFF predefined name, 4-145 .bin filename suffix, 6-14, 6-20 debugger information, 6-24 Boolean data types, 4-133 boolean expressions. See comparison expressions bottlenecks, identifying with prof utility, 6-39 !bin/cc, command line syntax, 1-5 bounds checking, 4-23 !bin/cc command, 1-3, 6-3 to 6-13 and the preprocessor, 4-16, 4-99 creating named sections, 3-69 braces ({}) and if statements, 4-92 in enum declarations, 3-14 initialization, 3-5 binary operators, 4-13 bind utility, 6-14, 6-43, 6-44 to 6-45 global variables, 3-57 binding See also linking of macro arguments, 4-67 4 Index branching statements, 4-3 conditional, 4-91 break statement, 4-3, 4-46 to 4-48 unreachable, C-4 used to exit a switch statement, 4-155 to 4-165 breakpoints, 6-24 and optimized code, 6-34 Brodie, James, 1-2 C library, choosing version of, 4-160 C preprocessor (cpp) , 6-3 command options, 6-6 bsd4.3 systype, 6-41 c++ programming language, 1-3 features supported by Domain C, B-1 reference variables, 3-62, 5-12 -bss compiler option, 3-69, 6-3, 6-21, 7-26 calls, function, 5-7 to 5-12 .bss section, 3-69, 4-140, 6-21, 7-27 carriage return escape code (\r) , 2-8 buffer manager, 8-5 case keyword, 4-3, 4-154 buffering, 8-4 to 8-6 block, 8-5 line, 8-5 case label, 4-154 buffers, 8-4 cast operator, 4-5 bug alerts binding of macro arguments, 4-67 comparing floating-point values, 4-135 confusing = with ==, 4-132 ending a macro definition with a semicolon, 4-66 integer division and remainder, 4-21 opening a file, 8-15 passing structures vs. passing arrays, 4-150 referencing elements in a multidimensional array, 4-31 side effects, 4-109 side effects in macro arguments, 4-71 side effects in relational expressions, 4-117 space between left parenthesis and macro name, 4-69 the dangling else, 4-93 using = to define a macro, 4-69 walking off the end of an array, 4-23 casts, 4-49 to 4-53, C-6, D-7 abstract declarators, 3-41 double to float, 4-53 enum to integer, 4-52 float to double, 4-53 floating-point to integer, 4-52 generic pointers, 3-21 integer to floating-point, 4-21 integer to integer, 4-50 to 4-82 of pointers, 4-125 to 4-165 pointer to integer, 4-52 pointer to pointer, 4-53 to 4-82 to pointer, 4-31 to unsigned integer, 4-51 void, 3-19 bsd4.2 systype, 6-41 built-in routines, 6-47 to 6-48 builtins.h header file, 6-47 case sensitivity, 2-5 of global names, 7-27 cb utility, 6-50 cc command, 6-3 to 6-19 /com/cc, 6-13 to 6-14 differences between /bin/cc and /com/cc, 1-3 to 1-5, 3-69 #include preprocessor directive, 4-104 char, arrays. See strings c C beautifier. See cb utility C programming language Domain extensions, 1-3 history, 1-1 to 1-2 overview, 1-1 to 1-7 standards, 1-2 to 1-3 tenet of, 1-2 C Reference Manual. See K&R standard -C compiler option, 6-6 -c compiler option, 6-3, 6-6 char arrays, 3-40 to 3-41 char type, 3-2, 3-6 alignment, 3-25 range, 3-3 representation, 3-8 to 3-9 size, 3-3 char type specifier, 3-8 character codes, ASCII, A-I to A-3 character constants, 2-8 to 2-9, 3-8 escape characters, 2-8 multi-character, 2-9 character data, passing from C to FORTRAN, 7-16 to 7-17 Index 5 characteristic. of floating-point constants. 2-7 characters. nonportable use of. C-6. D-7 clearerrO function. 8-7. 8-8 clib. See standard C library closeO function. 8-26 closing a file. 8-15 code absolute. 6-39 dead. 6-35 fixed position. See absolute code relocatable. See position independent code COFF (common object file format). 4-145 /com/cc. 1-3 command line syntax. 1-5 -comchk compiler option. 2-3. 6-21 comma operator (.). 4-8. 4-54 erroneously used in multidimensional array references. 4-31 comma operator(.). in for statements. 4-85 command line arguments. 5-25 comments. 2-2 to 2-3 checking for balanced delimiters. 6-21 terminating. 9-2 common blocks. FORTRAN. 7-32 accessing from C. 3-69 common object file format. See COFF common subexpressions. 6-35 comparison operators. 4-6 See also relational operators compatibility. backwards. for function declarations. 5-16 compilation. conditional. 4-98, 6-22 compilation errors, 6-30 compilation statistics. 6-30 compilation warnings. 6-30 compile-time errors. 6-5 /com/cc. 6-14 to 6-19 compiler options. 6-19 to 6-43 -a. 6-6 -abs. 6-19 -aCt 6-20 -align. 6-20 -alnchk. 6-20 -B. 6-6 6 Index -b, 6-20 to 6-21 -bss. 3-'69, 6-3, 6-21. 7-26 -C, 6-6 -c. 6-3, 6-6 -comchk, 2-3, 6-21 -cond, 4-61, 6-22 -cpu, 6-22 to 6-23 -D, 4-98, 4-99. 6-24 to 6-26 -db, 6-23 to 6-24, 6-49 -dba, 6-23 to 6-24, 6-33, 6-34, 6-49 -dbs. 6-23 to 6-24. 6-49 -def, 4-98. 4-99, 6-24 to 6-26. 9-3 -E, 4-66, 6-26 -eSt 4-66, 6-26 -esf. 6-26 -exp, 6-27 -F, 6-7 -f, 6-7 -fpa, 6-27 -g. 6-23 to 6-24 -H, 6-7 -I, 4-104, 4-105, 6-8 -idir, 4-104, 4-105, 6-28 -indexl, 6-29 -info, 5-16. 6-29, 6-42. 9-1 -inlib, 6-30 -L, 6-8 -I. 6-8, 6-30 to 6-31 -M, 6-9, 6-22 to 6-23 -m, 6-8 -map, 6-31 to 6-32 -msgs, 6-33 -nalign, 6-20 -nalnchk, 6-20 -nb, 6-20 to 6-21 -nbss, 6-21 -ncomchk. 6-21 -ncond, 4-61. 6-22 -ndb, 6-23 to 6-24. 6-52 -nexp, 6-27 -nindexl, 6-29 -ninfo, 6-29, 9-1 -nl, 6-30 to 6-31 -nmap, 6-31 to 6-32 -nmsgs, 6-33 -nopt, 4-38, 6-24, 6-33 to 6-38 -nstd, 6-40 -ntype. 3-62. 4-98, 5-16, 6-42 -nuline, 6-42 -nwarn, 6-43, 9-1 -0, 6-33 to 6-38 -0, 6-9. 6-20 to 6-21 -opt. 6-24, 6-33 to 6-38 -P, 6-10, 6-26 -p, 6-9, 6-39 to 6-40 -pg, 6-10 -pic, 6-30, 6-39 -prof, 6-39 to 6-40 -qg, 6-10 -qp, 6-10 -r, 6-10 -runtype, 6-40 -S, 6-27 -s, 6-10 -std, 4-148, 6-40 -systype, 4-161, 6-40 to 6-42 -T, 6-11, 6-40 to 6-42 -t, 6-11 -type, 6-42 -U, 6-11 -u, 6-11 -uline, 6-42 -V, 6-11 -W,6-12 -w, 6-43 -warn, 6-43 -x, 6-12 -Y, 1-3 -Y, 6-12 /com/cc, 6-15 to 6-19 invoking from /bin/cc, 6-12 affecting code generation, 6-30 /bin/cc, 6-6 to 6-13 constant expressions, 4-79 and initialization, 3-5 computing at compile-time, 6-34 in enum declarations, 3-14 constant folding, 6-35 constants, 2-6 to 2-10 character, 2-8 to 2-9, 3-8 escape characters, 2-8 enumeration, 3-15 to 3-16 floating-point, 2-7 to 2-8 magnitude, 2-7 scientific notation, 2-7 to 2-8 table, 2-8 type, 2-7 improper, 9-2 integer, 2-6 to 2-7 decimal, 2-6 hexadecimal, 2-6 long, 2-6 octal, 2-6 multi-character, 2-9 negative, 2-7 passing by reference, 7-7 sign, 2-7 string, 2-10 to 2-12 using as lvalues, 3-62 continuation character (\), 2-3 in strings, 2-10 COMPILESYSTYPE environment variable, 4-161, 6-42 continue statement, 4-3, 4-57 to 4-59 compiling programs, 6-3 to 6-19 for specific processors, 6-9, 6-22 to 6-23 introduction, 1-6 with /bin/cc, 6-3 to 6-13 with /com/cc, 6-13 conversions automatic argument, 7-2 to 7-3 suppressing, 5-12 table of, 7-3 of function arguments, 5-3, 5-9 type. See type conversions complement operator, bitwise (-), 4-42, 4-45 COMPLEX, FORTRAN data type, 7-3 simulating in C, 7-25 to 7-26 complex declarations, 3-42 to 3-45 compound blocks, and if statements, 4-92 compound statements, 4- 2 -cond compiler option, 4-61, 6-22 conditional branching statements, 4-91 conditional compilation, 4-98, 6-22 conditional expression operator, 4-8, 4-55 to 4-56 copying files, 8-16 to 8-25 cosO function, 4-151 built-in version of, 6-47 cpp UNIX C preprocessor, 4-104 UNIX C preprocessor. See preprocessor -cpu compiler option, 6-22 to 6-23 creatO function, 8-26 cross-language communication, 7-1 to 7-36 calling FORTRAN from C, 7-14 to 7-26 calling Pascal from C, 7-7 to 7-14 sharing data, 7-26 to 7-35 Index 7 D -D compiler option, 4-98, 4-99, 6-24 to 6-26 dangling else, 4-93 data sharing between C and FORTRAN, 7-32 to 7-35 sharing between C and Pascal, 7-27 to 7-32 .data section, 3-69, 4-140, 7-27 data sections, 4-140 changing name of, 4-119 data types, 2-14 aggregate, 3-2, 3-4 to 3-5 agreement between C, Pascal, FORTRAN, 7-3 to 7-4 arithmetic, 3-1 table, 3-3 array. See arrays C, FORTRAN, and Pascal, 7-4 casting. See casts char, 3-2, 3-6 range, 3-3 representation, 3-8 to 3-9 size, 3-3 double, 3-2, 3-11 accuracy, 3-12 range, 3-3 representation, 3-12 to 3-13 size, 3-3 enum, 3-2, 3-14 to 3-17 declaring, 3-14 range, 3-3 size, 3-3 type-checking, 3-15 float, 3-2, 3-11 accuracy, 3-11 range, 3-3 representation, 3-11 to 3-12 size, 3-3 floating-point, 3-11 to 3-13 hierarchy, 3-2 int, 3-2, 3-6 range, 3-3 representation, 3-6 to 3-7 size, 3-3 integer, 3-6 to 3-10 portability, 3-6 long, 3-2, 3-6 range, 3-3 representation, 3-6 to 3-7 size, 3-3 8 Index long enum, 3-17 long float, 3-11 representation, 3-12 to 3-13 not supported in C, 7-3 overview, 3-1 to 3-4 pointer See also pointers size, 3-3 qualifiers, 3-2 range, 3-3 scalar, 3-1, 3-2 to 3-3 hierarchy of, 4-14 short, 3-2, 3-6 range, 3-3 representation, 3-7 to 3-8 size, 3-3 short enum, 3-17 size of, 3-3, 4-143 struct. See structures union. See unions unsigned, 3-2, 3-6 integer overflow, 3-10 range, 3-3 size, 3-3 unsigned char, representation, 3-8 to 3-9 unsigned int, representation, 3-6 to 3-7 unsigned short, representation, 3-7 to 3-8 void, 3-2, 3-18 to 3-19 date, of program compilation, 4-60 _DATE_ predefined name, 4-15, 4-60 -db compiler option, 6-23 to 6-24, 6-49 -dba compiler option, 6-23 to 6-24, 6-33, 6-34, 6-49 dbg utility, 6-23, 6-49 to 6-52 -dbs compiler option, 6-23 to 6-24, 6-49 dbx utility, 6-23, 6-50 dead code, 6-35 #debug preprocessor directive, 4-61 to 4-62, 6-22 debug sections, 4-140 debuggers compiling for, 6-23 to 6-24 using on optmized code, 6-33 debugging code, adding to source files, 3-50 debugging programs, 6-49 to 6-50 using conditional compilation feature, 4-98 decimal integer constants, 2-6 decimal point, 2-7 declarations, 2-13 to 2-17 allusions, 3-57 argument, 3-46 array, 3-35 #attribute modifier, 3-63 to 3-70 complex, 3-42, 3-42 to 3-45 composing, 3-42 to 3-45 deciphering, 3-43 decomposing, 3-42 to 3-45 definitions, 3-57 enum, 3-14 examples, 2-14 function, 3-61 global, 2-11 global variable, 3-57, 3-57 to 3-60 head-of-block, 3-46 in a compound statement, 4-2 legal and illegal, 3-45 of bit fields, 3-31 of function arguments, 5-3 to 5-4, 5-3 to 5-4 pointer, 3-19 position of, 3-46 to 3-47 reference variable, 3-63 scope of, 3-48, 3-48 to 3-51 storage class. See storage class structure, 3-23 to 3-24 table of, 3-45 top-level, 3-46 typedef, 2-14 to 2-16 union, 3-23 to 3-24, 3-29 visibility of, 3-50 declarators, abstract, 3-41 decrement operators, 4-5, 4-106 to 4-110 and pointers, 4-124 precedence of, 4-108 -def compiler option, 4-98, 4-99, 6-24 to 6-26, 9-3 default initialization, of fixed variables, 3-53 default label, 4-3, 4-155 DEFINE, Pascal keyword, 7-27 #define preprocessor directive, 4-64 to 4-71, 6-24 defined names, 6-24 defined predefined macro, 4-15, 4-96 to 4-100 definitions function, 3-61, 5-1, 5-1 to 5-5 prototyping, 5-14 to 5-15 global variable, 3-57, 3-57 to 3-60 reaching, 6-35 Delphi online documentation system, 1-7 dereferencing operator (*), 4-123 descriptors, file. See file descriptors device attribute specifier, 3-63, 3-66 to 3-68 device registers, device attribute specifier, 3-66 devices, standard, 8-8 to 8-10 diagnostic messages, 9-1 to 9-37 directives, preprocessor. See preprocessor directives directories for header files, 6-28 /usrlinclude, 6-28, 6-45, 7-35 divO function, 4-21 division, integer, 4-21 division operator (/), 4-19 DN460 workstation, 6-22 DN5xx-T workstation, 6-22 DN660 workstation, 6-22 DNxxx, compiling code for, 6-9, 6-22 to 6-23 do/while statement, 4-3, 4-72 to 4-73 dollar sign ($), used in identifiers, 2-4 Domain extensions, 1-3 #attribute modifier, 3-63 to 3-70 #debug preprocessor directive, 4-61 dollar sign in identifiers, 2-4 #eject preprocessor directive, 4-74 #list preprocessor directive, 4-114 long float type, 3-11 name spaces, 2-17 #nolist preprocessor directive, 4-114 #options specifier, 5 -19 reference variables, 3-62 to 3-63 #section preprocessor directive, 4-140 to 4-142 short and long enum, 3-3, 3-17 systype predefined macro, 4-160 #systype preprocessor directive, 4-160 table of, B-1 to B-2 domain extensions, module preprocessor directive, 4-119 Domain/Dialogue, 6-51 Domain/OS environments, 1-3 to 1-5 dot operator (.). See structure member operator Index 9 double quote escape code (\") , 2-8 double quotes, delimiting strings, 2-10 double type, 3-2, 3-11 accuracy, 3-12 casting to float, 4-53 range, 3-3 representation, 3-12 to 3-13 size, 3-3 double-precision floating-point, 3-11, 3-12 to 3-13 dpak utilities, 6-50 drivers, GPIO, compiling with -pic, 6-39 DSEE (Domain Software Engineering Environment), 6-51 DSP160 workstation, 6-22 duration, 3-46, 3-52 to 3-54 automatic, 3-52 fixed, 3-52 E E, exponent in scientific notation, 2-7 enum type, 3-2, 3-14 to 3-17 declaring, 3-14 initializing, 3-17 range, 3-3 short and long, 3-3, 3-17 size, 3-3 type-checking, 3-15 enumerated data type. See enum type enumeration constants, 3-15 to 3-16 enums casting to integer, 4-52 in switch statements, 4-156 indexing arrays with, 4-23 initializing, 3-17 maximum number of enumerators, 9-5 names of, 2-16 operations on, 4-78 environment variables COMPILESYSTYPE, 4-161, 6-42 inprocess, 6-51 LIBDIR, 6-8 LLIBDIR, 6-8 SYSTYPE, 6-42 EOF macro, 8-6 e, exponent in scientific notation, 2-7 equality operator (==), 4-132 confused with assignment operator =, 4-132 -E compiler option, 4-66, 6-26 ermo, 6-47, 8-27 ECB. See entry control block ermo.h header file, 8-27 echo program, 5-26 error handling. for 110, 8-7 to 8-8 for unbuffered 110, 8-27 efficiency and built-in routines, 6-47 and prototypes, 5-17 register variables, 3-56 using macros for, 4-70 error messages, 9-1 to 9-37 errors, compile-time, 6-5, 6-30 errout stream, 6-14 #eject preprocessor directive, 4-74 -es compiler option, 6-26 elements, array. See array elements escape characters, 2-8 to 2-9 #elif preprocessor directive, 4-16, 4-99 escape codes. See escape characters ellipsis, used to specify a variable number of arguments, 5-15 -esf compiler option, 6-26 else clause, 4-91 to 4-95 #else preprocessor directive, 4-96 to 4-100 else statement, dangling, 4-93 end block symbol (}), 2-13 #endif preprocessor directive, 4-96 to 4-100 entry control block (ECB), 6-31 10 Index evaluation, order of, 4-10 to 4-11 and logical operators, 4-116 and side effects, 4-109 examples break_example, 4-47 bubble_sort, 4-32 callyowery, 7-9 callJeverse_string, 7-10 conditional_exp_op_example, 4-56 continue_example, 4-58 date_and_time_example, 4-60 debugyreprocessor_ cmd, 4-61 do. while_example, 4-73 echo, 5-26 floatJounding, 4-39 for_example, 4-85 get_IoLc, 7-34 global_var_c, 7-29 goto_example, 4-89 if. else_example , 4-94 inc. dec_example 1 , 4-107 inc.dec_example2, 4-107 inc. dec_example 3, 4-110 line_example, 4-112 logical_op_example, 4-117 multi_dim_array_c, 7-21 online, 1-6 to 1-7 pass_boolean_c; 7-13 pass_bYJeCexample, 5-9 pass_bLval_example, 5-8 pass_char_array_c, 7-19 pass_char_cf, 7-17 pass_complex_c, 7-25 pass_int_array, 7-20 pass_Iogical_c, 7-24 passyoint_c, 7-23 passyointer_c, 7-12 pointer_example 1, 4-128 pointer_example 2 , 4-129 print_size, 4-27 ptr_example2, 4-123 ptr_example3, 4-124 recursive_example, 5-20 relational_example, 4-136 return_example, 4-139 returning_arrays, 4-28 section_example_c, 7-31 sizeoCexample, 4-144 standard_io_example, 8-9 switch_example, 4-158 unix_copy, 8-27 while_example, 4-165 executable files, 6-48 executing programs, 6-48 to 6-49 introduction, 1-6 -exp compiler option, 6-27 expO function, built-in version of. 6-47 exponent, 2-7 in floating-point constants. 2-7 expressions. 4-79 to 4-81 boolean. 4-133 constant, 4-79 and initialization. 3-5 computing at compile-time, 6-34 in enum declarations, 3-14 float. 4-79 integer, overflow. 3-10 integral. 4-79 loop-invariant, 6-38 order of evaluation, 4-10 to 4-11, 4-109 See also operators parenthesized, 4-9 to 4-10 pointer. 4-79 pointer arithmetic, 4-124 to 4-165 rearranging to optimize code. 6-34 relational, 4-115 to 4-118 side effects. 4-108 subexpressions. 4-13 extensible streams, 8-3 extensions. Domain. See Domain extensions EXTERN. pascal keyword. 7-27 extern storage class specifier, 3-55, 3-57, 5-5 , and array size. 3-36 and initialization, 3-4 function allusions. 3-61 external references. resolving. 6-43 external variable. See global variable F -F compiler option, 6-7 -f compiler option. 6-7 f77 command. 7-14 fabsO function, built-in version of. 6-47 false values. 4-133 and logical operators. 4-115 fault handlers. 6-23 fcloseO function. 8-10 fdopenO function. 8-10 feofO function, 8-7, 8-8. 8-17 ferrorO function, 8-7. 8-8. 8-19 expanded listing files. 6-27 fflushO function. 8-5. 8-10 expansion. macro, 4-64 fgetcO function. 8-10. 8-16 Index 11 fgets() function, 8-11, 8-17 to 8-27 error, 8-8 fields, bit. See bit fields float expressions, 4-79 file descriptors, 8-3 to 8-4, 8-25 float type, 3-2, 3-11 accuracy, 3-11 casting to double, 4-53 range, 3-3 representation, 3-11 to 3-12 size, 3-3 file names, in #include directive, 4-104 file pointers, 8-4 file position indicators, 8-8 _FILE_ predefined name, 4-111 file scope, 3-48, 3-51, 3-61 FILE structure, 8-4 file types, 8-3 filename suffixes .bak, 6-14 .bin, 6-14, 6-20 .c, 6-13 .h, 7-35 .i, 6-26 .1st, 6-14, 6-31 filenames _FILE_ predefined macro, 4-111 changing with #line directive, 4-112 filenoO macro, 8-7, 8-11 files .bin, debugger information in, 6-24 ASCII, 8-3 closing, 8-15 executable, 6-48 expanded listing, 6-27 fixed-length record, 8-3 header, 4-103 I/O to, 8-10 to 8-12 listing, 4-114, 6-30 to 6-31 map, 6-31 to 6-32 object, 6-3 specifying name of, 6-20 opening, 8-12 to 8-15 reading and writing, 8-16 to 8-27 source, 2-11, 6-3 types of, 8-3 variable-length record, 8-3 fixed duration, 3-52 and initialization, 3-4 initialization, 3-54 floating-point casting from integer to, 4-52 to 4-82 double-precision, 3-12 to 3-13 single-precision, 3-11 to 3-12 floating-point accelerator (FPX) , 6-22 compiling code for, 6-9 floating-point accuracy, 6-27 floating-point constants, 2-7 to 2-8 magnitude, 2-7 scientific notation, 2-7 to 2-8 table, 2-8 type, 2-7 floating-point data passing from C to FORTRAN, 7-15 passing from C to Pascal, 7-8 to 7-9 floating-point data types, 3-11 to 3-13 floating-point expressions, rounding of, 4-135 floating-point overflow, 4-38 floating-point precision, 6-27 floating-point registers, 6-27 floating-point values comparing, 4-135 passing as arguments, 5-17 floating-point variables, initializing, 3-13 flow of control abnormal, 5-19 and lint utility, C-4, D-4 fopenO function, 8-11, 8-12 to 8-15 for loops, 4-84 for statement, 4-3, 4-83 to 4-87, 4-101, 4-102, 4-121 form feed, forcing with #eject directive, 4-74 fixed position code. See absolute code formal arguments, 5-8 See also arguments fixed-length record files, 8-3 formal parameters. See formal arguments flags end-of-file, 8-8 formfeed escape code (\f) , 2-8 12 Index FORTRAN data types, table of, 7-4 FORTRAN programming language, 4-31 calling routines from C, 7-14 to 7-26 names of routines, 7-14 to 7-15 type agreement with C, 7-3 to 7-4 FORTRAN programs, accessing common blocks from C, 3-69 forward refereences, of functions, 5-6 -fpa compiler option, 6-27 fprintfO function, 8-11, 8-14 fputcO function, 8-11, 8-16 fputsO function, 8-11, 8-17 to 8-27 fpx floating-point accelerator, compiling code for, 6-9 freadO function, 8-11, 8-19 to 8-27 freopenO function, 8-11 fscanfO function, 8-11 fseekO function, 8-11, 8-21 to 8-27 ftellO function, 8-11, 8-21 to 8-27 ftn command, 7-14 function allusions, 3-61, 5-5 to 5-6 syntax of, 5-6 function calls See also functions, invoking syntax of, 5-7 using pointers to functions, 5-23 to 5-25 function definitions, 2-11, 5-1, 5-1 to 5-2 prototyping, 5-14 to 5-15 body of, 5-4 to 5-5 calling, 5-7 to 5-12 default return type of, 5-2 defining, 5-1 definitions, 2-11 definitions of, 3-61 invoking, 5-7 to 5-12 mainO, 5-25 to 5-27 nested, 9-13 pass by value, 5-7 to 5-10 passing as function arguments, 5-3 pointers to, 5-20 to 5-25 assigning values to, 5-21 to 5-22 calling functions using, 5-23 to 5-25 dereferencing, 5-23 passing as arguments, 5-24 to 5-25 return type agreement, 5-22 preamble of, 5-2 to 5-4 recursive, 5-20 return type of, 5-2 return value of, 5-2, 5-17 to 5-19 incorrectly used, C-4 returning arrays (illegal), 3-44, 4-28 to 4-82 returning functions (illegal), 3-44 returning pointers, 7-5 returning void, 4-138, 5-2 scope, 3-61 storage class of, 3-60 to 3-62, 5-6 unused, C-3, D-3 vs. macros, 4-70 fwriteO function, 8-12, 8-19 to 8-27 G function parameters. See arguments -g compiler option, 6-23 to 6-24 function prototypes, 5-12 to 5-17 and efficiency, 5 -1 7 backwards compatibility of, 5-16 turning on and off, 6-42 using _STDC_ to turn on and off, 4-145 using to suppress automatic argument prom ortions, 7-2 gaps, in structures. See padding garbage values, 3-53 generic pointers, 3-21 to 3-23 casting, 4-53, 4-126 getcO function, 8-7, 8-12, 8-16, 8-17 getcc utility, 1-6 function return values pointers to functions, 5-22 structures, 4-150 getsO function, 8-9 function scope, 3-48, 3-51 getwO function, 8-12 function signatures, 2-11 global declarations, 2-11 functions, 2-12, 5-1 to 5-27 allusions to, 3-61, 5-1, 5-5 to 5-6 and macros, 8-7 arrays of (illegal), 3-44 getcharO macro, 8-7, 8-9 global names, case sensitivity of, 7-27 global register allocation, 6-37 global variables, 3-48, 3-51, 3-57 to 3-60 allusions, 3-57 Index 13 and cross-language communication, 7-26 to 7-35 defining, 3-57 to 3-60 length of names, 2-4 placement in object file, 6-21 portability, 3-60 sharing data between C and FORTRAN, 7-32 to 7-35 sharing data between C and Pascal, 7-27 to 7-32 using /bin/cc, 7-27 using /com/cc, 7-26 to 7-27 I -I compiler option, 4-104, 4-105, 6-8 I/O. See input and output identifiers, 2-4 length, 2-4 table of legal and illegal, 2-4 uniqueness, 2-4 -idir compiler option, 4-104, 4-105, 6-28 #if preprocessor directive, 4-96 to 4-100 gmon.out file, 6-10 if statement, 4-3, 4-91 to 4-95 goto labels, scope of, 3-48, 3-51 #ifdef preprocessor directive, 4-96 to 4-100 goto statement, 4-3, 4-88 to 4-90 #ifndef preprocessor directive, 4-96 to 4-100 GPIO drivers, compiling with -pic, 6-39 implementation dependencies, sizes of objects, 4-144 gprof utility, 6-10 greater than operator (», 4-13 2 greater than or equal to operator (>=) , 4-132 grouping, of operators, 4-9 implicit type conversions, 4-12 IN parameters, 7-6 in-line code, 6-47 to 6-48 include directories, 6-28 H include files. See header files #include preprocessor directive. 4-103 to 4-105 -H compiler option, 6-7 inclusive OR. bitwise operator (I). 4-42 head-of-block declarations, 3-46 increment operator (++) and pointers. 2-16 postfix. use of. 5-26 header files, 4-103 apollo_$std.h, 4-103 base.h, 7-36 builtin.h, 6-47 default, 6-13 directories for, 6-28 errno.h, 8-27 list of standard, 6-46 macro definitions in, 4-71 nesting, 9-9 stdio.h, 8-4, 8-6 system, 7-35 to 7-36 hexadecimal constants, in escape codes, 2-9 hexadecimal integer constants, 2-6 hierarchy of data types, 3-2 of scalar data types, 4-14 of scopes, 3-48 history, of the C language, 1-1 to 1-2 holes, in structures. See padding horizontal tab escape code (\t) , 2-8 14 Index increment operators, 4-5, 4-106 to 4-110 and pointers, 4-124 precedence of, 4-108 to 4-165 -indexl compiler option, 6-29 indirection operator (*). See dereferencing operator -info compiler option. 5-16. 6-29. 6-42, 9-1 informational messages. 6-29. 9-1 to 9-37 inifinite loops, 4-23 initial values. 2-14 initialization and automatic duration. 3-4 and braces ({}), 3-5 and constant expressions. 3-5 and extern storage class specifier. 3-4 and fixed duration. 3-4 and type conversion, 3-5. 3-9 array. 3-36 using strings, 3-36 array of struct, 3-39 automatic variables, 3-54 default, 3-53 enum variables, 3-14, 3-17 fixed duration variables, 3-54 floating-point variables, 3-13 integer variables, 3-9 to 3-10 multidimensional array, 3-38 to 3-39 of aggregate objects, 9-7 old-style, 3-5, 9-4 overview, 3-4 to 3-5 pointer, 3-20 pointer to char, 3-41 string, 3-40 structure, 3-33 union, 3-33 to 3-35 initializations, allusions and definitions, 9-14 integer data types, 3-6 to 3-10 portability, 3-6 integer division, sign of result, 4-21 integer overflow, 3-10, 4-37 integer remainder. See modulo division operator integer widening, 4-50 integers 32-bit, 3-6 to 3-7 8-bit, 3-8 to 3-9 and pointers, 4-24, 4-126 casting, 4-50 to 4-82 conversions of, table of, 4-50 passing from C to FORTRAN, 7-15 passing from C to Pascal, 7-8 to 7-9 integral expressions, 4-79 -inlib compiler option, 6-30 integral promotions. See integral widening conversions inprocess environment variable, 6-51 integral widening conversions, 4-12 input, standard, redirecting, 6-49 invocation, of macros. See macro expansion input and output, 8-1 to 8-27 buffering, 8-4 to 8-6 closing a file, 8-15 error handling, 8-7 to 8-8 file pointers, 8-4 file position indicators, 8-8 granularity of, 8-16 opening a file, 8-12 to 8-15 random access, 8-21 to 8-25 reading data, 8-16 to 8-25 standard I/O library, 8-4 to 8-25 to files, 8-10 to 8-12 unbuffered functions, 8-25 to 8-27 writing data, 8-16 to 8-25 invocations, function, 5-7 to 5-12 insert files. See header files lOS type managers, compiling with -pic, 6-39 K K&R standard, 1-2 extensions to, 6-40 name spaces, 2-17 kernel-level blocks, 8-5 Kernighan, Brian, 1-2 keywords, 2-5 for scalar types, 3-2 table of, 2-5 installed libraries, 6-30 instruction address register (IADDR), 6-23 L instruction reordering, 6-38 L, integer constant suffix, 2-6 int type, 3-2, /3-6 assigning longs to, C-6, D-8 range, 3-3 representation, 3-6 to 3-7 size, 3-3 1, integer constant suffix, 2-6 integer constants, 2-6 to 2-7 decimal, 2-6 hexadecimal, 2-6 long, 2-6 octal, 2-6 -1 compiler option, 6-8, 6-30 to 6-31 -L compiler option, 6-8 labels case, 4-154 default, 4-155 statement, 3-51, 4-88 Id link editor, 1-4, 6-43, 6-43 global variables, 3-57 Index 15 left-to-right binding order, 4-9 #list preprocessor directive, 4-114 less than operator «), 4-132 listing files, 4-114, 6-30 to 6-31 expanded, 6-27 less than or equal to operator «=) , 4-132 letters, alphabetic, used in identifiers, 2-4 lexical elements, of a C program, 2-1 to 2-5 /lib/clib. See standard C library /lib/crtO.o, startup routine, 6-43 LlBDIR environment variable, 6-8 libraries Domain system calls, 8-2 for I/O, 8-1 installed, 6-30 lint, C-12 managing with ar utility, 6-44 shared, compiling with -pic, 6-39 specifying search order, 6-8 standard C, 6-44, 6-45 to 6-46 standard I/O, 8-2, 8-4 to 8-25 system, 6-44, 7-35 to 7-36 linking to, 7-36 UNIX I/O functions, 8-2 library directory, specifying default, 6-13 library records, 6-30 live analysis of local variables, 6-36 LLlBDIR environment variable, 6-8 local variables, length of names, 2-4 logO function, built-in version of, 6-47 LOGICAL, FORTRAN data type, 7-3 simulating in C, 7-23 to 7-25 logical operators, 4-7, 4-115 to 4-118 and order of evaluation, 4-116 bitwise, 4-43 truth table for, 4-115 long enum type, 3-3, 3-17 long float type, 3-11 See also double type representation, 3-12 to 3-13 long integer constants, 2-6 long type, 3-2, 3-6 range, 3-3 representation, 3-6 to 3-7 size, 3-3 longword alignment, 3-25 line buffering, 8-5 loop-invariant expressions, 3-67, 6-38 line numbers #line preprocessor directive, 4-112 _LlNE_ predefined macro, 4-111 in listing file, 6-30 stripping from object file, 6-10 looping statements, 4-3 _LlNE_ predefined name, 4-15, 4-111 #line preprocessor directive, 4-112 to 4-113, 9-26 enabling and disabling, 6-42 lines, spanning multiple, 2-3 link editor (ld), 1-4, 6-14 command options, 6-6 linking global variables, 3-57 named sections, 3-60 linking object modules, 6-43 to 6-44 and the #section preprocessor directive, 4-140 lint utility, 6-5 0 BSD version, C-l to C-12 sysV version, D-l to D-I0 /* LlNTLlBRARY */, lint comment, C-12, D-2 16 Index loops for, 4-84 infinite, 4-23 while, 4-84 IseekO function, 8-26 lvalues, 9-8 definition of, 4-4 using constants as, 3-62 M -M compiler option, 6-9, 6-22 to 6-23 -m compiler option, 6-8 M68020 mocroprocessor, 6-22 M68881 floating-point co-processor, 6-22 macro body, 4-64 macro expansion, 4-64 macro names, and name spaces, 2-17 macros advantages of, 4-70 and functions, 8-7 arguments to binding of, 4-67 no type-checking for, 4-67 to 4-82 side effects in, 4-71 body of, 4-64 calling, 4-64 defining, 4-64, 4-64 to 4-71 disadvantages of, 4-70 expansion of, 4-64 names of, 4-64 predefined, 4-15 See also predefined macros syntax of, 4-65 undefining, 4-71 vs. functions, 4-70 to 4-82 magnitude, floating-point constants, 2-7 mainO function, 2-12, 5-25 to 5-27 make utility, 6-50 malloc function, return type, 3-21 mantissa, in floating-point constants, 2-7 -map compiler option, 6-31 to 6-32 map files, 6-31 to 6-32 math.h header file, built-in routines for, 6-47 memory array storage, 3-39 shared, volatile attribute specifier, 3-65 storage of multidimensional arrays, 3-39 structure representation, 3-24 to 3-29 union representation, 3-29 to 3-30 virtual, 6-39 memory allocation of arrays, 4-23 of automatic variables, 3-52 of strings, 3-41 of structures, 3-27 to 3-29 of unions, 3-30 memory storage. See memory allocation messages compile-time, 9-1 to 9-37 informational, 6-29 warning, suppressing, 6-43 #module preprocessor directive, 4-119 to 4-120, 9-28 modules, object changing name of, 4-119 section summary of, 6-30 modulo division operator (%), 4-19 sign of result, 4-21 man. out file, 6-9 -msgs compiler option, 6-33 multi-character constants, 2-9 multidimensional arrays, 3-37 to 3-39, 4-28 to 4-33 initializing, 3-38 to 3-39 passing as arguments, 4-29 to 4-33 passing from C to FORTRAN, 7-21 storage, 3-39 multiplication operator (*), 4-19 N -nalign compiler option, 6-20 -nalnchk compiler option, 6-20 _NAME_ predefined name, 4-15 name spaces, 2-16 to 2-17 struct and union, 3-32 named sections for global variables, 7-26 section attribute specifier, 3-69 using to access FORTRAN common blocks, 3-69 names See also identifiers array, 4-25 interpretation of, 4-24 conflicting, 3-49, 3-50 defining at compilation time, 6-24 to 6-26 macro, 4-64 predefined, 4-15 struct and union member, name space, 2-16 structure and union members, 2-16 tag and member, 3-24 variable, 2-14 visibility of, 3-50 natural alignment, 3-25, 3-27 -nb compiler option, 6-20 to 6-21 -nbss compiler option, 6-21 -ncomchk compiler option, 6-21 -ncond compiler option, 4-61, 6-22 Index 17 o -ndb compiler option, 6-23 to 6-24, 6-52 negation, bitwise operator (-). See complement operator negative constants, 2-7 negative integers, representation, 3-7 OR bitwise exclusive operator C) , 4-42, 4-45 bitwise inclusive operator (I), 4-42, 4-45 -0 compiler option, 6-33 to 6-38 nested members, structure and union, 4-147 -0 newlines, 2-2 escape code (\n) , 2-8 object file sections, specifying alignment of, 6-20 -nexp compiler option, 6-27 object files, 6-3 differences between Aegis and UNIX, 1-4 specifying name of, 6-20 NIL pointers, 7-11 -nindexl compiler option, 6-29 -ninfo compiler option, 6-29, 9-1 -nl compiler option, 6-30 to 6-31 nm command, 7-15 -nmap compiler option, 6-31 to 6-32 -nmsgs compiler option, 6-33 compiler option, 6-9, 6-20 to 6-21 object modules changing name of, 4-119 section summary of, 6-30 octal constants, 2-6 in escape codes, 2-9 old-style initialization, 3-5 no-ops, 5-7 online sample programs, 1-6 to 1-7 See also examples #nolist preprocessor directive, 4-114 openO function, 8-26 -nopt compiler option, 4-38, 6-24, 6-33 to 6-38 opening a file, 8-12 to 8-15 noreturn, #options specifier, 5-19 operating systems, 1-1 nosave, #options specifier, 5-19 operators, 4-3 to 4-11 See also expressions address-of (&), 5-22 arithmetic, 4-6, 4-19 to 4-21 assignment, 4-8, 4-34 to 4-40 old-style, 4-36 to structures and unions, 4-147 associativity of, 4-9 table of, 4-11 binary, 4-13 binding of, 3-42 See also associativity of operators bit, 4-7, 4-41, 4-42 to 4-45 cast, 4-5 casts. See casts comma, 4-8, 4-54 comparison, 4-6 See also relational operators conditional expression, 4-8, 4-55 to 4-56 decrement, 4-5, 4-106 to 4-110 grouping of operands to, 4-9 increment, 4-5, 4-106 to 4-110 logical, 4-7, 4-115 to 4-118 bitwise, 4-43 f* NO STRICT *f, lint comment, C-11, D-7 NOT, logical operator (!), 4-115 not equal to operator (!=), 4-132 f* NOTREACHED *f, lint comment, C-11 -nstd compiler option, 6-40 -ntype compiler option, 3-62, 4-98, 5-16, 6-42 -nuline compiler option, 6-42 null character \0, 2-9, 3-40 in string constants, 9-18 in strings, 2-10 inserted by fgets 0, 8-17 NULL macro, 8-6 null pointers, 3-20, 4-126 to 4-165, 7-11 null statement, 4-2 null string, 2-10 -nwarn compiler option, 6-43, 9-1 18 Index operands, 4-3 to 4-11, 4-13 order of evaluation, 4-10 to 4-18 overview of, 4-3 to 4-11 pointer, 4-4, 4-122 to 4-130 pointer arithmetic, 4-124 to 4-165 postfix, 3-42, 4-106 precedence of, 3-42, 4-9 table of, 4-11 prefix, 3-42, 4-106 relational, 4-132 to 4-136 See also comparison operators side effects in, 4-117 side effect, 4-109 sizeof, 4-5, 4-26, 4-143 to 4-144 structure, 4-146 to 4-153 unary, 4-13 union, 4-146 to 4-153 -opt compiler option, 6-24, 6-33 to 6-38 optimization levels, 6-33 to 6-38 optimizations, 6-33 to 6-38 and noreturn #options specifier, 5-19 common subexpressions, 6-35 computing constant expressions at compiletime, 6-34 constant folding, 6-35 dead code, 6-35 differences between /com/cc and /bin/cc, 6-3 global register allocation, 6-37 instruction reordering, 6-38 live analysis of local variables, 6-36 loop-invariant expressions, 3-67 reaching definitions, 6-35 rearranging expressions, 6-34 redundant assignment elimination, 6-37 removing loop-invariant expressions, 6-38 turning off, 3-63, 6-29 device attribute specifier, 3-66 volatile attribute specifier, 3-65 options. See compiler options #options specifier, 5-19, 7-5 OR, logical operater ([[),4-115 order of evaluation, 4-10 to 4-18 and logical operators, 4-116 and side effects, 4-109 overlay sections, 7-26, 7-32 creating, 7-29 to 7-32 p -P compiler option, 6-10, 6-26 -p compiler option, 6-9, 6-39 to 6-40 padding in structures, 3-25 unnamed bit fields, 3-31 page break, forcing with #eject directive, 4-74 parameters. See arguments parentheses, 4-9 to 4-10 in macro definitions, 4-67 used to change precedence in declarations, 3-43 parenthesized expressions, 4-9 to 4-10 Pascal data types, table of, 7-4 Pascal programming language, 1-2, 4-31, 4-133 calling from C, 7-7 to 7-14 sharing data with C, 7-27 to 7-32 type agreement with C, 7-3 to 7-4 pass by reference, 4-148, 5-7, 5-11 to 5-14, 7-5 . pass by value, 4-148, 5-7, 5-7 to 5-10 pathnames absolute, 4-104 in #include directives, 4-105 relative, 4-104 PCC. See portable C compiler peb performance enhancement board, compiling code for, 6-9 performance evaluating with prof utility, 6-9 evaluating with the gprof utility, 6-10 performance enhancement board, 6-22 compiling code for, 6-9 -pg compiler option, 6-10 organization, of programs, 2-1 to 2-17 pgm_ $invoke system call, compiling with -pic, 6-39 OUT parameters, 7-6 pic. See position independent code output, standard, redirecting, 6-49 -pic compiler option, 6-30, 6-39 overflow conditions, 4-10, 4-38 floating-point, 4-38 integer, 3-10, 4-37 pointer alignment, C-8 pointer arithmetic, 4-24, 4-124 to 4-165 scaling, 4-125 Index 19 pointer expressions, 4-79 precision, loss of, 4-37, 4-38 pointer operators, 4-4 predefined macros defined, 4-15, 4-96 to 4-100 systype, 4-15, 4-160 to 4-162 table of, 4-15 pointers, 3-19 to 3-22 accessing array elements through, 4-24 to 4-82 alignment of, D-10 and #attribute modifier, 3-64 and increment operator (++), 2-16 and integers, 4-24 arithmetic with, 4-124 to 4-165 assigning integer values to, 4-126 assigning values to, 4-122 casting, 4-53 to 4-82, 4-125 to 4-165 casting to integer, 4-52 declaring, 3-19, 4-122 dereferencing, 4-123 to 4-165 functions returning, 7-5 generic, 3-21 to 3-23 casting, 4-126 initializing, 3-20 internal representation, 3-20 NIL, 7-11 null, 3-20, 4-126 to 4-165, 7-11 operations with, 4-122 to 4-130 passing as arguments, 5-9, 7-6 passing from C to FORTRAN, 7-22 to 7-23 passing from C to Pascal, 7-11 to 7-12 to char, initializing, 3-41 to functions, 5-20 assigning values to, 5-21 to 5-22 calling functions using, 5-23 to 5-25 dereferencing, 5-23 passing as arguments, 5-24 to 5-25 return type agreement, 5-22 to structures, 4-147 type compatibility of, 4-122, 4-124, 4-138 portability, C-9 to C-11 and integer data types, 3-6 and integer division, 4-21 and pointers to functions, 5-24 global variables, 3-60 Portable C Compiler (PCC) , 1-2 position independent code (Pic), 6-39 postfix operators, 3-42, 4-106 powO function, 5-27 preambles, of functions, 5-2 to 5-4 precedence of operators, 4-9 table of, 4-11 20 Index predefined names _DATE_, 4-60 _FILE_, 4-111 _LINE_, 4-111 _STDC_, 4-98, 4-145, 6-42 _TIME_, 4-60 _BFMT_COFF, 4-145 table of, 4-15 prefix operators, 3-42, 4-106 preprocessor differences between Aegis and UNIX, 1-4 execution, 6-26 macros. See macros UNIX (cpp) , 1-4, 4-16, 4-99 preprocessor directives, 2-11, 2-13 #debug, 4-61 to 4-62, 6-22 #define, 4-64 to 4-71, 6-24 #eject, 4-74 #elif, 4-16, 4-99 #else, 4-96 to 4-100 #if, 4-96 to 4-100 #ifdef, 4-96 to 4-100 #ifndef, 4-96 to 4-100 #include, 4-103 to 4-105 #line, 4-112 to 4-113, 9-26 enabling and disabling, 6-42 #list, 4-114 #module, 4-119 to 4-120, 9-28 #nolist, 4-114 #section, 4-140 to 4-142 #systype, 4-160 to 4-162, 6-40 #undef, 4-64 to 4-71 column position in source file, 4-16 overview of, 4-15 table of, 4-16 printfO function, 8-9 prototype for, 5-15 procedure section, 4-140 changing name of, 4-119 procedures, 2-12 processors, compiling code for specific, 6-9, 6-22 to 6-23 -prof compiler option, 6-39 to 6-40 prof utility, 6-9, 6-39 profiling programs See also prof utility prof utility, 6-9 program development, 6-1 to 6-52 program organization, 2-11 to 2-13 program scope, 3-48, 3-51 program start-up, 3-53 programming languages, systems, 1-1 programs compiling, 6-3 to 6-19 debugging, 6-49 to 6-50 developing, 6-1 to 6-3 executing, 6-48 to 6-49 online examples, 1-6 to 1-7 organization of, 2-1 to 2-17 prototypes, 3-62, 4-68 See also function prototypes putcO function, 8-7, 8-12, 8-16 putcharO macro, 8-7, 8-9 putsO function, 8-9 putwO function, 8-12 Q -qg compiler option, 6-10 -qp compiler option, 6-10 qsortO function, 4-33, 8-23 qualifiers, data type, 3-2 quiet type conversions, 4-12 quotes double delimiting strings, 2-10 escape code (\"), 2-8 surrounding filenames, 4-104 single, 2-8, 3-8 escape code (\'), 2-8 readO function, 8-26 read-only variables, 3-68 reading files, 8-16 to 8-25 recursive functions, 5-20 redirecting standard input, 6-49 redirecting standard output, 6-49 redundant assignment elimination, 6-37 reference variables, 3-62 to 3-63 declaring, 3-63 passing arguments by reference, 5-11 to 5-14 turning on and off, 6-42 using for cross-language communication, 7-7 using to return values from functions, 5-18 references backward, 5-6 external, resolving, 6-43 forward, 5-6 register storage class specifier, 3-55, 3-56, 5-3 in prototypes, 5-13 register variables, 3-56 registers AO, 5-19 and optimized code, 6-34 controlling use of, 5-19 DO, 5-19 device, device attribute specifier, 3-66 floating-point, 6-27 global allocation of, 6-37 instruction address (lAD DR) , 6-23 preserving, 5-19 used for returning functions, 7-5 relational expressions, 4-115 to 4-118 side effects in, 4-117 relational operators, 4-132 to 4-136 See also comparison operators relative pathnames, 4-104 relocatable code. See position independent code relocation entries, retaining, 6-10 remainder, integer. See modulo division operator R -r compiler option, 6-10 reordering, instruction, 6-38 reserved names, 2-4 range, data types, 3-3 return statement, 4-3, 4-137, 4-137 to 4-139, 5-17 used to exit a switch statement, 4-155 to 4-165 reaching definitions, 6-35 return type, of functions, default, 5-2 random access, 110, 8-21 to 8-25 Index 21 return value of functions, 5-2, 5-17 to 5-19 by reference, 5-18 procedure, 4-140 changing name of, 4-119 rewindO function, 8-12 SEEK_CUR macro, 8-21 right-arrow operator (-». See structure member operator (-» SEEK_END macro, 8-21 right-to-Ieft binding order, 4-9 SET functions, implementing in C, 7-3 Ritchie, Dennis M., 1-1 setbufO function, 8-6 rounding, 4-38 of floating-point expressions, 4-135 setbufferO function, 8-6 row-major order, storage of multidimesnional arrays, 3-39 setvbufO function, 8-6 -runtype compiler option, 6-40 SEEK_SET macro, 8-21 setlinebufferO function, 8-6 shared libraries, compiling with -pic, 6-39 shared memory, volatile attribute specifier, 3-65 s -S compiler option, 6-27 -s compiler option, 6-10 sample programs, online, 1-6 See also examples scalar types, 3-1, 3-2 to 3-3 hierarchy of, 4-14 scaling, pointer arithmetic, 4-125 scanfO function, 8-9 sccs utility, 6-50 scientific notation, 2-7 to 2-8 scope, 3-46, 3-48 to 3-51 block, 3-48, 3-50 to 3-51 file, 3-48, 3-51, 3-61 function, 3-48, 3-51 global, 3-48, 3-51 hierarchy of, 3-48 of functions, 3-61 program, 3-48, 3-51 section attribute specifier, 3-63, 3-69 #section preprocessor directive, 4-140 to 4-142 sections .bss, 7-27 .data, 7-27 data, 4-140 changing name of, 4-119 debug, 4-140 named, 3-60 for global variables, 7-26 overlay, 7-26, 7-32 creating, 7-29 to 7-32 22 Index shift left operator «<) , 4-42 shift operators, bitwise, 4-42 sign preservation, 4-43 shift right operator (») , 4-42 short enum type, 3-3, 3-17 short type, 3-2, 3-6 range, 3-3 representation, 3-7 to 3-8 size, 3-3 side effects, 4-108, 4-109, C-8, D-10 in macro arguments, 4-71 in relational expressions, 4-117 sign reversal operator (-), 4-19 sign-preserving, during bitwise shift operations, 4-43 signatures, function, 2-11 simple statement, 4-2 sinO function, 4-151 built-in version of, 6-47 single quote escape code (\ '), 2-8 single quotes, 2-8, 3-8 single-precision floating-point, 3-11, 3-11 to 3-12 size data types, 3-3 structure, 3-26 sizeof operator, 4-5, 4-143 to 4-144 abstract declarators, 3-41 applied to arrays, 4-26 strings, 2-10 sorting bubble_sortO function, 4-32 qsortO function, 4-33 stderr, 8-8 source files, 2-11, 6-3 line numbers in, 4-111 stdin, 8-8 sqrtO function, built-in version of, 6-47 stdout, 8-8 stack frame, 5-19 storage class, 3-46 to 3-56 duration. See duration function, 3-60 to 3-62 of functions, 5-6 scope. See scope table of, 3-56 stack size, 6-31 standard C library, 6-45 to 6-46 standard devices, 8-8 to 8-10 standard I/O library, 8-4 to 8-25 standard input, redirecting, 6-49 standard output, redirecting, 6-49 standards, for the C language, 1-2 to 1-3 start-up, of programs, 3-53 start-up routine, selecting directory of, 6-13 startup routine, 6-43 statement labels, 4-88 scope of, 3-51 statements, 2-13, 4-1 to 4-3 branching, 4-3 break, 4-3, 4-46 to 4-48, 4-155 compound, 4-2 continue, 4-3, 4-57 to 4-59 do/while, 4-3, 4-72 to 4-73 for, 4-3, 4-83 to 4-87, 4-101, 4-102, 4-121 goto, 4-3, 4-88 to 4-90 if, 4-3, 4-91 to 4-95 labeled, 4-88 looping, 4-3 null, 4-2 return, 4-3, 4-137 to 4-139, 4-155, 5-17 simple, 4-2 switch, 4-3, 4-154 to 4-159 while, 4-3, 4-164 to 4-165 static duration. See fixed duration static storage class specifier, 3-48, 3-51, 3-52, 3-55, 3-61, 5-5 dual meanings of, 3-54 example of, 4-28 stdio.h header file, 8-4, 8-6 storage class specifiers, 2-14, 3-55 to 3-56 auto, 3-55 erroneous use of, 9-4 extern, 3-55, 3-57, 5-5 function allusions, 3-61 omitted, 3-55 register, 3-55, 3-56, 5-3 in prototypes, 5-13 static, 3-48, 3-51, 3-52, 3-55, 3-61, 5-5 dual meanings of, 3-54 strcatO function, built-in version of, 6-47 strcmpO function, built-in version of, 6-47 strcpyO function, built-in version of, 6-47 streams, 8-3 to 8-4 extensible, 8-3 string constants, 2-10 to 2-12 string.h header file, built-in routines for, 6-47 strings, 3-40 to 3-41 constant, 2-10 to 2-12 converted to pointer, 2-10 maximum size, 2-10 initializing, 3-40 maximum size of, 9-3 memory allocation, 3-41 null, 2-10 passing from C to Pascal, 7-9 to 7-10 size of, 9-18 terminating, 9-3 uppercase and lowercase, 2-5 used to initialize char arrays, 3-36 strings.h header file, built-in routines for, 6-47 status codes, returned by system routines, 7-36 strip utility, 6-10 STATUS_$T type, 7-36 strlenO function, built-in version of, 6-47 -std compiler option, 4-148, 6-40 strncatO function, built-in version of, 6-47 std_$call reserved word, 7-2, E-1 to E-27 _STDC_ predefined name, 4-15, 4-98, 4-145, 6-42 strncpyO function, built-in version of, 6-47 Stroustrup, Bjarne, 3-62 structure member operator (.), 4-146 Index 23 structure member operator (-», 4-147 systype macro, 4-15 structure members, 4-146 to 4-153 alignment, 3-25, 3-31 layout, 3-25 nested, 4-147 referencing, 4-152 systype predefined macro, 4-160 to 4-162 structures, 3-22 to 3-34 alignment, 3-24 to 3-25 array of, 3-28 assigning values to, 4-147 bit fields, 3-31 to 3-32 declaring, 3-23 to 3-24 initializing, 3-33 members of. See structure members memory allocation, 3-27 to 3-29 name space, 3-32 names of, 2-16 operations on, 4-146 to 4-153 passing as arguments, 5-10 passing as function arguments, 4-148 to 4-149 vs. passing arrays, 4-150 pointers to, 4-147 referencing each other, 3-24 representation, 3-24 to 3-29 returning from functions, 4-150 self-referential, 3-24 size, 3-4, 3-26 #systype preprocessor directive, 4-160 to 4-162, 6-40 systypes, 6-40 designating at compile-time, 6-40 T -T compiler option, 6-11, 6-40 to 6-42 -t compiler option, 6-11 tabs horizontal, escape code (\t) , 2-8 vertical, escape code (\ v), 2-8 tag names, 3-23 name space, 2-16 tags. See tag names tan () function, 4-151 built-in version of, 6-47 target cpu, compiling for, 6-22 tb utility. See traceback utility .text section, 4-140 Thompson, Ken, 1-1 subexpressions, 4-13, D-10 common, 6-35 time, of program compilation, 4-60 subtraction operator (-), 4-19 and pointers, 4-124 tokens, 2-1 suffixes, filename. See filename suffixes traceback utility (tb) , 6-51 switch statement, 4-3, 4-154 to 4-159 symbol table, undefined symbols in, 6-11 true values, 4-133 and logical operators, 4-115 symbolic map. See map files two's-complement notation, 3-7 symbols, predefined. See predefined names type checking, C-5, D-6 sys5 systype, 6-41 default systype, 4-161 -type compiler option, 6-42 sys5.3 systype, 6-41 system libraries, 6-44 linking to, 7-36 system service routines, 7-35 to 7-36 systems programming language, 1-1 -systype compiler option, 4-161, 6-40 to 6-42 SYSTYPE environment variable, 6-42 24 Index _TIME_ predefined name, 4-15, 4-60 top-level declarations, 3-46 type conversions, 4-12 to 4-14 and initialization, 3-5, 3-9 arithmetic, 4-12 array to pointer, 4-24, 4-27 assignment, 4-12, 4-36 to 4-82 automatic, 4-12 casts, 4-49 to 4-53 floating-point to integer, 4-21 implicit, 4-12 integer, table of, 4-50 integer widening, 4-50 integral promotion. See integral widening conversions integral widening, 4-12 quiet, 4-12 rounding, 4-38 UNIX compiling for different versions, 6-40 different versions of, 4-160 echo program, 5-26 executing programs in, 6-48 type managers. See lOS type managers unlinkO function, 8-26 type specifiers, char, 3-8 unnamed bit fields, 3-31 type-checking none for macro arguments, 4-67 of function arguments, 5-12 of function return values, 4-137 unsigned char type, representation, 3-8 to 3-9 typedef declarations, 2-14 to 2-16 and arrays, 2-16 used to simplify declarations, 3-42 unsigned int type, representation, 3-6 to 3-7 unsigned integers, casting, 4-51 unsigned short type, representation, 3-7 to 3-8 typedef keyword, 2-14 unsigned type, 3-2, 3-6 integer overflow, 3-10 range, 3-3 size, 3-3 types. See data types unused functions, C-3, D-3 unused variables, C-3, D-3 u -U compiler option, 6-11 user-level blocks, 8-5 /usr/include directory, 4-104, 6-28, 6-45, 7-35 /usr/lib/o directory, 6-6 -u compiler option, 6-11 v -uline compiler option, 6-42 unary operators, 4-13 unbuffered 110, 8-25 to 8-27 underscore C), used in identifiers, 2-4 -v compiler option, 6-11 /* VARARGS */, lint comment, C-ll, D-2 /* VARARGS2 */, lint comment, C-12 undersore, appended to FORTRAN routine names, 7-14 variable length record files, 8-3 ungetcO function, 8-12 variable names, 2-14 union members, 4-146 to 4-153 nested, 4-147 referencing, 4-152 unions, 3-22 to 3-34 assigning values to, 4-147 bit fields, 3-31 to 3-32 declaring, 3-23 to 3-24, 3-29 initializing, 3-33 to 3-35 members of. See union members memory allocation, 3-30 name space, 3-32 names of, 2-16 operations on, 4-146 to 4-153 passing as arguments, 5-10 referencing each other, 3-24 representation, 3-29 to 3-30 size, 3-4 variable number of arguments, 5-15 variables integer, initializing, 3-9 to 3-10 list of, 6-32 names of, 3-49 reference. See reference variables unused, C-3, D-3 version selector, systype, 4-161 vertical tab escape code (\v) , 2-8 virtual address. See address virtual memory, 6-39 visibility, of names, 3-50 void, pointers to. See generic pointers void type, 3-2, 3-18 to 3-19 casting, 4-49 Index 25 functions returning, 4-138 illegal with arrays, 3-35 used as function return type, 5-2 volatile attribute specifier, 3-63, 3-64 to 3-66 word alignment, 3-25, 3-27 writeO function, 8-26 write-only variables, 3-68 writing to files, 8-16 to 8-25 w -w compiler option, 6-12 -w compiler option, 6-43 -warn compiler option, 6-43 warning messages, 9-1 to 9-37 compilation, 6-30 suppressing, 6-43 while loops, 4-84 x X3J11 Technical Committee, 1-2 -x compiler option, 6-12 XOR operator. See exclusive OR operator y while statement, 4-3, 4-164 to 4-165 -y compiler option, 1-3, 6-12 white space, 2-2 yacc utility, 4-113 26 Index Reader's Response Please take a few minutes to send us the information we need to revise and improve our manuals fron your point of view. Document Title: Domain C Language Reference Order No.: 002093-AOO Date of Publication: July, 1988 What type of user are you? _ _ System programmer; language _ _ Applications programmer; language _ _ _ _ _ _ _ _ __ _ _ System maintenance person _ _ Manager/Professional _ _ System Administrator Technical Professional _ _ Student Programmer Novice Other How often do you use the Domain system? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ What parts of the manual are especially useful for the job you are doing? _ _ _ _ _ _ _ _ _ _ __ What additional information would you like the manual to include? _ _ _ _ _ _ _ _ _ _ _ _ _ __ Please list any errors, omissions, or problem areas in the manual. (Identify errors by page, section, figure, or table number wherever possible. Specify additional index entries.) _ _ _ _ _ _ _ _ _ _ _ _ __ Your Name Date Organization Street Address City No postage necessary if mailed in the U.S. State Zip o SO ..., Q: Co III 0" :J CO Co o ::\ 11) Co :J 11) 1111 NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST CLASS PERMIT NO. 78 CHELMSFORD, MA 01824 POSTAGE WILL BE PAID BY ADDRESSEE APOLLO COMPUTER INC. Technical Publications P.O. Box 451 Chelmsford, MA 01824 \ \ \ \ \ \ \ \ \ \ \ \ --------------------------------------------------------------------------------------, )LD Reader's Response Please take a few minutes to send us the information we need to revise and improve our manuals from your point of view. Document Title: Domain C Language Reference Order No.: 002093-AOO Date of Publication: July, 1988 What type of user are you? _ _ System programmer; language _ _ Applications prograr.lmer; language _ _ _ _ _ _ _ _ __ _ _ System maintenance person _ _ Manager/Professional _ _ System Administrator Technical Professional _ _ Student Programmer Novice Other How often do you use the Domain system? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ What parts of the manual are especially useful for the job you are doing? _ _ _ _ _ _ _ _ _ _ __ What additional information would you like the manual to include? _ _ _ _ _ _ _ _ _ _ _ _ _ __ Please list any errors, omissions, or problem areas in the manual. (Identify errors by page, section, figure, or table number wherever possible. Specify additional index entries.) _ _ _ _ _ _ _ _ _ _ _ _ __ Your Name Date Organization Street Address City No postage necessary if mailed in the U.S. State Zip o ~ ...o 0' 0: DI 0" :::J (Q .. 0. ~ CD 0. :::J CD ______________________________________________________________________________________ J .0 I"III NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST CLASS PERMIT NO. 78 CHELMSFORD, MA 01824 POSTAGE WILL BE PAID BY ADDRESSEE APOLLO COMPUTER INC. Technical Publications P.O. Box 451 Chelmsford, MA 01824 .-------------------------------------------------------------------------------------4 LO Reader's Response Please take a few minutes to send us the information we need to revise and improve our manuals frOJ your point of view. Document Title: Domain C Language Reference Order No.: 002093-AOO Date of Publication: July, 1988 What type of user are you? _ _ System programmer; language _ _ Applications programmer; language _ _ _ _ _ _ _ _ __ _ _ System maintenance person _ _ Manager/Professional _ _ System Administrat:>r Technical Professional _ _ Student Programmer Novice Other How often do you use the Domain system? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ What parts of the manual are especially useful for the job you are doing? _ _ _ _ _ _ _ _ _ _ __ What additional information would you like the manual to include? _ _ _ _ _ _ _ _ _ _ _ _ _ __ Please list any errors, omissions, or problem areas in the manual. (Identify errors by page, section, figure. or table number wherever possible. Specify additional index entries.) _ _ _ _ _ _ _ _ _ _ _ _ __ Your Name Date Organization Street Address City No postage necessary if mailed in the U.S. State Zip () s o -. 0' a: I» 0' ::J IQ C. o CD C. ::J CD " II NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST CLASS PERMIT NO. 78 CHELMSFORD, MA 01824 POSTAGE WILL BE PAID BY ADDRESSEE APOLLO COMPUTER INC. Technical Publications P.O. Box 451 Chelmsford, MA 01824 --------------------------------------------------------------------------------------, LD , \111\~ml~~I\\lmi\I\\II\i~1
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19 Create Date : 2015:11:13 17:50:35-08:00 Modify Date : 2015:11:13 17:41:19-08:00 Metadata Date : 2015:11:13 17:41:19-08:00 Producer : Adobe Acrobat 9.0 Paper Capture Plug-in Format : application/pdf Document ID : uuid:514995eb-9610-8c44-ac2c-b4a37ccaf0f1 Instance ID : uuid:8b43a883-82fc-d44e-8abc-cd3113770988 Page Layout : SinglePage Page Mode : UseNone Page Count : 554EXIF Metadata provided by EXIF.tools