/*************************************************************************/ TSM_UTIL.TXT TURBO ASSEMBLER 4.0 This file documents the utilities that can be used with TASM 4.0. The following utilities are documented in this file: 1) H2ASH and H2ASH32 2) TCREF 1) Online Documentation for H2ASH and H2ASH32 ============================================== C/C++ modules within a program typically share definitions of types and data structures among themselves. They do this by including small files called header files, which contain type definitions and data structure definitions used by more than one module. A C/C++ header file usually has a name ending in H. Similarly, assembly modules use header files with names that end in .ASH (or .INC). Programs containing modules written in both C or C++ and modules written using Turbo Assembler must be able to share definitions of types and data structures. The H2ASH converter utility lets this happen. The converter has the following limitations: -- All executable statements are ignored; no code output is generated. -- #INCLUDEs are fully expanded in the output. -- Name conflicts can occur in the assembly output since C++ scoping rules differ greatly from assembly rules; using Ideal mode output can help. -- Multiple Inheritance, Virtual Base Classes, and Templates are not supported in this version of H2ASH. The H2ASH converter not only smooths the interface between C/C++ and assembly language, it also lets you write C++ base classes directly in assembly language (for maximum performance). H2ASH automatically converts base class declarations in C++ into equivalent assembly object definitions, and defines method names. We recommend that you use the following procedure for writing base classes in assembly language: 1) Write the class declaration in a C/C++ header file. 2) Include the C header file in all descendant classes, and in the C/C++ modules that use the base class. 3) Use the H2ASH converter on the header file to produce an assembly object header file; you can automate this by setting up a makefile. 4) Include the assembly object header file in the module where the assembly language methods are written. Command Line Interface for H2ASH -------------------------------- You can invoke H2ASH by typing the following command on the command line: H2ASH [[switches] [ ... ]] H2ASH triggers the printing of a help screen. By default, H2ASH accepts a C/C++ header file, and produces a Turbo Assembler (TASM) 3.0 compatible assembly include file with extension .ASH. The resulting file is not a self-contained assembly language program; rather, it contains TASM type declarations and EQU statements that correspond to declarations in the original C/C++ header file. H2ASH recognizes wild cards for filename matching. All H2ASH-specific command line switches are a submode of the -q switch, that is they all have the following form: -q[] [-] H2ASH also recognizes a majority of the Borland C++ command-line switches. In particular, word-alignment (-a), force C++ compile (-P), set output file name (-o), define symbol (-D), and the compilation model directives (-mt, -ms, -ml, ...), are all supported. For a definitive description of the BCC command-line interface, see the Borland C++ User's Guide. H2ASH allows the following switches: -q- turn off the H2ASH converter -qc pass c/c++ comments into the .ASH file -qn enable name-mangling -qf a 'strict' mode that strips out all untranslatable source code (as opposed to the default behavior of issuing a warning and partially translating the declaration in question) -qi generate Ideal mode code, as opposed to Masm mode code -qd generate EQUs for #DEFINE statements which appear on the command-line, in project configuration file, or in the original source program -qb generate EQUs for #DEFINE statements which are normally generated automatically by the compiler -qt generate TASM directives as needed (H2ASH will emit .DATA and .MODEL directives whenever static declarations appear in the original C/C++ header file. The IDEAL directive is generated whenever Ideal mode is used) -qs pass original source code into the .ASH file Processing #DEFINE directives and macros ---------------------------------------- H2ASH accepts C preprocessor #DEFINE directives and generates a corresponding EQU directive containing the symbol name and its associated value. Hexadecimal and octal constants force H2ASH to append an 'h' or 'q' radix specifier onto the end of the constant value. Decimal constants are passed directly to the output stream, while floating point numbers are treated as text literals. Here's an example of input and output: ---- H2ASH input ---- #define ONE 1 #define NEG_ONE -1 #define OCTAL_ONE 01 #define HEX_ONE 0x1 #define FLOAT_ONE 1.000 ---- H2ASH output ---- ONE EQU 1 NEG_ONE EQU <-1> OCTAL_ONE EQU 01q HEX_ONE EQU 01h FLOAT_ONE EQU <1.000> All other macro #DEFINEs generate text EQUs into the H2ASH output stream. However, if the #DEFINE contains any macro arguments, a warning is issued and the macro in question is not processed. For example, ---- H2ASH input ---- #define PI 22/7 #define LOOP_FOREVER while(1) { #define seq(s,t) strcmp(s,t) ---- H2ASH output ---- PI EQU <22/7> LOOP_FOREVER EQU H2ASH does not directly translate C preprocessor conditional #IF/#ELSE/#ENDIF directives into their TASM equivalents. Rather, it evaluates the constant-expression that is the argument to the conditional directive, and then passes the appropriate clause of the conditional into the output stream, as follows: ---- H2ASH input ---- #if 0 #define PI 3.14159 #else #define PI 22/7 #endif ---- H2ASH output ---- ZOO PI <22/7> Processing Simple Declarations ------------------------------ H2ASH does not process declarations that have storage class 'auto'; that is, any C/C++ declarations contained in compound-blocks are ignored. However, declarations that have static linkage are translated into TASM 'DB', 'DW', 'DD', .... statements, depending on the size or type of the variable. Further- more, declarations that have external linkage are translated into TASM 'GLOBAL' statements with an appropriate type specifier that is dependent on the size of the C/C++ variable. H2ASH ignores all initializes for declara- tions if the initializer expression is not a constant expression. If this situation occurs, H2ASH issues a warning message. For example, ---- H2ASH input ---- char a_char; unsigned char unsigned_char; signed char signed_char; int a_integer; unsigned int unsigned_integer; signed int signed_integer; long a_long; unsigned long unsigned_long; signed long signed_long; float a_float; double a_double; long double long_double; ---- H2ASH output ---- GLOBAL C a_char :BYTE GLOBAL C unsigned_char :BYTE GLOBAL C signed_char :BYTE GLOBAL C a_integer :WORD GLOBAL C unsigned_integer :WORD GLOBAL C signed_integer :WORD GLOBAL C a_long :DWORD GLOBAL C unsigned_long :DWORD GLOBAL C signed_long :DWORD GLOBAL C a_float :PWORD GLOBAL C a_double :QWORD GLOBAL C long_double :TBYTE H2ASH ignores the type modifiers 'const' and 'volatile'. However, the linkage specifiers 'pascal' and 'cdecl', do generate the corresponding TASM equivalents. Static variables, as required by the ANSI C standard, generate a default zero initializer expression of the appropriate width for the data type in question. ---- H2ASH input ---- const int const_int; volatile int volatile_int; static int static_int; static double static_double; extern int extern_int; int pascal int_pascal; int cdecl int_cdecl; ---- H2ASH output ---- GLOBAL C const_int :WORD GLOBAL C volatile_int :WORD .MODEL SMALL .DATA static_int DW 0 static_double DQ 0.0 GLOBAL C extern_int :WORD GLOBAL PASCAL int_pascal :WORD GLOBAL C int_cdecl :WORD Processing Pointer/Array Declarations ------------------------------------- Pointer declarations are handled in a similar manner to basic variable declarations, except that the 'PTR' type specifier is introduced for each level of pointer/reference indirection. Note that any initializer that requires the creation of a temporary variable, or is not a constant expression is ignored. For example, ---- H2ASH input ---- char *v1; int **v2; char far *v3; char near *v4; ---- H2ASH output ---- GLOBAL C v1 :NEAR PTR BYTE GLOBAL C v2 :NEAR PTR NEAR PTR WORD GLOBAL C v3 :FAR PTR BYTE GLOBAL C v4 :NEAR PTR BYTE Arrays are treated like basic variable declarations, where the type specifier consists of the element type of the array, followed by an optional repetition field that contains the dimension of the array. If the array is an incomplete declaration, (that is, if it's dimension is not fully specified) the repetition field is not passed into the output stream. For example, ---- H2ASH input ---- extern char a1[]; extern char a2[10]; extern char a3[][10]; extern char a4[10][20]; ---- H2ASH output ---- GLOBAL C a1: BYTE GLOBAL C a2: BYTE : 10 GLOBAL C a3: BYTE GLOBAL C a4: BYTE : 200 Processing TYPEDEF Declarations ------------------------------- C/C++ TYPEDEF declarations are mapped directly onto their equivalent TASM counterpart. When the typedef alias (the name of the typedef) is used as the type specifier in another declaration, the actual type of the typedef, as op- posed to the typedef alias itself, is emitted into the output stream. For example, ---- H2ASH input ---- typedef unsigned long ul; typedef ul alias_ul; alias_ul ul_object; ---- H2ASH output ---- ul TYPEDEF DWORD alias_ul TYPEDEF DWORD GLOBAL C ul_object :DWORD Processing ENUM Declarations ---------------------------- C/C++ enumeration types are converted directly into TASM 'ENUM' defini- tions. Each enumerator is given an explicit initializer if the value of the enumerator changes by more than one unit from the previously processed enum- erator, or if the enumerator already has an explicit initializer in the C/C++ source code. As usual, this initializer must contain a constant- expression. H2ASH automatically synthesizes a tag name if the enumeration does not contain a tag. This generated name is always unique within the compilation unit it occurs in, and can be in subsequent variable declarations. For example, ---- H2ASH input ---- enum FOO { a, b, c = -1, d = sizeof(enum FOO)}; enum { north, east, south, west }; ---- H2ASH output ---- FOO ENUM { a, b, c=-1, d=2, pad$0=256} tag$0 ENUM { north, east, south, west, pad$1=256} Finally, if the enumeration is not representable within a declaration of size BYTE, H2ASH generates an extra enumerator (which has a unique name), and appends it to the end of the enumeration list. This enumerator is used to induce TASM to pack the enumeration within a data type of size WORD. This situation can occur whenever you use the -b option, or whenever the value of an enumerator exceeds a value representable within a BYTE. For example, ---- H2ASH input ---- enum direction { north, east = 254, south, west }; enum color { red, green = 256, blue }; ---- H2ASH output ---- direction ENUM { north, east=254, south, west, pad$0=256} color ENUM { red, green=256, blue} Processing Simple Aggregate Declarations ---------------------------------------- Structures and unions are mapped by H2ASH onto their direct TASM STRUC and UNION counterparts. If a tag is not present in the aggregate declaration, H2ASH automatically generates one. The GLOBAL directive is used to represent an instance of the structure, where the type-field contains that tag name of the structure or union in question. For example, ---- H2ASH input ---- struct S1 { int field1; int field2; } S1_object; union U1 { int field1; int field2; int (*field3)(void); } U1_object; ---- H2ASH output ---- S1 STRUC @S1@field1 DW ? @S1@field2 DW ? S1 ENDS GLOBAL C S1_object :S1 U1 UNION @U1@field1 DW ? @U1@field2 DW ? @U1@field3 DW NEAR PTR ? U1 ENDS GLOBAL C U1_object :U1 Using the word-alignment (-a) option lets H2ASH explicitly output appropriate padding. For the exact structure-alignment rules, see the C++ Programmer's Guide. For example, ---- H2ASH input ---- struct S2 { char field1; int field2; }; ---- H2ASH output ---- S2 STRUC @S2@field1 DB ? DB 0 @S2@field2 DW ? S2 ENDS H2ASH treats nested structures as separate TASM STRUC declarations, and gives the nested structures specially mangled names. This is done because nested structures are not true members; they are merely in the lexical scope of the structure in which they are contained. However, if a nested structure is instantiated (for example, an object of the type of that structure is declared), the appropriate space is reserved in the enclosing structure as in the following example: ---- H2ASH input ---- struct a { int i; struct b { int j; }; int k; struct c { int l; } c_instance; }; ---- H2ASH output ---- @a@b STRUC @a@b@j DW ? @a@b ENDS @a@c STRUC @a@c@l DW ? @a@c ENDS a STRUC @a@i DW ? @a@k DW ? @a@c_instance @a@c <> a ENDS Processing Bitfield Declarations -------------------------------- Bitfields are represented by H2ASH as TASM RECORD declarations. In general, it generates two kinds of RECORDS. First, if a STRUCT contains only bitfield declarations and if each of the bitfields are of the same type, and if this type is sufficiently large to contain all of the fields, H2ASH generates a global RECORD declaration. H2ASH inserts a uniquely named padding field at the top of the RECORD so that proper alignment is achieved. For example, ---- H2ASH input ---- struct S1 { int field1: 1; int field2: 4; }; ---- H2ASH output ---- S1 RECORD { pad$0 :11 field2 :4 field1 :1 } If H2ASH is unable to generate a global RECORD declaration, a local one is emitted within the STRUC in which the original bitfield is declared. Furthermore, a member with the type of the tag of the local RECORD declaration is declared to reserve the appropriate amount of storage. H2ASH attempts to pack adjacent bitfields within a single record when adjacent bitfields are of the same type, and whenever the type of the bitfield is large enough to contain the adjacent fields. For example, ---- H2ASH input ---- struct S1 { int field1; char field2: 1; char field3: 8; // NOTE: cannot be packed with field2 }; ---- H2ASH output ---- S1 STRUC @S1@field1 DW ? S1 RECORD { pad$0 :7 field2 :1 } $bit$0 S1 <> S1 RECORD { field3 :8 } $bit$1 S1 <> S1 ENDS Processing Function/Operator Declarations ----------------------------------------- File scope function declarations are emitted as GLOBAL declarations with type NEAR or FAR, depending on the model settings. Prototypes for func- tions are ignored, because TASM does not support typed CALLs or typed PROTO statements. However, the prototype is encrypted by the mangled name of the function. For example, ---- H2ASH input ---- void fvv(void); extern efvv(void); int fiii(int, int); ---- H2ASH output ---- GLOBAL C @fvv$qv :NEAR ;or far, depending on model GLOBAL C @efvv$qv :NEAR ;or far, depending on model GLOBAL C @fiii$qii :NEAR ;or far, depending on model H2ASH ignores default arguments, as well as all function bodies. In both cases, H2ASH issues a warning and processes that declaration as if the default arguments and function bodies were not present. H2ASH also ignores static function declarations. In this case, the declaration is not emitted into the output stream. Here's an example: ---- H2ASH input ---- static sfvv(void); void dfii(int i = 0, int = sizeof(foo)); int fvv(int i) { return ++i; } ---- H2ASH output ---- ; warning, declaration of static function 'sfvv(...)' ignored void C @dfii$qii :NEAR ; warning, default arguments ignored void C @fvv$qi :NEAR ; warning, function body ignored H2ASH supports function and operator overloading by encoding function prototypes and operator names. Otherwise, H2ASH treats these declarations in exactly the same manner as ordinary functions. For example, ---- H2ASH input ---- int abs(int, int); int abs(int); struct alpha; int operator+(alpha, int); int operator+(int, alpha); ---- H2ASH output ---- GLOBAL C @abs$qii :NEAR GLOBAL C @abs$qi :NEAR GLOBAL C @$badd$q5alphai :NEAR GLOBAL C @$badd$qi5alpha :NEAR Processing Classes ------------------ C++ classes are mapped onto TASM STRUC declarations, just as C STRUCTS are. Nonstatic class data members are treated as ordinary STRUCT fields. Static class data members are treated as GLOBAL declarations, and are emitted after the class declaration in which they are declared. Nonvirtual function members are treated exactly like ordinary function definitions, except, that they receive a mangled name that encodes the tagname of class in which they are contained. Virtual function members are treated exactly like nonvirtual functions, except that they force H2ASH to allocate space for a virtual- function-table-pointer. This pointer has a mangled name containing the suffix 'vptr$', which is always unique throughout a single compilation unit. Finally, all 'special' member functions (constructors, copy constructors, assignment operators), are treated as ordinary function declarations. However, if they are compiler-synthesized members, H2ASH does not emit them. For example, ---- H2ASH input ---- class c { static int static_member; int normal_member; public: virtual void f1_virtual(void); virtual void f2_virtual(void); void f3_normal(void); void *operator ++(); c(); c(int&); ~c(); }; ---- H2ASH output ---- c STRUC @c@vptr$0 DW NEAR PTR ? @c@normal_member DW ? c ENDS GLOBAL C @c@static_member :WORD GLOBAL C @c@f1_virtual$qv :NEAR GLOBAL C @c@f2_virtual$qv :NEAR GLOBAL C @c@f3_normal$qv :NEAR GLOBAL C @c@$binc$qv :NEAR GLOBAL C @c@$bctr$qv :NEAR GLOBAL C @c@$bctr$qri :NEAR GLOBAL C @c@$bdtr$qv :NEAR H2ASH supports single inheritance. If a program using multiple inheri- tance is presented as input, or if virtual bases are present, H2ASH terminates and gives an appropriate error message. Within a derived class, a base class is represented as a member subobject, and is treated by H2ASH as if it were an ordinary member with a specially synthesized name: 'subobject$' which has the type of the base class in question. Again, this name is always unique. Virtual function table pointers are shared between the base and derived class; hence, no further space is allocated for the virtual-pointer within the derived class. For example, adding a derived class to the previous example: ---- H2ASH input ---- // previous definition for class c goes here. class d : c { int derived_member; virtual void f1_virtual(void); // virtual override of c::f1_virtual() d(); ~d(); }; ---- H2ASH output ---- d STRUC subobject$0 c <> @d@derived_member DW ? d ENDS GLOBAL C @d@f1_virtual$qv :NEAR GLOBAL C @d@$bctr$qv :NEAR GLOBAL C @d@$bdtr$qv :NEAR Pointers to class members are also supported, both pointers to data members and pointers to function members. Here's a simple example: ---- H2ASH input ---- class f{ public: int x; int * px; int foo( char * ) { return px } }; int f::*pointer_to_data_member; int (f::*pointer_to_function)(char *); ---- H2ASH output ---- f STRUC @f@x DW ? @f@px DW NEAR PTR ? f ENDS GLOBAL C @f@foo$qpzc :NEAR vb_data$mptr STRUC vb_data$member_offset DW 0 vb_data$vbcptr_offset DW 0 vb_data$mptr ENDS GLOBAL C pointer_to_data_member :vb_data$mptr vb_near_func$mptr STRUC vb_near_func$func_addr DW 0 vb_near_func$member_offset DW 0 vb_near_func$vbcptr_offset DW 0 vb_near_func$mptr ENDS GLOBAL C pointer_to_function :vb_near_func$mptr C++ Templates are not supported in this version of H2ASH. If a program containing templates is given as input, H2ASH outputs an error and terminates execution. 2) TCREF: The source module cross-reference utility =================================================== TABLE OF CONTENTS ----------------- 1. What is TCREF? 2. How to Use TCREF 3. Compatibility with TLINK 4. TCREF Options 5. TCREF Reports 1. WHAT IS TCREF? ----------------- TCREF is designed to produce two reports: a cross-reference list of where all global symbols are used and defined, and a list of individual modules and the symbols used within them. 2. HOW TO USE TCREF ------------------- TCREF accepts as input a group of .XRF files produced by TASM. These files contain cross-reference information for individual modules. From these input files, a single .REF file is produced that contains one or more reports in ASCII text. The command format follows: TCREF ',' For example, the following would take the FOO1.XRF, FOO2.XRF, and FOO3.XRF as input files and produce FOO.REF: TCREF foo1+foo2+foo3,foo Response files --------------- TCREF also accepts ASCII files as command strings. Simply precede the file name with an @@ sign to include a file in the command string. For example, TCREF @@dofoo where DOFOO contains foo1+foo2+foo3,foo will do the same thing as the previous example. 3. COMPATIBILITY WITH TLINK --------------------------- TCREF accepts command strings that TLINK accepts. TCREF ignores any irrelevant switches and fields, such as any libraries or MAP files, or switches that pertain only to the linker function. Similarly, if an .XRF file cannot be located, TCREF will simply ignore it. Beware! When using a TLINK response file, don't explicitly specify file extensions, since doing so will override TCREF's internal defaults and possibly result in disaster. For example, if the response file reads as foo1+foo2+foo3,foo.exe you should not use this file without modification with TCREF because the .REF file it creates will be named FOO.EXE, presumably overwriting your program. 4. TCREF OPTIONS ---------------- TCREF accepts all the switches present in TLINK, but most of them are discarded. TCREF only uses these options: /c makes GLOBAL report case-sensitive. /r generates LOCAL reports for all the specified modules. /p# sets report page length to # lines. /w# sets report page width to # columns. 5. TCREF REPORTS ---------------- TCREF takes great care to make semantic sense of symbols. Cross-reference information is useless when symbols with the same name but different meanings are lumped together. TCREF therefore takes into account the SCOPE of a symbol when producing its reports. Cross-reference information is always listed for the source file and source line number. Global (or linker-scope) report ------------------------------- TCREF's global report lists cross-reference information for global symbols as they appear to the linker. Use the /c switch if you want to produce case-sensitive reports. In this report, global symbols appear alphabetically in the left column. References, organized by source file, are listed in the right column. Wherever #'s appear indicates that definition occurs at that line. Here's an example symbol printout: Global Symbols Cref # = definition BAR TEST.ASM: 1 3 6 9 12 15 18 + 21 23 29 # TEST2.ASM: 2 4 6 #8 What does this tell you? The leading # sign before the TEST2.ASM indicates that BAR was defined somewhere in that module. For each source file, the source line at which the reference occurred is listed. This list can occupy more than one line, as in the case of the lines for TEST.ASM. The + character indicates that wrap has occurred. Finally, the # sign before the 8 indicates that a definition of BAR occurred in TEST2.ASM on line 8. The local (or module-scope) report ---------------------------------- If you specify /r on the command line, a local report will be made for each module. It will contain all the symbols used in that module, listed in alphabetical order. The /c switch will have no effect on these reports, since the appropriate case-sensitivity has already been determined at assembly time. Like global reports, references are organized by source file in the right column. A sample printout looks like this: Module TEST.ASM Symbols Cref # = definition UGH TEST.ASM: 1 3 6 9 12 15 18 + 21 23 29 # UGH.INC: #2 /**************************** END OF FILE ********************************/