9. Mangled identifier names

  1. 9.1. Mangling identifier names
  2. 9.2. Mangling namespace names
  3. 9.3. Mangling types
  4. 9.4. Other mangled names
  5. 9.5. Mangled name examples

In a similar fashion to other C++ compilers, the C++ producer needs a method of mapping C++ identifiers to a form suitable for further processing, namely TDF tag names. This mangled name contains an encoding of the identifier name, its parent namespace or class and its type. Identifiers with C linkage are not mangled. The producer contains a built-in name unmangler which performs the reverse operation of transforming the mangled form of an identifier name back to the underlying identifier. This can be useful when analysing system linker errors.

Note that the type of an identifier forms part of its mangled name not only for functions, but also for variables. Many other compilers do not mangle variable names, however the ISO C++ rules on namespaces and variables with C linkage make it necessary (this can be suppressed using the -j-n command-line option). Declaring the language linkage of a variable inconsistently can therefore lead to linking errors with the C++ producer which are not detected by other compilers. A common example is:

extern int errno ;

which, leaving aside whether errno is actually an external variable, should be:

extern "C" int errno ;

As described above, the mangled form of an identifier has three components; the identifier name, the identifier namespace and the identifier type. Two underscores (__) are used to separate the name component from the namespace and type components. The mangling scheme used is based on that described in the ARM. The description below is not complete; the mangling and unmangling routines themselves should be consulted for a complete description.

9.1. Mangling identifier names

Simple identifier names are mapped to themselves. Unicode characters of the forms \uxxxx and \Uxxxxxxxx are mapped to __kxxxx and __Kxxxxxxxx respectively, where the hex digits are output in their canonical lower-case form. Constructors are mapped to __ct and destructors to __dt. Conversions functions are mapped to __optype where type is the mangled form of the conversion type. Overloaded operator functions, operator@, are mapped as follows:

OperatorMappingOperatorMappingOperatorMapping
&__ad&=__aad[]__vc
->__rf->*__rm=__as
,__cm~__co/__dv
/=__adv==__eq()__cl
>__gt>=__ge<__lt
<=__le&&__aa||__oo
<<__ls<<=__als-__mi
-=__ami--__mm!__nt
!=__ne|__or|=__aor
+__pl+=__apl++__pp
%__md%=__amd>>__rs
>>=__ars*__ml*=__aml
^__er^=__aerdelete__dl
delete []__vdnew__nwnew []__vn
?:__cn:__cs::__cc
.__df.*__dmabs__ab
max__mxmin__mnsizeof__sz
typeid__tdvtable__tb

Note that this table contains a number of operators which are not part of C++ or cannot be overloaded in C++. These are used in the representation of target dependent integer constants.

9.2. Mangling namespace names

The global namespace is mapped to an empty string. Simple namespace and class names are mapped as above, but are preceded by a series of decimal digits giving the length of the mangled name. Nested namespaces and classes are represented by a sequence of such namespace names, preceded by the number of elements in the sequence. This takes the form Qdigit if there are less than 10 elements, or Q_digits_ if there are more than 10. Note that members of anonymous classes or namespaces are local to their translation unit, and so do not have external tag names.

9.3. Mangling types

The mangling of types is essentially similar to that used in the tdfc2dump symbol table dump format. The type used in the mangled name for an identifier ignores the return type for a function and ignores the most significant bound for an array.

The built-in types are mapped in precisely the same way as in the symbol table dump. Class and enumeration types are mapped to their type names mangled in the same way as the namespace names above. The exception to this is that in a class member, the parent class is mapped to X.

The composite types are again mapped in a similar fashion to that in the dump file. For example, PCc represents const char *. The only difficult case concerns function parameter types where the ARM T and N encodings are used for duplicate parameter types. The function return type is included in the mangled form except for function identifier types. In the cases where the identifier is known always to represent a function (constructors, destructors etc.) the initial F indicating a function type is also omitted.

The types of template functions and classes are represented by the underlying template and the template arguments giving rise to the instance. Template classes are preceded by t; template functions are preceded by G rather than F. Type arguments are represented by Z followed by the type value; non-type arguments are represented by the argument type followed by the argument value. In the underlying type the template parameters are represented by m0, m1 etc. An alternative scheme, in which the mangled form of a template function includes the type of that instance, rather than the underlying template, can be enabled using the -j-f command-line option.

9.4. Other mangled names

The virtual function table for a class, when this is a variable with external linkage, is named __vt__type , where type is the mangled form of the class name. The virtual function table for a base class is named __vt__base where base is a sequence of mangled class names specifying the base class. The run-time type information structure for a type, when this is a variable with external linkage, is named __ti__type, where type is the mangled form of the type name.

9.5. Mangled name examples

The following gives some examples of the name mangling scheme:

class A {
    static int a ;               // a__1Ai
public :
    A () ;                       // __ct__1A
    A ( int ) ;                  // __ct__1Ai
    A ( const A & ) ;            // __ct__1ARCX
    virtual ~A () ;              // __dt__1A
    operator bool () ;           // __opb__1A
    bool operator! () ;          // __nt__1A
} ;

// virtual function table        __vt__1A
// run-time type information     __ti__1A

int f ( A *, int, A * ) ;        // f__FP1AiT1
int b = 2 ;                      // b__i
int c [3] ;                      // c__A_i

namespace N {
    int *p = 0 ;                 // p__1NPi
}