6. Classes

  1. 6.1. Class layout
  2. 6.2. Derived class layout
    1. 6.2.1. Single inheritance
    2. 6.2.2. Multiple inheritance
    3. 6.2.3. Virtual inheritance
  3. 6.3. Constructors and destructors
  4. 6.4. Virtual function tables

6.1. Class layout

Consider a class with no base classes:

class A {
    // A's members
} ;

Each object of class A needs its own copy of the non-static data members of A and, for polymorphic types, a means of referencing the virtual function table and run-time type information for A. This is accomplished using a layout of the form:

A vptr A typeid A ptr typeid A voff A vfunc1 vfunc2 vfuncn ... voff A
Figure 1. Class A

where the A component consists of the non-static data members and vptr A is a pointer to the virtual function table for A. For non-polymorphic classes the vptr A field is omitted; otherwise space for vptr A needs to be allocated within the class and the pointer needs to be initialised in each constructor for A. The precise layout of the virtual function table and the run-time type information is given below.

Two alternative ways of laying out the non-static data members within the class are implemented. The first, which is default, gives them in the order in which they are declared in the class definition. The second lays out the public, the protected, and the private members in three distinct sections, the members within each section being given in the order in which they are declared. The latter can be enabled using the -jo command-line option.

The offset of each member within the class (including vptr A) can be calculated in terms of the offset of the previous member. The first member has offset zero. The offset of any other member is given by the offset of the previous member plus the size of the previous member, rounded up to the alignment of the current member. The overall size of the class is given by the offset of the last member plus the size of the last member, rounded up using the token:

~comp_off : ( EXP OFFSET ) -> EXP OFFSET

which allows for any target dependent padding at the end of the class. The shape of the class is then a compound shape with this offset.

Classes with no members need to be treated slightly differently. The shape of such a class is given by the token:

~cpp.empty.shape : () -> SHAPE

(recall that an empty class still has a nonzero size). The token:

~cpp.empty.offset : () -> EXP OFFSET

is used to represent the offset required for an empty class when it is used as a base class. This may be a zero offset.

Bitfield members provide a slight complication to the picture above. The offset of a bitfield is additionally padded using the token:

~pad : ( EXP OFFSET, SHAPE, SHAPE ) -> EXP OFFSET

where the two shapes give the type underlying the bitfield and the bitfield itself.

The layout of unions is similar to that of classes except that all members have zero offset, and the size of the union is the maximum of the sizes of its members, suitably padded. Of course unions cannot be polymorphic and cannot have base classes.

Pointers to incomplete classes are represented by means of the alignment:

~cpp.empty.align : () -> ALIGNMENT

This token is also used for the alignment of a complete class if that class is never used in the generated TDF in a manner which requires it to be complete. This can lead to savings on the size of the generated code by preventing the need to define all the member offset tokens in order to find the shape of the class.

6.2. Derived class layout

The description of the implementation of derived classes will be given in terms of the example class hierarchy given by:

class A {
    // A's members
} ;

class B : public A {
    // B's members
} ;

class C : public A {
    // C's members
} ;

class D : public B, public C {
    // D's members
} ;

or, as a directed acyclic graph:

D B A C A
Figure 2. Class D

6.2.1. Single inheritance

The layout of class A is given by:

A vptr A vtbl A typeid A
Figure 3. Class A

as above. Class B inherits all the members of class A plus those members explicitly declared within class B. In addition, class B inherits all the virtual member functions of A, some of which may be overridden in B, extended by any additional virtual functions declared in B. This may be represented as follows:

A B vptr B vptr B::A vtbl B::A vtbl B typeid B
Figure 4. Class B

where A denotes those members inherited from the base class and B denotes those members added in the derived class. Note that an object of class B contains a sub-object of class A. The fact that this sub-object is located at the start of B means that the base class conversion from B to A is trivial. Any base class with this property is called a primary base class.

Note that in theory two virtual function tables are required, the normal virtual function table for B, denoted by vtbl B, and a modified virtual function table for A, denoted by vtbl B::A, taking into account any overriding virtual functions within B, and pointing to B's run-time type information. This latter means that the dynamic type information for the A sub-object relates to B rather than A. However these two tables can usually be combined - if the virtual functions added in B are listed in the virtual function table after those inherited from A and the form of the overriding is suitably well behaved (in the sense defined below) then vptr B::A is an initial segment of vptr B. It is also possible to remove the vptr B field and use vptr B::A in its place in this case (it has to be this way round to preserve the A sub-object). Thus the items shaded in the diagram can be removed.

The class C is similarly given by:

A C vptr C vptr C::A vtbl C::A vtbl C typeid C
Figure 5. Class C

6.2.2. Multiple inheritance

Class D is more complex because of the presence of multiple inheritance. D inherits all the members of B, including those which B inherits from A, plus all the members of C, including those which C inherits from A. It also inherits all of the virtual member functions from B and C, some of which may be overridden in D, extended by any additional virtual functions declared in D. This may be represented as follows:

A B C D vptr D::B::A vptr D::B vptr D::C::A vptr D::C vptr D vtbl D::B::A vtbl D::B vtbl D::C::A vtbl D::C vtbl D typeid D delta D::C
Figure 6. Class D

Note that there are two copies of A in D because virtual inheritance has not been used.

The B base class of D is essentially similar to the single inheritance case already discussed; the C base class is different however. Note firstly that the C sub-object of D is located at a non-zero offset, delta D::C, from the start of the object. This means that the base class conversion from D to C consists of adding this offset (for pointer conversions things are further complicated by the need to allow for null pointers). Also vtbl D::C is not an initial segment of vtbl D because this contains the virtual functions inherited from B first, followed by those inherited from C, followed by those first declared in D (there are other reasons as well). Thus vtbl D::C cannot be eliminated.

6.2.3. Virtual inheritance

Virtual inheritance introduces a further complication. Now consider the class hierarchy given by:

class A {
// A's members
} ;

class B : virtual public A {
// B's members
} ;

class C : virtual public A {
// C's members
} ;

class D : public B, public C {
// D's members
} ;

or, as a directed acyclic graph:

D B C A
Figure 7. Class D

As before A is given by:

A vptr A vtbl A typeid A
Figure 8. Class A

but now B is given by:

ptr A A B vptr B::A vtbl B::A vptr B vtbl B typeid B
Figure 9. Class B

Rather than having the sub-object of class A directly as part of B, the class now contains a pointer, ptr A, to this sub-object. The virtual sub-objects are always located at the end of a class layout; their offset may therefore vary for different objects, however the offset for ptr A is always fixed. The ptr A field is initialised in each constructor for B. In order to perform the base class conversion from B to A, the contents of ptr A are taken (again provision needs to be made for null pointers in pointer conversions). In cases when the dynamic type of the B object can be determined statically it is possible to access the A sub-object directly by adding a suitable offset. Because this conversion is non-trivial (see below) the virtual function table vtbl B::A is not an initial segment of vtbl B and cannot be eliminated.

The class C is similarly given by:

ptr A A C vptr C::A vtbl C::A vptr C vtbl C typeid C
Figure 10. Class C

Now the class D is given by:

ptr A A B C D vptr D::A vptr D::B vptr C::A vtbl D::A vtbl D::B vtbl D::C vptr D::C vptr D vtbl C vtbl D typeid D
Figure 11. Class D

Note that there is a single A sub-object of D referenced by the ptr A fields in both the B and C sub-objects. The elimination of vtbl D::B is as above.

6.3. Constructors and destructors

The implementation of constructors and destructors, whether explicitly or implicitly defined, is slightly more complex than that of other member functions. For example, the constructors need to set up the internal vptr and ptr fields mentioned above.

The order of initialisation in a constructor is as follows:

  • The internal ptr fields giving the locations of the virtual base classes are initialised.

  • The constructors for the virtual base classes are called.

  • The constructors for the non-virtual direct base classes are called.

  • The internal vptr fields giving the locations of the virtual function tables are initialised.

  • The constructors for the members of the class are called.

  • The main constructor body is executed.

To ensure that each virtual base is only initialised once, if a class has a virtual base class then all its constructors have an implicit extra parameter of type int. The first two steps above are then only applied if this flag is nonzero. In normal applications of the constructor this argument will be 1, however in base class initialisations such as those in the third and fourth steps above, it will be 0.

Note that similar steps to protect virtual base classes are not taken in an implicitly declared operator= function. The order of assignment in this case is as follows:

  • The assignment operators for the direct base classes (both virtual and non-virtual) are called.

  • The assignment operators for the members of the class are called.

  • A reference to the object assigned to (i.e. *this) is returned.

The order of destruction in a destructor is essentially the reverse of the order of construction:

  • The main destructor body is executed.

  • The destructor for the members of the class are called.

  • The internal vptr fields giving the locations of the virtual function tables are re-initialised.

  • The destructors for the non-virtual direct base classes are called.

  • The destructors for the virtual base classes are called.

  • If necessary the space occupied by the object is deallocated.

All destructors have an extra parameter of type int. The virtual base classes are only destroyed if this flag is nonzero when and-ed with 2. The space occupied by the object is only deallocated if this flag is nonzero when and-ed with 1. This deallocation is equivalent to inserting:

delete this ;

in the destructor. The operator delete function is called via the destructor in this way in order to implement the pseudo-virtual nature of these deallocation functions. Thus for normal destructor calls the extra argument is 2, for base class destructor calls it is 0, and for calls arising from a delete expression it is 3.

The point at which the virtual function tables are initialised in the constructor, and the fact that they are re-initialised in the destructor, is to ensure that virtual functions called from base class initialisers are handled correctly (see ISO C++ 12.7).

A further complication arises from the need to destroy partially constructed objects if an exception is thrown in a constructor. A count is maintained of the number of base classes and members constructed within a constructor. If an exception is thrown then it is caught in the constructor, the constructed base classes and members are destroyed, and the exception is re-thrown. The count variable is used to determine which bases and members need to be destroyed.

These partial destructors currently do not interact correctly with any exception specification on the constructor. Exceptions thrown within destructors are not correctly handled either.

6.4. Virtual function tables

The virtual functions in a polymorphic class are given in its virtual function table in the following order: firstly those virtual functions inherited from its direct base classes (which may be overridden in the derived class) followed by those first declared in the derived class in the order in which they are declared. Note that this can result in virtual functions inherited from virtual base classes appearing more than once. The virtual functions are numbered from 1 (this is slightly more convenient than numbering from 0 in the default implementation).

The virtual function table for this class has shape:

~cpp.vtab.type : ( NAT ) -> SHAPE

the argument being n + 1 where n is the number of virtual functions in the class (there is also a token:

~cpp.vtab.diag : () -> SHAPE

which is used in the diagnostic output for a generic virtual function table). The table is created using the token:

~cpp.vtab.make : ( EXP pti, EXP OFFSET, NAT, EXP NOF ) -> EXP vt

where the first expression gives the address of the run-time type information structure for the class, the second expression gives the offset of the vptr field within the class (i.e. voff), the integer constant is n + 1, and the final expression is a make_nof construct giving information on each of the n virtual functions.

The information given on each virtual function in this table has the form of a pointer to function member formed using the token:

~cpp.pmf.make : ( EXP PROC, EXP OFFSET, EXP OFFSET ) -> EXP pmf

as above, except that the third argument gives the offset of the base class in virtual function tables such as vtbl B::A. For pure virtual functions the function pointer in this token is given by:

~cpp.vtab.pure : () -> EXP PROC

In the default implementation this gives a function __TCPPLUS_pure which just calls abort.

To avoid duplicate copies of virtual function tables and run-time type information structures being created, the ARM algorithm is used. The virtual function table and run-time type information structure for a class are defined in the module containing the definition of the first non-inline, non-pure virtual function declared in that class. If such a function does not exist then duplicate copies are created in every module which requires them. In the former case the virtual function table will have an external tag name; in the latter case it will be an internal tag. This scheme can be overridden using the -jv command-line option, which causes local virtual function tables to be output for all classes.

Note that the discussion above applies to both simple virtual function tables, such as vtbl B above, and to those arising from base classes, such as vtbl B::A. We are now in a position to precisely determine when vtbl B::A is an initial segment of vtbl B and hence can be eliminated. Firstly, A must be the first direct base class of B and cannot be virtual. This is to ensure both that there are no virtual functions in vtbl B before those inherited from A, and that the corresponding base class conversion is trivial so that the pointers to function members of B comprising the virtual function table can be equally regarded as pointers to function members of A. The second requirement is that if a virtual function for A, f, is overridden in B then the return type for B::f cannot differ from the return type for A::f by a non-trivial conversion (recall that ISO C++ allows the return types to differ by a base class conversion). In the non-trivial conversion case the function entered in vtbl B::A needs to be, not B::f as in vtbl B, but a stub function which calls B::f and converts its return value to the return type of A::f.

The virtual function call mechanism is implemented using the token:

~cpp.vtab.func : ( EXP ppvt, SIGNED_NAT ) -> EXP ppmf

which has as its arguments a reference to the vptr field of the object the function is to be called for, and the number of the virtual function to be called. It returns a reference to the corresponding pointer to function member within the object's virtual function table. The function is then called by extracting the base class offset to be added, and the function to be called, from this reference using the tokens:

~cpp.pmf.delta : ( ALIGNMENT a, EXP ppmf ) -> EXP OFFSET ( a, a )
~cpp.pmf.func : ( EXP ppmf ) -> EXP PROC

described as part of the pointer to function member call mechanism above.