Dynamic_cast and typeid as (non) RTTI tools.

You are already very likely to know about the dynamic_cast and typeid tools from the C++ language. Those tools allow you to get some information about the types of objects in the runtime. And yes… runtime. It’s slow, right? So we would rather not use those tools at all. In this article, we will have a bit more detailed look into the behavior of the dynamic_cast and typeid and also we will show its non-rtti cases.

First, let’s start from why do we need those tools at all. So without the RTTI in general, we could not use exceptions. It’s like that because without information about the exception’s type itself we wouldn’t be able to match the exception to the corresponding catch clause.

Another use case for the RTTI is the implementation of the types like std::any. Inside the any, we need to store the information about the type in order to check later on, whether the cast to a concrete type is safe.

Hopefully, I convinced you, that RTTI might be useful for some cases. Now let’s have a look at its first reincarnation, which is the dynamic_cast.

The dynamic_cast

First thing you need to know about RTTI is, that it doesn’t work for the non-polymorphic types. If you wonder what polymorphic types are, then here I come with the explanation. The types are polymorphic, when either they have at least one virtual function or they inherit from another polymorphic type.

And it turns out, that dynamic_cast can work for non-polymorphic types. How is that even possible? The answer is simple – not every case, that dynamic_cast handles, needs RTTI. We will prove that later on in the article.

struct B {};
struct D : B {};

int main(){
    B* ptr = new D;
    dynamic_cast<D*>(ptr);
}

What do we want to achieve in the snippet above? We want to cast the pointer to the base class to the pointer to the derived class. If you fear the memory leak caused by not freeing the memory after allocation, then stop. The example will not even compile. Here’s why.

The above snippet requires RTTI mechanism to ensure safety in your program. After all, you can try to cast to a different derived type, which is not a dynamic type of a pointer you are trying to cast. This case is handled by the dynamic_cast by returning nullptr or throwing the std::bad_cast exception (depending on whether you cast a pointer or a reference). Since type B has got no virtual function, then this is non-polymorphic type and RTTI cannot be used. This causes compilation error – dynamic_cast wants to use RTTI, but it can’t do so.

So now let’s try also to cast types the other way around, so from the derived to the base class. The snippet below represents that situation.

struct B {};
struct D : B {};

int main(){
    D* ptr = new D;
    dynamic_cast<B*>(ptr);
    delete ptr;
}

This works correctly. After all, even implicit cast would be fine here, but dynamic_cast also won’t complain. Why is that? The fact, that the cast is safe is known by the compiler at compile time. This means, that dynamic_cast does not need to use RTTI.

We can say in general, that dynamic_cast is a tool for moving around the inheritance tree – up and down. Whether the dynamic_cast uses RTTI depends only on whether the particular case needs it.

Using dynamic_cast can also make our intentions clearer. Whenever we say dynamic_cast, the reader knows we intend to cast to the base or derived class. Whenever we say static_cast, on the other hand, we know we mainly mean to perform some arithmetic casts, converting constructor calls or the user’s conversion operators.

The dynamic_cast, you (probably) didn’t know

C++ As I already mentioned in another post is a very flexible tool. Dynamic_cast, since it’s a tool of the C++, was also created in the spirit of the maximum flexibility. This section is dedicated to the esoteric use cases of the dynamic_cast.

Dynamic_cast<void*>

We said, that dynamic_cast is dedicated for moving around the inheritance tree. One of such moves is a move to the most derived object. With C++ we can perform such move even without knowing the most derived object’s type. And this is where dynamic_cast<void*> can be used. Let’s have a look at an example:

struct B {int a; int b; virtual ~B()=default;};
struct C {int a; int b; virtual ~C()=default;};
struct D : C, B {int c; int d;};

int main(){
    D* ptrd = new D;
    B* ptrb = ptrd;
    C* ptrc = ptrd;

    assert(dynamic_cast<void*>(ptrb) == ptrd);
    assert(dynamic_cast<void*>(ptrc) == ptrd);
    delete ptrd;
}

In this case, the ptrb will have different value, than ptrd, since B is the subobject of D (and therefore its address will have some offset from the beginning of the object D).

This, of course, needs the information about the type in the runtime, since we might not know what’s the exact type of the most derived object. For this reason structures B and C have virtual destructors (RTTI on those types must be possible).

Ambiguous casts

Did you ever wonder, what if some object of type (let’s say) D would inherit twice from the type B and we would like to cast it to this type?

Let’s have a look at the example of such cast:

struct B {int a; int b; virtual ~B()=default;};
struct C : B {int a; int b; virtual ~C()=default;};
struct D : C, B {int c; int d;};

Now, if we try to cast D object to the B object, we will get the following (or similar) error message:

error: ambiguous conversion from derived class 'D' to base class 'B':
    struct D -> struct C -> struct B
    struct D -> struct B
    B* ptrb = dynamic_cast<B*>(ptrd);

The compiler won’t know which object B do we mean, so it will refuse to cast. But you might wonder what would happen if ambiguity occurs when casting to the derived object.

Let’s have a look at the following inheritance tree:

struct A{virtual ~A()=default;};
struct B{virtual ~B()=default;};
struct C : A, B{};
struct D : C, B{};

The object D will have two instances of class B. now we can try to dynamic_cast from A to D object like so:

int main(){
    D* d = new D;
    A* a = dynamic_cast<A*>(d); // one subobject A
    B* b= dynamic_cast<B*>(a); // two possible B subobjects
    assert(b == nullptr);
    delete d;
}

Now, in this case, the result of the dynamic_cast is nullptr as the compiler cannot check in compile time whether there are more than one derived objects of the given type.

The typeid

Hopefully, I surprised you with some of the dynamic_cast‘s properties, now it’s time for the typeid.

The typeid is a tool used to obtain information about the type based on the expression or the type itself. The result of the typeid expression is type_info object defined in the <typeinfo> header.

Typeid != RTTI

Just like dynamic_cast, the typeid does not always need to use RTTI mechanism to work correctly. If the argument of the typeid expression is non-polymorphic type, then no runtime check is performed. Instead, the information about the type is known at the compile-time.

This has a consequence of easily overlooking whether we will perform typeid on a polymorphic type or not, giving us sometimes non-desireable effects, that we will illustrate below:

#include <typeinfo>
#include <iostream>

struct A{};
struct B : A{};

std::type_info const& give_me_type(A&& a){
  return typeid(a);
}

int main(){
  auto& info = give_me_type(B{});
  std::cout << info.name() << std::endl;
}

In this case, the typeid obtained inside the give_me_type function will always return the typeid to type A, which is probably not what the user expected. In my case, the program displays “1A” string. This shouldn’t be a surprise, because the string is implementation dependent (might differ with different compilers).

Even more interesting fact is that in case of the non-polymorphic types, the expression is not evaluated, because the only thing, compiler needs is the type of the expression. In this case, any side effects that can occur in the expression will not happen (strings will not be displayed, files won’t be opened, etc.). A special case of such expression is dereferencing a null pointer, which is a safe operation:

int main(){
  auto& info = typeid(*((B*)nullptr));
  std::cout << info.name() << std::endl;
}

In this case, the compiler won’t bother that we are dereferencing the empty pointer as it will not be executed on the machine anyway.

Yet another interesting example of non-evaluated expressions are lambda expressions. The lambda expressions are generally not allowed as non-evaluated. This is why the following code will not compile until C++20:

typeid([]{});

But why until C++20? Since C++20 the lambda expressions with empty capture lists are allowed in unevaluated contexts. With the use of GCC10 the result of typeid([]{}).name() is Z4mainEUlvE_ when I performed the lambda expression inside the main function.

Typeid(type-id)

The argument of the typeid expression does not to be another expression. In fact, it can be just a type. The behavior is then the same as if the argument would be a non-polymorphic expression of the same type. We could even say, that for non-polymorphic expressions typeid of the form:

typeid(<expression>)

gets translated into

typeid(decltype(<expression>))

My personal recommendation would be to be explicit about non-polymorphic types, so that we won’t be surprised, that some of the expressions aren’t evaluated.

Typeid == RTTI

So now, its polymorphic types time!

So, in the case of the polymorphic expressions, the expressions need to be evaluated. After all, the returned object depends on the runtime.

If that’s the case, then dereferencing null pointer now must be undefined behavior, isn’t it? The assumption is wrong. Let’s have a look at the following code:

int main(){
  try{
    typeid(*((B*)nullptr));    
  } catch(std::bad_typeid& e) {
    std::cout << e.what() << std::endl;  
  }
}

Why do I wrap the typeid within the try-catch clauses? Because the result of the typeid expression is actually not undefined behavior. It’s the exception of the std::bad_typeid type.

This is some corner case of how typeid behaves and it’s valid only for the outermost expression. If you try to dereference a null pointer in a subexpression, then still your program has undefined behavior. Let’s see the following code:

struct polymorph{virtual ~polymorph()=default;};
struct test{polymorph n;};

polymorph& getn(test& t){return t.n;}

int main()
{
   try{
    typeid(((test*)nullptr)->n);    
  } catch(std::bad_typeid& e) {
    std::cout << e.what() << std::endl;  
  }

}

You won’t see any exception of the bad::typeid type here, because the pointer dereference is not an outermost expression (fetching the member is). The result of the above program is undefined.

Typeid’s result

We were talking about typeid corner cases, but actually didn’t mention the basic thing – “what does the typeid return?”

The type of the typeid expression is const std::type_info&. And there is something quite special about this type, because the object is guaranteed to live till the end of the program. What’s more, the actual dynamic type returned from the typeid might be different from std::type_info as it can also be any type that derives from the std::type_info.

This makes developers to not care about the ownership and lifetime of the result of the std::typeid.

The interface of the std::type_info

The interface is as follows:

  • equality operators (bool operator==(const type_info& rhs) const noexcept;). Allows determining whether two instances of the std::type_info present information about the same type.
  • before function bool before(const type_info& rhs) const noexcept; Allows ordering the types. The result is guaranteed to be stable only during one program run. I.e. for random types A and B. The function before can result true for order A -> B and in the second run it can return true for the sequence B -> A.
  • hash_code function with the following prototype: size_t hash_code() const noexcept, that returns hash value, that’s unique for the given type. We can use the value as a hash in hashing containers. It might return different values for the different program runs, just like before function.
  • Name function, that returns a string representation of the pointed type.

What about constness?

By this time you might have started to think “should two type_info objects pointed to the same type differing by the const/volatile specifier compare equal?”. The answer is specifiers are not taken into account. Well at least partially aren’t.

If the specifier is applied to the topmost type, then it’s not taken into consideration, so the following expression yields true value:

typeid(int) == typeid(const int);

That was an easy example, let’s have a look into pointers:

typeid(int*const) == typeid(int*);

Now the const refers to the pointer itself, so the typeids will compare equal. However, when const refers to the type pointer points to, the typeids will not compare equal:

typeid(int const*) == typeid(int*); // returns false

Value categories conversions

The interesting thing about the typeid expressions is, that their arguments do not get converted according to the standard conversions rules. Specifically, you shouldn’t expect lvalue to rvalue conversions, array to pointer and function to pointer conversions. In practice it means, two things.

First, we can get different results than we expect to see, because the conversion rules are unusual.

Second, we can get precise information about the type we give as the argument. For example we are able to differentiate function type from the pointer to function type. Let’s have a look:

std::cout << typeid(foo).name() << std::endl;
std::cout << typeid(void(*)()).name() << std::endl;
std::cout << typeid(bar).name() << std::endl;
std::cout << typeid(void(*)(int, int)).name() << std::endl;

The exact printed string might differ depending on the compiler, but we should be able to see the difference and in my case, the output was following:

FvvE  # Function type returning Void and taking Void argument
PFvvE # Pointer to Function returning Void and taking Void argument
FviiE # Function type returning Void and taking two Integers as arguments
PFviiE # Pointer to Function returning Void and taking two Integers as arguments

RTTI limits

Even the most popular tools in C++ have their darkest corners and RTTI tools are not different. Since undefined behavior is often encountered in case of the C++ it shouldn’t be surprising, that you can cause undefined behavior with those tools as well.

The issue with the dynamic_cast and typeid is that they do use the values they are given as it’s an argument (if the argument is of a polymorphic type). By using I mean dereferencing. So one way to shoot yourself in the foot is to pass to the dynamic_cast or typeid tools object under the construction or destruction, whose dynamic type hasn’t been yet fully created or was already destroyed.

Let’s have a look at the example taken straight from the standard:

struct V { virtual void f();};
struct A : virtual V {};
struct B : virtual V { B(V*, A*);};

struct D : A, B {
  D() : B((A*)this, this) { }
};

B::B(V* v, A* a) {
  //B ctor - the D object is not yet fully constructed

  typeid(*this);                // correct: type_info for B.
  typeid(*v);                   // correct: v points the V subobject, which is the base of B - it surely is fully constructed
  typeid(*a);                   // undefined behavior: A is not a base of B - might be not yet created at this time
  dynamic_cast<B*>(v);          // correct: v points V subobject, which is the base of B
  dynamic_cast<B*>(a);          // undefined behavior, a points the  A subobject, which is not the base of B
}

What’s happening in the snippet is that ctor B::B takes two pointers to the V and A types. B is inheriting from V and A is not connected with the B type.

The D class inherits from A and B. D gives this pointer as the arguments of the B ctor. This makes us sure, that V subobject is already created when creating B subobject, but we have got no such assurance if it’s about the A subobject and this is why dynamic_cast and typeid cause undefined behavior in the example above.

Typeid as faster dynamic_cast

Have you heard about using the typeid as a faster dynamic_cast tool? No? Then I will explain. Yes? Then I will tell you why you should be careful about it. Just hold on a second.

First, how can we really use the typeid to perform safe and fast dynamic_cast? Let’s see at the definition below:

template <typename Dest, typename Src>
Dest* fast_dynamic_cast(Src* src){
  if(typeid(*src) == typeid(Dest)){
    return static_cast<Dest*>(src);
  }
  throw std::bad_cast();
}

We use typeid to check whether we can safely cast the pointer and later on perform static_cast (which no longer does runtime check). But why is that faster at all and why it’s not safe to use it? Let’s start with the working example:

struct A {virtual ~A()=default;};
struct B : A{};

int main(){
    A* a= new B;
    fast_dynamic_cast<B>(a);
}

This example works fine, although won’t be too good at showing how fast the fast_dynamic_cast is in comparison to the regular dynamic_cast.

But let’s modify the example just a little bit:

struct A {virtual ~A()=default;};
struct B : A{};
struct C : B{};

int main(){
    A* a= new C; // now we create the C object
    fast_dynamic_cast<B>(a); // uncaught exception
}

and now we end up having the uncaught exception. What’s the reason for such behavior?

The dynamic casting tool checks the whole inheritance tree to be sure, that the cast is safe. Typeid, on the other hand, immediately jumps to the most derived object and returns the information about it. This the reason why typeid is fast and also the reason why typeid is not perfect in our use-case (actually we should say it’s useless). Maybe the function should be called most_derived_cast instead of fast_dynamic_cast to simply not confuse the users.

Summary

As you can see, even when RTTI mechanisms are in the language since ages and more experienced developers already had their time to get familiar with the tools, their behavior still can be surprising (or maybe you did know about all of those? – Let me know). We did not mention exceptions in this article at all, since I think they are perfectly explained on the Andrzej Krzeminski’s blog.

I hope that I made those tools clearer for you and explained possible dangers, the tools carry with them. Whenever you doubt about proper usage of the RTTI tools feel free to come back to this post ๐Ÿ™‚


Bibliography


7 responses to “Dynamic_cast and typeid as (non) RTTI tools.”

  1. Good overview of the topic, but the article is difficult to read. Please get an editor or a tutor who can teach you the rules of written English. Pay particular attention to your gratuitous comma use.

    • “Good overview of the topic, but the article is difficult to read” is grammatically wrong. You must remove the comma. Please get an editor or a tutor who can teach you the rules of written English. Pay particular attention to your gratuitous comma use.

    • @ABHISHEK The RTTI stands for Run-Time Type Information. It’s a “tool”, that allows you to acquire the information about the type (like its name or some unique id) when the program is running, rather than when it’s compiling.

      Cheers,
      David

  2. I disagree with the sentiment expressed by an individual above. I found this read on a what is essentially, a complex topic to have the expected level of wait-a-second-i-need-to-read-this-for-the-third-time.
    Thanks for taking the time to write this up, and taking the time to share your findings… Regardless of how hard they might be to express, or understand, by some. It is appreciated.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.