C++ Reflection

My specialization at The Game Assembly

What is reflection?

In C++, the members of a class are only accessible through their names until the application is compiled. During this process, the members names get converted to memory addresses and the original structure is forgotten. Reflective programming means that a data type is self-aware even after compilation. With other words, the type contains meta data which describes it as it was before the compilation.

The uses of reflection are vast, programming languages that natively support reflection can use this meta data and automatically allow for features such as serialization and editor functionalities. This, as compared to other languages where code needs to be manually written, can drastically improve iteration speeds, lower the barrier of entry to a new workspace and decrease the amount of code that can contain bugs.

C++ does not natively support reflection, so my question was:

How close to real reflection could I achieve in C++?

For reflection to be usable it needs a few basic functionalities.

The variables of a type needs to be accessible after compilation
The meta data / code needs to be automatically generated.

I’ll talk about my implementations and about some of the issues that I encountered in their respective areas, as well as going over my conclusion and some what the future features might be.

But for now, let’s get right into it.

Type and member iteration

I started by looking at different ways of implementing a variable data type that would allow it to be used after compilation, I decided to implement a system where code would register a type and its variables on startup, allowing it to be used globally from that point. I chose this as I don’t want any inheritance to be necessary for a type to use reflection. This basic implementation would allow for the following code to work.

How I want the interface to work.

This code alone would be very powerful, as the variables of a type could then be used to do anything, even if the code does not know what the base type was during compile-time. So, I looked at registering variables and allowing them to be used, as simply knowing about a variable is not enough. It needs to be writeable and or readable for reflection to have any use. I decided to utilize a macro that generates lambdas for each variable that would allow it to be written to and read from.

The get variable lambda

The set variable Lambda

However, this first implementation was quite naïve as it didn’t think about constant variables. For constant variables to be able to follow this workflow another function would be needed, a reading function that simply returns a constant pointer. However, the non-constant versions need to be modified to allow for constant variables. While fundamentally the same code, it needs to be compile-able even if the variable is constant. The new version needs to use a const_cast to allow the return value to be evaluated as a regular pointer, this however only runs if the variable isn’t constant to begin with.

Allowing for constant variables

With the ability to read and write data the reflection system is proving to be feasible, however I had a lot more features to add before the system would be usable.

A feature in reflection that gives the programmers more control over how a type is used is allowing it to be marked with an attribute. These attributes can mean a lot of things, from only having certain variables be serialized to an indication to the editor that the variable should be a slider between x and y. As I want to focus on the ability to use reflection in a game engine, allowing the programmers to do these things would up the usability significantly. So, I implemented it, even if the implementation is basic it allows for all the functionality that I deemed it needs.

Field based attributes

The attributes are simply passed into the registrations as variadic arguments, with the attribute themselves allowing for variadic arguments in their constructor removing any restrictions that otherwise would exist. The attribute arguments are stored in an std::variant, while this might restrict some uses, as passing other types would be possible, I never found myself needing another data type. And if a new data type needs to be added then its simply to add it to the variant and the GetArgument function where it tries to convert the type to the specified type of the function.

The next task I decided to target was inheritance, as getting the variables for a type with public inheritance, should show the variables of the inherited type as well. I decided to implement this system with a limit of one inherited class for now to prove that the system would work.

As the registration wouldn’t be guaranteed to reflect the parent type first, I decided that on the first call of GetVariables() the reflected class would add the variables from the inherited type into its own list of variables. This works as static_cast in the generated lambdas can convert the base to the inherited type without issues.

Registering inheritance

With this, inherited variables can now be iterated over from a base type. Next, I worked on what would become one of the larges issues with the reflection system, templated types.

In languages that natively support reflection, usually datatypes inherit from a base ‘Object’ class that can contain basic information about the type, and if its templated then it contains data about the defined template type. In C++ there is no such base type. And as such, when a templated type is defined, there is no way of detecting the base type and its defined template arguments. This isn’t an issue in regular C++ as when you define the combined base type and template argument the compiler creates an instantiation for the type you need. But when trying to implement reflection, not being able to detect the base type can create quite ugly code, as you need to check against each instantiation of a type to perform any logic for it. The solution that I came up with goes against one of my goals, that inheritance for a type should not be necessary.

However, due to the way that the C++ compilation works, I saw no other way than to allow for custom implementations to inherit from a base. I decided to focus on vectors and implementing a ReflectionList type that would allow the reflection system to iterate over a templated type without needing to define a special case for each implementation of the type. The individual templated types do still need to be registered for the reflection to work as otherwise there is no reference to it in the system. I will talk about how I decided implement that during the code generation part.

A generic reflection enabled list

Code generation

After implementing the basic system, the reflection system works. It can be used for entirely generic editor and serialization functionality which was my main goal as those are the primary uses in game engines. However, the reflection currently needs to be manually written, and one of the goals was that the reflection needs to be automatically generated. As the reflection system is fuctional, I knew what the generated code needs to look like to register a type. This made working with a code generator a lot easier as I knew exactly what I needed to support. As I had worked with code generators in previous projects, I decided to use it as a base to add features upon.

The main loop of the code generator is simple, it uses a few simple optimizations to make sure that it won’t slow down compilation speeds too much.

The main code generating loop

First it gets the timestamp of the last time it was run, this way it can compare it to when header files were updated and see if it has been modified since the last time it scanned that file, this minimizes the number of headers that it needs to scan and significantly speeds up the generation process.

Next it checks if the output file to is empty, I implemented this as a safe guard so that if the generator were to fail, you can clear the file and force it to re-generate all of the code from scratch. While not always solving the issue, it usually provides a way out of a situation where you cant compile the project if the generated code is incorrect.

Thirdly it uses the timestamp and if it should force a scan to get all the updated headers from the project.

If there are any new headers to look at, it loads the previous code that have already been generated, this allows for the types and their namespaces to be loaded, even if it is not explicitly defined in the current header.

The new files are then scanned, and their code is generated, I will talk about this more in-depth later.

Lastly, we write all the code into the target file and register all the subsequent function calls to the main exposed function, defined by an argument passed to the generator. This argument allows the generator to run once for each project in a solution and generate a function distinct for that project.

Let us talk about the file scanning and generation of code. I decided to implement a state stack in the file scanner to separate functionality and make the code easier to follow. The state stack functions to separate contexts from each other as they do not need to know anything about what happens in another state, but rather only the lines that gets passed to the current context. An example of how the state stack functions can be seen below.

The state stack

context is finished. This allows areas such as functions to be ignored while the type contexts can focus on only parsing the appropriate lines. As most code for the reflection system can be generated using a single macro call per variable, the generated code is quite simple. It generates a function for each type where it runs the appropriate registration code. As I had already wrapped the lambda creation behind a macro, I simply needed to generate one line of code per variable.

Generated code

There are a lot of potential issues with code generators. Some can be quite daunting when planning to implement one, especially the potential of the generator failing, and generating code that can’t be compiled. This could mean that an end user is stuck waiting for a fix to arrive so that they can continue working.

I worried about this for quite a while. However, I realized that we have multiple finished game engines and projects and I can run the code generator on those projects and see what the generated code looks like. As there are a lot of different libraries in our engines, and most libraries follow slightly different coding standards, running the generator over them covered a lot of ground at the same time. While not a perfect solution, with the limited time frame that I had, I think that this was one of the best paths that I could have taken. It showed multiple bugs with how I parsed variables and functions that extend over multiple lines it. It revealed bugs in how I looked for template arguments and a lot of other smaller bugs. However, after a longer period of debugging and verifying results, the parser became quite stable. While It cannot be trusted to handle every coding standard, it consistently worked with the one that we used in our game engines, I deemed that this was enough for now, as I wanted to implement reflection of template types.

Given that the code scanner only looks at header files, it knows nothing about the actual implementations of a types. Therefore it cannot know what instantiations of a templated type exists. I decided that the best implementation was to allow programmers to define the types that are of interest to the project. This way, the code generator won’t have to guess the implementations and won’t create a lot of unused versions of a templated type.

Reflecting templated types

While not a fully automated solution, it works quite well. I would have preferred a way for the template instantiations to be automatically detected and registered, however this would have changed the scope for the entire project and would be feasible before the deadline of the project.

With that, the code generator is now functional, as it can scan header files and automatically update the appropriate code for the types in said header. The final step to make it fully automatic was to call the code generator from the visual studio project. Visual studio has the ability to run applications during post-build, pre-link and pre-build. Making visual studio run the code generator during pre-build allows the code to be generated before visual studio starts to build anything. All the generator requires is that you run:

call PreProcessor.exe FunctionToRegisterIn $(ProjectDir)

And it will automatically generate “GeneratedReflection” .h and .cpp, include these into the project file and the project will have automatic C++ reflection, to the extent that I have described above.

Conclusion and results

Was I able to achieve full reflection in C++? No, but I did manage to re-create the aspects most used in game engines. As my final version of reflection allows for my targeted aspects of reflection, it can be used for automatic editor and serialization without needing any workarounds. To showcase this I implemented this into our game engine and I’m very pleased by the resulting code. While extremely similar, they can achieve entirely different functionalities using the same generic type interface.

Interfacing with the reflection

Iterating the type

Rendering the widget

A type to edit

The generated editor window

The future / more features

There are a lot of aspects that I didn’t touch on, I’ll mention them here even if I didn’t get the time to implement them in this project.

Functions.

I only allow iteration of variables, meaning that I ignore functions. Theoretically, functions can be implemented in a similar way to variables, only needing to different registration code for them. Allowing functions to be reflected upon could allow you to mark a function as a button and having that function show up in the editor as a button or marking it as a function that you can automatically expose in a visual scripting node.

Optimizations.

Compiling the generated code can be quite slow, as it generates all the code for all the types in one file. It would be an improvement to separate the generated code into multiple files, as usually only a few types are updated not needing to re-compile the code for all types would vastly improve compilation times.

Type constructors

It would not be difficult to generate constructors and the other necessary functionality in a way that it could be used even if the type is unknown during compile time. This could allow a data type to be marked as a component, so when the engine later iterates over all the types it can allocate pools, etc. for all the components using the generated constructors.

Hopefully, you found my findings and experiments to be as interesting as I did. If you have any questions, don’t hesitate to contact me.