Reflecting Over Members of an Aggregate
Implementing ‘reflection’ qualities using standard C++
A little while back a friend of mine and I were talking about serialization of
struct
objects as raw bytes. He was working with generated objects that contain
padding, but the objects needed to be serialized without the padding; for
example:
struct Foo
{
char data0;
// 3 bytes padding here
int data1;
};
In the case he described, there are dozens of object types that need to be serialized, and all are:
- Generated by his organization (so they can’t be modified), and
- Are guaranteed to be aggregates
Being a template meta-programmer, I thought it would be a fun challenge to try to solve this in a generic way using c++17 – and in the process I accidentally discovered a generic solution for iterating all members of any aggregate type.
Aggregates ¶↑
Before I continue, it’s important to know what an aggregate type actually is, an what its properties are.
Simply put, an aggregate is one of two things:
- an array, or
- a
struct
/class
with only public members and public base-classes, with no custom constructors
There’s formally more criteria for this, but this is a simplification.
What is special about Aggregates? ¶↑
Aggregates are special for a couple reasons.
The first is that Aggregates cannot have custom constructors; they can only use either the default-generated ones (copy/move/default), or be aggregate initialized . This fact will be important in a moment.
The second is that, since c++17 , aggregates can be used with structured bindings expressions without any extra work needed by the compiler author – for example:
struct Foo{
char a;
int b;
};
...
auto [x,y] = Foo{'X', 42};
It’s also important to know that an aggregate can only be decomposed with structured-bindings into the exact number of members that the aggregate has, so the number of members must be known before the binding expression is used.
How does this help? ¶↑
Knowing the above two points about aggregates is actually all we need to develop a generic solution. If we can find out how many members an aggregate object contains, then we will be able to decompose this object with structured bindings and do something with each member!
Detecting members in an aggregate ¶↑
The obvious first question is how can we know how many members an aggregate holds?
The C++ language does not offer any sizeof
-equivalent for the number of
members, so we will have to compute this using some template trickery. This
is where the first point about aggregates comes into play: an aggregate can
only have constructors that perform aggregate initialization
. This
means that for any aggregate, we know that it can be constructed from an
expression T{args...}
, where args...
can be anywhere between 0
to the
total number of members in the aggregate itself.
So really the question we need to be asking now is: “what is the most arguments
I can aggregate-initialize this T
from?”
Testing if T
is aggregate initializable
¶↑
The first thing we need is a way to test that T
is aggregate initializable at
all. Since we don’t actually know what the argument type is for each member,
we will need something that the C++ language can substitute into the expression
for the unevaluated type expression:
// A type that can be implicitly converted to *anything*
struct Anything {
template <typename T>
operator T() const; // We don't need to define this function
};
We don’t actually need to define the function at all; we only need to have the type itself so that the C++ type system can detect the implicit conversion to any valid argument type.
From here, all we really need is a simple trait that tests whether the
expression T{ Anything{}... }
is valid for a specific number of arguments.
This is a perfect job for using std::index_sequence
along
with std::void_t
to evaluate the expression in a SFINAE context:
namespace detail {
template <typename T, typename Is, typename=void>
struct is_aggregate_constructible_from_n_impl
: std::false_type{};
template <typename T, std::size_t...Is>
struct is_aggregate_constructible_from_n_impl<
T,
std::index_sequence<Is...>,
std::void_t<decltype(T{(void(Is),Anything{})...})>
> : std::true_type{};
} // namespace detail
template <typename T, std::size_t N>
using is_aggregate_constructible_from_n = detail::is_aggregate_constructible_from_n_impl<T,std::make_index_sequence<N>>;
With this, we can now test how many arguments are needed to construct an aggregate using aggregate-initialization:
struct Point{ int x, y; }
// Is constructible from these 3
static_assert(is_aggregate_constructible_from_n<Point,0>::value);
static_assert(is_aggregate_constructible_from_n<Point,1>::value);
static_assert(is_aggregate_constructible_from_n<Point,2>::value);
// Is not constructible for anything above
static_assert(!is_aggregate_constructible_from_n<Point,3>::value);
static_assert(!is_aggregate_constructible_from_n<Point,4>::value);
Testing the max number of initializer members ¶↑
All we need now, is to test the max number of arguments that an aggregate can be constructed with.
This could be done in a number of ways:
- Count iteratively from 0 up to the first failure,
- Count from some pre-defined high number down until we find our first success, or
- Binary search between two predefined values until we find the largest scope
The former two options grow in template iteration depth based on the number of members an aggregate has. The larger the number of members, the more iterations are required at compile time – which can increase both compile-time and complexity.
The latter option will be more complex to understand, but also guarantees the fewest number of template instantiations and thus should reduce overall compile-time complexity.
For this part, it turns out that @Yakk on Stack Overflow already provided a brilliant solution doing exactly this (modified slightly for this article):
namespace detail {
template <std::size_t Min, std::size_t Range, template <std::size_t N> class target>
struct maximize
: std::conditional_t<
maximize<Min, Range/2, target>{} == (Min+Range/2)-1,
maximize<Min+Range/2, (Range+1)/2, target>,
maximize<Min, Range/2, target>
>{};
template <std::size_t Min, template <std::size_t N> class target>
struct maximize<Min, 1, target>
: std::conditional_t<
target<Min>{},
std::integral_constant<std::size_t,Min>,
std::integral_constant<std::size_t,Min-1>
>{};
template <std::size_t Min, template <std::size_t N> class target>
struct maximize<Min, 0, target>
: std::integral_constant<std::size_t,Min-1>
{};
template <typename T>
struct construct_searcher {
template<std::size_t N>
using result = is_aggregate_constructible_from_n<T, N>;
};
}
template <typename T, std::size_t Cap=32>
using constructor_arity = detail::maximize< 0, Cap, detail::construct_searcher<T>::template result >;
This solution makes use of template
template arguments which reuses the
is_aggregate_constructible_from_n
above to find the largest number of members
that can construct a given aggregate from between 0
to Cap
(default 32
).
Testing our solution with the above Point
type:
static_assert(constructor_arity<Point>::value == 2u);
Extracting members from an aggregate ¶↑
Now that we know how many members a given aggregate type can be built from, we can leverage structured bindings to extract the elements out, and perform some basic operation on them.
For our purposes here, lets simply call some function on it in the same way that
std::visit
does with visitor functions. Note that because structured bindings
requires a specific number of elements statically specified, we will require N
overloads for extracting N members:
namespace detail {
template <typename T, typename Fn>
auto for_each_impl(T&& agg, Fn&& fn, std::integral_constant<std::size_t,0>) -> void
{
// do nothing (0 members)
}
template <typename T, typename Fn>
auto for_each_impl(T& agg, Fn&& fn, std::integral_constant<std::size_t,1>) -> void
{
auto& [m0] = agg;
fn(m0);
}
template <typename T, typename Fn>
auto for_each_impl(T& agg, Fn&& fn, std::integral_constant<std::size_t,2>) -> void
{
auto& [m0, m1] = agg;
fn(m0); fn(m1);
}
template <typename T, typename Fn>
auto for_each_impl(T& agg, Fn&& fn, std::integral_constant<std::size_t,3>) -> void
{
auto& [m0, m1, m2] = agg;
fn(m0); fn(m1); fn(m2);
}
// ...
} // namespace detail
template <typename T, typename Fn>
void for_each_member(T& agg, Fn&& fn)
{
detail::for_each_impl(agg, std::forward<Fn>(fn), constructor_arity<T>{});
}
We simply use integral_constant
for tag-dispatch here for each member, and
forward a function Fn
that is called on each member. Lets test this quickly:
int main()
{
const auto p = Point{1,2};
for_each_member(p, [](auto x){
std::cout << x << std::endl;
});
}
This effectively gives us a working solution.
Back to Serialization ¶↑
Lets tie this all together now by brining in serialization. Now that we have an
easy way to access each member of a struct
, serialization just becomes a
simple aspect of converting all members to a series of bytes with a simple
callback.
If we were ignoring the endianness, then serialiazation of packed data can be accomplished as simply as:
template <typename T>
auto to_packed_bytes(const T& data) -> std::vector<std::byte>
{
auto result = std::vector<std::byte>{};
// serialize each member!
for_each_member(data, [&result](const auto& v){
const auto* const begin = reinterpret_cast<const std::byte*>(&v);
const auto* const end = begin + sizeof(v);
result.insert(result.end(), begin, end);
});
return result;
}
...
auto data = Foo{'X', 42};
auto result = to_packed_bytes(data);
This is much easier than having N serialization functions defined for each
generated object. All thats needed for this solution is the high-bound of
Count
in the detection macro to increase, and to have Count
instances of
the for_each_impl
overload mentioned earlier.
Closing Thoughts ¶↑
This gave us an interesting solution to “reflecting” over members of any aggregate in a generic way – all using completely standard C++17.
Originally when I discovered this solution, I had thought that I was the first
to encounter this particular method; however while doing research for this
write-up I discovered that the brilliant magic_get
library beat
me to it. However, this technique can still prove useful in any modern
codebase – and can be used for a number of weird and wonderful things.
Outside of the serialization example that prompted this discovery, this can
also be used in conjunction with other meta-programming utilities such as
getting the unmangled type name at compile time
to generate
operator<<
overloads for the purposes of printing aggregates on-the-fly.
Possible Improvements ¶↑
This is just a basic outline of what can be done since this is a tutorial article. There are some possible improvements that are worth considering as well:
We can propagate the CV-qualifiers of the type by changing
T
toT&&
, andauto&
toauto&&
in the bindings (which will then require some morestd::forward
-ing)We could detect the existence of specializations of
std::get
andstd::tuple_size
so that this works with more than just aggregates