Skip to content

Conversation

@tenpercent
Copy link
Contributor

@tenpercent tenpercent commented Jan 19, 2026

Summary

  • Add BUILD_TIME_OPTIMIZATION.md documenting techniques for reducing C++ template instantiation overhead
  • Covers: O(1) pack expansion, named functors, constexpr arrays, fold expressions
  • Includes before/after code examples and metrics from the optimization PRs

Tracking issue: #3575

@tenpercent tenpercent force-pushed the mpodkory/build-time-docs branch from 5de5575 to 52fa8f6 Compare January 19, 2026 21:41
Add documentation for:
- sequence_map_inverse: O(N) to O(1) via pack expansion (95% time reduction)
- calculate_element_space_size: fold expression (73% time reduction)

Update case studies section with these optimizations.
{
return Sequence<F{}(Number<Is>{})...>{};
}
using type = decltype(make(__make_integer_seq<std::integer_sequence, index_t, N>{}));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two things here, the compiler intrinsic __make_integer_seq and removing the recursive generation. How much do you gain from pure C++17 standard (removing the recursion), versus full optimization with the compiler intrinsic? Is this fixed in C++2x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell there's no way to switch from recursive type generation without using the intrinsic. You can check out the related LLVM and GCC issues such as this

Copy link
Contributor Author

@tenpercent tenpercent Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__make_integer_seq is actually wrapped in std::make_integer_sequence but this is one more layer of template instantiation nvm, the main problem was drift between the documentation and implementation, I updated docs with rationale for using the intrinsic


## Optimization Techniques

### 1. Replace O(N) Recursion with O(1) Pack Expansion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is for replacing recursion with pack expansion, but it also has a compiler intrinsic. Can we split that into two different examples? I think everyone would agree with the parameter pack expansion, but considering the compiler intrinsic is a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler intrinsic doesn't exist without the parameter packs, so it's impossible to separate.

@@ -0,0 +1,327 @@
# Build Time Optimization
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this down into the source, where we are making the code changes. It's not customer documentation, it's aimed at the developers.

Can we align on the goal of this doc? This is kind of all over the place. If it's general info, it should probably go in the tracking bug. In fact, the cleanest way is some comments in the tracking bug that link to documented changes in the source. Then the only need for a markdown file is to track files we need to work on and what has been optimized. The scripts should be documented, so that's not needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, let's move to, say, include/ck

The high-level goal is to document the optimization attempts for the metaprogramming constructs that we have, as well as collect the techniques in an accessible way

When relying on the tracking issue we need to keep in mind that the source code and github infra are different sources of information; from recent discussions I had an impression we wanted to start storing the design documentation in the source, which this file would be a start for

@tenpercent tenpercent force-pushed the mpodkory/build-time-docs branch 10 times, most recently from ecef7c8 to 80c4f97 Compare January 20, 2026 01:12
Changes:
- Move to include/ck/ (developer documentation, not customer-facing)
- Add tracking issue link at top
- Fix section structure (sequential numbering 1-5)
- Remove mismatched transform_tensor_descriptor example
- Clarify O(N) constexpr loop vs template recursion distinction
- Remove "Case Studies" section (redundant with tracking issue)
- Simplify examples for clarity
@tenpercent tenpercent force-pushed the mpodkory/build-time-docs branch from 80c4f97 to 71413bd Compare January 20, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants