Add @utf8HtmlLiterals directive for opt-in UTF-8 HTML string literals#12848
Add @utf8HtmlLiterals directive for opt-in UTF-8 HTML string literals#12848DamianEdwards wants to merge 1 commit intomainfrom
Conversation
Implements the @utf8HtmlLiterals directive (with boolean token) that when
enabled causes the Razor compiler to emit HTML literal blocks as C# UTF-8
string literals ("..."u8) instead of regular string literals.
This allows the page's base class to provide a WriteLiteral(ReadOnlySpan<byte>)
overload that writes pre-encoded UTF-8 bytes directly to the output, avoiding
runtime UTF-16 to UTF-8 encoding and associated memory allocations.
Key changes:
- Add WriteHtmlUtf8StringLiterals flag to RazorCodeGenerationOptions
- Add Utf8HtmlLiteralsDirective and Utf8HtmlLiteralsDirectivePass
- Register directive for Legacy (.cshtml) files, gated on Version_11_0
- Modify CodeWriterExtensions to append u8 suffix when flag is set
- Modify RuntimeNodeWriter to pass flag from options to code writer
- Use documentNode.Options in lowering phase (respects directive passes)
- Relax directive keyword validation to allow digits (not just letters)
Fixes #8429
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Since this is for .NET 11 anyway, seems like there's plenty of time to get the ROS overload into the runtime. Should also ideally detect whether such an overload exists or not, an error if not, then people can polyfill easily on older runtimes. Also should probably have a LDM about this :) |
The intent wasn't to get an overload into the runtime, at least not at this time. While we certainly could do that, it would make it slower in that case, not faster, as it would then convert from UTF8 bytes to For now, the goal here is to enable other .cshtml-based scenarios (i.e. non-MVC) to leverage this support and get the performance benefits, e.g. Razor Slices.
We don't do this for other directives when custom base classes are being used AFAIK, e.g. if I use
Didn't realize we discussed Razor compiler stuff there now, cool. LMK what the process is. |
That at least answer my other (unasked) question about why this is .cshtml only.
Oh, I don't mean the C# LDM. There has been one Razor LDM meeting so far, and I was asleep at the time, but the plan is for there to at least be some committee that can sign off on things, I believe.
I know we don't, but IMO that is not a good thing, and something we should be better about in future. BUT this is also something we can discuss at LDM and see if anyone else agrees with me :) |
Summary
Implements the
@utf8HtmlLiteralsdirective (#8429) that when enabled causes the Razor compiler to emit HTML literal blocks as C# UTF-8 string literals ("..."u8) instead of regular string literals.Motivation
HTML content in
.cshtmlfiles is emitted asWriteLiteral("html content")calls using regular C# string literals (UTF-16). At runtime, these strings must be encoded to UTF-8 on every request, causing measurable overhead in high-performance scenarios. C# UTF-8 string literals allow the compiler to pre-encode the bytes, eliminating runtime encoding and reducing memory allocations.Usage
Generates:
The page base class must provide a
WriteLiteral(ReadOnlySpan<byte>)overload:Key design decisions
@utf8HtmlLiterals true/false), consistent with@preservewhitespace.cshtml— Razor Pages/MVC Views)Version_11_0(Preview) — prevents premature shipping_ViewImports.cshtml— enable globally, disable per-pageChanges
WriteHtmlUtf8StringLiteralsflag toRazorCodeGenerationOptions(immutableFlagsenum pattern)Utf8HtmlLiteralsDirectiveandUtf8HtmlLiteralsDirectivePassVersion_11_0CodeWriterExtensionsto appendu8suffix when flag is setRuntimeNodeWriterto pass flag from options to code writerdocumentNode.Optionsin lowering phase (respects directive pass modifications)Fixes #8429