diff --git a/README.md b/README.md index b0ee268..8f379f4 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ A comprehensive guide to mastering macros in Rust, from basic concepts to advanc ## 📖 Overview -This tutorial is designed for Rust developers who want to deepen their understanding of macros, one of the most powerful—and often misunderstood—features of the language. Whether you're struggling with `macro_rules!`, exploring declarative macros, or diving into custom derive implementations, this guide will walk you through every step. +Welcome to the Rust Macros Tutorial! This guide is a hands-on, interactive way to learn about Rust's powerful metaprogramming system. You will learn about declarative macros (`macro_rules!`) and all three types of procedural macros: Derive, Attribute, and Function-like. Through editable code examples, you'll discover how to use macros for code generation and domain-specific abstractions, while learning best practices for safety and maintainability. ## 💻 Prerequisites diff --git a/src/SUMMARY.md b/src/SUMMARY.md index bd3840a..c981a1f 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -9,3 +9,9 @@ - [Hygiene](./macros_1.0/hygiene.md) - [Macros 2.0](./macros_2.0.md) - [Procedural macros](./procedural_macros.md) + - [TokenStream](./procedural_macros/token_stream.md) + - [proc-macro2, syn, quote](./procedural_macros/proc_macro2_syn_quote.md) + - [Syntax Tree](./procedural_macros/syntax_tree.md) + - [Span](./procedural_macros/span.md) + - [quote](./procedural_macros/quote.md) + - [Procedural Macro Crates](./procedural_macros/proc_macro_crates.md) diff --git a/src/procedural_macros.md b/src/procedural_macros.md index 175e016..4d35749 100644 --- a/src/procedural_macros.md +++ b/src/procedural_macros.md @@ -1 +1,18 @@ -# Procedural macros +# Procedural Macros + +Procedural macros allow you to run custom code during the compilation process. They operate with the same level of access as the compiler itself, including access to standard I/O streams and the file system. Because of this, procedural macros share the same security considerations as Cargo build scripts. However, they are also subject to several unique constraints: + +1. They must be defined in a dedicated crate type specifically marked as a `proc-macro` crate. +2. A `proc-macro` crate can only export procedural macros; it cannot export standard functions or types. +3. Procedural macros cannot be used within the same crate that defines them. +4. Defining them requires the compiler-provided `proc-macro` crate, which is only available within these specialized macro crates. + +These constraints can make testing and debugging procedural macros challenging, often creating a barrier for developers who are just starting out. + +Fortunately, [David Tolnay](https://github.com/dtolnay), a prominent figure in the Rust community, has created several essential crates that simplify this workflow. In this chapter, we will focus on three of them: [`syn`](https://github.com/dtolnay/syn), [`quote`](https://github.com/dtolnay/quote), and [`proc-macro2`](https://github.com/dtolnay/proc-macro2). These are the standard tools used in nearly all professional Rust projects for building procedural macros. + +By using these crates, we can write our macro logic in a standard Rust crate where it can be easily tested and debugged. Once the logic is verified, we can then integrate it into a formal `proc-macro` crate. + +Instead of jumping straight into the complexities of procedural macro crates, we will begin our journey by learning how to write "code that generates code" within a standard Rust environment. + +Let's go! diff --git a/src/procedural_macros/proc_macro2_syn_quote.md b/src/procedural_macros/proc_macro2_syn_quote.md new file mode 100644 index 0000000..c339469 --- /dev/null +++ b/src/procedural_macros/proc_macro2_syn_quote.md @@ -0,0 +1,7 @@ +# The Procedural Macro Ecosystem: proc-macro2, syn, and quote + +Procedural macros function by manipulating the `proc_macro::TokenStream` type. However, the standard `proc_macro` crate is a compiler built-in and is restricted to crates explicitly defined as procedural macro libraries. To bypass these limitations and streamline development, the Rust ecosystem relies on three foundational crates: + +- **`proc-macro2`**: This crate offers an API almost identical to `proc_macro` but can be used in any environment (including tests and binaries). It acts as a bridge, facilitating seamless conversion to and from the compiler-native `proc_macro::TokenStream`. +- **`syn`**: A parsing library that provides a complete syntax tree for any valid Rust source code. It also makes defining custom syntax trees easy. +- **`quote`**: As the counterpart to `syn`, this crate enables you to convert Rust code templates back into a `proc_macro2::TokenStream`. Its support for "quasi-quoting" allows you to interpolate variables directly into the generated source code. diff --git a/src/procedural_macros/proc_macro_crates.md b/src/procedural_macros/proc_macro_crates.md new file mode 100644 index 0000000..f0a11cd --- /dev/null +++ b/src/procedural_macros/proc_macro_crates.md @@ -0,0 +1,154 @@ +# Procedural Macro Crates + +Congratulations, fellow Rustacean🦀! You've reached a major milestone. Now, you're ready to learn the final, essential step: how to package and use your procedural macros in a crate. + +## Creating a Proc-Macro Crate + +A dedicated library crate is mandatory for procedural macros because they must have the `proc-macro` crate type enabled. Crucially, they cannot be defined within the same crate that uses them. + +To create one, run: + +```bash +cargo new --lib my_proc_macros +cd my_proc_macros +cargo add syn quote proc-macro2 +``` + +Then, update your `Cargo.toml` to enable the `proc-macro` feature: + +```toml +[lib] +proc-macro = true +``` + +## Function-like Procedural Macros + +Function-like macros are called with a following exclamation mark (like `println!`). + +```rust,ignore +use proc_macro::TokenStream; +use quote::quote; +use syn::{parse_macro_input, Ident}; + +// In `src/lib.rs`, define a function-like procedural macro. +#[proc_macro] +pub fn hello_macro(input: TokenStream) -> TokenStream { + // Parse the input tokens into a syn Ident + let name = parse_macro_input!(input as Ident); + let name_str = name.to_string(); + + let output = quote! { + println!("Hello, {}!", #name_str); + }; + + TokenStream::from(output) +} +``` + +We can now use `hello_macro` in a normal crate: + +```rust,ignore +use my_proc_macros::hello_macro; + +fn main() { + hello_macro!(world); +} +``` + +### The `input` Parameter +The `input` contains the tokens enclosed by whatever delimiters (parentheses `()`, brackets `[]`, or braces `{}`) are used when calling the macro. In the example above, the input is `world`. Upon expansion, the call `hello_macro!(world)` is replaced entirely by the macro's `output`. + +### The `parse_macro_input!` Macro +The `parse_macro_input!` macro parses the input `TokenStream` into a specific `syn` syntax tree node. If parsing fails, it automatically generates a high-quality compile-time error. + +The basic syntax is `parse_macro_input!( as )`. This convenience macro is specifically designed for use within `proc-macro` crates. + +## Attribute Procedural Macros + +Attribute macros define custom attributes that can be attached to items like functions or structs. + +```rust,ignore +use proc_macro::TokenStream; +use quote::quote; +use syn::{parse_macro_input, ItemFn}; + +// In `src/lib.rs`, define an attribute procedural macro. +#[proc_macro_attribute] +pub fn my_attribute(_attr: TokenStream, item: TokenStream) -> TokenStream { + // We parse the item (e.g., a function) the attribute is attached to + let input_item = parse_macro_input!(item as ItemFn); + + // We keep the original item as-is + let output = quote! { + #input_item + }; + + TokenStream::from(output) +} +``` + +We can use `my_attribute` in a normal crate: + +```rust,ignore +use my_proc_macros::my_attribute; + +#[my_attribute(attr1, attr2, key=value)] +pub fn foo() { + println!("Hello from foo!"); +} +``` + +### Parameters in Attribute Macros +- **`attr`**: The tokens inside the attribute's parentheses (e.g., `attr1, attr2`). +- **`item`**: The tokens for the item the attribute is attached to (e.g., a function, struct, or enum). + +Unlike function-like macros, which replace the call itself, an attribute macro replaces the **entire item** it is attached to with its `output`. + +## Derive Procedural Macros + +Derive macros create new items (usually trait implementations) for an existing item. + +```rust,ignore +use proc_macro::TokenStream; +use quote::quote; +use syn::{parse_macro_input, DeriveInput}; + +#[proc_macro_derive(MyDerive, attributes(my_helper))] +pub fn my_derive(input: TokenStream) -> TokenStream { + // Parse the entire struct/enum/union + let input = parse_macro_input!(input as DeriveInput); + let name = input.ident; + + let output = quote! { + impl MyTrait for #name { + fn hello() { + println!("Hello from my derive!"); + } + } + }; + + TokenStream::from(output) +} +``` + +We can use `MyDerive` in a normal crate: + +```rust,ignore +use my_proc_macros::MyDerive; + +#[derive(MyDerive)] +pub struct MyStruct { + #[my_helper] + pub field1: i32, +} +``` + +### The `input` Parameter +The `input` TokenStream represents the entire item (struct, enum, or union) that the `#[derive(...)]` attribute is decorating. + +### Append-only Expansion +Unlike attribute macros, a derive macro's output does **not** replace the input item. Instead, the output (usually an implementation block) is **appended** to the module or block containing the original item. + +--- + +*Happy macro programming!* 🦀 diff --git a/src/procedural_macros/quote.md b/src/procedural_macros/quote.md new file mode 100644 index 0000000..330ed01 --- /dev/null +++ b/src/procedural_macros/quote.md @@ -0,0 +1,31 @@ +# The `quote` Crate + +We have discussed how to parse input using `syn`. Now, it's time to generate output with `quote`. + +## quote! + +The `quote!` macro is a procedural macro that takes a template of Rust code and returns a `TokenStream`. Its usage is similar to `macro_rules!`. + +|Macro Name|Interpolation|Interpolation Type|Repetition| +|:---|:---|:---|:---| +|macro_rules!|`$var`|metavariable|`$(<...>)[delimiter]<*\|?\|+>`| +|quote!|`#var`|any type implementing `ToTokens`|`#(<...>)[delimiter]*`| + +```rust,editable,compile_fail +fn main() { + let f: syn::ItemFn = syn::parse_quote!( + fn foo(x: i32) -> i32 { + println!("Hello World"); + } + ); + let fn_name = f.sig.ident; + let fn_input = f.sig.inputs; + let fn_output = f.sig.output; + let fn_block = f.block; + // Convert the function to be public + let token_stream = quote::quote! { + pub fn #fn_name (#fn_input) #fn_output #fn_block + }; + println!("{}", token_stream); +} +``` diff --git a/src/procedural_macros/span.md b/src/procedural_macros/span.md new file mode 100644 index 0000000..f891e70 --- /dev/null +++ b/src/procedural_macros/span.md @@ -0,0 +1,70 @@ +# Spans + +When parsing malformed input, simply stating what is wrong without indicating its location is not very helpful for debugging. + +To provide precise error locations, we use `Span`s. A `Span` is an opaque value representing a specific range of source code. While they cannot be modified, they can be created or retrieved. Their primary purpose is error reporting, and every token carries an associated `Span`. + +## Example with Coarse-grained Spans + +```rust,compile_fail +# use syn::{ +# parse::{Parse, ParseStream}, +# *, +# }; +# +# struct HtmlNode; +# impl Parse for HtmlNode { +# fn parse(input: ParseStream) -> Result { +# input.parse::()?; +# input.parse::()?; +# input.parse::]>()?; +# input.parse::()?; +# input.parse::()?; +# input.parse::()?; +# input.parse::()?; +# input.parse::]>()?; +# Ok(HtmlNode) +# } +# } +# fn main() { + // `quote!` assigns the same span to all tokens inside the block. + let input = quote::quote! {
"Hello World"
}; + if let Err(e) = syn::parse2::(input) { + println!("Error: {} at {:?}", e, e.span()); + } +# } +``` + +## Example with Precise Spans + +```rust,editable,compile_fail +use std::str::FromStr; + +use proc_macro2::TokenStream; +use syn::{ + parse::{Parse, ParseStream}, + *, +}; + +struct HtmlNode; +impl Parse for HtmlNode { + fn parse(input: ParseStream) -> Result { + input.parse::()?; + input.parse::()?; + input.parse::]>()?; + input.parse::()?; + input.parse::()?; + input.parse::()?; + input.parse::()?; + input.parse::]>()?; + Ok(HtmlNode) + } +} +fn main() { + // `TokenStream::from_str` assigns a unique span to each individual token. + let input = TokenStream::from_str(r#"
"Hello World"
"#).unwrap(); + if let Err(e) = syn::parse2::(input) { + println!("Error: {} at {:?}", e, e.span()); + } +} +``` diff --git a/src/procedural_macros/syntax_tree.md b/src/procedural_macros/syntax_tree.md new file mode 100644 index 0000000..96040cc --- /dev/null +++ b/src/procedural_macros/syntax_tree.md @@ -0,0 +1,159 @@ +# Syntax Tree + +A syntax tree is a hierarchical representation of source code. It transforms a flat, linear stream of tokens into a structured format that is easy to process and manipulate. The `syn` crate provides a complete syntax tree that can represent any valid Rust source code. We can use `syn` to define our own syntax trees; for example, we can define a syntax tree for HTML, CSS, or any other DSL. + +A syntax tree is made up of syntax tree nodes. A syntax tree node can be a token, a group of tokens, or a value of a type that implements the `syn::parse::Parse` trait. + +## See a Syntax Tree Node in Action + +`syn::File` is a syntax tree (root) node that represents a full source file. + +```rust,editable,compile_fail +use quote::quote; + +fn main() { + let token_stream = quote! { + fn main(){ + println!("Hello, world!"); + } + }; + + let syntax_tree: syn::File = syn::parse2(token_stream).unwrap(); + + println!("{:#?}", syntax_tree); +} +``` + +> [!TIP] +> Don't worry if the output seems overwhelming. You don't need to understand it unless you are working with a full Rust source file. +> +> Furthermore, we won't be using `syn::File` in this tutorial. + +We will learn how to define our own syntax tree nodes. But first, let's explore some basic parsing techniques. + +## Parsing a Single Token + +### Token! + +[Token!](https://docs.rs/syn/latest/syn/macro.Token.html) is a type macro that expands to the Rust type representing a specific token. + +```rust,editable,compile_fail +use syn::*; + +fn main() { + // Parse the `pub` keyword + let input = quote::quote! {pub}; + let _token: Token![pub] = parse2(input).unwrap(); + // Or use parse_quote! + let _token: Token![pub] = parse_quote! {pub}; + + // Parse the `struct` keyword + let _token: Token![struct] = parse_quote! {struct}; + + // Parse `+=` + let _token: Token![+=] = parse_quote! {+=}; + + // Parse `::` + let _token: Token![::] = parse_quote! {::}; + + // Error: `pub fn main() {}` is not a single token + // let _token: Token![pub] = parse_quote! {pub fn main() {}}; +} +``` + +### custom_keyword! + +```rust,editable,compile_fail +use syn::*; + +// We define custom keywords in a `kw` or `keywords` module by convention. +mod kw{ + syn::custom_keyword!(div); +} + +fn main() { + let _token: kw::div = parse_quote! {div}; +} +``` + +## Parsing a Syntax Tree Node + +```rust,editable,compile_fail +use syn::*; + +fn main() { + let _node: ItemFn = parse_quote! {fn main() {println!("Hello, world!")}}; + let _node: ItemStruct = parse_quote! {struct MyStruct {field: i32}}; + // `syn::DeriveInput` is a syntax tree node that represents any valid input to a derive macro. + let _node: DeriveInput = parse_quote! {#[derive(Debug)] struct MyStruct {field: i32}}; +} +``` + +## Parsing a Custom Syntax Tree Node + +There are two ways to parse a custom syntax tree node: + +1. Use a function or closure. +2. Define a custom syntax tree node type that implements the `syn::parse::Parse` trait. + +### Using a function or closure + +```rust,editable,compile_fail +use quote::*; +use syn::{ + parse::{ParseStream, Parser}, + *, +}; + +fn main() { + let input = quote! { +
"Hello World"
+ }; + // parse::Parser::parse2(|input: ParseStream| -> Result<()> { todo!() }, input).unwrap(); + // or + let parser = |input: ParseStream| -> Result<()> { + // `ParseStream::parse()` parses a syntax tree node of type `T`, + // advancing the cursor of the parse stream past it. + + // `<` + input.parse::()?; + // `div` + input.parse::()?; + // `>` + input.parse::]>()?; + // `"Hello World"` + input.parse::()?; + // `<` + input.parse::()?; + // `/` + input.parse::()?; + // `div` + input.parse::()?; + // `>` + input.parse::]>()?; + Ok(()) + }; + parser.parse2(input).unwrap(); +} +``` + +### Defining a custom syntax tree node type by implementing the `syn::parse::Parse` trait + +```rust,ignore +struct HtmlNode{...} +impl Parse for HtmlNode{ + fn parse(input: ParseStream) -> Result { + todo!() + } +} +fn main(){ + let node: HtmlNode = parse_quote!{ +
"Hello World"
+ }; +} +``` + +> [!TIP] +> Complex tree nodes (such as `syn::File`) are composed of simpler tree nodes. +> +> I hope this gives you a clear idea of how to define a custom syntax tree, even for more complex structures. diff --git a/src/procedural_macros/token_stream.md b/src/procedural_macros/token_stream.md new file mode 100644 index 0000000..4cc5f0f --- /dev/null +++ b/src/procedural_macros/token_stream.md @@ -0,0 +1,40 @@ +# TokenStream + +**TLDR**: A procedural macro is a function that takes a (or two for attribute macros) `TokenStream` as input and returns a `TokenStream` as output. + +Before the compiler calls our procedural macro, it converts the source code (the code to which the macro is applied) into a `TokenStream`. The compiler then calls our macro with this `TokenStream` as an argument. Finally, our procedural macro returns a new `TokenStream` as its result. + +A `TokenStream` is roughly equivalent to a `Vec`. A `TokenTree` is very similar to the `tt` (Token Tree) metavariable type used in `macro_rules!`, with only a few minor differences. + +## See TokenStream in Action + +```rust,editable,compile_fail +use quote::quote; + +fn main() { + // Convert Rust code to a TokenStream. + let token_stream = quote! { + // Comments and whitespace are ignored. + + //! inner doc comment + // Note: `//! inner doc comment` is parsed as `#![doc = " inner doc comment"]` + + /// doc comment + // Note: `/// doc comment` is parsed as `#[doc = " doc comment"]` + fn print_sum(a: i32, b: i32) { + println!("{}", a + b); + } + }; + + println!("{}\n", token_stream.to_string()); + + for (i, tt) in token_stream.clone().into_iter().enumerate() { + println!("token {}:", i); + println!("source code: {}", tt); + println!("TokenTree: {:?}\n", tt); + } +} +``` + +> [!TIP] +> We'll talk about `quote!` in detail in the [quote](./quote.md) chapter.