You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/tokens.md
+47-56Lines changed: 47 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -109,17 +109,17 @@ r[lex.token.literal.suffix]
109
109
#### Suffixes
110
110
111
111
r[lex.token.literal.literal.suffix.intro]
112
-
A suffix is a sequence of characters following the primary part of a literal (without intervening whitespace), of the same form as a non-raw identifier or keyword.
112
+
A suffix is a sequence of characters following (without intervening whitespace) the primary part of a literal of the same form as a non-raw identifier or keyword.
113
113
114
114
r[lex.token.literal.suffix.syntax]
115
115
```grammar,lexer
116
-
SUFFIX -> IDENTIFIER_OR_KEYWORD _except `_`_
117
-
118
-
SUFFIX_NO_E -> ![`e` `E`] SUFFIX
116
+
SUFFIX ->
117
+
`_` ^ XID_Continue+
118
+
| XID_Start XID_Continue*
119
119
```
120
120
121
121
r[lex.token.literal.suffix.validity]
122
-
Any kind of literal (string, integer, etc) with any suffix is valid as a token.
122
+
Any kind of literal (string, integer, etc.) with any suffix is valid as a token.
123
123
124
124
A literal token with any suffix can be passed to a macro without producing an error. The macro itself will decide how to interpret such a token and whether to produce an error or not. In particular, the `literal` fragment specifier for by-example macros matches literal tokens with arbitrary suffixes.
* A _binary literal_ starts with the character sequence `U+0030``U+0062` (`0b`) and continues as any mixture (with at least one digit) of binary digits and underscores.
479
482
480
-
r[lex.token.literal.int.restriction]
483
+
r[lex.token.literal.int.suffix]
481
484
Like any literal, an integer literal may be followed (immediately, without any spaces) by a suffix as described above. The suffix may not begin with `e` or `E`, as that would be interpreted as the exponent of a floating-point literal. See [Integer literal expressions] for the effect of these suffixes.
482
485
483
486
Examples of integer literals which are accepted as literal expressions:
@@ -525,6 +528,35 @@ Examples of integer literals which are not accepted as literal expressions:
525
528
# }
526
529
```
527
530
531
+
r[lex.token.literal.int.invalid]
532
+
##### Invalid integer literals
533
+
534
+
r[lex.token.literal.int.invalid.intro]
535
+
Certain integer literal forms are invalid. To avoid ambiguity, the tokenizer rejects them rather than splitting them into separate tokens.
536
+
537
+
```rust,compile_fail
538
+
0b0102; // This is not `0b010` followed by `2`.
539
+
0o1279; // This is not `0o127` followed by `9`.
540
+
0x80.0; // This is not `0x80` followed by `.` and `0`.
541
+
0b101e; // This is not a suffixed literal or `0b101` followed by `e`.
542
+
0b; // This is not an integer literal or `0` followed by `b`.
543
+
0b_; // This is not an integer literal or `0` followed by `b_`.
544
+
2em; // This is not a suffixed literal or `2` followed by `em`.
545
+
2.0em; // This is not a suffixed literal or `2.0` followed by `em`.
546
+
```
547
+
548
+
r[lex.token.literal.int.out-of-range]
549
+
It is an error to have an unsuffixed binary or octal literal followed without intervening whitespace by a decimal digit outside the range for its radix.
550
+
551
+
r[lex.token.literal.int.period]
552
+
It is an error to have an unsuffixed binary, octal, or hexadecimal literal followed without intervening whitespace by a period character (subject to the same restrictions on what may follow the period as in floating-point literals).
553
+
554
+
r[lex.token.literal.int.exp]
555
+
It is an error to have an unsuffixed binary or octal literal followed without intervening whitespace by the character `e` or `E`.
556
+
557
+
r[lex.token.literal.int.empty-with-radix]
558
+
It is an error for a radix prefix to not be followed, after any optional leading underscores, by at least one valid digit for its radix.
The following lexical forms similar to number literals are _reserved forms_. Due to the possible ambiguity these raise, they are rejected by the tokenizer instead of being interpreted as separate tokens.
621
-
622
-
r[lex.token.literal.reserved.out-of-range]
623
-
* An unsuffixed binary or octal literal followed, without intervening whitespace, by a decimal digit out of the range for its radix.
624
-
625
-
r[lex.token.literal.reserved.period]
626
-
* An unsuffixed binary, octal, or hexadecimal literal followed, without intervening whitespace, by a period character (with the same restrictions on what follows the period as for floating-point literals).
627
-
628
-
r[lex.token.literal.reserved.exp]
629
-
* An unsuffixed binary or octal literal followed, without intervening whitespace, by the character `e` or `E`.
630
-
631
-
r[lex.token.literal.reserved.empty-with-radix]
632
-
* Input which begins with one of the radix prefixes but is not a valid binary, octal, or hexadecimal literal (because it contains no digits).
633
-
634
-
r[lex.token.literal.reserved.empty-exp]
635
-
* Input which has the form of a floating-point literal with no digits in the exponent.
636
-
637
-
Examples of reserved forms:
636
+
r[lex.token.literal.float.invalid-exponent]
637
+
It is an error for a floating-point literal to have an exponent with no digits.
638
638
639
639
```rust,compile_fail
640
-
0b0102; // this is not `0b010` followed by `2`
641
-
0o1279; // this is not `0o127` followed by `9`
642
-
0x80.0; // this is not `0x80` followed by `.` and `0`
643
-
0b101e; // this is not a suffixed literal, or `0b101` followed by `e`
644
-
0b; // this is not an integer literal, or `0` followed by `b`
645
-
0b_; // this is not an integer literal, or `0` followed by `b_`
646
-
2e; // this is not a floating-point literal, or `2` followed by `e`
647
-
2.0e; // this is not a floating-point literal, or `2.0` followed by `e`
648
-
2em; // this is not a suffixed literal, or `2` followed by `em`
649
-
2.0em; // this is not a suffixed literal, or `2.0` followed by `em`
640
+
2e; // This is not a floating-point literal or `2` followed by `e`.
641
+
2.0e; // This is not a floating-point literal or `2.0` followed by `e`.
0 commit comments