Skip to content

Commit 0a1928e

Browse files
authored
Merge pull request #2193 from ehuss/remove-reserved-number
Remove RESERVED_NUMBER
2 parents 50a1075 + 8c698a2 commit 0a1928e

1 file changed

Lines changed: 47 additions & 56 deletions

File tree

src/tokens.md

Lines changed: 47 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -109,17 +109,17 @@ r[lex.token.literal.suffix]
109109
#### Suffixes
110110

111111
r[lex.token.literal.literal.suffix.intro]
112-
A suffix is a sequence of characters following the primary part of a literal (without intervening whitespace), of the same form as a non-raw identifier or keyword.
112+
A suffix is a sequence of characters following (without intervening whitespace) the primary part of a literal of the same form as a non-raw identifier or keyword.
113113

114114
r[lex.token.literal.suffix.syntax]
115115
```grammar,lexer
116-
SUFFIX -> IDENTIFIER_OR_KEYWORD _except `_`_
117-
118-
SUFFIX_NO_E -> ![`e` `E`] SUFFIX
116+
SUFFIX ->
117+
`_` ^ XID_Continue+
118+
| XID_Start XID_Continue*
119119
```
120120

121121
r[lex.token.literal.suffix.validity]
122-
Any kind of literal (string, integer, etc) with any suffix is valid as a token.
122+
Any kind of literal (string, integer, etc.) with any suffix is valid as a token.
123123

124124
A literal token with any suffix can be passed to a macro without producing an error. The macro itself will decide how to interpret such a token and whether to produce an error or not. In particular, the `literal` fragment specifier for by-example macros matches literal tokens with arbitrary suffixes.
125125

@@ -443,15 +443,16 @@ r[lex.token.literal.int]
443443
r[lex.token.literal.int.syntax]
444444
```grammar,lexer
445445
INTEGER_LITERAL ->
446-
( BIN_LITERAL | OCT_LITERAL | HEX_LITERAL | DEC_LITERAL ) SUFFIX_NO_E?
446+
( BIN_LITERAL | OCT_LITERAL | HEX_LITERAL | DEC_LITERAL )
447+
^ !RESERVED_FLOAT SUFFIX?
447448
448449
DEC_LITERAL -> DEC_DIGIT (DEC_DIGIT|`_`)*
449450
450-
BIN_LITERAL -> `0b` `_`* BIN_DIGIT (BIN_DIGIT|`_`)*
451+
BIN_LITERAL -> `0b` ^ `_`* BIN_DIGIT (BIN_DIGIT|`_`)* ![`e` `E` `2`-`9`]
451452
452-
OCT_LITERAL -> `0o` `_`* OCT_DIGIT (OCT_DIGIT|`_`)*
453+
OCT_LITERAL -> `0o` ^ `_`* OCT_DIGIT (OCT_DIGIT|`_`)* ![`e` `E` `8`-`9`]
453454
454-
HEX_LITERAL -> `0x` `_`* HEX_DIGIT (HEX_DIGIT|`_`)*
455+
HEX_LITERAL -> `0x` ^ `_`* HEX_DIGIT (HEX_DIGIT|`_`)*
455456
456457
BIN_DIGIT -> [`0`-`1`]
457458
@@ -460,6 +461,8 @@ OCT_DIGIT -> [`0`-`7`]
460461
DEC_DIGIT -> [`0`-`9`]
461462
462463
HEX_DIGIT -> [`0`-`9` `a`-`f` `A`-`F`]
464+
465+
RESERVED_FLOAT -> `.` !(`.` | `_` | XID_Start)
463466
```
464467

465468
r[lex.token.literal.int.kind]
@@ -477,7 +480,7 @@ r[lex.token.literal.int.kind-oct]
477480
r[lex.token.literal.int.kind-bin]
478481
* A _binary literal_ starts with the character sequence `U+0030` `U+0062` (`0b`) and continues as any mixture (with at least one digit) of binary digits and underscores.
479482

480-
r[lex.token.literal.int.restriction]
483+
r[lex.token.literal.int.suffix]
481484
Like any literal, an integer literal may be followed (immediately, without any spaces) by a suffix as described above. The suffix may not begin with `e` or `E`, as that would be interpreted as the exponent of a floating-point literal. See [Integer literal expressions] for the effect of these suffixes.
482485

483486
Examples of integer literals which are accepted as literal expressions:
@@ -525,6 +528,35 @@ Examples of integer literals which are not accepted as literal expressions:
525528
# }
526529
```
527530

531+
r[lex.token.literal.int.invalid]
532+
##### Invalid integer literals
533+
534+
r[lex.token.literal.int.invalid.intro]
535+
Certain integer literal forms are invalid. To avoid ambiguity, the tokenizer rejects them rather than splitting them into separate tokens.
536+
537+
```rust,compile_fail
538+
0b0102; // This is not `0b010` followed by `2`.
539+
0o1279; // This is not `0o127` followed by `9`.
540+
0x80.0; // This is not `0x80` followed by `.` and `0`.
541+
0b101e; // This is not a suffixed literal or `0b101` followed by `e`.
542+
0b; // This is not an integer literal or `0` followed by `b`.
543+
0b_; // This is not an integer literal or `0` followed by `b_`.
544+
2em; // This is not a suffixed literal or `2` followed by `em`.
545+
2.0em; // This is not a suffixed literal or `2.0` followed by `em`.
546+
```
547+
548+
r[lex.token.literal.int.out-of-range]
549+
It is an error to have an unsuffixed binary or octal literal followed without intervening whitespace by a decimal digit outside the range for its radix.
550+
551+
r[lex.token.literal.int.period]
552+
It is an error to have an unsuffixed binary, octal, or hexadecimal literal followed without intervening whitespace by a period character (subject to the same restrictions on what may follow the period as in floating-point literals).
553+
554+
r[lex.token.literal.int.exp]
555+
It is an error to have an unsuffixed binary or octal literal followed without intervening whitespace by the character `e` or `E`.
556+
557+
r[lex.token.literal.int.empty-with-radix]
558+
It is an error for a radix prefix to not be followed, after any optional leading underscores, by at least one valid digit for its radix.
559+
528560
r[lex.token.literal.int.tuple-field]
529561
#### Tuple index
530562

@@ -559,7 +591,7 @@ r[lex.token.literal.float.syntax]
559591
```grammar,lexer
560592
FLOAT_LITERAL ->
561593
DEC_LITERAL (`.` DEC_LITERAL)? FLOAT_EXPONENT SUFFIX?
562-
| DEC_LITERAL `.` DEC_LITERAL SUFFIX_NO_E?
594+
| DEC_LITERAL `.` DEC_LITERAL SUFFIX?
563595
| DEC_LITERAL `.` !(`.` | `_` | XID_Start)
564596
565597
FLOAT_EXPONENT ->
@@ -601,52 +633,12 @@ Examples of floating-point literals which are not accepted as literal expression
601633
# }
602634
```
603635

604-
r[lex.token.literal.reserved]
605-
#### Reserved forms similar to number literals
606-
607-
r[lex.token.literal.reserved.syntax]
608-
```grammar,lexer
609-
RESERVED_NUMBER ->
610-
BIN_LITERAL [`2`-`9`]
611-
| OCT_LITERAL [`8`-`9`]
612-
| ( BIN_LITERAL | OCT_LITERAL | HEX_LITERAL ) `.` !(`.` | `_` | XID_Start)
613-
| ( BIN_LITERAL | OCT_LITERAL ) (`e`|`E`)
614-
| `0b` `_`* !BIN_DIGIT
615-
| `0o` `_`* !OCT_DIGIT
616-
| `0x` `_`* !HEX_DIGIT
617-
```
618-
619-
r[lex.token.literal.reserved.intro]
620-
The following lexical forms similar to number literals are _reserved forms_. Due to the possible ambiguity these raise, they are rejected by the tokenizer instead of being interpreted as separate tokens.
621-
622-
r[lex.token.literal.reserved.out-of-range]
623-
* An unsuffixed binary or octal literal followed, without intervening whitespace, by a decimal digit out of the range for its radix.
624-
625-
r[lex.token.literal.reserved.period]
626-
* An unsuffixed binary, octal, or hexadecimal literal followed, without intervening whitespace, by a period character (with the same restrictions on what follows the period as for floating-point literals).
627-
628-
r[lex.token.literal.reserved.exp]
629-
* An unsuffixed binary or octal literal followed, without intervening whitespace, by the character `e` or `E`.
630-
631-
r[lex.token.literal.reserved.empty-with-radix]
632-
* Input which begins with one of the radix prefixes but is not a valid binary, octal, or hexadecimal literal (because it contains no digits).
633-
634-
r[lex.token.literal.reserved.empty-exp]
635-
* Input which has the form of a floating-point literal with no digits in the exponent.
636-
637-
Examples of reserved forms:
636+
r[lex.token.literal.float.invalid-exponent]
637+
It is an error for a floating-point literal to have an exponent with no digits.
638638

639639
```rust,compile_fail
640-
0b0102; // this is not `0b010` followed by `2`
641-
0o1279; // this is not `0o127` followed by `9`
642-
0x80.0; // this is not `0x80` followed by `.` and `0`
643-
0b101e; // this is not a suffixed literal, or `0b101` followed by `e`
644-
0b; // this is not an integer literal, or `0` followed by `b`
645-
0b_; // this is not an integer literal, or `0` followed by `b_`
646-
2e; // this is not a floating-point literal, or `2` followed by `e`
647-
2.0e; // this is not a floating-point literal, or `2.0` followed by `e`
648-
2em; // this is not a suffixed literal, or `2` followed by `em`
649-
2.0em; // this is not a suffixed literal, or `2.0` followed by `em`
640+
2e; // This is not a floating-point literal or `2` followed by `e`.
641+
2.0e; // This is not a floating-point literal or `2.0` followed by `e`.
650642
```
651643

652644
r[lex.token.life]
@@ -771,7 +763,6 @@ r[lex.token.reserved.syntax]
771763
```grammar,lexer
772764
RESERVED_TOKEN ->
773765
RESERVED_GUARDED_STRING_LITERAL
774-
| RESERVED_NUMBER
775766
| RESERVED_POUNDS
776767
| RESERVED_RAW_IDENTIFIER
777768
| RESERVED_RAW_LIFETIME

0 commit comments

Comments
 (0)