Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,27 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),

## [Unreleased]

## [3.2.0]

Streaming batch parsing, severity classification for validation errors, RFC 5322 §4.4 obs-route support, and broader CFWS tolerance around addr-spec boundaries. All additions are non-breaking for v3.1 callers.

### Added
- `Parse::parseStream(iterable, string): Generator<ParsedEmailAddress>` — lazy batch parsing that yields one typed address at a time, reducing memory footprint for large inputs (CSV rows, pipelines, etc.). Each input item may itself contain multiple separator-delimited addresses.
- `ValidationSeverity` backed enum with `Critical`, `Warning`, `Info` cases. Callers can distinguish structural parse failures (Critical) from policy violations where the address is syntactically well-formed (Warning) to accept soft failures in non-SMTP contexts.
- `ParseErrorCode::severity(): ValidationSeverity` — every error code is now classified. 13 codes are Warning (UTF-8 rejection, C0/C1 controls, empty-quoted, FQDN requirement, IP global-range, length limits, punycode conversion); all others are Critical.
- `ParsedEmailAddress::invalidSeverity(): ?ValidationSeverity` — derived from `invalidReasonCode`; returns `null` when the address is valid.
- RFC 5322 §4.4 obs-route support: `<@host1,@host2:user@host3>` source-route prefixes are recognized and stripped; the real addr-spec becomes the parsed address. The route string is captured on `ParsedEmailAddress::$obsRoute`. Gated by `ParseOptions::$allowObsRoute` (default `false`; enabled in `rfc5322()` and `rfc2822()`).
- `ParseOptions::$allowObsRoute` property and `withAllowObsRoute()` fluent builder.
- `obs_route` field on the array output of `Parse::parse()` (populated when an obs-route is consumed; `null` otherwise).

### Changed
- RFC 5322 §3.2.2 CFWS: folding whitespace is now absorbed at dot-atom boundaries and around angle-addr delimiters via look-ahead in the whitespace handler. Previously-rejected inputs like `local @domain.com`, `local@ domain.com`, `< local@domain.com >`, `<local @ domain.com>`, and multi-line folded whitespace now parse successfully.
- Parser internal: added `STATE_OBS_ROUTE` state for absorbing obs-route prefixes; added `in_angle_addr` and `obs_route` tracking fields to the internal email-address accumulator.
- `composer stan` now runs with `--memory-limit=512M` to accommodate the larger codebase.

### Fixed
- None — no behavior regressions; only additions and tolerance expansions.

## [3.1.0]

Immutable `ParseOptions`, typed value-object output, structured error codes, and two new validation rules. All additions are non-breaking for v3.0 callers; readonly rule properties are a hard cutover for code that was mutating them directly (the factory methods and deprecated setters continue to work).
Expand Down
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,12 @@ if ($address->invalid) {

$result = Parse::getInstance()->parseMultiple('a@a.com, b@b.com');
foreach ($result->emailAddresses as $addr) { /* ... */ }

// Streaming for large batches (v3.2+) — yields one address at a time.
foreach (Parse::getInstance()->parseStream($csvRows) as $addr) {
if ($addr->invalid) continue;
// ...
}
```

### Advanced Usage with ParseOptions
Expand Down Expand Up @@ -166,6 +172,7 @@ $parser = new Parse(null, $options);
| `applyNfcNormalization` | `false` | Apply NFC Unicode normalization (RFC 6532 §3.1) |
| `validateDisplayNamePhrase` | `false` | Enforce RFC 5322 §3.2.5 phrase syntax on unquoted display names |
| `strictIdna` | `false` | Apply full IDNA2008 conformance on U-label domains (RFC 5891/5892/5893) |
| `allowObsRoute` | `false` | Accept RFC 5322 §4.4 obs-route source-routes like `<@host1,@host2:user@host3>` |
| **Length & Output** | | |
| `enforceLengthLimits` | `true` | Enforce RFC 5321 length limits (64/254/63) |
| `includeDomainAscii` | `false` | Include punycode `domain_ascii` in output |
Expand Down
18 changes: 10 additions & 8 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,22 +38,24 @@ Future plans by version. Items here are intent, not commitment — priority and
- [x] `strictIdna: bool` — apply full IDNA2008 conformance (`IDNA_USE_STD3_RULES | IDNA_CHECK_BIDI | IDNA_CHECK_CONTEXTJ | IDNA_NONTRANSITIONAL_TO_ASCII`) per RFC 5891/5892/5893. Enabled by default in `rfc6531()`.
- [x] Extended test coverage: 265 assertions (target: 250+).

## v3.2 — Streaming, Severity Levels, Obsolete Syntax
## v3.2 — Streaming, Severity Levels, Obsolete Syntax — shipped

**Batch streaming:**
- [ ] `parseStream(iterable): Generator` — yield `ParsedEmailAddress` one at a time for large email lists, reducing memory footprint.
- [x] `Parse::parseStream(iterable, string): Generator<ParsedEmailAddress>` — yields one typed address at a time; each input item may itself contain multiple separator-delimited addresses.

**Validation severity levels:**
- [ ] Add a `ValidationSeverity` enum (`Critical`, `Warning`, `Info`) attached to each parsed address — allows callers to accept "soft" failures while rejecting hard ones.
- [x] `ValidationSeverity` enum with `Critical`, `Warning`, `Info` cases.
- [x] `ParseErrorCode::severity()` method classifying every code (13 Warning, rest Critical).
- [x] `ParsedEmailAddress::invalidSeverity()` accessor returning the derived severity (or `null` when valid).

**Obsolete syntax extensions (RFC 5322 §4):**

> Note: `obs-local-part` is already supported via `allowObsLocalPart` in v3.0. The items below cover the remaining obsolete forms.
> Note: `obs-local-part` was already supported via `allowObsLocalPart` in v3.0.

- [ ] `obs-route` handling for the `rfc5322()` preset.
- [ ] CFWS (comments / folding whitespace) improvements.
- [ ] `obs-angle-addr` support.
- [ ] `obs-domain-list` syntax for the `rfc2822()` preset.
- [x] `obs-route` handling — `ParseOptions::$allowObsRoute` gates acceptance of `<@host1,@host2:user@host3>` source-route prefixes; the route is captured on `ParsedEmailAddress::$obsRoute`. Enabled by default in `rfc5322()` and `rfc2822()`.
- [x] `obs-angle-addr` — implied by obs-route support (it is the outer `[CFWS] "<" obs-route addr-spec ">" [CFWS]` form).
- [x] `obs-domain-list` — the `*("," [CFWS] ["@" domain])` shape is consumed inside `STATE_OBS_ROUTE`.
- [x] CFWS (comments / folding whitespace) improvements — look-ahead in the whitespace handler now absorbs CFWS at dot-atom boundaries (`local @domain`, `local@ domain`, `local @ domain`) and around angle-addr delimiters (`< local@domain >`, `<local @ domain>`), including folded whitespace (LF + WSP). Comments in these positions were already supported in v3.0.

## v4.0 — Breaking Modernization

Expand Down
39 changes: 39 additions & 0 deletions UPGRADE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,44 @@
# Upgrade Guide

## v3.1 → v3.2

v3.2 is fully additive — no breaking changes. Two behavior changes are worth noting for callers who depended on them:

### Behavior Changes (Tolerance Expansions)

**CFWS around `@` and inside `<…>` is now accepted.** The v3.1 parser rejected these inputs as "Email address contains whitespace"; v3.2 treats them as RFC 5322 §3.2.2 folding whitespace:

```php
// All of these now parse successfully (v3.2+):
'local @domain.com' // trailing CFWS on local-part
'local@ domain.com' // leading CFWS on domain
'local @ domain.com' // both
'< local@domain.com >' // inside angle-addr
'<local @ domain.com>' // both, inside angle-addr
"local\n\t@domain.com" // folded whitespace
```

If your code validated that addresses are "tight" (no whitespace), re-check with the v3.2 definition — these now register as `invalid=false`.

**Obs-route `<@host:addr>` is accepted in `rfc5322()` and `rfc2822()` presets.** Previously rejected as "Invalid character in domain"; now recognized, stripped, and the real addr-spec is exposed. The captured route is available as `$parsed->obsRoute`. Disabled in `rfc5321()` and legacy defaults — no change there. To opt out, call `->withAllowObsRoute(false)` on the preset.

### Additions (Non-Breaking)

- **`Parse::parseStream(iterable, string): Generator`** — lazy batch parsing. Use it for large inputs where holding every `ParsedEmailAddress` in memory is undesirable.
- **`ValidationSeverity` enum** — `Critical` / `Warning` / `Info`. Access via `$parsed->invalidSeverity()` or `$errorCode->severity()`. Use it to distinguish "unparseable" from "policy-rejected but well-formed":
```php
if ($parsed->invalid && $parsed->invalidSeverity() === ValidationSeverity::Warning) {
// Well-formed address rejected by a configured rule (UTF-8, FQDN, IP range, length).
// Safe to accept in non-SMTP contexts if desired.
}
```
- **`ParsedEmailAddress::$obsRoute`** — captured obs-route prefix (e.g. `@hostA,@hostB`) when one was stripped. `null` for normal addresses.
- **`ParseOptions::$allowObsRoute`** (readonly) + `withAllowObsRoute()` builder.

### Minimum Requirements (Unchanged)

PHP `^8.1`, `ext-mbstring`, `ext-intl`.

## v3.0 → v3.1

v3.1 is additive with one hard cutover: the 15 `ParseOptions` rule properties are now `readonly`. Factory presets and the deprecated setters still work. Everything else is new and non-breaking.
Expand Down
2 changes: 1 addition & 1 deletion composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
"test:coverage": "phpunit --coverage-html coverage",
"cs:check": "php-cs-fixer fix --dry-run --diff",
"cs:fix": "php-cs-fixer fix",
"stan": "phpstan analyse",
"stan": "phpstan analyse --memory-limit=512M",
"ci": [
"@cs:check",
"@stan",
Expand Down
Loading
Loading