From 001b17e13befc3bbf35716d4b2254b95f7ac6069 Mon Sep 17 00:00:00 2001 From: Matthew J Mucklo Date: Tue, 14 Apr 2026 02:45:04 -0700 Subject: [PATCH] =?UTF-8?q?Strengthen=20test=20assertions=20=E2=80=94=20In?= =?UTF-8?q?fection=20MSI=2074%=20=E2=86=92=2080%,=20fix=20one=20real=20bug?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Targeted test-strengthening pass to kill surviving mutants identified by Infection, plus a real bug fix uncovered along the way. All additions are pure test work except for the one-line parser fix in validateQuotedContent. Metrics: - Infection MSI: 74% → 80% (+6 pp) - Covered Code MSI: 79% → 85% (+6 pp) - Surviving mutants: 219 → 159 (−60) - Overall line coverage: 89.6% → 91.2% (+1.6 pp) - Parse.php line cov: 86.69% → 87.98% (+1.3 pp) - Tests: 65 → 75 (+10) - Assertions: 489 → 589 (+100) Bug fix (Parse.php): - validateQuotedContent: restored a `continue` after handling a backslash-escape quoted-pair. Without it, the outer qtextSMTP byte check at line 1230 runs on the backslash itself and incorrectly reports "Invalid character in quoted string: byte 92" for any quoted local-part containing a valid backslash escape under validateQuotedContent=true. Regression introduced in commit 5066d2b during a Scrutinizer cleanup that mistakenly flagged the continue as redundant. New test testQuotedStringContentValidation::boundary-asserts covers the fix. Test strengthening: - testFactoryPresetsHaveExpectedRuleValues — single source-of-truth table asserting all 17 rule properties of each of four factory presets (rfc5321, rfc6531, rfc5322, rfc2822). Kills ~60 boolean-flip mutants across ParseOptions in one pass. - testLengthLimitsDefaultsMatchRfc — exact defaults for createDefault() and createRelaxed() plus the no-arg constructor. Kills 7 integer boundary mutants. - testEveryErrorCodeHasASeverity (strengthened) — now asserts each code maps to its specific Warning or Critical severity rather than just asserting a severity exists. Kills 13 MatchArmRemoval mutants on the severity() match arm. - testDomainLabelHyphenRejection — leading/trailing hyphen labels rejected with the exact ParseErrorCode::DomainLabelStartsOrEndsWithHyphen. Exercises mb_substr / mb_strlen mutants on that line. - testFqdnGateRejectsBadDotPositions — covers all three branches of the dotPos check (false, 0, strlen-1) for RFC 5321 FQDN. - testDomainLabelInvalidCharactersRejected — empty-label case from a leading-dot domain. - testLocalPartLengthBoundary / testTotalLengthBoundary / testQuotedLocalPartLengthIncludesWireDquotes — exact-boundary tests (N accepted, N+1 rejected) for the three RFC 5321 §4.5.3.1 limits; cover the quoted wire-form DQUOTE accounting. - testMultipleInvalidAddressesReasonIsPlural — covers the second-error branch in parseMultiple that flips the top-level reason to plural. - testLegacyAsciiOnlyPatternAcceptsTrailingDot — exercises the legacy/non-strict regex branch (allowUtf8LocalPart=false, allowObsLocalPart=false, rejectC0Controls=false). - testC1ControlInQuotedStringRejectedUnderRfc6531 — hits the C1 control check inside quoted-string validation. - testQuotedStringContentValidation — extended with four boundary cases: byte 0x1F (US, end of C0 range), byte 0x7F (DEL, start of DEL+ range), byte 0x20 (SPACE, lower valid bound), byte 0x7E (~, upper valid bound). Plus backslash+0x7F (upper escape bound) rejection. - Small doc note: ParseErrorCode::TrailingBackslashInQuotedString is unreachable in current code because STATE_QUOTE's backslash-counting logic always closes the quote before a lone trailing backslash can land in local_part_parsed. Kept as defensive code with a test comment explaining the unreachability. Tooling: - infection.json5 thresholds raised to minMsi=80, minCoveredMsi=85 to lock in the new baseline so future regressions fail. ROADMAP updates: - Infection entry reflects new thresholds and progress toward 85% goal. - Parse.php line-coverage entry updated with current number and a note that ≥95% remains aspirational; remaining gaps are defensive / unreachable code paths. --- ROADMAP.md | 4 +- infection.json5 | 12 +- src/Parse.php | 7 +- tests/ParseTest.php | 306 +++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 314 insertions(+), 15 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index a10b25d..5448284 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -81,9 +81,9 @@ Non-breaking follow-on to v3.2. Not tied to a specific release; picked up as time allows. **Testing depth:** -- [~] Mutation testing with Infection — wired in via `composer infect` with thresholds `minMsi=74`, `minCoveredMsi=79` (current baseline). Target remains ≥85% MSI / ≥85% covered MSI; raise thresholds as tests are strengthened. +- [~] Mutation testing with Infection — wired in via `composer infect` with thresholds `minMsi=80`, `minCoveredMsi=85` (current baseline, up from 74/79). Target remains ≥85% overall MSI; raise threshold as more error-path tests land. - [ ] Property-based testing (Eris or Pest plugin): generate random valid addresses, assert `parseSingle(parseSingle($x)->simpleAddress)` round-trips; perturb bytes and assert error codes. -- [ ] Parse.php line coverage 86.69% → ≥95% — remaining gaps are obscure error branches and the "shouldn't ever get here" default case. +- [~] Parse.php line coverage — now 87.98% (up from 86.69%). Overall project line coverage 91.15% (up from 89.61%). Remaining gaps are obscure error branches, the "shouldn't ever get here" default case, and code paths reachable only via internal state corruption. Target ≥95% aspirational. - [ ] CI matrix: add PHP 8.5 once released. **Static analysis:** diff --git a/infection.json5 b/infection.json5 index ac80832..ebc465a 100644 --- a/infection.json5 +++ b/infection.json5 @@ -5,10 +5,10 @@ // against the PHPUnit suite and reports the Mutation Score Indicator (MSI). // MSI is the percentage of mutations that caused at least one test to fail. // - // Thresholds are set just below the current baseline (74% MSI / 79% - // covered MSI) so future regressions fail CI, while the full 85% target - // is an ongoing aspiration tracked in ROADMAP.md. Raise these numbers as - // the test suite improves. + // Thresholds pinned at the current baseline so regressions fail CI. + // Covered Code MSI is already at 85%; overall MSI target is also 85% + // (raise `minMsi` in lockstep as more error-path tests land). Track + // progress in ROADMAP.md. "$schema": "vendor/infection/infection/resources/schema.json", "source": { "directories": ["src"] @@ -21,8 +21,8 @@ "mutators": { "@default": true }, - "minMsi": 74, - "minCoveredMsi": 79, + "minMsi": 80, + "minCoveredMsi": 85, "timeout": 10, "testFramework": "phpunit" } diff --git a/src/Parse.php b/src/Parse.php index b8da850..0946a2f 100644 --- a/src/Parse.php +++ b/src/Parse.php @@ -1212,7 +1212,12 @@ protected function validateLocalPart(array $emailAddress): array if ($nextByte < 32 || $nextByte > 126) { return ['valid' => false, 'reason' => 'Invalid escaped character in quoted string', 'code' => Err::InvalidEscapedCharInQuotedString, 'normalized' => null]; } - $i++; // skip the escaped character on the next iteration + // Skip both the backslash and its escape target; they form one + // quoted-pair and must not be re-checked against qtextSMTP below + // (which would otherwise reject the backslash as byte 92). + ++$i; + + continue; } // UTF-8 multibyte in quoted string (internationalized) diff --git a/tests/ParseTest.php b/tests/ParseTest.php index 9376e0b..cd5904a 100644 --- a/tests/ParseTest.php +++ b/tests/ParseTest.php @@ -423,21 +423,214 @@ public function testRfc5321RequiresFqdn(): void /** * Quoted-string content validation (RFC 5321 §4.1.2 qtextSMTP / quoted-pairSMTP). */ + /** + * RFC 1035 §2.3.4 forbids domain labels starting or ending with a hyphen. + * Validates Parse::validateDomainName() emits the precise error code at + * both label positions — kills the mb_substr/mb_strlen mutants on that line. + */ + public function testDomainLabelHyphenRejection(): void + { + $opts = ParseOptions::rfc5321()->withRequireFqdn(false); + $parser = new Parse(null, $opts); + + $leading = $parser->parseSingle('user@-bad.example.com'); + $this->assertTrue($leading->invalid, 'leading hyphen label must be rejected'); + $this->assertSame( + \Email\ParseErrorCode::DomainLabelStartsOrEndsWithHyphen, + $leading->invalidReasonCode, + ); + + $trailing = $parser->parseSingle('user@bad-.example.com'); + $this->assertTrue($trailing->invalid, 'trailing hyphen label must be rejected'); + $this->assertSame( + \Email\ParseErrorCode::DomainLabelStartsOrEndsWithHyphen, + $trailing->invalidReasonCode, + ); + + // Mid-label hyphens are valid; passes the same check. + $valid = $parser->parseSingle('user@my-domain.example.com'); + $this->assertFalse($valid->invalid); + } + + /** + * RFC 5321 §2.3.5 FQDN: reject single-label and trailing-empty-label domains. + * Covers all three branches of the dotPos check at the FQDN gate. + */ + public function testFqdnGateRejectsBadDotPositions(): void + { + $opts = ParseOptions::rfc5321(); + $parser = new Parse(null, $opts); + + // Single label (no dot at all) — dotPos === false branch + $this->assertSame( + \Email\ParseErrorCode::FqdnRequired, + $parser->parseSingle('user@localhost')->invalidReasonCode, + ); + + // Bare TLD with trailing dot stripped → single label → FqdnRequired + $this->assertSame( + \Email\ParseErrorCode::FqdnRequired, + $parser->parseSingle('user@example.')->invalidReasonCode, + ); + + // Multi-label valid case + $this->assertFalse($parser->parseSingle('user@example.com')->invalid); + } + + /** + * Length boundary tests — assert the exact cutoff (N accepted, N+1 rejected) + * for the three RFC 5321 §4.5.3.1 limits. Kills IncrementInteger / + * DecrementInteger / GreaterThan mutants on the length-comparison lines. + */ + public function testLocalPartLengthBoundary(): void + { + $opts = new ParseOptions(lengthLimits: new \Email\LengthLimits(10, 254, 63)); + $parser = new Parse(null, $opts); + + // 10 octets: accepted; 11: rejected. + $this->assertFalse($parser->parseSingle(str_repeat('a', 10).'@x.com')->invalid); + $this->assertSame( + \Email\ParseErrorCode::LocalPartTooLong, + $parser->parseSingle(str_repeat('a', 11).'@x.com')->invalidReasonCode, + ); + } + + public function testTotalLengthBoundary(): void + { + // maxTotalLength = 20, local + '@' + domain. + $opts = new ParseOptions(lengthLimits: new \Email\LengthLimits(64, 20, 63)); + $parser = new Parse(null, $opts); + + // "abcdefgh@example.com" is exactly 20 octets — accepted at the boundary. + $this->assertFalse($parser->parseSingle('abcdefgh@example.com')->invalid); + // "abcdefghi@example.com" is 21 octets — over the limit. + $this->assertSame( + \Email\ParseErrorCode::TotalLengthExceeded, + $parser->parseSingle('abcdefghi@example.com')->invalidReasonCode, + ); + } + + public function testLegacyAsciiOnlyPatternAcceptsTrailingDot(): void + { + // Legacy/non-strict branch in validateLocalPart: allowObsLocalPart=false, + // rejectC0Controls=false, allowUtf8LocalPart=false. The regex allows a + // single trailing dot for v2.x backward compatibility. + $opts = new ParseOptions(); + $opts = $opts->withAllowUtf8LocalPart(false); + $parser = new Parse(null, $opts); + + $this->assertFalse($parser->parseSingle('user@example.com')->invalid); + $this->assertFalse($parser->parseSingle('user.name@example.com')->invalid); + } + + public function testC1ControlInQuotedStringRejectedUnderRfc6531(): void + { + // rfc6531() enables rejectC1Controls; a U+0080-U+009F character inside + // a quoted-string must produce C1ControlInQuotedString. + $opts = ParseOptions::rfc6531()->withRequireFqdn(false); + // Construct a quoted local-part containing a C1 control (U+0080). + $input = "\"a\u{0080}b\"@example.com"; + $r = (new Parse(null, $opts))->parseSingle($input); + $this->assertTrue($r->invalid); + $this->assertSame( + \Email\ParseErrorCode::C1ControlInQuotedString, + $r->invalidReasonCode, + ); + } + + public function testMultipleInvalidAddressesReasonIsPlural(): void + { + // When a batch contains two or more invalid addresses, the top-level + // $reason becomes "Invalid email addresses" (plural) — the second-error + // branch on line ~844 of Parse.php flips $reason from the singular form. + $result = Parse::getInstance()->parseMultiple('first-bad@, second-bad@'); + $this->assertFalse($result->success); + $this->assertSame('Invalid email addresses', $result->reason); + } + + public function testQuotedLocalPartLengthIncludesWireDquotes(): void + { + // Quoted local-parts count the enclosing DQUOTEs in the wire-form length. + // maxLocalPartLength = 5: `"abc"` (3 chars + 2 DQUOTE = 5 octets) is valid; + // `"abcd"` (4 chars + 2 DQUOTE = 6 octets) is rejected. + $opts = new ParseOptions(lengthLimits: new \Email\LengthLimits(5, 254, 63)); + $parser = new Parse(null, $opts); + + $this->assertFalse($parser->parseSingle('"abc"@x.com')->invalid); + $this->assertSame( + \Email\ParseErrorCode::LocalPartTooLong, + $parser->parseSingle('"abcd"@x.com')->invalidReasonCode, + ); + } + + /** + * RFC 1035 §2.3.4 character set: domain labels are LDH (letters, digits, hyphen). + * A label containing other characters is rejected as DomainContainsInvalidChars. + */ + public function testDomainLabelInvalidCharactersRejected(): void + { + $opts = ParseOptions::rfc5321()->withRequireFqdn(false); + $parser = new Parse(null, $opts); + + // Empty label (leading dot in domain) → empty string fails LDH regex + $r = $parser->parseSingle('user@.example.com'); + $this->assertTrue($r->invalid); + $this->assertSame(\Email\ParseErrorCode::DomainContainsInvalidChars, $r->invalidReasonCode); + } + public function testQuotedStringContentValidation(): void { $opts = (new ParseOptions()) ->withValidateQuotedContent(true) ->withAllowUtf8LocalPart(false); + $parser = new Parse(null, $opts); // Invalid escape: backslash followed by byte outside %d32-126 (SOH = 0x01). - $result = (new Parse(null, $opts))->parseSingle("\"a\\\x01b\"@example.com"); - $this->assertTrue($result->invalid); - $this->assertSame(\Email\ParseErrorCode::InvalidEscapedCharInQuotedString, $result->invalidReasonCode); + $r = $parser->parseSingle("\"a\\\x01b\"@example.com"); + $this->assertTrue($r->invalid); + $this->assertSame(\Email\ParseErrorCode::InvalidEscapedCharInQuotedString, $r->invalidReasonCode); + + // Invalid escape — backslash followed by DEL (0x7F, one past the upper bound). + // Exercises the `> 126` half of the quoted-pairSMTP check. + $r = $parser->parseSingle("\"a\\\x7fb\"@example.com"); + $this->assertTrue($r->invalid); + $this->assertSame(\Email\ParseErrorCode::InvalidEscapedCharInQuotedString, $r->invalidReasonCode); + + // Boundary: backslash followed by SPACE (0x20, the lower bound) — valid. + $r = $parser->parseSingle("\"a\\ b\"@example.com"); + $this->assertFalse($r->invalid, 'backslash + SPACE (0x20) is a valid quoted-pair'); + + // Boundary: backslash followed by ~ (0x7E, the upper bound) — valid. + $r = $parser->parseSingle("\"a\\~b\"@example.com"); + $this->assertFalse($r->invalid, 'backslash + ~ (0x7E) is a valid quoted-pair'); // Bare control byte inside the quoted string (no escape). - $result = (new Parse(null, $opts))->parseSingle("\"a\x01b\"@example.com"); - $this->assertTrue($result->invalid); - $this->assertSame(\Email\ParseErrorCode::InvalidCharInQuotedString, $result->invalidReasonCode); + $r = $parser->parseSingle("\"a\x01b\"@example.com"); + $this->assertTrue($r->invalid); + $this->assertSame(\Email\ParseErrorCode::InvalidCharInQuotedString, $r->invalidReasonCode); + + // Boundary: byte 0x1F (US, the last C0 control) — rejected via `<= 31`. + $r = $parser->parseSingle("\"a\x1fb\"@example.com"); + $this->assertTrue($r->invalid); + $this->assertSame(\Email\ParseErrorCode::InvalidCharInQuotedString, $r->invalidReasonCode); + + // Boundary: byte 0x7F (DEL, the first DEL+ byte) — rejected via `>= 127`. + $r = $parser->parseSingle("\"a\x7fb\"@example.com"); + $this->assertTrue($r->invalid); + $this->assertSame(\Email\ParseErrorCode::InvalidCharInQuotedString, $r->invalidReasonCode); + + // Boundary: byte 0x20 (SPACE) is valid qtextSMTP — `<= 31` must not fire. + $r = $parser->parseSingle('"a b"@example.com'); + $this->assertFalse($r->invalid); + + // Boundary: byte 0x7E (~) is valid qtextSMTP — `>= 127` must not fire. + $r = $parser->parseSingle('"a~b"@example.com'); + $this->assertFalse($r->invalid); + + // Note: ParseErrorCode::TrailingBackslashInQuotedString is defensive-only — + // the STATE_QUOTE backslash-counting logic always closes the quote before an + // unescaped lone backslash can end up in `local_part_parsed`. Kept as a + // safety net should the quote-closing logic change. } public function testValidAddressHasNullInvalidSeverity(): void @@ -469,6 +662,107 @@ public function testPolicyFailureIsWarningSeverity(): void $this->assertSame(\Email\ValidationSeverity::Warning, $result->invalidSeverity()); } + /** + * Each factory preset is a fixed combination of 15 rule properties. Mutations + * that flip any single boolean must be detected — this test asserts the exact + * setting for every property in every preset, killing ~60 boolean-flip mutants + * in ParseOptions in one pass. + * + * Source-of-truth table; if a preset's intent changes, update this table and + * the matching factory method together. + */ + public function testFactoryPresetsHaveExpectedRuleValues(): void + { + $presets = [ + 'rfc5321' => [ParseOptions::rfc5321(), [ + 'allowUtf8LocalPart' => false, + 'allowObsLocalPart' => false, + 'allowQuotedString' => true, + 'validateQuotedContent' => true, + 'rejectEmptyQuotedLocalPart' => true, + 'allowUtf8Domain' => false, + 'allowDomainLiteral' => true, + 'requireFqdn' => true, + 'validateIpGlobalRange' => true, + 'rejectC0Controls' => true, + 'rejectC1Controls' => false, + 'applyNfcNormalization' => false, + 'enforceLengthLimits' => true, + 'includeDomainAscii' => false, + 'validateDisplayNamePhrase' => false, + 'strictIdna' => false, + 'allowObsRoute' => false, + ]], + 'rfc6531' => [ParseOptions::rfc6531(), [ + 'allowUtf8LocalPart' => true, + 'allowObsLocalPart' => false, + 'allowQuotedString' => true, + 'validateQuotedContent' => true, + 'rejectEmptyQuotedLocalPart' => true, + 'allowUtf8Domain' => true, + 'allowDomainLiteral' => true, + 'requireFqdn' => true, + 'validateIpGlobalRange' => true, + 'rejectC0Controls' => true, + 'rejectC1Controls' => true, + 'applyNfcNormalization' => true, + 'enforceLengthLimits' => true, + 'includeDomainAscii' => true, + 'validateDisplayNamePhrase' => false, + 'strictIdna' => true, + 'allowObsRoute' => false, + ]], + 'rfc5322' => [ParseOptions::rfc5322(), [ + 'allowUtf8LocalPart' => false, + 'allowObsLocalPart' => true, + 'allowQuotedString' => true, + 'validateQuotedContent' => false, + 'rejectEmptyQuotedLocalPart' => false, + 'allowUtf8Domain' => false, + 'allowDomainLiteral' => true, + 'requireFqdn' => false, + 'validateIpGlobalRange' => true, + 'rejectC0Controls' => true, + 'rejectC1Controls' => false, + 'applyNfcNormalization' => false, + 'enforceLengthLimits' => true, + 'includeDomainAscii' => false, + 'validateDisplayNamePhrase' => false, + 'strictIdna' => false, + 'allowObsRoute' => true, + ]], + 'rfc2822' => [ParseOptions::rfc2822(), [ + 'allowUtf8LocalPart' => false, + 'allowObsLocalPart' => true, + 'allowQuotedString' => true, + 'validateQuotedContent' => false, + 'rejectEmptyQuotedLocalPart' => false, + 'allowUtf8Domain' => false, + 'allowDomainLiteral' => true, + 'requireFqdn' => false, + 'validateIpGlobalRange' => true, + 'rejectC0Controls' => false, + 'rejectC1Controls' => false, + 'applyNfcNormalization' => false, + 'enforceLengthLimits' => true, + 'includeDomainAscii' => false, + 'validateDisplayNamePhrase' => false, + 'strictIdna' => false, + 'allowObsRoute' => true, + ]], + ]; + + foreach ($presets as $name => [$opts, $expected]) { + foreach ($expected as $prop => $value) { + $this->assertSame( + $value, + $opts->$prop, + "{$name}() {$prop} should be " . ($value ? 'true' : 'false'), + ); + } + } + } + public function testLengthLimitsDefaultsMatchRfc(): void { // RFC 5321 §4.5.3.1: 64-octet local-part, RFC 3696 EID 1690: 254-octet