Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Contributing to email-parse

Thank you for your interest in contributing!

## Development setup

```bash
git clone git@github.com:mmucklo/email-parse.git
cd email-parse
composer install
composer ci # cs:check + PHPStan level 8 + PHPUnit
```

## Running tests

```bash
composer test # PHPUnit (fast — unit + YAML-driven test spec)
composer test:coverage # HTML coverage → coverage/
composer infect # Infection mutation testing (takes ~2–5 min)
composer bench # PhpBench performance benchmarks
composer stan # PHPStan level 8
composer cs:check # PHP CS Fixer (dry-run)
composer cs:fix # PHP CS Fixer (auto-fix)
composer ci # Full CI: cs:check → stan → test
```

## Adding test cases

Most parser tests live in `tests/testspec.yml`. Each entry specifies an input, options, and the expected output. Add new entries at the end of the file to cover new behavior or regressions. PHPUnit picks them up automatically.

For typed-API or property-based tests, add methods to `tests/ParseTest.php` or `tests/PropertyTest.php`.

## Code style

The project uses PHP CS Fixer with the committed `.php-cs-fixer.dist.php` config. Run `composer cs:fix` before pushing.

## Static analysis

PHPStan runs at level 8. If your change introduces a new type issue, either fix it or — if it's a tool limitation (e.g. a generic not expressible in PHP) — add it to `phpstan-baseline.neon` via `bin/phpstan analyse --generate-baseline`.

## Pull requests

- One logical change per PR.
- Include tests for new features and bug fixes.
- Keep `composer ci` green before requesting review.
- Commit messages: imperative mood, concise subject, body if the *why* isn't obvious from the diff.

## RFC compliance

When implementing validation rules, cite the specific RFC section in both code comments and the PR description. The project follows RFC 5321 (SMTP Mailbox), RFC 5322 (Internet Message Format), RFC 6531/6532 (EAI), and RFC 1035 (domain names). See [DESIGN.md](DESIGN.md) for the full reference.

## Reporting issues

Please include:
- PHP version
- `ParseOptions` configuration used (factory preset or custom)
- Input email string
- Expected vs actual output
- Whether the behavior matches the cited RFC or not
4 changes: 2 additions & 2 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Not tied to a specific release; picked up as time allows.

**Testing depth:**
- [~] Mutation testing with Infection — wired in via `composer infect` with thresholds `minMsi=80`, `minCoveredMsi=85` (current baseline, up from 74/79). Target remains ≥85% overall MSI; raise threshold as more error-path tests land.
- [ ] Property-based testing (Eris or Pest plugin): generate random valid addresses, assert `parseSingle(parseSingle($x)->simpleAddress)` round-trips; perturb bytes and assert error codes.
- [x] Property-based testing — `tests/PropertyTest.php` with 10 invariants across 200 random iterations each: no-crash on arbitrary bytes, determinism, reason+code consistency, severity classification, Stringable contract, toArray ↔ parse() round-trip, valid-address round-trip, and all-presets-never-crash. No extra dependency (native PHPUnit + `mt_rand`; deterministic via `SEED` envvar).
- [~] Parse.php line coverage — now 87.98% (up from 86.69%). Overall project line coverage 91.15% (up from 89.61%). Remaining gaps are obscure error branches, the "shouldn't ever get here" default case, and code paths reachable only via internal state corruption. Target ≥95% aspirational.
- [ ] CI matrix: add PHP 8.5 once released.

Expand All @@ -95,7 +95,7 @@ Not tied to a specific release; picked up as time allows.
- [ ] Profile the state machine under mailing-list-sized inputs. Likely hot path: `mb_substr` in the main loop — investigate byte iteration for pure-ASCII inputs.

**Community / documentation:**
- [ ] `CONTRIBUTING.md` with dev setup, CI expectations, and commit-style guidance.
- [x] `CONTRIBUTING.md` dev setup, all `composer` scripts, test-case guidance, code-style rules, RFC citation expectations.
- [ ] GitHub issue + pull-request templates.
- [ ] `CODE_OF_CONDUCT.md`.
- [ ] Examples directory or GitHub Pages cookbook (UTF-8 addresses, obs-route in practice, custom normalizers once they ship, Symfony/Laravel integration snippets).
Expand Down
34 changes: 5 additions & 29 deletions phpstan-baseline.neon
Original file line number Diff line number Diff line change
Expand Up @@ -67,37 +67,13 @@ parameters:
path: tests/ParseTest.php

-
message: '#^Method Email\\Tests\\ParseTest\:\:fillReasonCode\(\) return type has no value type specified in iterable type array\.$#'
identifier: missingType.iterableValue
count: 1
path: tests/ParseTest.php

-
message: '#^Method Email\\Tests\\ParseTest\:\:normalizeActual\(\) has parameter \$result with no value type specified in iterable type array\.$#'
identifier: missingType.iterableValue
count: 1
path: tests/ParseTest.php

-
message: '#^Method Email\\Tests\\ParseTest\:\:normalizeActual\(\) return type has no value type specified in iterable type array\.$#'
identifier: missingType.iterableValue
count: 1
path: tests/ParseTest.php

-
message: '#^Method Email\\Tests\\ParseTest\:\:normalizeExpected\(\) has parameter \$result with no value type specified in iterable type array\.$#'
identifier: missingType.iterableValue
count: 1
path: tests/ParseTest.php

-
message: '#^Method Email\\Tests\\ParseTest\:\:normalizeExpected\(\) return type has no value type specified in iterable type array\.$#'
identifier: missingType.iterableValue
message: '#^Method Email\\Tests\\ParseTest\:\:testParseEmailAddresses\(\) has no return type specified\.$#'
identifier: missingType.return
count: 1
path: tests/ParseTest.php

-
message: '#^Method Email\\Tests\\ParseTest\:\:testParseEmailAddresses\(\) has no return type specified\.$#'
identifier: missingType.return
message: '#^Call to method PHPUnit\\Framework\\Assert\:\:assertIsBool\(\) with bool will always evaluate to true\.$#'
identifier: method.alreadyNarrowedType
count: 1
path: tests/ParseTest.php
path: tests/PropertyTest.php
222 changes: 222 additions & 0 deletions tests/PropertyTest.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
<?php

namespace Email\Tests;

use Email\Parse;
use Email\ParsedEmailAddress;
use Email\ParseOptions;
use Email\ValidationSeverity;

/**
* Property-based tests — randomized inputs verifying structural invariants.
*
* Each test generates N random inputs (strings or synthesized addresses) and
* asserts a single property that must hold for all of them. Failures point to
* edge cases missed by the hand-written unit tests.
*
* Deterministic via SEED envvar; defaults to time-based. Re-run a failure:
* SEED=<value> composer test
*/
class PropertyTest extends \PHPUnit\Framework\TestCase
{
private const ITERATIONS = 200;

private int $seed;

protected function setUp(): void
{
$this->seed = (int) (getenv('SEED') ?: (microtime(true) * 1000000) % PHP_INT_MAX);
mt_srand($this->seed);
}

protected function tearDown(): void
{
// Emit the seed in case a failure needs reproduction.
fwrite(STDERR, " [seed={$this->seed}]");
}

/** Generate a random byte string of length 0–$maxLen. */
private function randomString(int $maxLen = 80): string
{
$len = mt_rand(0, $maxLen);
$s = '';
for ($i = 0; $i < $len; $i++) {
$s .= chr(mt_rand(0, 255));
}

return $s;
}

/** Generate a plausible (but not guaranteed valid) email-like string. */
private function randomEmailLike(): string
{
$atext = 'abcdefghijklmnopqrstuvwxyz0123456789!#$%&*+-/=?^_`{|}~';
$localLen = mt_rand(1, 15);
$local = '';
for ($i = 0; $i < $localLen; $i++) {
$local .= $atext[mt_rand(0, strlen($atext) - 1)];
}

$domainLen = mt_rand(1, 10);
$domain = '';
for ($i = 0; $i < $domainLen; $i++) {
$domain .= chr(mt_rand(ord('a'), ord('z')));
}

$tld = '';
$tldLen = mt_rand(2, 5);
for ($i = 0; $i < $tldLen; $i++) {
$tld .= chr(mt_rand(ord('a'), ord('z')));
}

return "{$local}@{$domain}.{$tld}";
}

/**
* parseSingle never throws, regardless of input. Every byte string yields
* a ParsedEmailAddress (valid or invalid), never an unhandled exception.
*/
public function testParseSingleNeverThrows(): void
{
$parser = Parse::getInstance();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$input = $this->randomString();
$result = $parser->parseSingle($input);
$this->assertInstanceOf(ParsedEmailAddress::class, $result);
}
}

/**
* parseMultiple never throws on arbitrary input.
*/
public function testParseMultipleNeverThrows(): void
{
$parser = Parse::getInstance();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$result = $parser->parseMultiple($this->randomString());
$this->assertIsBool($result->success);
}
}

/**
* Determinism: same input always produces same output.
*/
public function testParseIsDeterministic(): void
{
$parser = new Parse();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$s = $this->randomString();
$a = $parser->parseSingle($s);
$b = $parser->parseSingle($s);
$this->assertSame($a->toArray(), $b->toArray(), "Non-deterministic on: " . bin2hex($s));
}
}

/**
* When invalid, both invalidReason and invalidReasonCode must be set.
* When valid, both must be null. No half-and-half states allowed.
*/
public function testInvalidImpliesBothReasonAndCode(): void
{
$parser = Parse::getInstance();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$s = $this->randomString();
$r = $parser->parseSingle($s);
if ($r->invalid) {
$this->assertNotNull($r->invalidReason, "missing reason for [" . bin2hex($s) . "]");
$this->assertNotNull($r->invalidReasonCode, "missing code for [" . bin2hex($s) . "]");
} else {
$this->assertNull($r->invalidReason);
$this->assertNull($r->invalidReasonCode);
}
}
}

/**
* Every invalid address has a severity derived from its error code.
*/
public function testInvalidAlwaysHasSeverity(): void
{
$parser = Parse::getInstance();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$r = $parser->parseSingle($this->randomString());
if ($r->invalid) {
$this->assertInstanceOf(ValidationSeverity::class, $r->invalidSeverity());
} else {
$this->assertNull($r->invalidSeverity());
}
}
}

/**
* Stringable: (string) $parsed === simpleAddress when valid, '' when invalid.
*/
public function testStringableContract(): void
{
$parser = Parse::getInstance();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$r = $parser->parseSingle($this->randomString());
$expected = $r->invalid ? '' : $r->simpleAddress;
$this->assertSame($expected, (string) $r);
}
}

/**
* toArray() always produces the same shape as the legacy parse(…, false).
*/
public function testToArrayMatchesLegacyParse(): void
{
$parser = new Parse();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$s = $this->randomString();
$legacy = $parser->parse($s, false);
$typed = $parser->parseSingle($s);
$this->assertSame($legacy, $typed->toArray(), "toArray drift for [" . bin2hex($s) . "]");
}
}

/**
* Synthesized valid addresses round-trip: parseSingle($addr)->simpleAddress
* re-parses to the same simpleAddress.
*/
public function testValidAddressRoundTrip(): void
{
$parser = new Parse();
for ($i = 0; $i < self::ITERATIONS; $i++) {
$addr = $this->randomEmailLike();
$first = $parser->parseSingle($addr);

if ($first->invalid) {
continue;
}

$second = $parser->parseSingle($first->simpleAddress);
$this->assertFalse(
$second->invalid,
"Round-trip failed: {$addr} → {$first->simpleAddress} → invalid ({$second->invalidReason})",
);
$this->assertSame($first->simpleAddress, $second->simpleAddress);
}
}

/**
* All four factory presets never crash on random byte input.
*/
public function testAllPresetsNeverCrash(): void
{
$presets = [
ParseOptions::rfc5321(),
ParseOptions::rfc6531(),
ParseOptions::rfc5322(),
ParseOptions::rfc2822(),
];

for ($i = 0; $i < self::ITERATIONS; $i++) {
$input = $this->randomString();
foreach ($presets as $opts) {
$r = (new Parse(null, $opts))->parseSingle($input);
$this->assertInstanceOf(ParsedEmailAddress::class, $r);
}
}
}
}
Loading