diff --git a/.github/workflows/good-egg.yml b/.github/workflows/good-egg.yml index dc56c38..cb49c1c 100644 --- a/.github/workflows/good-egg.yml +++ b/.github/workflows/good-egg.yml @@ -14,5 +14,4 @@ jobs: - uses: 2ndSetAI/good-egg@v1 with: github-token: ${{ secrets.GITHUB_TOKEN }} - scoring-model: v2 skip-known-contributors: 'false' diff --git a/README.md b/README.md index 84e4782..7f465b8 100644 --- a/README.md +++ b/README.md @@ -3,8 +3,7 @@ Good Egg -Trust scoring for GitHub PR authors using graph-based analysis of -contribution history. +Trust scoring for GitHub PR authors based on contribution history. ## Why @@ -122,23 +121,31 @@ See [docs/mcp-server.md](https://github.com/2ndSetAI/good-egg/blob/main/docs/mcp ## Scoring Models -Good Egg supports two scoring models: +Good Egg supports three scoring models: | Model | Name | Description | |-------|------|-------------| -| `v1` | Good Egg (default) | Graph-based scoring from contribution history | +| `v3` | Diet Egg (default) | Alltime merge rate as sole signal | | `v2` | Better Egg | Graph score + merge rate + account age via logistic regression | +| `v1` | Good Egg | Graph-based scoring from contribution history | -To use v2, set `scoring_model: v2` in your `.good-egg.yml`, pass -`--scoring-model v2` on the CLI, or set `scoring-model: v2` in the action -input. See [Methodology](https://github.com/2ndSetAI/good-egg/blob/main/docs/methodology.md#better-egg-v2) for how the -v2 model works. +v3 is the default. To use an older model, set `scoring_model: v1` or +`scoring_model: v2` in your `.good-egg.yml`, pass `--scoring-model v1` on +the CLI, or set `scoring-model: v1` in the action input. See +[Methodology](https://github.com/2ndSetAI/good-egg/blob/main/docs/methodology.md) for how each model works. + +### Fresh Egg Advisory + +Accounts less than 365 days old receive a "Fresh Egg" advisory in the +output. This is informational only and does not affect the score. Fresh +accounts correlate with lower merge rates in the validation data. ## How It Works -Good Egg builds a weighted contribution graph from a user's merged PRs and -runs personalized graph scoring to produce a trust score relative to your -project. See [Methodology](https://github.com/2ndSetAI/good-egg/blob/main/docs/methodology.md) for details. +The default v3 model (Diet Egg) scores contributors by their alltime merge +rate: merged PRs divided by total PRs (merged + closed). Older models (v1, +v2) build a weighted contribution graph and run personalized graph scoring. +See [Methodology](https://github.com/2ndSetAI/good-egg/blob/main/docs/methodology.md) for details. ## Trust Levels diff --git a/action.yml b/action.yml index 6b94554..90d120f 100644 --- a/action.yml +++ b/action.yml @@ -23,7 +23,7 @@ inputs: required: false default: 'false' scoring-model: - description: 'Scoring model to use (v1 or v2)' + description: 'Scoring model to use (v1, v2, or v3)' required: false skip-known-contributors: description: 'Skip scoring for authors with merged PRs in the repo (true/false)' @@ -40,7 +40,7 @@ outputs: description: 'GitHub username that was scored' value: ${{ steps.score.outputs.user }} scoring-model: - description: 'Scoring model that was used (v1 or v2)' + description: 'Scoring model that was used (v1, v2, or v3)' value: ${{ steps.score.outputs.scoring-model }} skipped: description: 'Whether scoring was skipped for an existing contributor (true/false)' diff --git a/assets/pr-comment-screenshot.png b/assets/pr-comment-screenshot.png index c20df9f..06da3b8 100644 Binary files a/assets/pr-comment-screenshot.png and b/assets/pr-comment-screenshot.png differ diff --git a/docs/configuration.md b/docs/configuration.md index 935eff3..29b240c 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -27,23 +27,24 @@ Configuration values are resolved in this order (highest priority first): ## Scoring Model -Good Egg supports two scoring models. Set the model at the top level of the -config file: +Good Egg supports three scoring models. Set the model at the top level of +the config file: ```yaml -scoring_model: v1 # default -- graph-based scoring only +scoring_model: v3 # default -- Diet Egg -- alltime merge rate as sole signal scoring_model: v2 # Better Egg -- graph + external features via logistic regression +scoring_model: v1 # Good Egg -- graph-based scoring only ``` -When using v2, PR comments are branded "Better Egg" instead of "Good Egg". -See [methodology.md](methodology.md#better-egg-v2) for how the v2 model +PR comments are branded "Diet Egg", "Better Egg", or "Good Egg" depending +on the model. See [methodology.md](methodology.md) for how each model works. ## Full YAML Schema ```yaml -# Scoring model selection: v1 (default) or v2 -scoring_model: v1 +# Scoring model selection: v3 (default), v2, or v1 +scoring_model: v3 # Skip scoring for authors who already have merged PRs in the target repo. # When true (the default), existing contributors get an EXISTING_CONTRIBUTOR @@ -127,12 +128,19 @@ v2: ### scoring_model -Selects the scoring model. Set to `v1` (default) for graph-only scoring or -`v2` for the Better Egg combined model. When set to `v2`, the parameters -under the `v2:` block are used and the graph construction is simplified -(no self-contribution penalty, no language normalization in repo quality, no -diversity/volume adjustment). Language match personalization weighting -(`same_language_weight`) is retained in v2. +Selects the scoring model. `v3` (default, Diet Egg) uses alltime merge rate +as the sole signal with no graph construction. `v2` (Better Egg) combines a +simplified graph score with merge rate and account age via logistic +regression. `v1` (Good Egg) uses graph-based scoring only. + +When set to `v2`, the parameters under the `v2:` block are used and the +graph construction is simplified (no self-contribution penalty, no language +normalization in repo quality, no diversity/volume adjustment). Language +match personalization weighting (`same_language_weight`) is retained in v2. + +v3 does not use graph construction, so the `graph_scoring`, `recency`, +`edge_weights`, and `language_normalization` sections have no effect. The +`thresholds` section still controls trust level classification. ### v2 (Better Egg) @@ -215,7 +223,7 @@ The following environment variables override individual config values: | `GOOD_EGG_HIGH_TRUST` | `thresholds.high_trust` | float | | `GOOD_EGG_MEDIUM_TRUST` | `thresholds.medium_trust` | float | | `GOOD_EGG_HALF_LIFE_DAYS` | `recency.half_life_days` | int | -| `GOOD_EGG_SCORING_MODEL` | `scoring_model` | str (`v1` or `v2`) | +| `GOOD_EGG_SCORING_MODEL` | `scoring_model` | str (`v1`, `v2`, or `v3`) | | `GOOD_EGG_SKIP_KNOWN_CONTRIBUTORS` | `skip_known_contributors` | bool (`true`/`false`) | ## Programmatic Configuration diff --git a/docs/github-action.md b/docs/github-action.md index 1a98d4d..8e8e62c 100644 --- a/docs/github-action.md +++ b/docs/github-action.md @@ -37,7 +37,7 @@ This posts a trust score comment on each pull request: | `comment` | No | `true` | Post a PR comment with the trust score | | `check-run` | No | `false` | Create a check run with the trust score | | `fail-on-low` | No | `false` | Fail the action if trust level is LOW | -| `scoring-model` | No | `v1` | Scoring model: `v1` (Good Egg) or `v2` (Better Egg) | +| `scoring-model` | No | `v3` | Scoring model: `v3` (Diet Egg), `v2` (Better Egg), or `v1` (Good Egg) | | `skip-known-contributors` | No | `true` | Skip scoring for authors with merged PRs in the repo | ## Outputs @@ -47,7 +47,7 @@ This posts a trust score comment on each pull request: | `score` | Normalized trust score (0.0 - 1.0) | | `trust-level` | Trust level: HIGH, MEDIUM, LOW, UNKNOWN, BOT, or EXISTING_CONTRIBUTOR | | `user` | GitHub username that was scored | -| `scoring-model` | Scoring model used: `v1` (Good Egg) or `v2` (Better Egg) | +| `scoring-model` | Scoring model used: `v3` (Diet Egg), `v2` (Better Egg), or `v1` (Good Egg) | | `skipped` | Whether scoring was skipped for an existing contributor (`true`/`false`) | ## Custom Configuration @@ -179,9 +179,10 @@ jobs: You can check whether scoring was skipped via the `skipped` output. -## Using Better Egg (v2) +## Selecting a Scoring Model -To use the v2 scoring model, set the `scoring-model` input: +The default model is v3 (Diet Egg), which scores by alltime merge rate. To +use an older model, set the `scoring-model` input: ```yaml jobs: @@ -191,12 +192,12 @@ jobs: - uses: 2ndSetAI/good-egg@v0 with: github-token: ${{ secrets.GITHUB_TOKEN }} - scoring-model: v2 + scoring-model: v2 # or v1 ``` -When using v2, PR comments are branded "Better Egg" and include a component -score breakdown showing graph score, merge rate, and account age -contributions. The `scoring-model` output reflects which model was used. +PR comments are branded according to the model: "Diet Egg" for v3, "Better +Egg" for v2, "Good Egg" for v1. v2 and v3 include a component score +breakdown. The `scoring-model` output reflects which model was used. You can also set `scoring-model` via the `GOOD_EGG_SCORING_MODEL` environment variable, but the input takes precedence. @@ -209,5 +210,7 @@ See the [examples/](../examples/) directory for complete workflow files: that posts a PR comment - [strict-workflow.yml](../examples/strict-workflow.yml) -- comment, check run, and fail-on-low +- [diet-egg-workflow.yml](../examples/diet-egg-workflow.yml) -- v3 + scoring model (default) with component breakdown - [better-egg-workflow.yml](../examples/better-egg-workflow.yml) -- v2 scoring model with component breakdown diff --git a/docs/library.md b/docs/library.md index 2aa5efc..8c2894b 100644 --- a/docs/library.md +++ b/docs/library.md @@ -114,26 +114,22 @@ result = await score_pr_author( ) ``` -### v2 (Better Egg) Configuration +### Scoring Model Selection -To use the v2 scoring model, set `scoring_model` on the config: +The default model is v3 (Diet Egg). To use an older model, set +`scoring_model` on the config: ```python from good_egg import GoodEggConfig, score_pr_author -config = GoodEggConfig( - scoring_model="v2", - v2={ - "graph": {"half_life_days": 180, "max_age_days": 730}, - "features": {"merge_rate": True, "account_age": True}, - "combined_model": { - "intercept": -0.8094, - "graph_score_weight": 1.9138, - "merge_rate_weight": -0.7783, - "account_age_weight": 0.1493, - }, - }, -) +# v3 (default) -- merge rate only +config = GoodEggConfig() + +# v2 -- graph + merge rate + account age +config = GoodEggConfig(scoring_model="v2") + +# v1 -- graph only +config = GoodEggConfig(scoring_model="v1") result = await score_pr_author( login="octocat", @@ -142,12 +138,13 @@ result = await score_pr_author( config=config, ) -# v2 results include component scores +# v3 and v2 results include component scores if result.component_scores: - print(f"Graph score: {result.component_scores['graph_score']:.3f}") - print(f"Merge rate: {result.component_scores['merge_rate']:.3f}") - print(f"Log account age: {result.component_scores['log_account_age']:.3f}") - print(f"Normalized score: {result.normalized_score:.3f}") + print(f"Merge rate: {result.component_scores.get('merge_rate')}") + +# v3 includes a fresh account advisory +if result.fresh_account and result.fresh_account.is_fresh: + print(f"Fresh account: {result.fresh_account.account_age_days} days old") print(f"Scoring model: {result.scoring_model}") ``` @@ -176,7 +173,7 @@ the following fields: |-------|------|-------------| | `user_login` | `str` | GitHub username that was scored | | `context_repo` | `str` | Repository used as scoring context | -| `raw_score` | `float` | Pre-normalization score: graph score (v1) or logit (v2) | +| `raw_score` | `float` | Pre-normalization score: merge rate (v3), logit (v2), or graph score (v1) | | `normalized_score` | `float` | Normalized score (0.0 - 1.0) | | `trust_level` | `TrustLevel` | HIGH, MEDIUM, LOW, UNKNOWN, BOT, or EXISTING_CONTRIBUTOR | | `account_age_days` | `int` | Age of the GitHub account in days | @@ -185,9 +182,10 @@ the following fields: | `top_contributions` | `list[ContributionSummary]` | Top repositories contributed to | | `language_match` | `bool` | Whether the user's top language matches the context repo | | `flags` | `dict[str, bool]` | Flags (is_bot, is_new_account, etc.) | -| `scoring_model` | `str` | Scoring model used: `v1` or `v2` | -| `component_scores` | `dict[str, float] \| None` | Component breakdown (v2 only): `graph_score`, `merge_rate`, `log_account_age` | +| `scoring_model` | `str` | Scoring model used: `v1`, `v2`, or `v3` | +| `component_scores` | `dict[str, float]` | Component breakdown (v3: `merge_rate`; v2: `graph_score`, `merge_rate`, `log_account_age`) | | `scoring_metadata` | `dict[str, Any]` | Internal scoring details | +| `fresh_account` | `FreshAccountAdvisory \| None` | Advisory for accounts under 365 days old (None for bots and existing contributors) | `TrustScore` is a Pydantic model, so you can serialize it: diff --git a/docs/mcp-server.md b/docs/mcp-server.md index 4cbe7ea..0f9e1fd 100644 --- a/docs/mcp-server.md +++ b/docs/mcp-server.md @@ -82,15 +82,17 @@ Returns the full trust score as JSON, including all fields from the |------|------|----------|-------------| | `username` | `string` | Yes | GitHub username to score | | `repo` | `string` | Yes | Target repository in `owner/repo` format | -| `scoring_model` | `string` | No | Scoring model: `v1` (Good Egg, default) or `v2` (Better Egg) | +| `scoring_model` | `string` | No | Scoring model: `v3` (Diet Egg, default), `v2` (Better Egg), or `v1` (Good Egg) | | `force_score` | `boolean` | No | Force full scoring even for known contributors (default: `false`) | **Returns:** Full `TrustScore` JSON with all fields (user_login, context_repo, raw_score, normalized_score, trust_level, account_age_days, total_merged_prs, unique_repos_contributed, top_contributions, language_match, flags, scoring_model, component_scores, -scoring_metadata). When `scoring_model` is `v2`, the response includes -`component_scores` with graph_score, merge_rate, and log_account_age. +scoring_metadata, fresh_account). v3 includes `component_scores` with +`merge_rate`. v2 includes `graph_score`, `merge_rate`, and +`log_account_age`. The `fresh_account` field contains a Fresh Egg advisory +for accounts under 365 days old (null for bots and existing contributors). ### check_pr_author @@ -102,17 +104,21 @@ Returns a compact summary suitable for quick checks. |------|------|----------|-------------| | `username` | `string` | Yes | GitHub username to check | | `repo` | `string` | Yes | Target repository in `owner/repo` format | -| `scoring_model` | `string` | No | Scoring model: `v1` (Good Egg, default) or `v2` (Better Egg) | +| `scoring_model` | `string` | No | Scoring model: `v3` (Diet Egg, default), `v2` (Better Egg), or `v1` (Good Egg) | | `force_score` | `boolean` | No | Force full scoring even for known contributors (default: `false`) | -**Returns (v1):** +**Returns (v3, default):** ```json { "user_login": "octocat", "trust_level": "HIGH", "normalized_score": 0.82, - "total_merged_prs": 47 + "total_merged_prs": 47, + "scoring_model": "v3", + "component_scores": { + "merge_rate": 0.82 + } } ``` @@ -143,10 +149,10 @@ Returns an expanded breakdown with contributions, flags, and metadata. |------|------|----------|-------------| | `username` | `string` | Yes | GitHub username to analyse | | `repo` | `string` | Yes | Target repository in `owner/repo` format | -| `scoring_model` | `string` | No | Scoring model: `v1` (Good Egg, default) or `v2` (Better Egg) | +| `scoring_model` | `string` | No | Scoring model: `v3` (Diet Egg, default), `v2` (Better Egg), or `v1` (Good Egg) | | `force_score` | `boolean` | No | Force full scoring even for known contributors (default: `false`) | -**Returns (v1):** +**Returns (v3, default):** ```json { @@ -154,7 +160,7 @@ Returns an expanded breakdown with contributions, flags, and metadata. "context_repo": "octocat/Hello-World", "trust_level": "HIGH", "normalized_score": 0.82, - "raw_score": 0.0045, + "raw_score": 0.82, "account_age_days": 3650, "total_merged_prs": 47, "unique_repos_contributed": 12, @@ -171,11 +177,18 @@ Returns an expanded breakdown with contributions, flags, and metadata. "is_bot": false, "is_new_account": false }, - "scoring_metadata": {} + "scoring_model": "v3", + "component_scores": { + "merge_rate": 0.82 + }, + "scoring_metadata": { + "closed_pr_count": 10 + }, + "fresh_account": null } ``` -**Returns (v2):** +**Returns (v1):** ```json { @@ -183,7 +196,7 @@ Returns an expanded breakdown with contributions, flags, and metadata. "context_repo": "octocat/Hello-World", "trust_level": "HIGH", "normalized_score": 0.82, - "raw_score": 0.2871, + "raw_score": 0.0045, "account_age_days": 3650, "total_merged_prs": 47, "unique_repos_contributed": 12, @@ -200,12 +213,6 @@ Returns an expanded breakdown with contributions, flags, and metadata. "is_bot": false, "is_new_account": false }, - "scoring_model": "v2", - "component_scores": { - "graph_score": 0.78, - "merge_rate": 0.91, - "log_account_age": 3.45 - }, "scoring_metadata": {} } ``` diff --git a/docs/methodology.md b/docs/methodology.md index cf1bcba..74ff0f6 100644 --- a/docs/methodology.md +++ b/docs/methodology.md @@ -101,9 +101,14 @@ These raw weights are normalized to sum to 1.0, so actual values in the random w ### Scoring and Normalization -The directed graph is scored using personalized graph-based ranking with a damping factor (alpha) of 0.85. This produces a raw score for the user node. +**v3 (default):** The score is the alltime merge rate: merged PRs divided +by total PRs (merged + closed). This value is used directly as both +`raw_score` and `normalized_score`, since it is already in [0, 1]. No +graph construction is performed. -**v1:** Normalization converts the raw graph score to a 0-1 range: +**v1:** The directed graph is scored using personalized graph-based ranking +with a damping factor (alpha) of 0.85. Normalization converts the raw +graph score to a 0-1 range: ``` baseline = 1 / num_nodes @@ -162,11 +167,57 @@ GitHub rate limits bound how much data can be fetched per user. Good Egg is desi --- +## Diet Egg (v3) + +The v3 scoring model -- branded "Diet Egg" in PR comments -- is the default +since v3 was introduced. It uses alltime merge rate as the sole scoring +input, dropping graph score and log account age from the v2 logistic +regression. + +### Motivation + +Experimental data from the validation study showed that: + +1. **Graph score (hub_score) hurts performance for unknown contributors.** + The graph is most useful for contributors with extensive cross-project + history, but for the primary use case -- evaluating unfamiliar PR + authors -- it adds noise. +2. **Log account age adds nothing significant.** While statistically + significant in isolation, account age does not improve ranking + performance when combined with merge rate. + +Merge rate alone provides a simple, fast, and effective signal. v3 requires +no graph construction, no pagerank computation, and no logistic regression. + +### Fresh Egg Advisory + +Accounts less than 365 days old receive a "Fresh Egg" advisory in the +output. This is informational only and does not affect the score. In the +validation data, fresh accounts correlate with approximately 16 percentage +points lower merge rates. + +The advisory is attached to all scoring paths (v1, v2, v3) and to the +insufficient-data short circuit. It is not attached to bot accounts (whose +synthetic profiles have unreliable age data) or to existing contributor +early returns (where no profile data is fetched). + +### Component Scores + +v3 output includes a single component: + +| Component | Description | +|-----------|-------------| +| `merge_rate` | Fraction of PRs that were merged: `merged / (merged + closed)` | + +The `scoring_metadata` contains `closed_pr_count` for transparency. + +--- + ## Better Egg (v2) The v2 scoring model -- branded "Better Egg" in PR comments -- extends the graph-based approach with external features combined via logistic regression. -It is opt-in via `scoring_model: v2` in configuration; v1 remains the default. +It is available via `scoring_model: v2` in configuration. ### Motivation diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index b06ed52..68bf477 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -26,6 +26,27 @@ backoff. If you see persistent failures: | `Could not extract PR number` | Not a PR event | Ensure workflow triggers on `pull_request` | | `Invalid GITHUB_REPOSITORY` | Malformed env var | Check Actions environment | +## v3 (Diet Egg) Scoring + +### v3 score differs from v1 or v2 + +This is expected. v3 uses alltime merge rate as the sole signal, with no +graph construction. Prolific contributors who close many of their own PRs +(drafts, experiments) will have lower scores than in v1/v2 because those +closed PRs pull down their merge rate. + +### "Diet Egg" appears in PR comments + +PR comments use the "Diet Egg" branding when `scoring_model` is set to +`v3` (the default). This is intentional. To use an older model, set +`scoring_model: v1` or `scoring_model: v2`. + +### "Fresh Egg" advisory appears + +Accounts under 365 days old get a "Fresh Egg" advisory. This is +informational only and does not affect the score. It is not shown for bot +accounts or existing contributors. + ## v2 (Better Egg) Scoring ### v2 score differs significantly from v1 @@ -46,9 +67,7 @@ has sufficient rate limit remaining. ### "Better Egg" appears in PR comments PR comments use the "Better Egg" branding when `scoring_model` is set to -`v2`. This is intentional and helps distinguish v2 results from v1. To -switch back, set `scoring_model: v1` or remove the setting (v1 is the -default). +`v2`. This is intentional and helps distinguish v2 results from v1. ## Getting Help diff --git a/examples/.good-egg.yml b/examples/.good-egg.yml index 0718e7b..025f3ee 100644 --- a/examples/.good-egg.yml +++ b/examples/.good-egg.yml @@ -1,6 +1,11 @@ # Good Egg configuration # Copy this file to your repository root as .good-egg.yml +# Scoring model: v3 (Diet Egg, default), v2 (Better Egg), or v1 (Good Egg) +# v3 uses alltime merge rate as the sole signal. v2 adds graph scoring and +# account age. v1 uses graph scoring only. +# scoring_model: v3 + # Skip scoring for authors who already have merged PRs in the target repo. # Set to false to always run full scoring. # skip_known_contributors: true @@ -84,7 +89,7 @@ language_normalization: # -------------------------------------------------------------------------- # v2 (Better Egg) scoring model # -------------------------------------------------------------------------- -# Uncomment the lines below to enable the v2 combined model. +# Uncomment the lines below to use the v2 combined model instead of v3. # See docs/methodology.md for how v2 works. # scoring_model: v2 diff --git a/examples/diet-egg-workflow.yml b/examples/diet-egg-workflow.yml new file mode 100644 index 0000000..6876b6e --- /dev/null +++ b/examples/diet-egg-workflow.yml @@ -0,0 +1,27 @@ +# Diet Egg (v3) workflow -- uses alltime merge rate as the sole scoring +# signal. This is the default model, so scoring-model does not need to be +# set explicitly. Shown here for clarity. +name: Diet Egg + +on: + pull_request: + types: [opened, reopened, synchronize] + +permissions: + pull-requests: write + +jobs: + score: + runs-on: ubuntu-latest + steps: + - id: egg + uses: 2ndSetAI/good-egg@v0 + with: + github-token: ${{ secrets.GITHUB_TOKEN }} + + - name: Print results + run: | + echo "Score: ${{ steps.egg.outputs.score }}" + echo "Trust level: ${{ steps.egg.outputs.trust-level }}" + echo "Model: ${{ steps.egg.outputs.scoring-model }}" + echo "User: ${{ steps.egg.outputs.user }}" diff --git a/examples/library_usage.py b/examples/library_usage.py index 71314d5..91b4353 100644 --- a/examples/library_usage.py +++ b/examples/library_usage.py @@ -1,4 +1,4 @@ -"""Example: Score a GitHub user with Good Egg.""" +"""Example: Score a GitHub user with Good Egg (v3 / Diet Egg).""" from __future__ import annotations @@ -17,6 +17,7 @@ async def main() -> None: ) print(f"User: {result.user_login}") print(f"Trust level: {result.trust_level}") + print(f"Scoring model: {result.scoring_model}") if result.flags.get("scoring_skipped"): pr_count = result.scoring_metadata.get("context_repo_merged_pr_count", 0) @@ -26,6 +27,19 @@ async def main() -> None: print(f"Merged PRs: {result.total_merged_prs}") print(f"Unique repos: {result.unique_repos_contributed}") + # v3 component scores + if result.component_scores: + merge_rate = result.component_scores.get("merge_rate") + if merge_rate is not None: + print(f"Merge rate: {merge_rate:.0%}") + + # Fresh Egg advisory + if result.fresh_account and result.fresh_account.is_fresh: + print( + f"Fresh account: {result.fresh_account.account_age_days} days old" + f" (< {result.fresh_account.threshold_days} days)" + ) + if __name__ == "__main__": asyncio.run(main()) diff --git a/src/good_egg/action.py b/src/good_egg/action.py index eb21107..cb47dc0 100644 --- a/src/good_egg/action.py +++ b/src/good_egg/action.py @@ -80,7 +80,7 @@ async def run_action() -> None: os.environ.get("INPUT_SCORING_MODEL") or os.environ.get("INPUT_SCORING-MODEL") ) - if scoring_model_input and scoring_model_input in ("v1", "v2"): + if scoring_model_input and scoring_model_input in ("v1", "v2", "v3"): config = config.model_copy(update={"scoring_model": scoring_model_input}) if skip_known_input is not None: config = config.model_copy( diff --git a/src/good_egg/cli.py b/src/good_egg/cli.py index 618d13b..ea0e07b 100644 --- a/src/good_egg/cli.py +++ b/src/good_egg/cli.py @@ -28,9 +28,9 @@ def main() -> None: @click.option("--json", "output_json", is_flag=True, help="Output as JSON") @click.option( "--scoring-model", - type=click.Choice(["v1", "v2"]), + type=click.Choice(["v1", "v2", "v3"]), default=None, - help="Scoring model (v1 or v2)", + help="Scoring model (v1, v2, or v3)", ) @click.option( "--force-score", diff --git a/src/good_egg/config.py b/src/good_egg/config.py index 576902b..0d44f8b 100644 --- a/src/good_egg/config.py +++ b/src/good_egg/config.py @@ -147,7 +147,7 @@ class V2Config(BaseModel): class GoodEggConfig(BaseModel): """Top-level configuration composing all sub-configs.""" - scoring_model: Literal["v1", "v2"] = "v1" + scoring_model: Literal["v1", "v2", "v3"] = "v3" skip_known_contributors: bool = True graph_scoring: GraphScoringConfig = Field(default_factory=GraphScoringConfig) edge_weights: EdgeWeightConfig = Field(default_factory=EdgeWeightConfig) diff --git a/src/good_egg/formatter.py b/src/good_egg/formatter.py index ca72c83..4015e45 100644 --- a/src/good_egg/formatter.py +++ b/src/good_egg/formatter.py @@ -36,7 +36,11 @@ def _existing_contributor_context(score: TrustScore) -> tuple[int, str]: def _brand_name(score: TrustScore) -> str: """Return 'Better Egg' for v2, 'Good Egg' for v1.""" - return "Better Egg" if score.scoring_model == "v2" else "Good Egg" + if score.scoring_model == "v3": + return "Diet Egg" + if score.scoring_model == "v2": + return "Better Egg" + return "Good Egg" def format_markdown_comment(score: TrustScore) -> str: @@ -67,7 +71,7 @@ def format_markdown_comment(score: TrustScore) -> str: ] # v2 component score breakdown - if score.scoring_model == "v2" and score.component_scores: + if score.scoring_model in ("v2", "v3") and score.component_scores: lines.append("### Score Breakdown") lines.append("") lines.append("| Component | Value |") @@ -122,6 +126,17 @@ def format_markdown_comment(score: TrustScore) -> str: lines.append(f"- {flag}") lines.append("") + # Fresh account advisory + if score.fresh_account and score.fresh_account.is_fresh: + lines.append("### Fresh Account") + lines.append("") + lines.append( + f"\U0001f423 Account is {score.fresh_account.account_age_days} days old" + f" (< {score.fresh_account.threshold_days} days)." + " Fresh accounts correlate with lower merge rates." + ) + lines.append("") + # Low trust note if score.trust_level == TrustLevel.LOW: lines.append("> **First-time contributor -- review manually**") @@ -163,7 +178,7 @@ def format_cli_output(score: TrustScore, verbose: bool = False) -> str: f"Repos: {score.unique_repos_contributed}" ) - if score.scoring_model == "v2" and score.component_scores: + if score.scoring_model in ("v2", "v3") and score.component_scores: lines.append("") lines.append("Component scores:") if "graph_score" in score.component_scores: @@ -191,6 +206,13 @@ def format_cli_output(score: TrustScore, verbose: bool = False) -> str: if value: lines.append(f" - {flag}") + if score.fresh_account and score.fresh_account.is_fresh: + lines.append("") + lines.append( + f"Fresh account: {score.fresh_account.account_age_days} days old" + f" (< {score.fresh_account.threshold_days} days)" + ) + if score.scoring_metadata: lines.append("") lines.append("Metadata:") @@ -234,7 +256,7 @@ def format_check_run_summary(score: TrustScore) -> tuple[str, str]: f"Repos contributed to: {score.unique_repos_contributed}", ] - if score.scoring_model == "v2" and score.component_scores: + if score.scoring_model in ("v2", "v3") and score.component_scores: summary_lines.append("") summary_lines.append("**Score Breakdown:**") if "graph_score" in score.component_scores: @@ -261,5 +283,12 @@ def format_check_run_summary(score: TrustScore) -> tuple[str, str]: summary_lines.append("") summary_lines.append(f"**Flags:** {', '.join(active_flags)}") + if score.fresh_account and score.fresh_account.is_fresh: + summary_lines.append("") + summary_lines.append( + f"**Fresh Account:** {score.fresh_account.account_age_days} days old" + f" (< {score.fresh_account.threshold_days} days)" + ) + summary = "\n".join(summary_lines) return title, summary diff --git a/src/good_egg/mcp_server.py b/src/good_egg/mcp_server.py index 95db190..7764aff 100644 --- a/src/good_egg/mcp_server.py +++ b/src/good_egg/mcp_server.py @@ -57,7 +57,7 @@ async def _scoring_resources( is closed on exit. """ config = _get_config() - if scoring_model is not None and scoring_model in ("v1", "v2"): + if scoring_model is not None and scoring_model in ("v1", "v2", "v3"): config = config.model_copy(update={"scoring_model": scoring_model}) if force_score: config = config.model_copy(update={"skip_known_contributors": False}) @@ -96,7 +96,7 @@ async def score_user( Args: username: GitHub username to score. repo: Target repository in owner/repo format. - scoring_model: Optional scoring model override (v1 or v2). + scoring_model: Optional scoring model override (v1, v2, or v3). force_score: Force full scoring even for known contributors. """ try: @@ -128,7 +128,7 @@ async def check_pr_author( Args: username: GitHub username to check. repo: Target repository in owner/repo format. - scoring_model: Optional scoring model override (v1 or v2). + scoring_model: Optional scoring model override (v1, v2, or v3). force_score: Force full scoring even for known contributors. """ try: @@ -169,7 +169,7 @@ async def get_trust_details( Args: username: GitHub username to analyse. repo: Target repository in owner/repo format. - scoring_model: Optional scoring model override (v1 or v2). + scoring_model: Optional scoring model override (v1, v2, or v3). force_score: Force full scoring even for known contributors. """ try: diff --git a/src/good_egg/models.py b/src/good_egg/models.py index a569872..f4175a1 100644 --- a/src/good_egg/models.py +++ b/src/good_egg/models.py @@ -77,6 +77,14 @@ class ContributionSummary(BaseModel): stars: int = 0 +class FreshAccountAdvisory(BaseModel): + """Advisory for accounts less than one year old.""" + is_fresh: bool + account_age_days: int + created_at: datetime | None = None + threshold_days: int = 365 + + class TrustScore(BaseModel): """Complete trust score result.""" user_login: str @@ -94,3 +102,4 @@ class TrustScore(BaseModel): scoring_metadata: dict[str, Any] = {} scoring_model: str = "v1" component_scores: dict[str, float] = {} + fresh_account: FreshAccountAdvisory | None = None diff --git a/src/good_egg/scorer.py b/src/good_egg/scorer.py index 194d40c..8e7c4f5 100644 --- a/src/good_egg/scorer.py +++ b/src/good_egg/scorer.py @@ -12,6 +12,7 @@ from good_egg.graph_builder import TrustGraphBuilder from good_egg.models import ( ContributionSummary, + FreshAccountAdvisory, TrustLevel, TrustScore, UserContributionData, @@ -40,6 +41,8 @@ def score( } model_name = self.config.scoring_model + fresh_account = self._build_fresh_account_advisory(user_data) + # ---- Bot short-circuit ---- if user_data.profile.is_bot: return TrustScore( @@ -65,15 +68,18 @@ def score( account_age_days=user_data.profile.account_age_days, flags=flags, scoring_model=model_name, + fresh_account=fresh_account, ) # ---- Suspected bot flag ---- if user_data.profile.is_suspected_bot: flags["is_suspected_bot"] = True + if model_name == "v3": + return self._score_v3(user_data, context_repo, flags, fresh_account) if model_name == "v2": - return self._score_v2(user_data, context_repo, flags) - return self._score_v1(user_data, context_repo, flags) + return self._score_v2(user_data, context_repo, flags, fresh_account) + return self._score_v1(user_data, context_repo, flags, fresh_account) # ------------------------------------------------------------------ # v1 scoring path @@ -84,6 +90,7 @@ def _score_v1( user_data: UserContributionData, context_repo: str, flags: dict[str, bool], + fresh_account: FreshAccountAdvisory | None = None, ) -> TrustScore: """Original graph-based scoring pipeline.""" login = user_data.profile.login @@ -130,6 +137,7 @@ def _score_v1( "graph_nodes": graph.number_of_nodes(), "graph_edges": graph.number_of_edges(), }, + fresh_account=fresh_account, ) # ------------------------------------------------------------------ @@ -141,6 +149,7 @@ def _score_v2( user_data: UserContributionData, context_repo: str, flags: dict[str, bool], + fresh_account: FreshAccountAdvisory | None = None, ) -> TrustScore: """v2 scoring: simplified graph + logistic regression combined model.""" login = user_data.profile.login @@ -222,6 +231,71 @@ def _score_v2( }, scoring_model="v2", component_scores=component_scores, + fresh_account=fresh_account, + ) + + # ------------------------------------------------------------------ + # v3 scoring path (Diet Egg) + # ------------------------------------------------------------------ + + def _score_v3( + self, + user_data: UserContributionData, + context_repo: str, + flags: dict[str, bool], + fresh_account: FreshAccountAdvisory | None = None, + ) -> TrustScore: + """v3 scoring: merge rate as sole signal.""" + login = user_data.profile.login + total_prs = len(user_data.merged_prs) + unique_repos = len({pr.repo_name_with_owner for pr in user_data.merged_prs}) + + merged_count = len(user_data.merged_prs) + closed_count = user_data.closed_pr_count + total_author_prs = merged_count + closed_count + merge_rate = ( + merged_count / total_author_prs if total_author_prs > 0 else 0.0 + ) + + trust_level = self._classify(merge_rate, flags) + context_language = self._resolve_context_language(user_data, context_repo) + top_contributions = self._build_top_contributions(user_data) + language_match = self._check_language_match(user_data, context_language) + + return TrustScore( + user_login=login, + context_repo=context_repo, + raw_score=merge_rate, + normalized_score=merge_rate, + trust_level=trust_level, + account_age_days=user_data.profile.account_age_days, + total_merged_prs=total_prs, + unique_repos_contributed=unique_repos, + top_contributions=top_contributions, + language_match=language_match, + flags=flags, + scoring_metadata={ + "closed_pr_count": closed_count, + }, + scoring_model="v3", + component_scores={"merge_rate": merge_rate}, + fresh_account=fresh_account, + ) + + # ------------------------------------------------------------------ + # Fresh account advisory + # ------------------------------------------------------------------ + + @staticmethod + def _build_fresh_account_advisory( + user_data: UserContributionData, + ) -> FreshAccountAdvisory: + """Build a fresh account advisory from user profile data.""" + age_days = user_data.profile.account_age_days + return FreshAccountAdvisory( + is_fresh=age_days < 365, + account_age_days=age_days, + created_at=user_data.profile.created_at, ) # ------------------------------------------------------------------ diff --git a/tests/test_config.py b/tests/test_config.py index 5711e97..ec4c36d 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -157,9 +157,9 @@ def test_custom_values(self) -> None: class TestScoringModelConfig: - def test_default_is_v1(self) -> None: + def test_default_is_v3(self) -> None: config = GoodEggConfig() - assert config.scoring_model == "v1" + assert config.scoring_model == "v3" def test_set_to_v2(self) -> None: config = GoodEggConfig(scoring_model="v2") diff --git a/tests/test_formatter.py b/tests/test_formatter.py index 8f52f15..c8c0b8e 100644 --- a/tests/test_formatter.py +++ b/tests/test_formatter.py @@ -11,7 +11,12 @@ format_json, format_markdown_comment, ) -from good_egg.models import ContributionSummary, TrustLevel, TrustScore +from good_egg.models import ( + ContributionSummary, + FreshAccountAdvisory, + TrustLevel, + TrustScore, +) def _make_score(**kwargs) -> TrustScore: @@ -332,3 +337,152 @@ def test_json_existing_contributor(self) -> None: parsed = json.loads(result) assert parsed["trust_level"] == "EXISTING_CONTRIBUTOR" assert parsed["flags"]["is_existing_contributor"] is True + + +class TestDietEggFormatting: + def _make_v3_score(self, **kwargs) -> TrustScore: + defaults = { + "user_login": "testuser", + "context_repo": "owner/repo", + "raw_score": 0.75, + "normalized_score": 0.75, + "trust_level": TrustLevel.HIGH, + "account_age_days": 500, + "total_merged_prs": 30, + "unique_repos_contributed": 8, + "top_contributions": [ + ContributionSummary( + repo_name="cool/project", + pr_count=15, + language="Python", + stars=1200, + ), + ], + "language_match": True, + "flags": {}, + "scoring_metadata": {"closed_pr_count": 10}, + "scoring_model": "v3", + "component_scores": {"merge_rate": 0.75}, + } + defaults.update(kwargs) + return TrustScore(**defaults) + + def test_markdown_diet_egg_header(self) -> None: + score = self._make_v3_score() + md = format_markdown_comment(score) + assert "Diet Egg" in md + assert "Better Egg" not in md + assert "Good Egg" not in md + + def test_markdown_component_breakdown(self) -> None: + score = self._make_v3_score() + md = format_markdown_comment(score) + assert "Score Breakdown" in md + assert "Merge Rate" in md + # graph_score and log_account_age should not appear + assert "Graph Score" not in md + assert "Account Age" not in md + + def test_cli_diet_egg_header(self) -> None: + score = self._make_v3_score() + out = format_cli_output(score) + assert "Diet Egg" in out + + def test_cli_verbose_component_scores(self) -> None: + score = self._make_v3_score() + out = format_cli_output(score, verbose=True) + assert "Component scores" in out + assert "Merge Rate" in out + assert "Graph Score" not in out + + def test_check_run_diet_egg_title(self) -> None: + score = self._make_v3_score() + title, summary = format_check_run_summary(score) + assert "Diet Egg" in title + assert "Score Breakdown" in summary + assert "Merge Rate" in summary + + def test_json_includes_v3_fields(self) -> None: + score = self._make_v3_score() + result = format_json(score) + parsed = json.loads(result) + assert parsed["scoring_model"] == "v3" + assert "merge_rate" in parsed["component_scores"] + assert "graph_score" not in parsed["component_scores"] + + +class TestFreshAccountFormatting: + def _make_fresh_score(self, is_fresh: bool = True, **kwargs) -> TrustScore: + fresh = FreshAccountAdvisory( + is_fresh=is_fresh, + account_age_days=200 if is_fresh else 500, + ) + defaults = { + "user_login": "newbie", + "context_repo": "owner/repo", + "raw_score": 0.6, + "normalized_score": 0.6, + "trust_level": TrustLevel.MEDIUM, + "account_age_days": 200 if is_fresh else 500, + "total_merged_prs": 5, + "unique_repos_contributed": 2, + "flags": {}, + "scoring_model": "v3", + "component_scores": {"merge_rate": 0.6}, + "fresh_account": fresh, + } + defaults.update(kwargs) + return TrustScore(**defaults) + + def test_markdown_fresh_account_shown(self) -> None: + score = self._make_fresh_score(is_fresh=True) + md = format_markdown_comment(score) + assert "Fresh Account" in md + assert "200 days old" in md + + def test_markdown_fresh_account_hidden(self) -> None: + score = self._make_fresh_score(is_fresh=False) + md = format_markdown_comment(score) + assert "Fresh Account" not in md + + def test_cli_fresh_account_shown(self) -> None: + score = self._make_fresh_score(is_fresh=True) + out = format_cli_output(score, verbose=True) + assert "Fresh account" in out + assert "200 days old" in out + + def test_cli_fresh_account_hidden_non_verbose(self) -> None: + score = self._make_fresh_score(is_fresh=True) + out = format_cli_output(score, verbose=False) + assert "Fresh account" not in out + + def test_cli_fresh_account_hidden_not_fresh(self) -> None: + score = self._make_fresh_score(is_fresh=False) + out = format_cli_output(score, verbose=True) + assert "Fresh account" not in out + + def test_check_run_fresh_account_shown(self) -> None: + score = self._make_fresh_score(is_fresh=True) + _, summary = format_check_run_summary(score) + assert "Fresh Account" in summary + assert "200 days old" in summary + + def test_check_run_fresh_account_hidden(self) -> None: + score = self._make_fresh_score(is_fresh=False) + _, summary = format_check_run_summary(score) + assert "Fresh Account" not in summary + + def test_json_fresh_account_serialized(self) -> None: + score = self._make_fresh_score(is_fresh=True) + result = format_json(score) + parsed = json.loads(result) + assert parsed["fresh_account"]["is_fresh"] is True + assert parsed["fresh_account"]["account_age_days"] == 200 + + def test_json_fresh_account_none(self) -> None: + score = TrustScore( + user_login="u", context_repo="o/r", fresh_account=None + ) + result = format_json(score) + parsed = json.loads(result) + assert parsed["fresh_account"] is None diff --git a/tests/test_models.py b/tests/test_models.py index 4dac06c..98611fe 100644 --- a/tests/test_models.py +++ b/tests/test_models.py @@ -5,6 +5,7 @@ from datetime import UTC, datetime, timedelta from good_egg.models import ( + FreshAccountAdvisory, MergedPR, RepoMetadata, TrustLevel, @@ -156,6 +157,58 @@ def test_backward_compat_no_new_fields(self) -> None: assert score.scoring_model == "v1" assert score.component_scores == {} + def test_fresh_account_field_default_none(self) -> None: + score = TrustScore(user_login="u", context_repo="o/r") + assert score.fresh_account is None + + def test_fresh_account_field_set(self) -> None: + advisory = FreshAccountAdvisory( + is_fresh=True, + account_age_days=100, + created_at=datetime(2025, 12, 1, tzinfo=UTC), + ) + score = TrustScore( + user_login="u", + context_repo="o/r", + fresh_account=advisory, + ) + assert score.fresh_account is not None + assert score.fresh_account.is_fresh is True + assert score.fresh_account.account_age_days == 100 + assert score.fresh_account.threshold_days == 365 + def test_top_contributions(self, sample_trust_score: TrustScore) -> None: assert len(sample_trust_score.top_contributions) == 2 assert sample_trust_score.top_contributions[0].repo_name == "elixir-lang/elixir" + + +class TestFreshAccountAdvisory: + def test_construction_fresh(self) -> None: + advisory = FreshAccountAdvisory(is_fresh=True, account_age_days=100) + assert advisory.is_fresh is True + assert advisory.account_age_days == 100 + assert advisory.created_at is None + assert advisory.threshold_days == 365 + + def test_construction_not_fresh(self) -> None: + advisory = FreshAccountAdvisory(is_fresh=False, account_age_days=500) + assert advisory.is_fresh is False + + def test_construction_with_created_at(self) -> None: + dt = datetime(2025, 6, 1, tzinfo=UTC) + advisory = FreshAccountAdvisory( + is_fresh=True, account_age_days=100, created_at=dt + ) + assert advisory.created_at == dt + + def test_serialization_roundtrip(self) -> None: + import json + advisory = FreshAccountAdvisory( + is_fresh=True, + account_age_days=200, + created_at=datetime(2025, 9, 1, tzinfo=UTC), + ) + data = json.loads(advisory.model_dump_json()) + restored = FreshAccountAdvisory(**data) + assert restored.is_fresh is True + assert restored.account_age_days == 200 diff --git a/tests/test_scorer.py b/tests/test_scorer.py index accbe6c..dc252f4 100644 --- a/tests/test_scorer.py +++ b/tests/test_scorer.py @@ -20,6 +20,8 @@ def _make_config(**overrides: object) -> GoodEggConfig: + if "scoring_model" not in overrides: + overrides["scoring_model"] = "v1" return GoodEggConfig(**overrides) # type: ignore[arg-type] @@ -659,7 +661,7 @@ async def test_skip_when_contributor_exists( mock_client.__aexit__ = AsyncMock(return_value=False) mock_client_cls.return_value = mock_client - config = GoodEggConfig(skip_known_contributors=True) + config = GoodEggConfig(skip_known_contributors=True, scoring_model="v1") result = await score_pr_author( login="testuser", repo_owner="my-org", @@ -690,7 +692,7 @@ async def test_no_skip_when_count_is_zero( mock_client.__aexit__ = AsyncMock(return_value=False) mock_client_cls.return_value = mock_client - config = GoodEggConfig(skip_known_contributors=True) + config = GoodEggConfig(skip_known_contributors=True, scoring_model="v1") result = await score_pr_author( login="testuser", repo_owner="my-org", @@ -716,7 +718,7 @@ async def test_force_score_bypasses_check( mock_client.__aexit__ = AsyncMock(return_value=False) mock_client_cls.return_value = mock_client - config = GoodEggConfig(skip_known_contributors=False) + config = GoodEggConfig(skip_known_contributors=False, scoring_model="v1") result = await score_pr_author( login="testuser", repo_owner="my-org", @@ -728,3 +730,221 @@ async def test_force_score_bypasses_check( assert result.trust_level != TrustLevel.EXISTING_CONTRIBUTOR mock_client.check_existing_contributor.assert_not_called() mock_client.get_user_contribution_data.assert_called_once() + + @pytest.mark.asyncio + @patch("good_egg.github_client.GitHubClient") + async def test_existing_contributor_has_no_fresh_account( + self, mock_client_cls: AsyncMock + ) -> None: + """Existing contributor early return should have fresh_account=None.""" + mock_client = AsyncMock() + mock_client.check_existing_contributor = AsyncMock(return_value=3) + mock_client.__aenter__ = AsyncMock(return_value=mock_client) + mock_client.__aexit__ = AsyncMock(return_value=False) + mock_client_cls.return_value = mock_client + + config = GoodEggConfig(skip_known_contributors=True, scoring_model="v1") + result = await score_pr_author( + login="testuser", + repo_owner="my-org", + repo_name="my-repo", + config=config, + token="fake-token", + ) + + assert result.trust_level == TrustLevel.EXISTING_CONTRIBUTOR + assert result.fresh_account is None + + +class TestV3Scoring: + def test_v3_merge_rate_as_score(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + # 3 merged + 2 closed = 3/5 = 0.6 + data = _make_contribution_data(merged_prs=prs, repos=repos, closed_pr_count=2) + result = scorer.score(data, "my-org/my-app") + + assert abs(result.raw_score - 0.6) < 1e-9 + assert abs(result.normalized_score - 0.6) < 1e-9 + assert result.scoring_model == "v3" + + def test_v3_component_scores_shape(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data(merged_prs=prs, repos=repos, closed_pr_count=5) + result = scorer.score(data, "my-org/my-app") + + assert list(result.component_scores.keys()) == ["merge_rate"] + assert 0.0 <= result.component_scores["merge_rate"] <= 1.0 + + def test_v3_no_graph_metadata(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data(merged_prs=prs, repos=repos, closed_pr_count=5) + result = scorer.score(data, "my-org/my-app") + + assert "graph_nodes" not in result.scoring_metadata + assert "graph_edges" not in result.scoring_metadata + assert "closed_pr_count" in result.scoring_metadata + + def test_v3_trust_level_classification(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + # 3 merged + 0 closed = 100% merge rate -> HIGH + data = _make_contribution_data(merged_prs=prs, repos=repos, closed_pr_count=0) + result = scorer.score(data, "my-org/my-app") + + assert result.trust_level == TrustLevel.HIGH + + def test_v3_low_merge_rate(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs = [ + MergedPR( + repo_name_with_owner="org/repo", + title="PR", + merged_at=datetime.now(UTC) - timedelta(days=1), + ), + ] + repos = { + "org/repo": RepoMetadata( + name_with_owner="org/repo", + stargazer_count=100, + ), + } + # 1 merged + 10 closed = 1/11 ~ 0.09 -> LOW + data = _make_contribution_data(merged_prs=prs, repos=repos, closed_pr_count=10) + result = scorer.score(data, "org/repo") + + assert result.trust_level == TrustLevel.LOW + + def test_v3_is_default_model(self) -> None: + config = GoodEggConfig() + assert config.scoring_model == "v3" + + def test_v3_bot_short_circuit(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + data = _make_contribution_data(is_bot=True) + result = scorer.score(data, "org/repo") + + assert result.trust_level == TrustLevel.BOT + assert result.scoring_model == "v3" + + def test_v3_insufficient_data_short_circuit(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + data = _make_contribution_data(merged_prs=[]) + result = scorer.score(data, "org/repo") + + assert result.trust_level == TrustLevel.UNKNOWN + assert result.scoring_model == "v3" + + def test_v3_zero_total_prs(self) -> None: + """Edge case: merged_prs present but closed_pr_count causes 0 denominator.""" + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs = [ + MergedPR( + repo_name_with_owner="org/repo", + title="PR", + merged_at=datetime.now(UTC) - timedelta(days=1), + ), + ] + repos = { + "org/repo": RepoMetadata( + name_with_owner="org/repo", + stargazer_count=100, + ), + } + data = _make_contribution_data(merged_prs=prs, repos=repos, closed_pr_count=0) + result = scorer.score(data, "org/repo") + + assert result.raw_score == 1.0 + assert result.normalized_score == 1.0 + + +class TestFreshAccountAdvisory: + def test_fresh_account_flagged_under_365(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data( + days_old=200, merged_prs=prs, repos=repos + ) + result = scorer.score(data, "org/repo") + + assert result.fresh_account is not None + assert result.fresh_account.is_fresh is True + assert result.fresh_account.account_age_days == 200 + assert result.fresh_account.threshold_days == 365 + + def test_fresh_account_not_flagged_over_365(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data( + days_old=500, merged_prs=prs, repos=repos + ) + result = scorer.score(data, "org/repo") + + assert result.fresh_account is not None + assert result.fresh_account.is_fresh is False + + def test_fresh_account_boundary_exactly_365(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data( + days_old=365, merged_prs=prs, repos=repos + ) + result = scorer.score(data, "org/repo") + + assert result.fresh_account is not None + assert result.fresh_account.is_fresh is False + + def test_fresh_account_none_on_bot_short_circuit(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + data = _make_contribution_data(is_bot=True, days_old=100) + result = scorer.score(data, "org/repo") + + assert result.trust_level == TrustLevel.BOT + assert result.fresh_account is None + + def test_fresh_account_on_insufficient_data(self) -> None: + config = GoodEggConfig(scoring_model="v3") + scorer = TrustScorer(config) + data = _make_contribution_data(merged_prs=[], days_old=100) + result = scorer.score(data, "org/repo") + + assert result.trust_level == TrustLevel.UNKNOWN + assert result.fresh_account is not None + assert result.fresh_account.is_fresh is True + + def test_fresh_account_populated_in_v1(self) -> None: + scorer = TrustScorer(_make_config()) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data( + days_old=200, merged_prs=prs, repos=repos + ) + result = scorer.score(data, "org/repo") + + assert result.fresh_account is not None + assert result.fresh_account.is_fresh is True + + def test_fresh_account_populated_in_v2(self) -> None: + config = GoodEggConfig(scoring_model="v2") + scorer = TrustScorer(config) + prs, repos = _sample_prs_and_repos() + data = _make_contribution_data( + days_old=200, merged_prs=prs, repos=repos + ) + result = scorer.score(data, "org/repo") + + assert result.fresh_account is not None + assert result.fresh_account.is_fresh is True