Skip to content

Conversation

@dklawren
Copy link
Contributor

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a comprehensive pytest-based test suite and GitHub Actions workflow to automatically run linting, unit tests, and integration tests for the GitHub ETL pipeline.

Changes:

  • Added a large pytest test suite covering main.py functions (extraction, transformation, loading, orchestration).
  • Added pytest/coverage configuration and a testing guide documenting local + CI workflows.
  • Added a GitHub Actions workflow to run linting/tests on pull requests; expanded dependencies to include lint/format tools.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test_main.py New comprehensive unit/integration test suite for main.py.
requirements.txt Adds dev tooling dependencies (black/flake8/mypy/isort) alongside existing test deps.
pytest.ini Configures pytest discovery, verbosity, and coverage reporting.
TESTING.md Documents how to run tests/linting locally and in CI, plus docker-based integration testing.
.github/workflows/tests.yml Adds CI workflow for linting, pytest runs (unit + all), coverage artifacts, and docker-compose integration job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

main.py:536

  • GITHUB_REPOS is split on commas but entries aren’t stripped/validated. Values like "owner/repo, owner/repo" (note the space) or a trailing comma will produce repo strings with leading whitespace or empty entries, which will break API URLs. Consider stripping whitespace and filtering out empty repo names before iterating.
    github_repos = []
    github_repos_str = os.getenv("GITHUB_REPOS")
    if github_repos_str:
        github_repos = github_repos_str.split(",")
    else:
        raise SystemExit(
            "Environment variable GITHUB_REPOS is required (format: 'owner/repo,owner/repo')"
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@cgsheeh cgsheeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked over the tests in full yet, but wanted to leave comments about the higher-level items I noticed off the bat.

@dklawren dklawren requested a review from cgsheeh January 23, 2026 23:22
Copy link
Member

@cgsheeh cgsheeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of content in this PR, it might take a few iterations to review it fully.

We could land some of this code faster if we split it into smaller chunks. For example, we could add linting support and tests in one PR, and deal with the tests themselves in another.

requires-python = ">=3.14"
license = {text = "MPL-2.0"}
authors = [
{name = "Mozilla", email = "dev-platform@lists.mozilla.org"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{name = "Mozilla", email = "dev-platform@lists.mozilla.org"}
{name = "Mozilla", email = "conduit-team@mozilla.org"}



def test_ruff():
passed = subprocess.call(("ruff", "check", "main.py", "--target-version", "py314"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the --target-version and it will be inferred from the requires-python in pyproject.toml.

Also, you can specify the root of the repo to ensure we lint all files going forward.

Suggested change
passed = subprocess.call(("ruff", "check", "main.py", "--target-version", "py314"))
passed = subprocess.call(("ruff", "check", ".",))



def test_black():
cmd = ("black", "--diff", "main.py")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cmd = ("black", "--diff", "main.py")
cmd = ("black", "--diff", ".")


import main

# =============================================================================
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should create a tests/ directory for all the test files, and in that directory we should have a conftest.py file where we define all the fixtures. This is the common configuration for testing Python with pytest.

# =============================================================================
# TESTS FOR EXTRACT_COMMITS
# =============================================================================

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be a new test? Missing def test_name(): here if so.

Can you remove these long # ===== page separators? If you feel they are necessary, you should instead split these tests out into their own file like test_extract_commits.py.

@@ -0,0 +1,621 @@
# Testing Guide for GitHub ETL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file reads like an AI generated description of the PR. Let's remove it, or update it to only include instructions on how to run various testing commands. A doc that says "here are the tools we use and our coverage numbers" isn't very useful and is likely to bitrot quickly, as you can see from the fact that it still references isort, mypy and others despite already switching to ruff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants