Skip to content

[Self-hosted] Missing base commit on main despite earlier covered ancestors (monorepo, batched merges) #860

@nikosatwork

Description

@nikosatwork

Describe the bug

On a self-hosted Codecov instance used for a monorepo hosted on GitHub Enterprise, main (default branch) coverage sometimes shows “Missing base commit” even though:

  • Coverage has been uploaded for every main batch head (via a set of CI jobs plus carryforward for many flags).
  • There are earlier main head commits with coverage in Codecov that should be valid ancestors and usable as base.

Example sequence of main heads (all on the same branch, merged via batches):

  • ff5da5cc490e101779515fe1041a3620d3691b02 – shows correctly, base commit found
  • 5c1172b1e4cf9ce1df18a1180afdc95d10d1b2d6 – shows correctly, base commit found
  • 2cab47a0f48e29c2ac5e12275a22a9e0eff3738a – “Missing base commit” (unexpected)
  • 5ff4d21d4551b7510115773080d2ad13046b1b55 – “Missing base commit” (unexpected)

For the commit 2cab47a0f4... (and similarly for 5ff4d21d45…), worker logs show:

{"message": "Unable to find a parent commit that was properly found on Github", "commit": "2cab47a0f48e29c2ac5e12275a22a9e0eff3738a", "repoid": 15, "context": {"task_name": "app.tasks.commit_update.CommitUpdate"}}
{"message": "No parent commit was found to be carriedforward from", "commit": "2cab47a0f48e29c2ac5e12275a22a9e0eff3738a", "repoid": 15, "context": {"task_name": "app.tasks.upload.PreProcessUpload"}, "parent_tracing": []}
{"message": "Could not find parent for possible carryforward", "commit": "2cab47a0f48e29c2ac5e12275a22a9e0eff3738a", "repoid": 15}
{"message": "Neither the original nor updated base commit are known", "commit": "2cab47a0f48e29c2ac5e12275a22a9e0eff3738a", "repoid": 15, "context": {"task_name": "app.tasks.notify.Notify"}}

This is unexpected because there are earlier main branch commits (5c1172b…, ff5da5c…) with coverage that should be valid ancestors on the same branch, and we would expect Codecov to walk back and use the last covered ancestor as base.

Environment (please complete the following information):

  • Codecov deployment: Self-hosted Codecov on Kubernetes (GKE) using official self-hosted-* images
  • SCM: GitHub Enterprise (internal, “GitHub.CDS”)
  • Branch model: main-based, large batches landing into main (e.g., 10–60 PRs per batch)
  • CI system: Internal-based CI (runs coverage jobs and uses Codecov CLI to upload)
  • Codecov version: 26.1.20, and all versions prior to this
  • Codecov CLI version: 11.2.4, and all versions prior to this
  • Codecov configuration:
codecov:
  max_report_age: false
  require_ci_to_pass: false
  allow_coverage_offsets: true
  notify:
    wait_for_ci: false

coverage:
  precision: 2
  round: down
  range: "50...70"
  status:
    project: false
    patch: false
    default_rules:
      flag_coverage_not_uploaded_behavior: exclude

comment: false

flag_management:
  default_rules:
    carryforward: true

To Reproduce

Steps to reproduce the behavior:

This is not a one-click UI reproduction; it shows up in a main-based monorepo with batched merges and multiple coverage flags. A simplified reproduction flow:

  1. On GitHub Enterprise, use a main-based workflow where regularly merge batches of many PRs into the default branch (main).
  2. For each batch landing into main:
  • Run a set of jobs determined by changed files in the batch.
  • Each batch head runs coverage for ~5–10 flags and relies on flag_management.default_rules.carryforward: true to bring forward coverage for ~250 other flags from prior commits.
  • Upload coverage to the self-hosted Codecov instance with the Codecov CLI, pointing at the batch head SHA and the main branch.
  1. Allow this to run across multiple main heads, ensuring that Codecov shows coverage for at least some early main heads:
  2. In Codecov UI, open the coverage page for each of those main commits and/or inspect the notifications/status.
    Observe that some will report “Missing base commit”, despite earlier commits on the same branch having coverage.

Simultaneously, in the worker logs for the problematic commits, observe:

  • Unable to find a parent commit that was properly found on Github
  • No parent commit was found to be carriedforward from
  • parent_tracing: []
  • Neither the original nor updated base commit are known

Expected behavior

Given that:

  • main heads ff5da5cc… and 5c1172b1e… have coverage recorded in Codecov for the same repo and branch, and
  • Later main heads (2cab47a…, 5ff4d21d…) are in the same Git ancestry and also upload coverage,

we expect Codecov to:

  • Use the immediate parent as base when it has coverage (e.g., 5c1172b1e… as base for 2cab47a…, and 2cab47a… as base for 5ff4d21d4…), or
  • Failing that, walk back through the commit ancestry and find the closest ancestor with coverage (e.g., 5c1172b1e… as base for 5ff4d21d4…),

rather than reporting “Missing base commit” when covered ancestor commits exist on the same branch.

In other words, for pushes to main branch in a monorepo with frequent batched merges and partial-per-batch coverage jobs, we expect Codecov to reliably find and use the last known main commit with coverage as base (following Git ancestry), as long as coverage exists and metadata is consistent.

Additional context

  • Repo is extremely large; each batch can merge 10–60 PRs.
  • Each main head only runs a subset of coverage jobs; many flags are carried forward using flag_management.default_rules.carryforward: true
  • The pattern is:
    • Every main head runs some coverage jobs and uploads coverage.
    • Most flags are carried forward; ~5–10 flags are newly uploaded for that head.
    • However, only some main heads get a valid base commit; others intermittently show “Missing base commit”.
  • The worker logs suggest Codecov tries to find a parent with coverage but ends up with parent_tracing: [] even though older main commits with coverage exist, which makes it look like:
    • Either the ancestry walk is failing in some cases, or
    • Carryforward / partial coverage per head interacts with base selection in a way that causes Codecov to treat some commits as having no usable covered ancestor, despite coverage being present on earlier main heads.

Attachments

I am attaching the following files to illustrate the main history and context:

  • main-full-graph.txt, output of git --no-pager log --graph --oneline --decorate --max-count=1000 main, showing the full commit graph around the relevant main heads.
  • main-mainline.txt, output of git --no-pager log --graph --oneline --decorate --first-parent --max-count=120 main, showing the mainline (first-parent) history of main heads where batches landed.

These files are obfuscated to avoid exposing commit messages but preserve SHAs and structure.

main-full-graph.txt
main-mainline.txt

Questions

  1. Is this behavior expected with our configuration (self-hosted + flag_management.default_rules.carryforward: true + partial coverage per main head)?
  2. Under what conditions does Codecov decide that no parent can be used for carryforward, resulting in parent_tracing: [], even when earlier main commits have coverage?
  3. Is there a recommended configuration or best practice for large batched workflows like this to ensure that:
    • Every main head uses the last covered main ancestor as base, and
    • “Missing base commit” does not occur as long as at least one earlier main commit has coverage?
  4. Would explicitly passing a base via the CLI (e.g. using --parent-sha <known_covered_ancestor_sha>) be a supported and recommended workaround in this scenario, assuming the specified parent commit has coverage in Codecov? If so, are there any caveats or best practices for using --parent-sha in a large batched-merge monorepo like ours?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Waiting for: Product Owner

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions