Bug triage results: 2026-05-18
Triage pass over the open requires-triage queue, per the project Bug Triage Guide.
- Total issues processed: 20
- Labels applied to: 19
- Skipped: 1
priority:high: 2
priority:medium: 10
priority:low: 7
Labels have already been applied and requires-triage removed from each issue listed under "Triaged". A reviewer should spot-check the calls and close this issue when satisfied. To correct a label, edit the affected issue directly.
Triaged
priority:high
- AbstractMethodError: CometBroadcastExchangeExec missing sparkContext() from BroadcastExchangeLike (#4318)
- Area labels: none
- Rationale:
AbstractMethodError thrown on a supported code path (Comet 0.16 + Spark 3.5.6 broadcast joins); per the guide, an unhandled exception on a supported path is priority:high.
- Windows crash if frame overflow (#4307)
- Area labels:
area:expressions
- Rationale: Native engine throws
CometNativeException on a supported window-function query (the 18446744073709551615 index points to a u64 underflow in frame computation); a native crash on a supported path is priority:high.
priority:medium
- Allocate Comet's parquet reader buffers from ArrowUtils.rootAllocator to enable zero-copy PyArrow UDF runner (#4294)
- Area labels:
area:scan, area:ffi
- Rationale: Performance optimization for the columnar Python runner with a working bulk-copy fallback today; matches the guide's "performance regression with workaround" criterion.
- [FEATURE] Native scan support for VariantType columns (Iceberg + Spark 4.0) (#4295)
- Area labels:
area:scan, native_iceberg_compat, spark 4.0
- Rationale: Missing native VariantType support causes whole-query fallback; functional gap with Spark fallback workaround is
priority:medium.
- Implement JVM UDFs for all date/time expressions (#4311)
- Area labels:
area:expressions
- Rationale: Compatibility-feature gap: replace native date/time expressions with JVM UDFs for full Spark parity; functional gap with workaround is
priority:medium.
- Add support for native custom scalar UDFs (#4312)
- Implement JVM UDFs for JSON expressions (#4313)
- Area labels:
area:expressions
- Rationale: Adds full Spark-compatible JSON expression support via JVM UDFs; missing-feature gap with workaround is
priority:medium.
- Writes to Apache Iceberg Tables (#4322)
- Area labels:
area:writer, native_iceberg_compat
- Rationale: New Iceberg write path is a major feature gap with the existing Spark write path as workaround; matches
priority:medium.
- Frequent CI failures for Spark 4.0.2 / JDK 21 (#4327)
- Area labels:
area:ci, spark 4.0
- Rationale: A flaky CI test would normally be
priority:low, but the title says "frequent" failures on the standard Spark 4.0.2 / JDK 21 build; per the guide's escalation rule for CI consistently blocking merges, escalated to priority:medium.
- Credential Provider Support (#4332)
- Area labels:
area:scan, native_iceberg_compat
- Rationale: Missing pluggable credential provider for Iceberg scans (only static creds today); functional gap with workaround is
priority:medium.
- Comet JVM UDF implementations cannot be created in
spark module (#4336)
- Area labels:
area:expressions
- Rationale: Module / shading structure prevents implementing UDFs that need
spark-module access; broken feature with workaround (place UDFs in common) is priority:medium.
- Implement TimeType support (#4288)
- Area labels:
area:expressions (existing: EPIC, spark 4.1)
- Rationale: Issue already carried
priority:medium from a prior reviewer; this pass added area:expressions and removed requires-triage.
priority:low
- Make CI run on the contributor forks (#4289)
- Area labels:
area:ci
- Rationale: CI infrastructure rework with no functional impact; matches the guide's
priority:low "tooling" example.
- [DISCUSS] Simplify regex engine + incompatibility config model (#4310)
- Area labels:
area:expressions
- Rationale: Refactor / config-UX discussion with no underlying functional bug; user experience polish is
priority:low.
- Drop support for Spark 3.4 (#4329)
- Area labels: none
- Rationale: Project-policy / versioning discussion; tooling-and-process item maps to
priority:low.
- Enable spark.comet.exec.localTableScan.enabled when running Spark SQL tests (#4347)
- Area labels:
spark sql tests
- Rationale: Test-infrastructure tweak so SQL suites exercise more of Comet; test-only / tooling change is
priority:low.
- native_datafusion: tests asserting parquet-mr's permissive overflow/narrowing behavior cannot be made to pass (#4352)
- Area labels:
area:scan, spark sql tests (existing: native_datafusion)
- Rationale: Architectural test-only mismatch; the workaround is to re-ignore the affected Spark tests. Test-only with workaround is
priority:low.
- native_datafusion (Spark 3.x): shim's ParquetSchemaConvert translation produces an extra SparkException cause-chain layer (#4354)
- Area labels:
area:scan, native_datafusion
- Rationale: Behavior difference visible only in Spark SQL test cause-chain assertions; tests stay ignored as a workaround. Test-only failure is
priority:low.
- Change UDF signature to use ColumnarValue rather than raw Arrow types (#4358)
- Area labels:
area:expressions
- Rationale: Internal API refactor with no user-facing functional bug; matches
priority:low for tooling/internal cleanup.
Escalations to consider
- Frequent CI failures for Spark 4.0.2 / JDK 21 (#4327)
- Escalated from
priority:low (CI flake) to priority:medium per the guide's rule "A priority:low CI flake is blocking PR merges consistently → escalate to priority:medium". The reviewer should confirm whether these failures are in fact blocking merges; if not, downgrade to priority:low.
Skipped — needs more info
- Bug triage results: 2026-05-11 (#4287)
- This is the previous triage's summary issue. It is a meta issue, not a bug or feature request, and per this skill's rules ("Do not add labels to the summary issue itself") it should not carry a priority. The reviewer should close it when finished spot-checking the prior pass;
requires-triage was left in place since this skill does not modify summary issues.
Notes on label availability
Bug triage results: 2026-05-18
Triage pass over the open
requires-triagequeue, per the project Bug Triage Guide.priority:high: 2priority:medium: 10priority:low: 7Labels have already been applied and
requires-triageremoved from each issue listed under "Triaged". A reviewer should spot-check the calls and close this issue when satisfied. To correct a label, edit the affected issue directly.Triaged
priority:high
AbstractMethodErrorthrown on a supported code path (Comet 0.16 + Spark 3.5.6 broadcast joins); per the guide, an unhandled exception on a supported path ispriority:high.area:expressionsCometNativeExceptionon a supported window-function query (the18446744073709551615index points to a u64 underflow in frame computation); a native crash on a supported path ispriority:high.priority:medium
area:scan,area:ffiarea:scan,native_iceberg_compat,spark 4.0priority:medium.area:expressionspriority:medium.area:expressionspriority:medium.area:expressionspriority:medium.area:writer,native_iceberg_compatpriority:medium.area:ci,spark 4.0priority:low, but the title says "frequent" failures on the standard Spark 4.0.2 / JDK 21 build; per the guide's escalation rule for CI consistently blocking merges, escalated topriority:medium.area:scan,native_iceberg_compatpriority:medium.sparkmodule (#4336)area:expressionsspark-module access; broken feature with workaround (place UDFs incommon) ispriority:medium.area:expressions(existing:EPIC,spark 4.1)priority:mediumfrom a prior reviewer; this pass addedarea:expressionsand removedrequires-triage.priority:low
area:cipriority:low"tooling" example.area:expressionspriority:low.priority:low.spark sql testspriority:low.area:scan,spark sql tests(existing:native_datafusion)priority:low.area:scan,native_datafusionpriority:low.area:expressionspriority:lowfor tooling/internal cleanup.Escalations to consider
priority:low(CI flake) topriority:mediumper the guide's rule "Apriority:lowCI flake is blocking PR merges consistently → escalate topriority:medium". The reviewer should confirm whether these failures are in fact blocking merges; if not, downgrade topriority:low.Skipped — needs more info
requires-triagewas left in place since this skill does not modify summary issues.Notes on label availability
spark 4as a pre-existing area indicator, but the repo only has versioned labels (spark 3.x,spark 4.0,spark 4.1,spark 4.2). Where applicable, the most specific existing version label was used (spark 4.0for [FEATURE] Native scan support for VariantType columns (Iceberg + Spark 4.0) #4295 and Frequent CI failures for Spark 4.0.2 / JDK 21 #4327). No new labels were created.