Bug triage results: 2026-05-18

# Bug triage results: 2026-05-18

Triage pass over the open `requires-triage` queue, per the project [Bug Triage Guide](https://github.com/apache/datafusion-comet/blob/main/docs/source/contributor-guide/bug_triage.md).

- Total issues processed: 20
- Labels applied to: 19
- Skipped: 1
- `priority:high`: 2
- `priority:medium`: 10
- `priority:low`: 7

Labels have already been applied and `requires-triage` removed from each issue listed under "Triaged". A reviewer should spot-check the calls and close this issue when satisfied. To correct a label, edit the affected issue directly.

## Triaged

### priority:high

- AbstractMethodError: CometBroadcastExchangeExec missing sparkContext() from BroadcastExchangeLike ([#4318](https://github.com/apache/datafusion-comet/issues/4318))
  - Area labels: none
  - Rationale: `AbstractMethodError` thrown on a supported code path (Comet 0.16 + Spark 3.5.6 broadcast joins); per the guide, an unhandled exception on a supported path is `priority:high`.
- Windows crash if frame overflow ([#4307](https://github.com/apache/datafusion-comet/issues/4307))
  - Area labels: `area:expressions`
  - Rationale: Native engine throws `CometNativeException` on a supported window-function query (the `18446744073709551615` index points to a u64 underflow in frame computation); a native crash on a supported path is `priority:high`.

### priority:medium

- Allocate Comet's parquet reader buffers from ArrowUtils.rootAllocator to enable zero-copy PyArrow UDF runner ([#4294](https://github.com/apache/datafusion-comet/issues/4294))
  - Area labels: `area:scan`, `area:ffi`
  - Rationale: Performance optimization for the columnar Python runner with a working bulk-copy fallback today; matches the guide's "performance regression with workaround" criterion.
- [FEATURE] Native scan support for VariantType columns (Iceberg + Spark 4.0) ([#4295](https://github.com/apache/datafusion-comet/issues/4295))
  - Area labels: `area:scan`, `native_iceberg_compat`, `spark 4.0`
  - Rationale: Missing native VariantType support causes whole-query fallback; functional gap with Spark fallback workaround is `priority:medium`.
- Implement JVM UDFs for all date/time expressions ([#4311](https://github.com/apache/datafusion-comet/issues/4311))
  - Area labels: `area:expressions`
  - Rationale: Compatibility-feature gap: replace native date/time expressions with JVM UDFs for full Spark parity; functional gap with workaround is `priority:medium`.
- Add support for native custom scalar UDFs ([#4312](https://github.com/apache/datafusion-comet/issues/4312))
  - Area labels: `area:expressions`
  - Rationale: New user-facing feature for registering custom UDFs (prototype in PR #4283); missing feature with workaround (use Spark UDFs) is `priority:medium`.
- Implement JVM UDFs for JSON expressions ([#4313](https://github.com/apache/datafusion-comet/issues/4313))
  - Area labels: `area:expressions`
  - Rationale: Adds full Spark-compatible JSON expression support via JVM UDFs; missing-feature gap with workaround is `priority:medium`.
- Writes to Apache Iceberg Tables ([#4322](https://github.com/apache/datafusion-comet/issues/4322))
  - Area labels: `area:writer`, `native_iceberg_compat`
  - Rationale: New Iceberg write path is a major feature gap with the existing Spark write path as workaround; matches `priority:medium`.
- Frequent CI failures for Spark 4.0.2 / JDK 21 ([#4327](https://github.com/apache/datafusion-comet/issues/4327))
  - Area labels: `area:ci`, `spark 4.0`
  - Rationale: A flaky CI test would normally be `priority:low`, but the title says "frequent" failures on the standard Spark 4.0.2 / JDK 21 build; per the guide's escalation rule for CI consistently blocking merges, escalated to `priority:medium`.
- Credential Provider Support ([#4332](https://github.com/apache/datafusion-comet/issues/4332))
  - Area labels: `area:scan`, `native_iceberg_compat`
  - Rationale: Missing pluggable credential provider for Iceberg scans (only static creds today); functional gap with workaround is `priority:medium`.
- Comet JVM UDF implementations cannot be created in `spark` module ([#4336](https://github.com/apache/datafusion-comet/issues/4336))
  - Area labels: `area:expressions`
  - Rationale: Module / shading structure prevents implementing UDFs that need `spark`-module access; broken feature with workaround (place UDFs in `common`) is `priority:medium`.
- Implement TimeType support ([#4288](https://github.com/apache/datafusion-comet/issues/4288))
  - Area labels: `area:expressions` (existing: `EPIC`, `spark 4.1`)
  - Rationale: Issue already carried `priority:medium` from a prior reviewer; this pass added `area:expressions` and removed `requires-triage`.

### priority:low

- Make CI run on the contributor forks ([#4289](https://github.com/apache/datafusion-comet/issues/4289))
  - Area labels: `area:ci`
  - Rationale: CI infrastructure rework with no functional impact; matches the guide's `priority:low` "tooling" example.
- [DISCUSS] Simplify regex engine + incompatibility config model ([#4310](https://github.com/apache/datafusion-comet/issues/4310))
  - Area labels: `area:expressions`
  - Rationale: Refactor / config-UX discussion with no underlying functional bug; user experience polish is `priority:low`.
- Drop support for Spark 3.4 ([#4329](https://github.com/apache/datafusion-comet/issues/4329))
  - Area labels: none
  - Rationale: Project-policy / versioning discussion; tooling-and-process item maps to `priority:low`.
- Enable spark.comet.exec.localTableScan.enabled when running Spark SQL tests ([#4347](https://github.com/apache/datafusion-comet/issues/4347))
  - Area labels: `spark sql tests`
  - Rationale: Test-infrastructure tweak so SQL suites exercise more of Comet; test-only / tooling change is `priority:low`.
- native_datafusion: tests asserting parquet-mr's permissive overflow/narrowing behavior cannot be made to pass ([#4352](https://github.com/apache/datafusion-comet/issues/4352))
  - Area labels: `area:scan`, `spark sql tests` (existing: `native_datafusion`)
  - Rationale: Architectural test-only mismatch; the workaround is to re-ignore the affected Spark tests. Test-only with workaround is `priority:low`.
- native_datafusion (Spark 3.x): shim's ParquetSchemaConvert translation produces an extra SparkException cause-chain layer ([#4354](https://github.com/apache/datafusion-comet/issues/4354))
  - Area labels: `area:scan`, `native_datafusion`
  - Rationale: Behavior difference visible only in Spark SQL test cause-chain assertions; tests stay ignored as a workaround. Test-only failure is `priority:low`.
- Change UDF signature to use ColumnarValue rather than raw Arrow types ([#4358](https://github.com/apache/datafusion-comet/issues/4358))
  - Area labels: `area:expressions`
  - Rationale: Internal API refactor with no user-facing functional bug; matches `priority:low` for tooling/internal cleanup.

## Escalations to consider

- Frequent CI failures for Spark 4.0.2 / JDK 21 ([#4327](https://github.com/apache/datafusion-comet/issues/4327))
  - Escalated from `priority:low` (CI flake) to `priority:medium` per the guide's rule "A `priority:low` CI flake is blocking PR merges consistently → escalate to `priority:medium`". The reviewer should confirm whether these failures are in fact blocking merges; if not, downgrade to `priority:low`.

## Skipped — needs more info

- Bug triage results: 2026-05-11 ([#4287](https://github.com/apache/datafusion-comet/issues/4287))
  - This is the previous triage's summary issue. It is a meta issue, not a bug or feature request, and per this skill's rules ("Do not add labels to the summary issue itself") it should not carry a priority. The reviewer should close it when finished spot-checking the prior pass; `requires-triage` was left in place since this skill does not modify summary issues.

## Notes on label availability

- The triage guide lists `spark 4` as a pre-existing area indicator, but the repo only has versioned labels (`spark 3.x`, `spark 4.0`, `spark 4.1`, `spark 4.2`). Where applicable, the most specific existing version label was used (`spark 4.0` for #4295 and #4327). No new labels were created.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug triage results: 2026-05-18 #4359

Bug triage results: 2026-05-18

Triaged

priority:high

priority:medium

priority:low

Escalations to consider

Skipped — needs more info

Notes on label availability

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug triage results: 2026-05-18 #4359

Description

Bug triage results: 2026-05-18

Triaged

priority:high

priority:medium

priority:low

Escalations to consider

Skipped — needs more info

Notes on label availability

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions