Skip to content

Phase 9: Harmonise - Refactor Harmonise Phase to Support Polars-Based Processing #495

@lakshmi-kovvuri1

Description

@lakshmi-kovvuri1

Overview
The PatchPhase step now produces data using Polars LazyFrames, but the HarmonisePhase step still uses the legacy implementation and cannot process this new output. To keep the pipeline consistent and efficient, the HarmonisePhase phase must be updated to use the Polars version. This will keep existing behaviour while modernising the underlying process.

Tech Approach

  • Update digital_land/phase/HarmonisePhase.py so it delegates to the Polars HarmonisePhase implementation.
  • Implement the full HarmonisePhase logic inside digital_land/phase_polars/transform/HarmonisePhase.py.
  • Convert existing HarmonisePhase rules into Polars lazy transformations.
  • Ensure the HarmonisePhase phase accepts the LazyFrame produced by the new HarmonisePhase step.
  • Keep the same public interface so no downstream code needs changing.

Acceptance Criteria/Tests

  • HarmonisePhase phase accepts the HarmonisePhase phase LazyFrame without errors.
  • Legacy HarmonisePhase behaviour is reproduced using Polars.
  • Outputs match the legacy HarmonisePhase phase (apart from expected improvements).
  • No Pandas or row‑based Python operations are used.
  • Add test cases
  • Refactored code complies with project formatting and style standards (black/flake8/pep)

Resourcing & Dependencies
Depends on PatchPhase, no external teams.

Metadata

Metadata

Labels

No labels
No labels

Type

Projects

Status

In Review / QA 🔎

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions