Skip to content

Remove psycopg2 and migrate the sync Postgres driver to psycopg3 #68453

@Dev-iL

Description

@Dev-iL

Context

#67801 switches the async Postgres driver default from asyncpg to psycopg3 (postgresql+psycopg_async://) and makes psycopg[binary] a hard dependency of apache-airflow-providers-postgres. To keep that PR reviewable, psycopg2-binary was deliberately kept as a hard dependency and the sync engine still uses psycopg2 (postgresql+psycopg2://, made explicit in #68314). This issue tracks the deferred follow-up: removing psycopg2 entirely and serving both sync and async from psycopg3.

Why deferred (and why it should happen)

  • Keeping both drivers doubles the installed footprint for no long-term benefit; psycopg3 serves sync and async from a single driver.
  • SQLAlchemy 2.1 changes the default dialect for postgresql:// URLs from psycopg2 to psycopg3, so the sync migration naturally aligns with the SQLA 2.1 upgrade (early experiments with that upgrade already surfaced breakage, e.g. in celery, from the implicit driver flip — hence the explicit-URL groundwork in Make PostgreSQL SQLAlchemy driver explicit (postgresql+psycopg2://) #68314).

Work

  1. Migrate the sync metadata engine default to psycopg3 (postgresql+psycopg://).
  2. Remove psycopg2-binary from providers/postgres hard dependencies (breaking change — major provider bump).
  3. Auto-upgrade legacy URLs: once psycopg2 is no longer guaranteed to be installed, postgresql:// (and postgresql+psycopg2://) connection URLs must be automatically mapped to postgresql+psycopg:// when psycopg2 is absent, so existing deployment configs keep working.
  4. Validate bulk-write paths (psycopg3 showed markedly slower batch INSERT/COPY in driver benchmarks — see Switch the default async Postgres driver from asyncpg to psycopg3 #67801) before any bulk path moves to it.
  5. Update deployment docs and PostgresHook as applicable.

Acceptance criteria for closing

  • psycopg2 is no longer a hard dependency anywhere in the repo.
  • Legacy postgresql:// / postgresql+psycopg2:// configs either keep working via auto-upgrade or fail with a clear actionable error.
  • Docs describe psycopg3 as the single default Postgres driver for sync and async.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions