Skip to content

MySQL support for sync-indexes #67

@koenvo

Description

@koenvo

MySQL support for sync-indexes

Background

sync-indexes currently creates partial expression indexes on PostgreSQL for high-cardinality identifier keys (e.g. one dataset per keyword). MySQL is a no-op.

Approach

MySQL 8.0.13+ supports functional indexes directly — no generated columns needed:

CREATE INDEX idx_dataset_identifier_keyword_ads_keyword_metrics
ON dataset (provider, dataset_type, (JSON_VALUE(identifier, '$.keyword' RETURNING CHAR(255))))
WHERE ...;  -- MySQL doesn't support partial indexes, so provider/dataset_type must be index columns

Note: MySQL has no partial index support, so provider and dataset_type must be leading columns in the index rather than a WHERE predicate.

For older MySQL (< 8.0.13), generated stored columns are the only option:

ALTER TABLE dataset
  ADD COLUMN keyword_txt VARCHAR(255)
    GENERATED ALWAYS AS (JSON_VALUE(identifier, '$.keyword' RETURNING CHAR(255))) STORED;

CREATE INDEX idx_dataset_identifier_keyword_ads_keyword_metrics
ON dataset (provider, dataset_type, keyword_txt);

This is more invasive (schema mutation, harder to roll back) and is not recommended unless targeting pre-8.0.13.

Scope

  • Detect MySQL dialect in create_identifier_indexes
  • Generate functional index SQL for MySQL 8.0.13+
  • Include provider and dataset_type as leading index columns (no partial index support)
  • Cast based on key_type: RETURNING UNSIGNED for int, RETURNING CHAR(255) for str
  • Skip (no-op + log warning) for MySQL < 8.0.13

Out of scope

  • Generated column approach for older MySQL versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions