Skip to content

Cursor is not deserialized correctly with multiple columns #1226

@nigel-nava

Description

@nigel-nava

The bug:
Pausing and resuming from the dashboard when my task has multiple cursor columns fails with an error.

The error:

ActiveRecord::StatementInvalid
PG::InvalidDatetimeFormat: ERROR: invalid input syntax for type timestamp with time zone: "["2025-03-11 14:30:01.132797000", "7d826fb4-b75e-4393-a799-e8cacbb32624"]" CONTEXT: unnamed portal parameter $1 = '...' 

The code:

# base jobclass
class DataMigration::MaintenanceTasksBaseJob < MaintenanceTasks::TaskJob
  queue_as :latency_30s
end
class ExampleTask < MaintenanceTasks::Task
  def collection
    User.all # has "id" and "created_at" columns
  end

  def cursor_columns
    [ :created_at, :id ]
  end

  def process(user)
    puts "Processing #{user.id}, #{user.created_at}"
    sleep 1
  end
end

My fix:

# base job class
class DataMigration::MaintenanceTasksBaseJob < MaintenanceTasks::TaskJob
  queue_as :latency_30s

  private

  def build_enumerator(run, cursor:)
    deserialized_cursor = cursor
    if cursor.nil? && @task.cursor_columns.length > 1 && run.cursor.is_a?(String)
      deserialized_cursor = JSON.parse(run.cursor)
    end
    super(run, cursor: deserialized_cursor)
  end
end

I think what's happening is that pausing and resuming from the dashboard enqueues a new job with no serialized position. In ac3872a we assign the cursor_position to the value of run.cursor, but at this point the value is supposed to be fully deserialized. When there are multiple columns in cursor_columns, we actually have a JSON string, which is not handled correctly by JobIteration

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions