Pass bulk import job to Spark with table ID instead of name

### User Story

As a user of Sleeper, once a job is submitted I want the system to remember which table it job is for by its ID, so that when I rename the table the job still runs against that table.

### Description / Background

Under epic:
- https://github.com/gchq/sleeper/issues/5078

When a bulk import job is received in the starter lambda, the lambda looks up the Sleeper table by the table name. When it submits the job to be run in Spark, it passes the job to Spark as it was when it received it, including the table name instead of the ID.

We'd like the job to be passed to Spark by the table ID instead of the name, so that if the table is renamed between the starter lambda and the Spark driver, the job will still run.

### Technical Notes / Implementation Details

The job is written to S3 with `BulkImportExecutor.WriteJobToBucket`. The job is read in Spark with `BulkImportJobLoaderFromS3`, in `BulkImportJobDriver.start`.

The Sleeper table is then looked up by its name in `BulkImportJobDriver.run`. Because it looks up by the name, that needs to happen outside of the try/catch/finally where failure is submitted to the job tracker. When we load the table properties by its ID instead, we'll be able to track any failure in the job tracker, and put it inside the try/catch/finally.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass bulk import job to Spark with table ID instead of name #6502

User Story

Description / Background

Technical Notes / Implementation Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pass bulk import job to Spark with table ID instead of name #6502

Description

User Story

Description / Background

Technical Notes / Implementation Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions