Skip to content

Data Transformation and Output Generation #3

@Oddonline

Description

@Oddonline

Data Transformation Implementation

Overview

Implement the data transformation pipeline from XML to CSV and JSON, with support for differential updates.

Requirements

XML Processing

  • Port existing XML processing code
  • Add validation and error handling
  • Implement entity extraction
  • Support incremental processing

Data Transformation

  • CSV generation
  • JSON generation
  • Support for differential updates
  • Data validation

Output Generation

  • Complete dataset generation
  • Differential update files
  • Change logs
  • Validation reports

Entity Types

  1. Lines
  2. Routes
  3. Route Points
  4. Journey Patterns
  5. Stop Sequences
  6. Service Journeys
  7. Passing Times
  8. Dated Journeys

File Structure

/data
  /current
    - lines.json
    - routes.json
    - ...
  /delta
    /{timestamp}
      - changes.json
      - added_entities.json
      - modified_entities.json
      - deleted_entities.json
  /archive
    /{timestamp}
      - complete dataset

Validation Rules

  • All required fields must be present
  • Data type validation
  • Relationship integrity
  • Format validation

Notes

  • Must maintain data consistency
  • Need efficient processing for large datasets
  • Should support partial updates
  • Must include data validation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions