Skip to content

Refactor: Improve Node table and relationship tables structure #127

@aheev

Description

@aheev

At the moment, the storage structure is untidy. If we look at the CreateNodeTable func in storage_manager

if (!entry->getStorage().empty()) {
        // Check if storage is Arrow backed
        if (entry->getStorage().substr(0, 8) == "arrow://") {
            ...
            // Create Arrow-backed node table
            tables[entry->getTableID()] = std::make_unique<ArrowNodeTable>(this, entry,
                &memoryManager, std::move(schemaCopy), std::move(arraysCopy), arrowId);
        } else {
            // Create parquet-backed node table
            tables[entry->getTableID()] =
                std::make_unique<ParquetNodeTable>(this, entry, &memoryManager);
        }
    } else {
        // Create regular node table
        tables[entry->getTableID()] = std::make_unique<NodeTable>(this, entry, &memoryManager);
    }

This nested if-else block is neither particularly extensible nor especially clear, particularly for non-native graph formats. As we add support for additional file or graph formats, this logic is likely to become increasingly convoluted

Based on my (fairly limited) understanding, here’s an idea I’ve been playing with:

- Introduce a top-level `graph-format` config (unique per graph) with values such as native / graphAr / graph-std.
  For non-native formats like graphAr or graph-std, we would not need to inspect any other configuration; those formats
  would be responsible for structuring their own node and relationship tables. We could potentially apply the same
  approach to the native format as well, simply delegating table creation to the format itself.

- Retain a `storage` config for both node tables and relationship tables.

- Add an `immutability` config to indicate whether a node table is immutable.

PS: The 3rd point in the 0.14.0 release schedule could fit into this

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions