At the moment, the storage structure is untidy. If we look at the CreateNodeTable func in storage_manager
if (!entry->getStorage().empty()) {
// Check if storage is Arrow backed
if (entry->getStorage().substr(0, 8) == "arrow://") {
...
// Create Arrow-backed node table
tables[entry->getTableID()] = std::make_unique<ArrowNodeTable>(this, entry,
&memoryManager, std::move(schemaCopy), std::move(arraysCopy), arrowId);
} else {
// Create parquet-backed node table
tables[entry->getTableID()] =
std::make_unique<ParquetNodeTable>(this, entry, &memoryManager);
}
} else {
// Create regular node table
tables[entry->getTableID()] = std::make_unique<NodeTable>(this, entry, &memoryManager);
}
This nested if-else block is neither particularly extensible nor especially clear, particularly for non-native graph formats. As we add support for additional file or graph formats, this logic is likely to become increasingly convoluted
Based on my (fairly limited) understanding, here’s an idea I’ve been playing with:
- Introduce a top-level `graph-format` config (unique per graph) with values such as native / graphAr / graph-std.
For non-native formats like graphAr or graph-std, we would not need to inspect any other configuration; those formats
would be responsible for structuring their own node and relationship tables. We could potentially apply the same
approach to the native format as well, simply delegating table creation to the format itself.
- Retain a `storage` config for both node tables and relationship tables.
- Add an `immutability` config to indicate whether a node table is immutable.
PS: The 3rd point in the 0.14.0 release schedule could fit into this
At the moment, the storage structure is untidy. If we look at the
CreateNodeTablefunc instorage_managerThis nested
if-elseblock is neither particularly extensible nor especially clear, particularly for non-native graph formats. As we add support for additional file or graph formats, this logic is likely to become increasingly convolutedBased on my (fairly limited) understanding, here’s an idea I’ve been playing with:
PS: The 3rd point in the 0.14.0 release schedule could fit into this