Skip to content
This repository was archived by the owner on Feb 6, 2026. It is now read-only.

semiotic-ai/graphdoc-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

653 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code style: black

graphdoc

subgraph documentation generation

Categorization

We adopt the methodology of Wretblad et al. [1] in defining both the difficulty of documenting a tables column and the quality of a column's description.

Column Difficulty Categorization

Difficulty Level Description
Very Hard Given the database name, the table name, the column name, example data from the database, and other columns in the table, it is impossible to accurately determine what the column description should be.
Hard Given the database name, the table name, the column name, example data from the database, and other columns in the table, I am unsure what the column description should be.
Medium Given the database name, the table name, the column name, example data from the database, and other columns in the table, I can accurately determine what the column description should be.
Easy Given only the table name and the column name, and other columns in the table, I can accurately determine what the column description should be.

Column Description Categorization (Gold)

Classification Description
Perfect A perfect column description should contain enough information so that the interpretation of the column is completely free of ambiguity. It does not need to include any descriptions of the specific values inside the column to be considered perfect. The description should contain information about what table the column is referencing. For example, instead of "The name," we want "The name of the client that made the transaction" if we have a transaction database with columns such as NAME, AMOUNT, and DATE to resolve the ambiguity of what the name refers to. Additionally, the column description should be a full and valid English sentence, with proper grammar, capitalization, and punctuation. For instance, instead of "nationality of drivers" when each instance refers to only one driver, it should be "The nationality of a driver."
Poor but Correct The column description is poor but correct, but there is room for improvement.
Incorrect The column description is incorrect. Contains inaccurate or misleading information. It could still contain correct information, but any incorrect information automatically leads to an incorrect rating.
No Description The column description is missing.
I Can’t Tell It is impossible to tell the class of the description with the given information.

Column Description Categorization (Generated)

Quality Level Description
Perfect Matching the gold description without extra, redundant information. Redundant information is categorized as descriptions that do not provide useful additional information. For example, " + ‘is a primary/foreign key’" (NOT REDUNDANT) versus " + ‘is useful for retrieving data’" (REDUNDANT).
Almost Perfect Matching the gold description but verbose with redundant information, without any incorrect or misleading information.
Poor but Correct The column description is poor but correct but has room for improvement due to missing information. For example, "The Time column records the specific time at which a transaction occurred, formatted in a 24-hour HH:MM pattern," which lacks enough information to make a valid prediction beyond the primary purpose.
Incorrect The column description is incorrect and contains inaccurate or misleading information. Any incorrect information automatically leads to an incorrect rating, even if some correct information is present.

Subgraphs

Name Explorer Github Creator Type
arbitrum-one-bridge link link messari messari: schema-bridge
gmx-forks link link messari messari: schema-derivatives-perpfutures
uniswap-v3-forks link link messari messari: schema-dex-amm-extended
bancor-v3 link link messari messari: schema-dex-amm-extended
aave-forks link link messari messari: schema-lending
opensea link link messari messari: schema-nft-marketplace
arrakis-finance link link messari messari: schema-yield
eigenlayer link link messari messari: schema-non-standard
livepeer link link livepeer livepeer: main
ens-subgraph link link ens ens: main
graph-network-arbitrum link link e&n graph: network arbitrum
known-origin link link known-origin known-origin

References

  1. Wretblad, Niklas et al. Synthetic SQL Column Descriptions and Their Impact on Text-to-SQL Performance. arXiv preprint arXiv:2408.04691, 2024.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages