Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 597 Bytes

File metadata and controls

7 lines (4 loc) · 597 Bytes

Sloth

In this repository we present the code of Sloth, our solution for determining the largest overlap between two tables.

The code is provided in the "sloth.py" file, while the main files for the experiments and for preprocessing the datasets are made available in the "data_preparation" and "experiments" folders, respectively.

The "examples" folders contains representative pairs of tables from Wikipedia describing typical cases where the detected largest overlap significantly differs from traditional set similarity measures, such as Jaccard similarity and overlap set similarity.