Skip to content

Latest commit

 

History

History
67 lines (58 loc) · 3.52 KB

File metadata and controls

67 lines (58 loc) · 3.52 KB

A Reproducibility Study Of Transtab: Learning Transferable Tabular Transformer Across Tables

Created by Alberto Tamajo, Jakub Dylag, Alessandro Nerla and Laurin Lanz.

Transtab architecture. Figure from original paper

Introduction

In this work, we verify the reproducibility of TransTab: Learning Transferable Tabular Transformers Across Tables as part of COMP6258 module.

The ubiquity of tabular data in machine learning led Wang & Sun (2022) to introduce a versatile tabular learning framework, Transferable Tabular Transformer (TransTab), capable of modelling variable-column tables. Furthermore, they proposed a novel technique that enables supervised or self-supervised pretraining on multiple tables, as well as finetuning on the target dataset. Given the potential impact of their work, we aim to verify their claims by trying to reproduce their results. Specifically, we try to corroborate the ’methods’ and ’results’ reproducibility of their paper.

The results of our reproducibility study are summarised in Report.pdf

Experiment results

Our experiment results are saved in this repository as pickle files:

  • Supervised learning: supervised_learning.pickle
  • Feature Incremental Learning: incremental_learning.pickle
  • Transfer Learning: transfer_learning.pickle
  • Zero-Shot Learning: zeroshot_learning.pickle
  • Supervised and Self-supervised Pretraining: across_table_pretraining_finetuning.pickle

How to run the reproducibility experiments

Clone this project

The first step is to clone this project:

git clone https://github.com/COMP6258-Reproducibility-Challenge/TransTab-Reproducibility.git
cd Transtab-Reproducibility/

Conda environment

The second step is to create a conda environment from our environment.yml:

conda env create -f environment.yml
conda activate TranstabReproducibility

Run the desired reproducibility experiment

The third step is to run the desired reproducibility experiment:

  • Supervised learning: python supervised_learning.py
  • Feature Incremental Learning: python incremental_learning.py
  • Transfer Learning: python transfer_learning.py
  • Zero-Shot Learning: python zeroshot_learning.py
  • Supervised and Self-supervised Pretraining: python supervised_selfsupervised_pretrain_finetuning.py

Google colab alternative

Alternatively, you can upload this repository's files into a Google colab session and use the Trasntab.ipynb file.

Code

We verified Transtab's reproducibility by leveraging Transtab's code package v. 0.0.2. On the 05/04/23 v. 0.0.5 was released. In the following, we list our code and the one retrieved from the original repository.

  • Our code:
    • Rankings.ipynb
    • Transtab.ipynb
    • incremental_learning.py
    • supervised_learning.py
    • supervised_selfsupervised_pretrain_finetuning.py
    • transfer_learning.py
    • zeroshot_learning.py
  • Original code:
    • constants.py
    • dataset.py
    • evaluator.py
    • load.py
    • modeling_transtab.py
    • trainer.py
    • trainer_utils.py
    • transtab.py