Skip to content

tssarathi/ai-text-detection-mtl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COMP90051-P2

This repository contains the codebase for Project 2 of the subject COMP90051 – Statistical Machine Learning, Semester 1, 2025, at the University of Melbourne. This project is part of a group submission for Group 28.

Team Members:

  • Ankita Holey
  • Richardo Husni
  • Sarathi Thirumalai Soundararajan

The objective of this project is to build a classification model to distinguish between machine-generated and human-written text, leveraging domain adaptation and imbalance handling techniques. The project was conducted as part of a Kaggle competition outlined in the Project Specification PDF, which includes additional details on data format, performance evaluation, and submission criteria.


Setup Instructions

1. Clone the Repository

If you haven't already cloned the repository, run:

git clone https://github.com/sthirumalais/COMP90051-A2.git
cd COMP90051-A2

2. Create the Conda Environment

Ensure you have Conda (Anaconda or Miniconda) installed, then create the environment:

conda env create -f environment.yml

This will create a Conda environment named p2 with all required dependencies.

3. Activate the Environment

Activate the newly created environment:

conda activate p2

4. Install the Jupyter Kernel

Register the Jupyter kernel:

python -m ipykernel install \
--user \
--name COMP90051-P2 \
--display-name "COMP90051-P2"

5. Launch JupyterLab or Notebook

Verify that the kernel COMP90051-P2 is available:

jupyter lab
# or
jupyter notebook

Select the COMP90051-P2 kernel when prompted.


Notebook Execution Order

To ensure reproducibility and correctness, run the notebooks in the following order:

  1. Pre-Processing.ipynb
  2. Feature-Engineering.ipynb
  3. Then, run any of the model notebooks below as needed:

Notes

  • Ensure your kernel is set to COMP90051-P2 in each notebook.
  • If you encounter issues with dependencies, try updating Conda and re-creating the environment.

Happy coding!

About

PyTorch-based multi-task model for detecting AI-generated text across imbalanced domains using TF-IDF features and sentence embeddings.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors