This repository provides data augmentation pipelines for Aspect-Based Sentiment Analysis (ABSA).
It supports two complementary approaches:
- Agentic Pipeline (LangGraph + Ollama) – generates sentences with aspect–polarity pairs using a generator + evaluator agent.
- Prompting Pipeline (Naive Generation) – directly prompts an LLM to produce aspect–polarity sentences without explicit validation.
The augmented data is used with the InstructABSA framework for training and evaluation.
- Augment datasets for ABSA in the Restaurant domain
- Two strategies:
- Agentic: validated samples, slower but higher quality.
- Prompting: faster generation, noisier but scalable.
- Uses local LLMs with Ollama and Hugging Face Transformers.
- Seamlessly integrates with InstructABSA for downstream experiments.
git clone https://github.com/mohamad7395/Thesis.git
cd absa-augmentationpip install -r requirements.txtFollow instructions from Ollama
ollama pull qwen2.5:14b
ollama pull llama3:8b-instructRun the controlled agent-based data generation:
python run_agent.pypython run_prompting.pySemEval Dataset
│
├── Agentic Pipeline ──> Augmented Data (validated)
└── Prompting Pipeline ─> Augmented Data (naive)
Augmented Data ──> InstructABSA ──> Model Training & Evaluation
This project builds on top of the original InstructABSA codebase.
On top of the baseline implementation, we added new scripts and generated datasets to support data augmentation experiments.
- Augmented Datasets
Located in:InstructABSA/Dataset/Generated - Experiment Results
Located in:Thesis/All Results - Experiment Scripts
Additional Python scripts for running automated experiments are stored in:
These paths reflect the extended functionality for generating augmented data and evaluating it within the InstructABSA framework.
InstructABSA/Research