Skip to content

Automated Synthetic Data Generation, Distillation, Quantization and Deployment Pipeline

License

Notifications You must be signed in to change notification settings

ameen-91/textforge

Repository files navigation

Infero

PyPI - Python Version PyPI - Version PyPI - Downloads CI Coverage

Overview

TextForge automates model distillation, training, quantization, and deployment for text classification. It simplifies synthetic data generation, model optimization using ONNX runtime, and FastAPI serving.

Features

  • Automated synthetic data generation
  • Transformer model training
  • ONNX conversion with 8-bit quantization
  • Automated model API serving with FastAPI

Installation

pip install textforge

Usage

import pandas as pd
from textforge.pipeline import Pipeline, PipelineConfig

pipeline_config = PipelineConfig(
    api_key=api_key,
    labels=['business','education','entertainment','sports','technology'],
    query="Classify based on headlines",
    save_steps=200,
    eval_steps=200,
    epochs=10
)

df = pd.read_csv('data.csv')

pipeline = Pipeline(pipeline_config)

pipeline.run(data=df, save=True, serve=True)

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Automated Synthetic Data Generation, Distillation, Quantization and Deployment Pipeline

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •