Active/Passive Sentence Classification using DistilBERT

This project uses DistilBERT, a lightweight version of the popular BERT model, to classify sentences as either active or passive. The model is fine-tuned on a small dataset of sentence examples and can predict the grammatical voice of any given sentence.

Features

Text Classification: Classifies sentences into two categories: Active or Passive.
Transformer-based Model: Utilizes DistilBERT for high-performance NLP.
Minimal Data: Fine-tuned with a small dataset, leveraging the power of transfer learning.
Fast Inference: Thanks to the lightweight DistilBERT architecture.

Google Colab Link: Active/Passive Sentence Classifier

Hugging Face: ActiveVoice_PassiveVoice_Classifier

Requirements

Libraries

You can install the required libraries using pip:

pip install transformers tensorflow datasets numpy pandas matplotlib scikit-learn

Model Fine-Tuning

The model is fine-tuned on a custom dataset with sentences labeled as active or passive. Below are the steps to fine-tune the DistilBERT model:

1. Preprocessing the Data

The input sentences are tokenized using the DistilBERT tokenizer, which converts them into a format that can be processed by the model. The tokenizer handles padding and truncation to ensure uniform input length.

tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
inputs = tokenizer(texts, return_tensors="tf", padding=True, truncation=True, max_length=128)

Loading and Pre-training Model

The DistilBERT model is loaded and prepared for fine-tuning. It is instantiated with a classification head for sequence classification.

model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')

The model is compiled with the Adam optimizer and sparse categorical cross-entropy loss function since this is a classification task.

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=2e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Once the model is fine-tuned and saved, you can use it to classify new sentences as active or passive.

Author

Prayas Jadhav

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
TextClassifier_Voices.ipynb		TextClassifier_Voices.ipynb
immverse_ai_eval_dataset.xlsx		immverse_ai_eval_dataset.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Active/Passive Sentence Classification using DistilBERT

Features

Google Colab Link: Active/Passive Sentence Classifier

Hugging Face: ActiveVoice_PassiveVoice_Classifier

Requirements

Libraries

Model Fine-Tuning

1. Preprocessing the Data

Loading and Pre-training Model

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Active/Passive Sentence Classification using DistilBERT

Features

Google Colab Link: Active/Passive Sentence Classifier

Hugging Face: ActiveVoice_PassiveVoice_Classifier

Requirements

Libraries

Model Fine-Tuning

1. Preprocessing the Data

Loading and Pre-training Model

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages