In this project, I have used Flan-T5 model and fine-tune it using LoRA approach to summarize the dialogue of dialogsum dataset. Later, using the Toxicity Evaluator model "facebook/roberta-hate-speech-dynabench-r4-target", calculate the reward for the sentence generated by the fine-tuned model. In the last step, using the PPO RL approach, I fine-tuned the model weights for the limited number of parameters that are defined by LoRA approach.
azmozaffari/Text_RLHF
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|