You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add general dataset class for different modes: sft_dataset, preference_dataset (for RM and DPO), rl_dataset. We can use some keys like prompt_key, chosen_key, rejected_key to specify how to read local or HuggingFace dataset, instead of writing a new dataset class.
For the built-in datasets (e.g. open_assistant, HelpSteer3, etc.), we'll keep them for enabling others to accurately reproduce our results.
After refactor, the usage will become:
For special supported datasets, the usage is the same as before.
For general datasets (local/hf), an example for DPO is below.
Step2: Decouple Train and Validation Dataset
Train and validation dataset are coupled for now, which means we need write the same logic twice for train and eval when we add support for new dataset, so it's good to decouple them.
After this, the usage will become:
Step3: Multiple Datasets Support
After this, the usage will become:
data:
train:
# this dataset will override prompt_key and use the default values for other vars
- data_path: /path/to/local/train_dataset_1.jsonlprompt_key: context# this dataset will use all the default values
- data_path: /path/to/local/train_dataset_2.jsonlvalidation:
- data_path: /path/to/local/val_dataset.jsonldefault:
# will use below vars as default values if dataset doesn't specify itdataset_name: BinaryPreferenceDatasetprompt_key: promptchosen_key: chosenrejected_key: rejected
Steps:
Step1: Dataset Refactor
prompt_key,chosen_key,rejected_keyto specify how to read local or HuggingFace dataset, instead of writing a new dataset class.open_assistant,HelpSteer3, etc.), we'll keep them for enabling others to accurately reproduce our results.After refactor, the usage will become:
Step2: Decouple Train and Validation Dataset
Train and validation dataset are coupled for now, which means we need write the same logic twice for train and eval when we add support for new dataset, so it's good to decouple them.
After this, the usage will become:
Step3: Multiple Datasets Support
After this, the usage will become:
Related issues / discussions
#688, #830