We examine whether such stereotypes are mirrored by large language models (LLMs). We draw on the sociolinguistic literature on dialect perception to analyze traits commonly associated with dialect speakers. Based on these traits, we assess the dialect naming bias and dialect usage bias expressed by LLMs in two tasks: association task and decision task. To assess a model's dialect usage bias, we construct a novel evaluation corpus that pairs sentences from seven regional German dialects (e.g., Alemannic and Bavarian) with their standard German counterparts.
-
scripts/init.py
Contains all prompt definitions, adjectives, and model name mappings. -
scripts/prompt_creation/create_tasks.py
Script for creating prompt tasks based on your annotated data. -
scripts/inference.py
Runs the model inference to generate predictions. -
scripts/eval_implicit.py
Evaluates the model predictions for association task. -
scripts/extract_decision.py
Extracts decision outputs from the model. -
scripts/eval_decision.py
Evaluates the model for decision task. -
See plots/ to generate final plots
You can access all data and outputs generated in this project here: Drive.
- Python 3.12 (recommended)
- Install dependencies with:
pip install -r requirements.txt-
Annotation Data Placement
- Add your annotation files to the directory:
data/annotated_data.
- Add your annotation files to the directory:
-
Generate Prompts
- Run the prompt creation script:
python scripts/prompt_creation/create_tasks.py
- Run the prompt creation script:
-
Dialect Usage Bias Inference
- Execute the following command:
python scripts/inference.py --model_name $MODEL_PATH --output_folder output/association_usage --gt_file data/prompts/tasks/association_usage.csv
- Execute the following command:
-
Dialect Naming Bias Inference
- Execute the following command:
python scripts/inference.py --model_name $MODEL_PATH --output_folder output/association_naming --gt_file data/prompts/tasks/association_naming.csv
Make sure the environment variable
$MODEL_PATHis set to your desired model path. - Execute the following command:
- Run the implicit bias evaluation script:
python scripts/eval_association.py
- Extract Decisions
- Run the extraction script:
python scripts/extract_decision.py --model_name $MODEL_PATH
- Evaluate Extracted Decisions
- Run the evaluation script for decisions:
python scripts/eval_decision.py
- In scripts/init.py, we specifiy the model name mapping function that is being used in the eval scripts to clean model names up.
- Outputs of all models will be shared in an online drive folder (as it is too big to upload) after acceptance
This project is licensed under the xxxx - see the LICENSE.md file for details
