GitHub - agarwalishika/LSKExtractor: Language Specific Knowledge Extractor

These experiments can be expensive because of all the inferences we have to do. We design our experiments to be effective yet cheap.

We first collect all the language model responses: for all languages and all models, we perform inference on the language model to generate the responses. We store the response and classify them into answer choices (because we use classification tasks, or you could classify them into correct/incorrect with other accuracy/reward metrics). Now, we can treat LSKExtractor and any of the other baselines as language selection methods, where we use those methods to select the language of the query, and we already have stored the language model response, so we can retrieve that from our stored responses. Below is the rough pipeline of how to run our code.

To start, process all the data using the .ipynb notebooks in data/.

You can use our script translate_gpt.py to use a GPT model to translate queries into different languages.

Now, you can run inference with run_inference.py. If you want to run inference without reasoning, use run_inference_nr.py.

It's crucial to run parse_generations_to_classify.py so that the LLM responses can be mapped to an answer choice in our classification tasks.

Finally, you can run the EVALUATION_... scripts to run and evaluate LSKExtractor and any of the baselines.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
EVALUATION_country.py		EVALUATION_country.py
EVALUATION_llm_selected.py		EVALUATION_llm_selected.py
EVALUATION_lskextractor-top3.py		EVALUATION_lskextractor-top3.py
EVALUATION_lskextractor.py		EVALUATION_lskextractor.py
EVALUATION_majority.py		EVALUATION_majority.py
EVALUATION_one_language.py		EVALUATION_one_language.py
EVALUATION_only_english.py		EVALUATION_only_english.py
EVALUATION_run.sh		EVALUATION_run.sh
EVALUATION_transfer_dataset.py		EVALUATION_transfer_dataset.py
EVALUATION_transfer_model.py		EVALUATION_transfer_model.py
README.md		README.md
config.py		config.py
lsk_figure.png		lsk_figure.png
parse_generations_to_classify.py		parse_generations_to_classify.py
run_inference.py		run_inference.py
run_inference_nr.py		run_inference_nr.py
translate_gpt.py		translate_gpt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

agarwalishika/LSKExtractor

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages